旧版下载已关闭,请移步到新版下载(点击打开新版

Challenges of Big Data analysis

资料来自用户(Oscar)上传,若本站收录的文献无意侵犯了您的著作版权,请点击版权申明
文献出处
National Science Review  2014年02期
机 构
Department,of,Operations,Research,and,Financial,Engineering,Princeton,University,Department,of,Biostatistics,Johns,Hopkins,University
基 金
supported,by,the,National,Science,Foundation[DMS-1206464,to,JQF,III-1116730,and,III-1332109,to,HL],the,National,Institutes,of,Health[R01-GM100474,and,R01-GM072611,to,JQF]
论文摘要

Big Data bring new opportunities to modern society and challenges to data scientists. On the one hand,Big Data hold great promises for discovering subtle population paterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage botleneck, noise accumulation, spurious correlation, incidental endogeneity and measurement errors.hese challenges are distinguished and require new computational and statistical paradigm. his paper gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-conidence set and point out that exogenous assumptions in most statistical methods for Big Data cannot be validated due to incidental endogeneity. hey can lead to wrong statistical inferences and consequently wrong scientiic conclusions.

全文下载
全文下载