Big Data and the Missing Links

被引:8
作者
De Veaux, Richard D. [1 ]
Hoerl, Roger W. [2 ]
Snee, Ronald D. [3 ]
机构
[1] Williams Coll, Dept Stat, Williamstown, MA 01267 USA
[2] Union Coll, Dept Math, Schenectady, NY 12308 USA
[3] Snee Associates LLC, Newark, DE 19711 USA
关键词
data mining; statistical engineering; quality; data integrity; data science; data quality; ethics;
D O I
10.1002/sam.11303
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although Big Data can have the potential to help researchers in science and industry solve large and complex problems, basic statistical ideas are often ignored in the Big Data literature. It is not true that simply having massive amounts of data renders subject-matter models and experiments obsolete, alleviates the need to ensure data quality and no longer requires that variables accurately measure what they are supposed to. We refer to these fundamentals as missing links in the Big Data process. In this paper, we illustrate the challenges of making decisions from Big Data through a series of case studies. We offer some strategies to help ensure that projects based on Big Data analyses are successful. (C) 2016 Wiley Periodicals, Inc.
引用
收藏
页码:411 / 416
页数:6
相关论文
共 9 条
[1]  
Anderson C., 2008, Wired, DOI DOI 10.1180/MINMAG.2008.072.1.7
[2]  
Box G.E., 1978, STAT EXPT
[3]  
Covey S. R., 1989, HIGHLY EFFECTIVE PEO
[4]   How to lie with bad data [J].
De Veaux, RD ;
Hand, DJ .
STATISTICAL SCIENCE, 2005, 20 (03) :231-238
[5]  
DiBenedetto A., 2014, QUAL PROG, P50
[6]  
Hoerl R., 2010, QUAL PROG, V43, P52
[7]   Applying statistical thinking to 'Big Data' problems [J].
Hoerl, Roger W. ;
Snee, Ronald D. ;
De Veaux, Richard D. .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2014, 6 (04) :222-232
[8]  
Snee Ronald D., 2014, Quality Progress, V47, P24
[9]  
Snee R. D., 2012, QUALITY PROGR DEC, V2012, P66