If these data could talk

被引:0
作者
Thomas Pasquier
Matthew K. Lau
Ana Trisovic
Emery R. Boose
Ben Couturier
Mercè Crosas
Aaron M. Ellison
Valerie Gibson
Chris R. Jones
Margo Seltzer
机构
[1] School of Engineering and Applied Sciences,
[2] Harvard University,undefined
[3] Cambridge,undefined
[4] Harvard Forest,undefined
[5] Harvard University,undefined
[6] Petersham,undefined
[7] European Organization for Nuclear Research (CERN),undefined
[8] Cavendish Laboratory,undefined
[9] University of Cambridge,undefined
[10] Institute for Quantitative Social Science,undefined
[11] Harvard University,undefined
来源
Scientific Data | / 4卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
In the last few decades, data-driven methods have come to dominate many fields of scientific inquiry. Open data and open-source software have enabled the rapid implementation of novel methods to manage and analyze the growing flood of data. However, it has become apparent that many scientific fields exhibit distressingly low rates of reproducibility. Although there are many dimensions to this issue, we believe that there is a lack of formalism used when describing end-to-end published results, from the data source to the analysis to the final published results. Even when authors do their best to make their research and data accessible, this lack of formalism reduces the clarity and efficiency of reporting, which contributes to issues of reproducibility. Data provenance aids both reproducibility through systematic and formal records of the relationships among data sources, processes, datasets, publications and researchers.
引用
收藏
相关论文
共 23 条
[1]  
Baker M(2017)Cancer reproducibility project releases first results Nature 541 269-270
[2]  
Dolgin E(2017)Is most published research really false? Annu Rev Stat Appl 4 109-122
[3]  
Leek JT(2016)The pressure to publish pushes down quality Nature 533 147-147
[4]  
Jager LR(2011)Reproducible research in computational science Science 334 1226-1227
[5]  
Sarewitz D(2006)An analytic web to support the analysis and synthesis of ecological data Ecology 87 1345-1358
[6]  
Peng RD(2011)A provenance-based infrastructure to support the life cycle of executable papers Procedia Comput Sci 4 648-657
[7]  
Ellison AM(2016)Thermal reactionomes reveal divergent responses to thermal extremes in warm and cool-climate ant species BMC Genomics 17 171-486
[8]  
Koop D(2012)Modeling range dynamics in heterogeneous landscapes: invasion of the hemlock woolly adelgid in eastern North America Ecol Appl 22 472-12666
[9]  
Stanton-Geddes J(1996)Seasonal variation of the ozone production efficiency per unit NOx at Harvard Forest, Massachusetts J Geophys Res 101 12659-247
[10]  
Fitzpatrick MC(2007)Ensuring reliable datasets for environmental models and forecasts Ecol Inform 2 237-93