Data-mining discovery of pattern and process in ecological systems

被引:105
作者
Hochachka, Wesley M. [1 ]
Caruana, Rich
Fink, Danniel
Munson, Art
Riedewald, Mirek
Sorokina, Darla
Kelling, Steve
机构
[1] Cornell Univ, Ornithol Lab, Ithaca, NY 14850 USA
[2] Cornell Univ, Dept Comp Sci, Ithaca, NY 14853 USA
关键词
bagging; data mining; decision trees; exploratory data analysis; hypothesis generation; machine learning; prediction;
D O I
10.2193/2006-503
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Most ecologists use statistical methods as their main analytical tools when analyzing data to identify relationships between a response and a set of predictors; thus, they treat all analyses as hypothesis tests or exercises in parameter estimation. However, little or no prior knowledge about a system can lead to creation of a statistical model or models that do not accurately describe major sources of variation in the response variable. We suggest that under such circumstances data mining is more appropriate for analysis. lit this paper we 1) present the distinctions between data-mining (usually exploratory) analyses and parametric statistical (confirmatory) analyses, 2) illustrate 3 strengths of data-mining tools for generating hypotheses from data, and 3) suggest useful ways in which data mining and statistical analyses can be integrated into a thorough analysis of data to facilitate rapid creation of accurate models and to guide further research.
引用
收藏
页码:2427 / 2437
页数:11
相关论文
共 50 条
  • [1] Collation and data-mining of literature bioactivity data for drug discovery
    Bellis, Louisa J.
    Akhtar, Ruth
    Al-Lazikani, Bissan
    Atkinson, Francis
    Bento, A. Patricia
    Chambers, Jon
    Davies, Mark
    Gaulton, Anna
    Hersey, Anne
    Ikeda, Kazuyoshi
    Krueger, Felix A.
    Light, Yvonne
    McGlinchey, Shaun
    Santos, Rita
    Stauch, Benjamin
    Overington, John P.
    BIOCHEMICAL SOCIETY TRANSACTIONS, 2011, 39 : 1365 - 1370
  • [2] Application of Data-Mining and Knowledge Discovery in Automotive Data Engineering
    Keller, Joerg
    Bauer, Valerij
    Kwedlo, Wojciech
    LECTURE NOTES IN COMPUTER SCIENCE <D>, 2000, 1910 : 464 - 469
  • [3] A DATA-MINING BASED METHOD FOR THE GAIT PATTERN ANALYSIS
    Rudek, Marcelo
    Silva, Nicoli Maria
    Steinmetz, Jean-Paul
    Jahnen, Andreas
    FACTA UNIVERSITATIS-SERIES MECHANICAL ENGINEERING, 2015, 13 (03) : 205 - 215
  • [4] Data-Mining for Processes in Chemistry, Materials, and Engineering
    Li, Hao
    Zhang, Zhien
    Zhao, Zhe-Ze
    PROCESSES, 2019, 7 (03):
  • [5] MetaMirClust: Discovery of miRNA cluster patterns using a data-mining approach
    Chan, Wen-Ching
    Ho, Meng-Ru
    Li, Sung-Chou
    Tsai, Kuo-Wang
    Lai, Chun-Hung
    Hsu, Chun-Nan
    Lin, Wen-Chang
    GENOMICS, 2012, 100 (03) : 141 - 148
  • [6] Software Vulnerability Analysis and Discovery Using Machine-Learning and Data-Mining Techniques: A Survey
    Ghaffarian, Seyed Mohammad
    Shahriari, Hamid Reza
    ACM COMPUTING SURVEYS, 2017, 50 (04)
  • [7] Raw Wind Data Preprocessing: A Data-Mining Approach
    Zheng, Le
    Hu, Wei
    Min, Yong
    IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, 2015, 6 (01) : 11 - 19
  • [8] A data-mining approach for the validation of aerosol retrievals
    Vucetic, Slobodan
    Han, Bo
    Mi, Wen
    Li, Zhanquing
    Obradovic, Zoran
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2008, 5 (01) : 113 - 117
  • [9] A data-mining approach to predict influent quality
    Kusiak, Andrew
    Verma, Anoop
    Wei, Xiupeng
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2013, 185 (03) : 2197 - 2210
  • [10] Special Issue on Data-Mining and Statistical Science
    Washio, Takashi
    NEW GENERATION COMPUTING, 2009, 27 (04) : 281 - 284