Data leakage jeopardizes ecological applications of machine learning

被引:9
|
作者
Stock, Andy [1 ]
Gregr, Edward J. [1 ,2 ]
Chan, Kai M. A. [1 ]
机构
[1] Univ British Columbia, Inst Resources Environm & Sustainabil, Vancouver, BC, Canada
[2] SciTech Environm Consulting, Vancouver, BC, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
VALIDATION;
D O I
10.1038/s41559-023-02162-1
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Machine learning is a popular tool in ecology but many scientific applications suffer from data leakage, causing misleading results. We highlight common pitfalls in ecological machine-learning methods and argue that discipline-specific model info sheets must be developed to aid in model evaluations.
引用
收藏
页码:1743 / 1745
页数:3
相关论文
共 50 条
  • [41] Machine learning for leaf disease classification: data, techniques and applications
    Yao, Jianping
    Tran, Son N.
    Sawyer, Samantha
    Garg, Saurabh
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL3) : S3571 - S3616
  • [42] Using machine learning to optimize parallelism in big data applications
    Brandon Hernandez, Alvaro
    Perez, Maria S.
    Gupta, Smrati
    Muntes-Mulero, Victor
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 86 : 1076 - 1092
  • [43] Machine Learning Algorithms for Big Data Applications With Policy Implementation
    Wu, Jianzu
    Zhang, Kunxin
    JOURNAL OF ORGANIZATIONAL AND END USER COMPUTING, 2022, 34 (03)
  • [44] CryptoML: Secure Outsourcing of Big Data Machine Learning Applications
    Mirhoseini, Azalia
    Sadeghi, Ahmad-Reza
    Koushanfar, Farinaz
    PROCEEDINGS OF THE 2016 IEEE INTERNATIONAL SYMPOSIUM ON HARDWARE ORIENTED SECURITY AND TRUST (HOST), 2016, : 149 - 154
  • [45] Machine Learning Applications for Site Characterization Based on CPT Data
    Tsiaousi, Dimitra
    Travasarou, Thaleia
    Drosos, Vasilis
    Ugalde, Jose
    Chacko, Jacob
    GEOTECHNICAL EARTHQUAKE ENGINEERING AND SOIL DYNAMICS V: SLOPE STABILITY AND LANDSLIDES, LABORATORY TESTING, AND IN SITU TESTING, 2018, (293): : 461 - 472
  • [46] Geological Applications of Machine Learning in Hyperspectral Remote Sensing Data
    Tse, C. H.
    Li, Yi-liang
    Lam, Edmund Y.
    IMAGE PROCESSING: MACHINE VISION APPLICATIONS VIII, 2015, 9405
  • [47] Applications of machine learning to behavioral sciences: focus on categorical data
    Dehghan, Pegah
    Alashwal, Hany
    Moustafa, Ahmed A.
    DISCOVER PSYCHOLOGY, 2022, 2 (01):
  • [48] Machine learning and real time data visualization for formulation applications
    Partopour, Behnam
    Boggara, Mohan
    Ren, Cindy
    Rathore, Nitin
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 257
  • [49] Large data sets and machine learning: Applications to statistical arbitrage
    Huck, Nicolas
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2019, 278 (01) : 330 - 342
  • [50] Machine learning and big data in psychiatry: toward clinical applications
    Rutledge, Robb B.
    Chekroud, Adam M.
    Huys, Quentin J. M.
    CURRENT OPINION IN NEUROBIOLOGY, 2019, 55 : 152 - 159