Predicting species abundance using machine learning approach: a comparative assessment of random forest spatial variants and performance metrics

被引:2
作者
Mushagalusa, Ciza Arsene [1 ,2 ]
Fandohan, Adande Belarmain [3 ]
Kakai, Romain Glele [1 ]
机构
[1] Univ Abomey Calavi, Fac Sci Agron, Lab Biomath & Estimat Forestieres, 04 PB 1525, Cotonou, Benin
[2] Univ Evangel Africa UEA, Fac Agr & Environm Sci, Bukavu 3323, DEM REP CONGO
[3] Univ Natl Agr, Ecole Foresterie Trop, Unite Rech Foresterie & Conservat Bioressources, BP 43, Ketou, Benin
关键词
Ecological modelling; Machine learning; Random forest; Spatial analysis; Population estimation; GEOGRAPHICALLY WEIGHTED REGRESSION; IMPERFECT DETECTION; LINEAR-MODEL; CLIMATE; CLASSIFICATION; INTERPOLATION; AUTOCORRELATION; DEPENDENCE; DIVERSITY; POWERFUL;
D O I
10.1007/s40808-024-02055-7
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
For informed decision-making in biodiversity conservation and ecological management, accurate predictions of species abundance are essential. This study aimed to assess the predictive performance of random forest (RF) spatial variants in modelling species abundance distribution compared to standard RF, Poisson, their hybrid methods with ordinary kriging (OK), and the random generalised linear model (RGLM). Model performance in abundance modelling has rarely been quantified using a comprehensive index, except the existing single statistical indices. Therefore, modified Taylor diagrams were used to evaluate the model's overall ability to predict species abundance spatial patterns, taking into account abundance class and detection probability. An exponential correlation function was used to generate spatially correlated random effects with and without a quadratic term and two variation strengths. Species abundance class and the relationship between abundance and independent variables determine which RF spatial variant performs the best. Spatial RF variants outperform conventional modelling in terms of prediction accuracy and power, particularly when spatial autocorrelation and species detection probabilities are high. RF spatial variants were less precise for common species than RGLM and GLM-OK, which better predicted species abundance for low or no spatial autocorrelation cases. However, none of the models outperformed the others for all prediction goals, highlighting the need for combining performance metrics to evaluate species abundance distribution models. The study highlights the importance of model specification in ecological research and cautions against the use of RF algorithms as a black box.
引用
收藏
页码:5145 / 5171
页数:27
相关论文
共 121 条
[91]   Newer classification and regression tree techniques: Bagging and random forests for ecological prediction [J].
Prasad, AM ;
Iverson, LR ;
Liaw, A .
ECOSYSTEMS, 2006, 9 (02) :181-199
[92]   Predicting extinction risk in declining species [J].
Purvis, A ;
Gittleman, JL ;
Cowlishaw, G ;
Mace, GM .
PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2000, 267 (1456) :1947-1952
[93]   Persistence of genetic diversity and phylogeographic structure of three New Zealand forest beetles under climate change [J].
Rizvanovic, Mirnesa ;
Kennedy, Jonathan D. ;
Nogues-Bravo, David ;
Marske, Katharine A. .
DIVERSITY AND DISTRIBUTIONS, 2019, 25 (01) :142-153
[94]   Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure [J].
Roberts, David R. ;
Bahn, Volker ;
Ciuti, Simone ;
Boyce, Mark S. ;
Elith, Jane ;
Guillera-Arroita, Gurutzeta ;
Hauenstein, Severin ;
Lahoz-Monfort, Jose J. ;
Schroeder, Boris ;
Thuiller, Wilfried ;
Warton, David I. ;
Wintle, Brendan A. ;
Hartig, Florian ;
Dormann, Carsten F. .
ECOGRAPHY, 2017, 40 (08) :913-929
[95]   Hierarchical spatial models of abundance and occurrence from imperfect survey data [J].
Royle, J. Andrew ;
Kery, Marc ;
Gautier, Roland ;
Schmid, Hans .
ECOLOGICAL MONOGRAPHS, 2007, 77 (03) :465-481
[96]  
Royle JA, 2009, HIERARCHICAL MODELING AND INFERENCE IN ECOLOGY: THE ANALYSIS OF DATA FROM POPULATIONS, METAPOPULATIONS AND COMMUNITIES, P267
[97]  
Russ Georg, 2010, Computational Intelligence for Knowledge-Based Systems Design. Proceedings 13th International Conference on Information Processing and Management of Uncertainty, IPMU 2010, P350, DOI 10.1007/978-3-642-14049-5_36
[98]   Random Forests for Spatially Dependent Data [J].
Saha, Arkajyoti ;
Basu, Sumanta ;
Datta, Abhirup .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (541) :665-683
[99]   BRISC: bootstrap for rapid inference on spatial covariances [J].
Saha, Arkajyoti ;
Datta, Abhirup .
STAT, 2018, 7 (01)
[100]  
SHAPIRO SS, 1965, BIOMETRIKA, V52, P591, DOI 10.2307/2333709