Combining the outputs of various k-nearest neighbor anomaly detectors to form a robust ensemble model for high-dimensional geochemical anomaly detection

被引:39
作者
Chen, Yongliang [1 ]
Zhao, Qingying [1 ]
Lu, Laijun [1 ]
机构
[1] Jilin Univ, Coll Earth Sci, Changchun 130061, Jilin, Peoples R China
基金
中国国家自然科学基金;
关键词
K-nearest neighbor; Gaussian mixture model; One-class support vector machine; Isolation forest; Combination algorithm; Geochemical anomaly detection; MIXED-EFFECTS MODELS; MACHINE;
D O I
10.1016/j.gexplo.2021.106875
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Machine learning techniques provide useful methods for high-dimensional geochemical anomaly detection for mineral exploration targeting. However, the instability of the machine learning models often leads to the uncertainty of high-dimensional geochemical anomaly detection result. Combining various individual models to form an adaptive ensemble anomaly detector is a feasible way to enhance the robustness of machine learning anomaly detectors. In this study, the average method, maximization method, average of maximum (AOM) method, and maximum of average (MOA) method were adopted to combine the outputs of various k-nearest neighbor (KNN) anomaly detectors to improve the robustness of the KNN models in the high-dimensional geochemical anomaly detection in the Baishan district (Jilin Province, China). The effectiveness of the four combination algorithms for high-dimensional geochemical anomaly detection was evaluated by comparing the ensemble models obtained by using the four combination algorithms with the single KNN model, Gaussian mixture model (GMM), one-class support vector machine (OCSVM), and isolation forest (IForest) in the case study. It is found that the four ensemble models (a) perform similarly well in high-dimensional geochemical anomaly detection, and (b) perform better than the single KNN model, GMM, OCSVM, and IForest in highdimensional geochemical anomaly detection. Therefore, the average method, maximization method, AOM method, and MOA method are potentially useful algorithms for combining the outputs of various KNN models to form robust ensemble models for high-dimensional geochemical anomaly detection.
引用
收藏
页数:11
相关论文
共 49 条
[1]  
Aggarwal C. C., 2015, ACM SIGKDD EXPLORATI, V17, P24
[2]  
Aggarwal Charu C, 2017, OUTLIER ENSEMBLES IN
[3]  
Angiulli F., 2002, Principles of Data Mining and Knowledge Discovery. 6th European Conference, PKDD 2002. Proceedings (Lecture Notes in Artificial Intelligence Vol.2431), P15
[4]  
Anjum S, 2014, J EC BUS MANAG, V2, P62
[5]  
[Anonymous], 2014, EARTH SCI J CHINA U
[6]  
[Anonymous], 2013, Pattern Recognition and Machine Learning, DOI [DOI 10.18637/JSS.V017.B05, 10.1117/1.2819119]
[7]   Bagging, subagging and Bragging for improving some prediction algorithms [J].
Bühlmann, P .
RECENT ADVANCES AND TRENDS IN NONPARAMETRIC STATISTICS, 2003, :19-34
[8]   Mapping mineral prospectivity by using one-class support vector machine to identify multivariate geological anomalies from digital geological survey data [J].
Chen, Y. ;
Wu, W. .
AUSTRALIAN JOURNAL OF EARTH SCIENCES, 2017, 64 (05) :639-651
[9]   Detection of Multivariate Geochemical Anomalies Using the Bat-Optimized Isolation Forest and Bat-Optimized Elliptic Envelope Models [J].
Chen, Yongliang ;
Wang, Shicheng ;
Zhao, Qingying ;
Sun, Guosheng .
JOURNAL OF EARTH SCIENCE, 2021, 32 (02) :415-426
[10]   Detection of multivariate geochemical anomalies associated with gold deposits by using distance anomaly factors [J].
Chen, Yongliang ;
Sun, Guosheng ;
Zhao, Qingying .
JOURNAL OF GEOCHEMICAL EXPLORATION, 2021, 221