Machine learning and data augmentation approach for identification of rare earth element potential in Indiana Coals, USA

被引:32
作者
Chatterjee, Snehamoy [1 ]
Mastalerz, Maria [2 ]
Drobniak, Agnieszka [2 ]
Karacan, C. Ozgen [3 ]
机构
[1] Michigan Technol Univ, 1400 Townsend Dr, Houghton, MI 49931 USA
[2] Indiana Univ, Indiana Geol & Water Survey, 1001 East 10th St, Bloomington, IN 47405 USA
[3] US Geol Survey, Geol Energy & Minerals Sci Ctr, Reston, VA 22092 USA
关键词
Rare earth potential; Indiana coal; Machine learning; Data augmentation; Feature selection; FEATURE-SELECTION; RANDOM FOREST; REGRESSION;
D O I
10.1016/j.coal.2022.104054
中图分类号
TE [石油、天然气工业]; TK [能源与动力工程];
学科分类号
0807 ; 0820 ;
摘要
Rare earth elements and yttrium (REYs) are critical elements and valuable commodities due to their limited availability and high demand in a wide range of applications and especially in high-technology products. The increased demand and geopolitical pressures motivate the search for alternative sources of REYs, and coal, coal waste, and coal ash are considered as new sources for these critical elements. This research evaluates the REY potential of coals from Indiana (USA). However, although coal data revealed REY potential, it suffered from sparse samples with complete REY measurements. Therefore, we explore the applicability of machine learning (ML) models and data augmentation techniques to demonstrate their applicability to evaluate REY potential in Indiana, and other areas in coal basins, using selected coal parameters (Al2O3, Fe2O3, C, Ash, S, P, Mo, Zn, and As contents) as covariates (indicators). Due to the relatively small sample size with complete REY data in the Indiana Coal Database, two data augmentation techniques (Random Over-Sampling Examples and Synthetic Minority Over-Sampling Technique) were used. Four machine learning algorithms (linear discriminate analysis, support vector machine, random forest, and artificial neural networks) were applied for modeling REY potential as a classification problem. The results show that application of Synthetic Minority Over-Sampling Technique prior to development of the support vector machine (SVM) models generated the best REY classification with an accuracy of 95%. The encouraging results based on Indiana coal data may suggest that a similar approach can be used for other coal basins for screening the locations with REY potential. Those locations then can be targeted for more detailed geochemical surveys to identify most promising areas and evaluate overall REY resources.
引用
收藏
页数:14
相关论文
共 63 条
[1]  
[Anonymous], 2018, D2013D2013M18 ASTM I
[2]  
[Anonymous], 2012, Rare Earth Materials: Insights and Concerns
[3]  
ASTM, 2015, Method D7582-15, DOI [10.1520/D7582-15, DOI 10.1520/D7582-15]
[4]  
Balakrishnama S., 1998, Institute for Signal and information Processing, V18, P1, DOI DOI 10.1109/IJCNN.2000.861335
[5]   Feature selection using Joint Mutual Information Maximisation [J].
Bennasar, Mohamed ;
Hicks, Yulia ;
Setchi, Rossitza .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (22) :8520-8532
[6]   RARE-EARTH ELEMENTS IN BITUMINOUS COALS AND UNDERCLAYS OF THE SYDNEY BASIN, NOVA-SCOTIA - ELEMENT SITES, DISTRIBUTION, MINERALOGY [J].
BIRK, D ;
WHITE, JC .
INTERNATIONAL JOURNAL OF COAL GEOLOGY, 1991, 19 (1-4) :219-251
[7]   Feature selection in machine learning: A new perspective [J].
Cai, Jie ;
Luo, Jiawei ;
Wang, Shulin ;
Yang, Sheng .
NEUROCOMPUTING, 2018, 300 :70-79
[8]   Vision-based rock-type classification of limestone using multi-class support vector machine [J].
Chatterjee, Snehamoy .
APPLIED INTELLIGENCE, 2013, 39 (01) :14-27
[9]   Ore Grade Prediction Using a Genetic Algorithm and Clustering Based Ensemble Neural Network Model [J].
Chatterjee, Snehamoy ;
Bandopadhyay, Sukumar ;
Machuca, David .
MATHEMATICAL GEOSCIENCES, 2010, 42 (03) :309-326
[10]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)