Evaluation of machine learning methods for classification of rotational absorption spectra for gases in the 220-330 GHz range

被引:11
作者
Chowdhury, M. Arshad Zahangir [1 ]
Rice, Timothy E. [1 ]
Oehlschlaeger, Matthew A. [1 ]
机构
[1] Rensselaer Polytech Inst, Dept Mech Aerosp & Nucl Engn, Troy, NY 12180 USA
来源
APPLIED PHYSICS B-LASERS AND OPTICS | 2021年 / 127卷 / 03期
基金
美国国家科学基金会;
关键词
CHEMICAL RECOGNITION; NEURAL-NETWORK; IMPLEMENTATION; ALGORITHM;
D O I
10.1007/s00340-021-07582-0
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Machine learning (ML) methods are implemented to classify rotational absorption spectra for gas-phase compounds in the THz region, specifically 220-330 GHz where experimental data is available. Eight ML methods were trained in both standard and one-versus-rest (OVR) implementations using simulated absorption spectra for 12 volatile organic compounds and halogenated hydrocarbons of interest in industrial and environmental gas sensing applications. The performance of the resulting ML classifiers was compared against simulated training spectra in both a 70-30 training-testing split and in tenfold cross-validation studies, with the classifiers exhibiting accuracies in the range of 88-99% for simulated spectra. The classifiers were then tested for their ability to classify noisy experimental rotational spectra for methanol, ethanol, formic acid, acetaldehyde, acetonitrile, and chloromethane. The OVR implementations of the support vector machine (SVM) classifier with both linear and radial basis function kernels and the multi-layer perceptron (MLP) classifier achieved average classification accuracies of 87-94% for the experimental dataset. The study shows that THz spectra in the present frequency region provide a sufficient spectral fingerprint for ML classifiers to learn and predict speciation, allowing automated gas sensing. The present methods can be extrapolated to different frequency ranges and compounds and conditions.
引用
收藏
页数:20
相关论文
共 81 条
[1]  
Abu-Mostafa Y. S., 2012, Learning from data: a short course
[2]  
[Anonymous], 1993, Series Title: Morgan Kaufmann series in {M}achine {L}earning Publication Title: Morgan Kaufmann San Mateo California
[3]  
[Anonymous], 2020, SCIKIT LEARN 0 23 2
[4]  
[Anonymous], 2006, Springer google schola, DOI [10.1117/1.2819119, DOI 10.18637/JSS.V017.B05]
[5]  
Anthony G., 2007, Proccedings of the 28th Asian Conference on Remote Sensing, V1, P801
[6]   Machine-Learning Guided Quantum Chemical and Molecular Dynamics Calculations to Design Novel Hole-Conducting Organic Materials [J].
Antono, Erin ;
Matsuzawa, Nobuyuki N. ;
Ling, Julia ;
Saal, James Edward ;
Arai, Hideyuki ;
Sasago, Masaru ;
Fujii, Eiji .
JOURNAL OF PHYSICAL CHEMISTRY A, 2020, 124 (40) :8330-8340
[7]  
BAILEY T, 1978, IEEE T SYST MAN CYB, V8, P311
[8]   Biodiesel classification by base stock type (vegetable oil) using near infrared spectroscopy data [J].
Balabin, Roman M. ;
Safieva, Ravilya Z. .
ANALYTICA CHIMICA ACTA, 2011, 689 (02) :190-197
[9]  
Banwell C. N., 1983, FUNDAMENTALS MOL SPE
[10]  
Bellou E., 2020, SPECTROCHIM ACTA B, V163, P105746