Interval-based sparse ensemble multi-class classification algorithm for terahertz data

被引:0
作者
Zheng, Chengyong [1 ]
Zha, Xiaowen [1 ]
Cai, Shengjie [2 ]
Cui, Jing [3 ]
Li, Qian [4 ]
Ye, Zhijing [5 ]
机构
[1] Wuyi Univ, Sch Math & Computat Sci, Jiangmen 529000, Peoples R China
[2] Shenzhen Kangguan Technol Co LTD, Shenzhen 518129, Peoples R China
[3] Guangdong Jiangmen Chinese Tradit Med Coll, Jiangmen 529020, Peoples R China
[4] Terahertz Technol Applicat Guangdong Co Ltd, Guangzhou 510700, Peoples R China
[5] Macau Univ Sci & Technol, Fac Innovat Engn, Taipa, Macau, Peoples R China
基金
中国国家自然科学基金;
关键词
Terahertz spectrum; Classification; Sparse ensemble; Interval; Cross entropy;
D O I
10.1016/j.heliyon.2024.e27743
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Terahertz time-domain spectroscopy (THz-TDS) has been widely used for food and drug identification. The classification information of a THz spectrum usually does not exist in the whole spectral band but exists only in one or several small intervals. Therefore, feature selection is indispensable in THz-based substance identification. However, most THz-based identification methods empirically intercept the low-frequency band of the THz absorption coefficients for analysis. In order to adaptively find out important intervals of the THz spectra, an interval-based sparse ensemble multi-class classifier (ISEMCC) for THz spectral data classification is proposed. In ISEMCC, the THz spectra are first divided into several small intervals through window sliding. Then the data of training samples in each interval are extracted to train some base classifiers. Finally, a final robust classifier is obtained through a nonnegative sparse combination of these trained base classifiers. With l(1)-norm, two objective functions that based on Mean Square Error (MSE) and Cross Entropy (CE) are established. For these two objective functions, two iterative algorithms based on the Alternating Direction Method of Multipliers (ADMM) and Gradient Descent (GD) are built respectively. ISEMCC transforms the problem of interval feature selection and decision-level fusion into a nonnegative sparse optimization problem. The sparse constraint ensures only a few important spectral segments are selected. In order to verify the performance of the proposed algorithm, comparative experiments on identifying the origin of Bupleurum and the harvesting year of Tangerine peel are carried out. The base classifiers used by ISEMCC are Support Vector Machine (SVM) and Decision Tree (DT). The experimental results demonstrate that the proposed algorithm outperforms six typical classifiers, including Random Forest (RF), AdaBoost, RUSBoost, ExtraTree, and the two base classifiers, in terms of classification accuracy.
引用
收藏
页数:9
相关论文
共 21 条
[1]   Terahertz and Millimeter Wave Sensing and Applications [J].
Bauer, Maris ;
Friederich, Fabian .
SENSORS, 2022, 22 (24)
[2]   Distributed optimization and statistical learning via the alternating direction method of multipliers [J].
Boyd S. ;
Parikh N. ;
Chu E. ;
Peleato B. ;
Eckstein J. .
Foundations and Trends in Machine Learning, 2010, 3 (01) :1-122
[3]   TERAHERTZ PHOTONICS Phase control of terahertz waves moves on chip [J].
Fedotov, Vassili .
NATURE PHOTONICS, 2021, 15 (10) :715-716
[4]  
Friska J., 2021, Journal of Physics: Conference Series, DOI 10.1088/1742-6596/1979/1/012056
[5]   Applications of Terahertz Spectroscopy in the Detection and Recognition of Substances [J].
Fu, Xiaojian ;
Liu, Yujie ;
Chen, Qi ;
Fu, Yuan ;
Cui, Tie Jun .
FRONTIERS IN PHYSICS, 2022, 10
[6]   Classification and identification of molecules through factor analysis method based on terahertz spectroscopy [J].
Huang, Jianglou ;
Liu, Jinsong ;
Wang, Kejia ;
Yang, Zhengang ;
Liu, Xiaming .
SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2018, 198 :198-203
[7]   Analysis and inspection techniques for mouse liver injury based on terahertz spectroscopy [J].
Huang, Pingjie ;
Cao, Yuqi ;
Chen, Jiani ;
Ge, Weiting ;
Hou, Dibo ;
Zhang, Guangxin .
OPTICS EXPRESS, 2019, 27 (18) :26014-26026
[8]   Progress in application of terahertz time-domain spectroscopy for pharmaceutical analyses [J].
Huang, Shuteng ;
Deng, Hanxiu ;
Wei, Xia ;
Zhang, Jiayu .
FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2023, 11
[9]   Approximating the Gradient of Cross-Entropy Loss Function [J].
Li, Li ;
Doroslovacki, Milos ;
Loew, Murray H. .
IEEE ACCESS, 2020, 8 :111626-111635
[10]   Dimensionality Reduction for Identification of Hepatic Tumor Samples Based on Terahertz Time-Domain Spectroscopy [J].
Liu, Haishun ;
Zhang, Zhenwei ;
Zhang, Xin ;
Yang, Yuping ;
Zhang, Zhuoyong ;
Liu, Xiangyi ;
Wang, Fan ;
Han, Yiding ;
Zhang, Cunlin .
IEEE TRANSACTIONS ON TERAHERTZ SCIENCE AND TECHNOLOGY, 2018, 8 (03) :271-277