Automatic feature subset selection for decision tree-based ensemble methods in the prediction of bioactivity

被引:39
|
作者
Cao, Dong-Sheng [1 ]
Xu, Qing-Song [2 ]
Liang, Yi-Zeng [1 ]
Chen, Xian [1 ]
Li, Hong-Dong [1 ]
机构
[1] Cent South Univ, Res Ctr Modernizat Tradit Chinese Med, Changsha 410083, Peoples R China
[2] Cent South Univ, Sch Math Sci & Comp Technol, Changsha 410083, Peoples R China
关键词
Feature selection; Bagging; Boosting; Random Forest (RF); Classification and Regression Tree (CART); Ensemble learning; QSAR MODELS; COMPOUND CLASSIFICATION; RANDOM FOREST; REGRESSION; INHIBITORS; QSPR; TOOL;
D O I
10.1016/j.chemolab.2010.06.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the structure-activity relationship (SAR) study, a learning algorithm is usually faced with the problem of selecting a compact subset of descriptors related to the property of interest, while ignoring the rest. This paper presents a new method of molecular descriptor selection utilizing three commonly used decision tree (DT)-based ensemble methods coupled with a backward elimination strategy (BES). Our proposed method eliminates descriptor redundancy automatically and searches for more compact descriptor subset tailored to DT-based ensemble methods. Six real SAR datasets related to different categorical bioactivities of compounds are used to evaluate the proposed method. The results obtained in this study indicate that DT-based ensemble methods coupled with BES, especially boosting tree model, yield better classification performance for compounds related to ADMET. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:129 / 136
页数:8
相关论文
共 50 条
  • [21] Tree-based Ensemble Classifier Learning for Automatic Brain Glioma Segmentation
    Amiri, Samya
    Mahjoub, Mohamed Ali
    Rekik, Islem
    NEUROCOMPUTING, 2018, 313 : 135 - 142
  • [22] A wrapper feature selection method for combined tree-based classifiers
    Gatnar, E
    FROM DATA AND INFORMATION ANALYSIS TO KNOWLEDGE ENGINEERING, 2006, : 119 - 125
  • [23] Prediction of Material Removal Rate for Chemical Mechanical Planarization Using Decision Tree-Based Ensemble Learning
    Li, Zhixiong
    Wu, Dazhong
    Yu, Tianyu
    JOURNAL OF MANUFACTURING SCIENCE AND ENGINEERING-TRANSACTIONS OF THE ASME, 2019, 141 (03):
  • [24] Comparison of regression tree-based methods in genomic selection
    Ashoori-Banaei, Sahar
    Ghafouri-Kesbi, Farhad
    Ahmadi, Ahmad
    JOURNAL OF GENETICS, 2021, 100 (02)
  • [25] Generalizing Gain Penalization for Feature Selection in Tree-Based Models
    Wundervald, Bruna
    Parnell, Andrew C.
    Domijan, Katarina
    IEEE ACCESS, 2020, 8 : 190231 - 190239
  • [26] Comparison of regression tree-based methods in genomic selection
    Sahar Ashoori-Banaei
    Farhad Ghafouri-Kesbi
    Ahmad Ahmadi
    Journal of Genetics, 2021, 100
  • [27] Application of decision tree-based ensemble learning in the classification of breast cancer
    Ghiasi, Mohammad M.
    Zendehboudi, Sohrab
    COMPUTERS IN BIOLOGY AND MEDICINE, 2021, 128
  • [28] RADE: resource-efficient supervised anomaly detection using decision tree-based ensemble methods
    Vargaftik, Shay
    Keslassy, Isaac
    Orda, Ariel
    Ben-Itzhak, Yaniv
    MACHINE LEARNING, 2021, 110 (10) : 2835 - 2866
  • [29] RADE: resource-efficient supervised anomaly detection using decision tree-based ensemble methods
    Shay Vargaftik
    Isaac Keslassy
    Ariel Orda
    Yaniv Ben-Itzhak
    Machine Learning, 2021, 110 : 2835 - 2866
  • [30] Feature subset selection for irrelevant data removal using Decision Tree Algorithm
    Evangeline, D. Preetha
    Sandhiya, C.
    Anandhakumar, P.
    Raj, G. Deepti
    Rajendran, T.
    2013 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2013, : 268 - 274