Two-step hybrid modeling for variable selection and estimation: An application to quantitative structure activity relationship study

被引:2
作者
Oranye, Henrietta Ebele [1 ,2 ]
Ugwuowo, Fidelis Ifeanyi [1 ]
Arum, Kingsley Chinedu [1 ]
机构
[1] Univ Nigeria, Dept Stat, Nsukka, Nigeria
[2] Univ Nigeria, Dept Stat, Nsukka, Enugu, Nigeria
关键词
cross-validation; jackknife; molecular descriptors; random forest; variable selection; ADAPTIVE LASSO; REGRESSION; QSAR; CLASSIFICATION;
D O I
10.1002/cem.3522
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this study, we developed a simple technique for effective parameter estimation and prediction of the quantitative structure activity relationship studies using a two-step procedure. The first step is to choose the important molecular descriptors using the random forest regression, and the second step is to optimally predict the biological activity of the selected chemical compounds using the following estimators: ridge regression, jackknife ridge, Liu regression, jackknife Liu, Kibria-Lukman, and jackknife Kibria-Lukman. We conducted a simulation study and a real-life analysis with a quantitative structure-activity relationship (QSAR) data with 2540 descriptors after preprocessing. The optimal prediction is determined using the cross-validation error. The estimator with minimum cross-validation error is considered best. It is obvious that performing jackknife estimation after random forest selection is preferred. In this study, we developed a simple technique for effective parameter estimation and prediction of the quantitative structure activity relationship studies (QSAR) using a two-step procedure. We conducted a simulation study and a real-life application with QSAR data with 2540 descriptors after preprocessing. The optimal prediction is determined using the cross-validation error. The performance of the methods is judged using the root mean squared error of prediction. It is obvious that performing jackknife estimation after random forest selection is preferred.
引用
收藏
页数:9
相关论文
共 50 条
[21]   Quantitative structure–activity relationship study of antitubercular fluoroquinolones [J].
Nikola Minovski ;
Marjan Vračko ;
Tom Šolmajer .
Molecular Diversity, 2011, 15 :417-426
[22]   Quantitative structure activity relationship study based molecular modeling of 4-aminoquinazoline derivatives for Aurora kinase inhibition [J].
Ranjana ;
Sharma, Neetu ;
Dwivedi, Amrita ;
Singh, Ajeet ;
Srivastava, A. K. .
INDIAN JOURNAL OF CHEMISTRY SECTION B-ORGANIC CHEMISTRY INCLUDING MEDICINAL CHEMISTRY, 2018, 57 (11) :1421-1429
[23]   New efficient spline estimation for varying-coefficient models with two-step knot number selection [J].
Jin, Jun ;
Ma, Tiefeng ;
Dai, Jiajia .
METRIKA, 2021, 84 (05) :693-712
[24]   A New Variable Selection Method Based on Mutual Information Maximization by Replacing Collinear Variables for Nonlinear Quantitative Structure-Property Relationship Models [J].
Ghasemi, Jahan B. ;
Zolfonoun, Ehsan .
BULLETIN OF THE KOREAN CHEMICAL SOCIETY, 2012, 33 (05) :1527-1535
[25]   Quantitative Structure-Activity Relationship Study of Camptothecin Derivatives as Anticancer Drugs Using Molecular Descriptors [J].
Ahmadinejad, Neda ;
Shafiei, Fatemeh .
COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2019, 22 (06) :387-399
[26]   Quantitative structure-activity relationship modeling of insect juvenile hormone activity of 2,4-dienoates using computed molecular descriptors [J].
Basak, SC ;
Natarajan, R ;
Mills, D ;
Hawkins, DM ;
Kraker, JJ .
SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2005, 16 (06) :581-606
[27]   Two-dimensional quantitative structure-activity relationship study on polyphenols as inhibitors of α-glucosidase [J].
Rastija, Vesna ;
Beslo, Drago ;
Nikolic, Sonja .
MEDICINAL CHEMISTRY RESEARCH, 2012, 21 (12) :3984-3993
[28]   Predictive Quantitative Structure-Activity Relationship Modeling of the Antifungal and Antibiotic Properties of Triazolothiadiazine Compounds [J].
Appell, Michael ;
Compton, David L. ;
Evans, Kervin O. .
METHODS AND PROTOCOLS, 2021, 4 (01) :1-11
[29]   A novel two-step QSAR modeling work flow to predict selectivity and activity of HDAC inhibitors [J].
Zhao, Lingling ;
Xiang, Yuhong ;
Song, Jinglin ;
Zhang, Zhuoyong .
BIOORGANIC & MEDICINAL CHEMISTRY LETTERS, 2013, 23 (04) :929-933
[30]   Linear and nonlinear quantitative structure-activity relationship modeling of the HIV-1 reverse transcriptase inhibiting activities of thiocarbamates [J].
Goodarzi, Mohammad ;
Freitas, Matheus P. ;
Vander Heyden, Yvan .
ANALYTICA CHIMICA ACTA, 2011, 705 (1-2) :166-173