Enhanced QSAR Model Performance by Integrating Structural and Gene Expression Information

被引:5
作者
Chen, Qian [1 ]
Wu, Leihong [1 ]
Liu, Wei [1 ]
Xing, Li [1 ]
Fan, Xiaohui [1 ]
机构
[1] Zhejiang Univ, Coll Pharmaceut Sci, Pharmaceut Informat Inst, Hangzhou 310058, Zhejiang, Peoples R China
基金
美国国家科学基金会;
关键词
quantitative structure-activity relationships (QSAR); SAR paradox; molecular modeling; gene expression; integrative analysis; RISK-ASSESSMENT; METALLOTHIONEIN; SELECTION; CANCER; PREDICTION; TOXICITY; CARCINOGENESIS; CLASSIFICATION; MECHANISMS; PARADIGM;
D O I
10.3390/molecules180910789
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Despite decades of intensive research and a number of demonstrable successes, quantitative structure-activity relationship (QSAR) models still fail to yield predictions with reasonable accuracy in some circumstances, especially when the QSAR paradox occurs. In this study, to avoid the QSAR paradox, we proposed a novel integrated approach to improve the model performance through using both structural and biological information from compounds. As a proof-of-concept, the integrated models were built on a toxicological dataset to predict non-genotoxic carcinogenicity of compounds, using not only the conventional molecular descriptors but also expression profiles of significant genes selected from microarray data. For test set data, our results demonstrated that the prediction accuracy of QSAR model was dramatically increased from 0.57 to 0.67 with incorporation of expression data of just one selected signature gene. Our successful integration of biological information into classic QSAR model provided a new insight and methodology for building predictive models especially when QSAR paradox occurred.
引用
收藏
页码:10789 / 10801
页数:13
相关论文
共 50 条
[21]   Integrating heterogeneous gene expression data for gene regulatory network modelling [J].
Alina Sîrbu ;
Heather J. Ruskin ;
Martin Crane .
Theory in Biosciences, 2012, 131 :95-102
[22]   Integrating heterogeneous gene expression data for gene regulatory network modelling [J].
Sirbu, Alina ;
Ruskin, Heather J. ;
Crane, Martin .
THEORY IN BIOSCIENCES, 2012, 131 (02) :95-102
[23]   Integrating 3D structural information into systems biology [J].
Murray, Diana ;
Petrey, Donald ;
Honig, Barry .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2021, 296
[24]   Drug target prioritization by perturbed gene expression and network information [J].
Isik, Zerrin ;
Baldow, Christoph ;
Cannistraci, Carlo Vittorio ;
Schroeder, Michael .
SCIENTIFIC REPORTS, 2015, 5
[25]   ASSOCIATION OF FEATURE GENE EXPRESSION WITH STRUCTURAL FINGERPRINTS OF CHEMICAL COMPOUNDS [J].
Li, Yun ;
Tu, Kang ;
Zheng, Siyuan ;
Wang, Jingfang ;
Li, Yixue ;
Hao, Pei ;
Li, Xuan .
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2011, 9 (04) :503-519
[26]   On the Potential for Integrating Gene Expression and Metabolic Flux Data [J].
Ovacik, Meric A. ;
Androulakis, Ioannis P. .
CURRENT BIOINFORMATICS, 2008, 3 (03) :142-148
[27]   A New Measure of Classifier Performance for Gene Expression Data [J].
Hanczar, Blaise ;
Bar-Hen, Avner .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (05) :1379-1386
[28]   Network based stratification of major cancers by integrating somatic mutation and gene expression data [J].
He, Zongzhen ;
Zhang, Junying ;
Yuan, Xiguo ;
Liu, Zhaowen ;
Liu, Baobao ;
Tuo, Shouheng ;
Liu, Yajun .
PLOS ONE, 2017, 12 (05)
[29]   Improved model quality assessment using sequence and structural information by enhanced deep neural networks [J].
Liu, Jun ;
Zhao, Kailong ;
Zhang, Guijun .
BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
[30]   A new LSTM-based gene expression prediction model: L-GEPM [J].
Wang, Huiqing ;
Li, Chun ;
Zhang, Jianhui ;
Wang, Jingjing ;
Ma, Yue ;
Lian, Yuanyuan .
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2019, 17 (04)