Improved QSAR models for PARP-1 inhibition using data balancing, interpretable machine learning, and matched molecular pair analysis

被引：0

作者：

Gomatam, Anish ^{[1
]}

Hirlekar, Bhakti Umesh ^{[1
]}

Singh, Krishan Dev ^{[1
]}

Murty, Upadhyayula Suryanarayana ^{[1
]}

Dixit, Vaibhav A. ^{[1
]}

机构：

[1] Govt India, Natl Inst Pharmaceut Educ & Res, Dept Med Chem,NIPER Guwahati, Dept Pharmaceut,Minist Chem & Fertilizers, PO Changsari, Gauhati 781101, Assam, India

来源：

MOLECULAR DIVERSITY | 2024年 / 28卷 / 04期

关键词：

PARP-1; inhibitor; Machine learning; QSAR; Data balancing; MMPA; IN-SILICO; PREDICTION; OPTIMIZATION; SOLUBILITY; REGRESSION; ENSEMBLES; SELECTION; DOCKING; DESIGN;

D O I：

10.1007/s11030-024-10809-9

中图分类号：

Q5 [生物化学]; Q7 [分子生物学];

学科分类号：

071010 ; 081704 ;

摘要：

The poly (ADP-ribose) polymerase-1 (PARP-1) enzyme is an important target in the treatment of breast cancer. Currently, treatment options include the drugs Olaparib, Niraparib, Rucaparib, and Talazoparib; however, these drugs can cause severe side effects including hematological toxicity and cardiotoxicity. Although in silico models for the prediction of PARP-1 activity have been developed, the drawbacks of these models include low specificity, a narrow applicability domain, and a lack of interpretability. To address these issues, a comprehensive machine learning (ML)-based quantitative structure-activity relationship (QSAR) approach for the informed prediction of PARP-1 activity is presented. Classification models built using the Synthetic Minority Oversampling Technique (SMOTE) for data balancing gave robust and predictive models based on the K-nearest neighbor algorithm (accuracy 0.86, sensitivity 0.88, specificity 0.80). Regression models were built on structurally congeneric datasets, with the models for the phthalazinone class and fused cyclic compounds giving the best performance. In accordance with the Organization for Economic Cooperation and Development (OECD) guidelines, a mechanistic interpretation is proposed using the Shapley Additive Explanations (SHAP) to identify the important topological features to differentiate between PARP-1 actives and inactives. Moreover, an analysis of the PARP-1 dataset revealed the prevalence of activity cliffs, which possibly negatively impacts the model's predictive performance. Finally, a set of chemical transformation rules were extracted using the matched molecular pair analysis (MMPA) which provided mechanistic insights and can guide medicinal chemists in the design of novel PARP-1 inhibitors.

引用

页码：2135 / 2152

页数：18

共 80 条

[1] Principal component analysis
Abdi, Herve
Williams, Lynne J.
[J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04): : 433 - 459
[2] Development and evaluation of a QSPR model for the prediction of diamagnetic susceptibility
Afantitis, Antreas
Melagraki, Georgia
Sarimveis, Haralambos
Koutentis, Panayiotis A.
Markopoulos, John
Igglessi-Markopoulou, Olga
[J]. QSAR & COMBINATORIAL SCIENCE, 2008, 27 (04): : 432 - 436
[3] A multi-task FP-GNN framework enables accurate prediction of selective PARP inhibitors
Ai, Daiqiao
Wu, Jingxing
Cai, Hanxuan
Zhao, Duancheng
Chen, Yihao
Wei, Jiajia
Xu, Jianrong
Zhang, Jiquan
Wang, Ling
[J]. FRONTIERS IN PHARMACOLOGY, 2022, 13
[4] Beware of R2: Simple, Unambiguous Assessment of the Prediction Accuracy of QSAR and QSPR Models
Alexander, D. L. J.
Tropsha, A.
Winkler, David A.
[J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2015, 55 (07) : 1316 - 1322
[5] [Anonymous], RES DRUG DISC
[6] Prediction-Inspired Intelligent Training for the Development of Classification Read-across Structure-Activity Relationship (c-RASAR) Models for Organic Skin Sensitizers: Assessment of Classification Error Rate from Novel Similarity Coefficients
Banerjee, Arkaprava
Roy, Kunal
[J]. CHEMICAL RESEARCH IN TOXICOLOGY, 2023, 36 (09) : 1518 - 1531
[7] Safety profile of poly (ADP-ribose) polymerase (PARP) inhibitors in cancer: a network meta-analysis of randomized controlled trials
Bao, Shengnan
Yue, Yuanping
Hua, Yijia
Zeng, Tianyu
Yang, Yiqi
Yang, Fan
Yan, Xueqi
Sun, Chunxiao
Yang, Mengzhu
Fu, Ziyi
Huang, Xiang
Li, Jun
Wu, Hao
Li, Wei
Zhao, Yang
Yin, Yongmei
[J]. ANNALS OF TRANSLATIONAL MEDICINE, 2021, 9 (15)
[8] Basak D., 2007, Neural Inf. Process. Lett. Rev., V11, P203, DOI DOI 10.1007/978-1-4302-5990-9_4
[9] A comparative analysis of gradient boosting algorithms
Bentejac, Candice
Csorgo, Anna
Martinez-Munoz, Gonzalo
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (03) : 1937 - 1967
[10] Berrar Daniel, 2018, Encyclopedia of Bioinformatics and Computational Biology, V403, P412, DOI [10.1016/b978-0-12-809633-8.20473-1, DOI 10.1016/B978-0-12-809633-8.20473-1, 10.1016/B978-0-12-809633-8.20473-1]

← 1 2 3 4 5 6 7 8 →