TB-IECS: an accurate machine learning-based scoring function for virtual screening

被引:11
|
作者
Zhang, Xujun [1 ]
Shen, Chao [1 ]
Jiang, Dejun [1 ]
Zhang, Jintu [1 ]
Ye, Qing [1 ]
Xu, Lei [2 ]
Hou, Tingjun [1 ]
Pan, Peichen [1 ]
Kang, Yu [1 ]
机构
[1] Zhejiang Univ, Innovat Inst Artificial Intelligence Med, Coll Pharmaceut Sci, Hangzhou 310058, Zhejiang, Peoples R China
[2] Jiangsu Univ Technol, Inst Bioinformat & Med Engn, Sch Elect & Informat Engn, Changzhou 213001, Peoples R China
基金
中国国家自然科学基金;
关键词
Scoring function; Machine learning; Virtual screening; Theory-based interaction energy component; PROTEIN-LIGAND DOCKING; GENETIC ALGORITHM; BINDING-AFFINITY; FORCE-FIELD; VALIDATION; PREDICTION; GLIDE;
D O I
10.1186/s13321-023-00731-x
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Machine learning-based scoring functions (MLSFs) have shown potential for improving virtual screening capabilities over classical scoring functions (SFs). Due to the high computational cost in the process of feature generation, the numbers of descriptors used in MLSFs and the characterization of protein-ligand interactions are always limited, which may affect the overall accuracy and efficiency. Here, we propose a new SF called TB-IECS (theory-based interaction energy component score), which combines energy terms from Smina and NNScore version 2, and utilizes the eXtreme Gradient Boosting (XGBoost) algorithm for model training. In this study, the energy terms decomposed from 15 traditional SFs were firstly categorized based on their formulas and physicochemical principles, and 324 feature combinations were generated accordingly. Five best feature combinations were selected for further evaluation of the model performance in regard to the selection of feature vectors with various length, interaction types and ML algorithms. The virtual screening power of TB-IECS was assessed on the datasets of DUD-E and LIT-PCBA, as well as seven target-specific datasets from the ChemDiv database. The results showed that TB-IECS outperformed classical SFs including Glide SP and Dock, and effectively balanced the efficiency and accuracy for practical virtual screening.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] TB-IECS: an accurate machine learning-based scoring function for virtual screening
    Xujun Zhang
    Chao Shen
    Dejun Jiang
    Jintu Zhang
    Qing Ye
    Lei Xu
    Tingjun Hou
    Peichen Pan
    Yu Kang
    Journal of Cheminformatics, 15
  • [2] Beware of the generic machine learning-based scoring functions in structure-based virtual screening
    Shen, Chao
    Hu, Ye
    Wang, Zhe
    Zhang, Xujun
    Pang, Jinping
    Wang, Gaoang
    Zhong, Haiyang
    Xu, Lei
    Cao, Dongsheng
    Hou, Tingjun
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (03)
  • [3] Machine-learning scoring functions for structure-based virtual screening
    Li Hongjian
    Sze, Kam-Heung
    Lu Gang
    Ballester, Pedro J.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE, 2021, 11 (01)
  • [4] Enhancing Scoring Performance of Docking-Based Virtual Screening Through Machine Learning
    Silva, Candida G.
    Simoes, Carlos J. V.
    Carreiras, Pedro
    Brito, Rui M. M.
    CURRENT BIOINFORMATICS, 2016, 11 (04) : 408 - 420
  • [5] Improving structure-based virtual screening performance via learning from scoring function components
    Xiong, Guo-Li
    Ye, Wen-Ling
    Shen, Chao
    Lu, Ai-Ping
    Hou, Ting-Jun
    Cao, Dong-Sheng
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (03)
  • [6] Machine Learning-Based Scoring Functions, Development and Applications with SAnDReS
    Bitencourt-Ferreira, Gabriela
    Rizzotto, Camila
    de Azevedo Junior, Walter Filgueira
    CURRENT MEDICINAL CHEMISTRY, 2021, 28 (09) : 1746 - 1756
  • [7] Cobdock: an accurate and practical machine learning-based consensus blind docking method
    Ugurlu, Sadettin Y.
    Mcdonald, David
    Lei, Huangshu
    Jones, Alan M.
    Li, Shu
    Tong, Henry Y.
    Butler, Mark S.
    He, Shan
    JOURNAL OF CHEMINFORMATICS, 2024, 16 (01)
  • [8] Machine Learning-based Virtual Screening and Its Applications to Alzheimer's Drug Discovery: A Review
    Carpenter, Kristy A.
    Huang, Xudong
    CURRENT PHARMACEUTICAL DESIGN, 2018, 24 (28) : 3347 - 3358
  • [9] Vinardo: A Scoring Function Based on Autodock Vina Improves Scoring, Docking, and Virtual Screening
    Quiroga, Rodrigo
    Villarreal, Marcos A.
    PLOS ONE, 2016, 11 (05):
  • [10] Machine Learning-Based Virtual Screening for the Identification of Cdk5 Inhibitors
    Di Stefano, Miriana
    Galati, Salvatore
    Ortore, Gabriella
    Caligiuri, Isabella
    Rizzolio, Flavio
    Ceni, Costanza
    Bertini, Simone
    Bononi, Giulia
    Granchi, Carlotta
    Macchia, Marco
    Poli, Giulio
    Tuccinardi, Tiziano
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2022, 23 (18)