TranScreen: Transfer Learning on Graph-Based Anti-Cancer Virtual Screening Model

被引：6

作者：

Salem, Milad ^{[1
]}

Khormali, Aminollah ^{[1
]}

Arshadi, Arash Keshavarzi ^{[2
]}

Webb, Julia ^{[2
]}

Yuan, Jiann-Shiun ^{[1
]}

机构：

[1] Univ Cent Florida, Dept Elect & Comp Engn, Orlando, FL 32816 USA

[2] Univ Cent Florida, Burnett Sch Biomed Sci, Orlando, FL 32816 USA

来源：

BIG DATA AND COGNITIVE COMPUTING | 2020年 / 4卷 / 03期

关键词：

cancer; drug discovery; machine learning; transfer learning; virtual screening; DRUG DISCOVERY; CANCER; P53;

D O I：

10.3390/bdcc4030016

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning's automatic feature extraction has proven its superior performance over traditional fingerprint-based features in the implementation of virtual screening models. However, these models face multiple challenges in the field of early drug discovery, such as over-training and generalization to unseen data, due to the inherently unbalanced and small datasets. In this work, the TranScreen pipeline is proposed, which utilizes transfer learning and a collection of weight initializations to overcome these challenges. An amount of 182 graph convolutional neural networks are trained on molecular source datasets and the learned knowledge is transferred to the target task for fine-tuning. The target task of p53-based bioactivity prediction, an important factor for anti-cancer discovery, is chosen to showcase the capability of the pipeline. Having trained a collection of source models, three different approaches are implemented to compare and rank them for a given task before fine-tuning. The results show improvement in performance of the model in multiple cases, with the best model increasing the area under receiver operating curve ROC-AUC from 0.75 to 0.91 and the recall from 0.25 to 1. This improvement is vital for practical virtual screening via lowering the false negatives and demonstrates the potential of transfer learning. The code and pre-trained models are made accessible online.

引用

页码：1 / 20

页数：20

共 58 条

[51] Deep visual domain adaptation: A survey
Wang, Mei
Deng, Weihong
[J]. NEUROCOMPUTING, 2018, 312 : 135 - 153
[52] MoleculeNet: a benchmark for molecular machine learning
Wu, Zhenqin
Ramsundar, Bharath
Feinberg, Evan N.
Gomes, Joseph
Geniesse, Caleb
Pappu, Aneesh S.
Leswing, Karl
Pande, Vijay
[J]. CHEMICAL SCIENCE, 2018, 9 (02) : 513 - 530
[53] Costs of cancer care in the USA: a descriptive review
Yabroff, K. Robin
Warren, Joan L.
Brown, Martin L.
[J]. NATURE CLINICAL PRACTICE ONCOLOGY, 2007, 4 (11): : 643 - 656
[54] Power Normalizing Second-order Similarity Network for Few-shot Learning
Zhang, Hongguang
Koniusz, Piotr
[J]. 2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1185 - 1193
[55] Model Selection for Generalized Zero-Shot Learning
Zhang, Hongguang
Koniusz, Piotr
[J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT II, 2019, 11130 : 198 - 204
[56] Zero-Shot Kernel Learning
Zhang, Hongguang
Koniusz, Piotr
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7670 - 7679
[57] Multiple Feature Reweight DenseNet for Image Classification
Zhang, Ke
Guo, Yurong
Wang, Xinsheng
Yuan, Jinsha
Ding, Qiaolin
[J]. IEEE ACCESS, 2019, 7 : 9872 - 9880
[58] A Comprehensive Survey on Transfer Learning
Zhuang, Fuzhen
Qi, Zhiyuan
Duan, Keyu
Xi, Dongbo
Zhu, Yongchun
Zhu, Hengshu
Xiong, Hui
He, Qing
[J]. PROCEEDINGS OF THE IEEE, 2021, 109 (01) : 43 - 76

← 1 2 3 4 5 6 →