Drug Target Interaction Predictions using PU-Learning under different experimental setting for four formulations namely known drug target pair prediction, drug prediction, target prediction and unknown drug target pair prediction

被引:0
作者
Rajpura, Hetal Rahul [1 ]
Ngom, Alioune [1 ]
机构
[1] Univ Windsor, Sch Comp Sci, Windsor, ON, Canada
来源
2018 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB) | 2018年
关键词
Drug target prediction; Support vector machine; Positive un-labelled learning; One class classifier; Chemical fingerprint; Protein motif; DATABASE;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Predicting new drug target interactions experimentally through wet lab experiments is time as well as resource intensive. In general, drug-target interaction prediction problem leads to drug discovery, drug repositioning and uncovers interesting patterns in chemogenomics research. Drug and target represent heterogeneous nodes within a network of interactions. Presence of an edge between the nodes indicates a positive interaction whereas an absence suggests an unknown interaction. Classification based machine learning algorithms are heavily applied in this area of research. Classification algorithms need positive as well as negative data to yield optimized results. The major problem in this field is lack of negative data because the data that are found in the public databases are positive interaction samples. Considering unknown drug target pairs as negative data may cause severe consequences for the prediction performance. Thereby, we propose a positive un-labelled (PU) learning-based approach that uses one class support vector machine technique as the learning algorithm. The algorithm learns the positive distribution from the unified feature vector space of drugs and targets and regards unknown pairs as unlabeled instead of labelling them as negative pairs. Additionally, we use 4860 Klekota Roth fingerprint + 881 PubChem fingerprint as a high dimensional and highly discriminative feature vector representation for drugs. To represent protein features, we create a protein-motif matrix based on the sliding window score that records the probability of a motif pattern occurring within a given protein sequence. Also, we separately evaluate the prediction performance using 5-fold nested cross-validation under different experimental setting for each of the four formulations: 1) Known drug-target pair, 2) Drug prediction, 3) Target prediction and 4) Unknown drug target pair. We show that our approach yields the best AUC score over previous benchmark techniques and outperforms most of the recent works based on one class classifiers and PU-based learning.
引用
收藏
页码:97 / 103
页数:7
相关论文
共 41 条
[1]  
[Anonymous], 2017, SEQUENCE MOTIF WIKIP
[2]   MEME SUITE: tools for motif discovery and searching [J].
Bailey, Timothy L. ;
Boden, Mikael ;
Buske, Fabian A. ;
Frith, Martin ;
Grant, Charles E. ;
Clementi, Luca ;
Ren, Jingyuan ;
Li, Wilfred W. ;
Noble, William S. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W202-W208
[3]  
Bailey TL., 2010, BMC Bioinformatics, V11
[4]   The ChEMBL bioactivity database: an update [J].
Bento, A. Patricia ;
Gaulton, Anna ;
Hersey, Anne ;
Bellis, Louisa J. ;
Chambers, Jon ;
Davies, Mark ;
Krueger, Felix A. ;
Light, Yvonne ;
Mak, Lora ;
McGlinchey, Shaun ;
Nowotka, Michal ;
Papadatos, George ;
Santos, Rita ;
Overington, John P. .
NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) :D1083-D1090
[5]  
Bharadwaja A., 2014, Similarity based learning method for drug target interaction prediction
[6]   Prediction of drug target groups based on chemical-chemical similarities and chemical-chemical/protein connections [J].
Chen, Lei ;
Lu, Jing ;
Luo, Xiaomin ;
Feng, Kai-Yan .
BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS, 2014, 1844 (01) :207-213
[7]   Drug-target interaction prediction: databases, web servers and computational models [J].
Chen, Xing ;
Yan, Chenggang Clarence ;
Zhang, Xiaotian ;
Zhang, Xu ;
Dai, Feng ;
Yin, Jian ;
Zhang, Yongdong .
BRIEFINGS IN BIOINFORMATICS, 2016, 17 (04) :696-712
[8]  
Cheng Zhanzhan, 2016, IEEE ACM TRANSACTION, P1
[9]   A Survey on the Computational Approaches to Identify Drug Targets in the Postgenomic Era [J].
Dai, Yan-Fen ;
Zhao, Xing-Ming .
BIOMED RESEARCH INTERNATIONAL, 2015, 2015
[10]   SLiMSearch 2.0: biological context for short linear motifs in proteins [J].
Davey, Norman E. ;
Haslam, Niall J. ;
Shields, Denis C. ;
Edwards, Richard J. .
NUCLEIC ACIDS RESEARCH, 2011, 39 :W56-W60