Compositional framework for multitask learning in the identification of cleavage sites of HIV-1 protease

被引:11
作者
Singh, Deepak [1 ]
Sisodia, Dilip Singh [1 ]
Singh, Pradeep [1 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, GE Rd Raipur, Raipur 492001, Chhattisgarh, India
关键词
HIV-1; protease; Multifactorial evolution; Multitask learning; Multiple Kernel learning; Protein encoding; MULTIFACTORIAL INHERITANCE; SUBNUCLEAR LOCALIZATION; CULTURAL TRANSMISSION; NEURAL-NETWORKS; PREDICTION; ENSEMBLE; CLASSIFIERS; SELECTION;
D O I
10.1016/j.jbi.2020.103376
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Inadequate patient samples and costly annotated data generations result into the smaller dataset in the biomedical domain. Due to which the predictions with a trained model that usually reveal a single small dataset association are fail to derive robust insights. To cope with the data sparsity, a promising strategy of combining data from the different related tasks is exercised in various application. Motivated by, successful work in the various bioinformatics application, we propose a multitask learning model based on multi-kernel that exploits the dependencies among various related tasks. This work aims to combine the knowledge from experimental studies of the different dataset to build stronger predictive models for HIV-1 protease cleavage sites prediction. In this study, a set of peptide data from one source is referred as 'task' and to integrate interactions from multiple tasks; our method exploits the common features and parameters sharing across the data source. The proposed framework uses feature integration, feature selection, multi-kernel and multifactorial evolutionary algorithm to model multitask learning. The framework considered seven different feature descriptors and four different kernel variants of support vector machines to form the optimal multi-kernel learning model. To validate the effectiveness of the model, the performance parameters such as average accuracy, and area under curve have been evaluated on the suggested model. We also carried out Friedman and post hoc statistical test to substantiate the significant improvement achieved by the proposed framework. The result obtained following the extensive experiment confirms the belief that multitask learning in cleavage site identification can improve the performance.
引用
收藏
页数:17
相关论文
共 78 条
  • [61] Deep recurrent neural networks in HIV-1 protease cleavage classification
    Shayanfar, Nima
    Derhami, Vali
    Rezaeian, Mehdi
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2017, 19 (04) : 298 - 311
  • [62] HlVcleave: a web-server for predicting human immunodeficiency virus protease cleavage sites in proteins
    Shen, Hong-Bin
    Chou, Kuo-Chen
    [J]. ANALYTICAL BIOCHEMISTRY, 2008, 375 (02) : 388 - 390
  • [63] Inductive transfer with context-sensitive neural networks
    Silver, Daniel L.
    Poirier, Ryan
    Currie, Duane
    [J]. MACHINE LEARNING, 2008, 73 (03) : 313 - 336
  • [64] Evolutionary based optimal ensemble classifiers for HIV-1 protease cleavage sites prediction
    Singh, Deepak
    Singh, Pradeep
    Sisodia, Dilip Singh
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 109 : 86 - 99
  • [65] Prediction of HIV-1 protease cleavage site using a combination of sequence, structural, and physicochemical features
    Singh, Onkar
    Su, Emily Chia-Yu
    [J]. BMC BIOINFORMATICS, 2016, 17
  • [66] Bridging Feature Selection and Extraction: Compound Feature Generation
    Sreevani
    Murthy, C. A.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (04) : 757 - 770
  • [67] Tang ZD, 2017, IEEE C EVOL COMPUTAT, P474, DOI 10.1109/CEC.2017.7969349
  • [68] Theodoros E., 2004, Proc. Of the ACM SIGKDD Intl Conf. on Knowledge Discovery and Data Mining, P109, DOI [DOI 10.1145/1014052.1014067, 10.1145/1014052.1014067]
  • [69] Thrun S, 1996, P 13 INT C MACH LEAR, V28, P5
  • [70] UNAIDS, 2016, UNAIDS FACT SHEET