Multi-Instance Metric Transfer Learning for Genome-Wide Protein Function Prediction

被引:7
作者
Xu, Yonghui [1 ]
Min, Huaqing [2 ]
Wu, Qingyao [2 ,3 ]
Song, Hengjie [2 ]
Ye, Bicui [4 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Guangdong, Peoples R China
[2] South China Univ Technol, Sch Software Engn, Guangzhou 510006, Guangdong, Peoples R China
[3] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China
[4] Wuzhou Red Cross Hosp, Wuzhou 543002, Peoples R China
关键词
DOMAIN; SYSTEM;
D O I
10.1038/srep41831
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Multi-Instance (MI) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with multiple instances. Many studies in this literature attempted to find an appropriate Multi-Instance Learning (MIL) method for genome-wide protein function prediction under a usual assumption, the underlying distribution from testing data (target domain, i.e., TD) is the same as that from training data (source domain, i.e., SD). However, this assumption may be violated in real practice. To tackle this problem, in this paper, we propose a Multi-Instance Metric Transfer Learning (MIMTL) approach for genome-wide protein function prediction. In MIMTL, we first transfer the source domain distribution to the target domain distribution by utilizing the bag weights. Then, we construct a distance metric learning method with the reweighted bags. At last, we develop an alternative optimization scheme for MIMTL. Comprehensive experimental evidence on seven real-world organisms verifies the effectiveness and efficiency of the proposed MIMTL approach over several state-of-the-art methods.
引用
收藏
页数:15
相关论文
共 50 条
[21]   Genome-wide survey and characterization of the small heat shock protein gene family in Bursaphelenchus xylophilus [J].
Wang, Feng ;
Li, Danlei ;
Chen, Qiaoli ;
Ma, Ling .
GENE, 2016, 579 (02) :153-161
[22]   Genome-wide identification and structure-function studies of proteases and protease inhibitors in Cicer arietinum (chickpea) [J].
Sharma, Ranu ;
Suresh, C. G. .
COMPUTERS IN BIOLOGY AND MEDICINE, 2015, 56 :67-81
[23]   Arrayed CRISPR libraries for the genome-wide activation, deletion and silencing of human protein-coding genes [J].
Yin, Jiang-An ;
Frick, Lukas ;
Scheidmann, Manuel C. ;
Liu, Tingting ;
Trevisan, Chiara ;
Dhingra, Ashutosh ;
Spinelli, Anna ;
Wu, Yancheng ;
Yao, Longping ;
Vena, Dalila Laura ;
Knapp, Britta ;
Guo, Jingjing ;
De Cecco, Elena ;
Ging, Kathi ;
Armani, Andrea ;
Oakeley, Edward J. ;
Nigsch, Florian ;
Jenzer, Joel ;
Haegele, Jasmin ;
Pikusa, Michal ;
Taeger, Joachim ;
Rodriguez-Nieto, Salvador ;
Bouris, Vangelis ;
Ribeiro, Rafaela ;
Baroni, Federico ;
Bedi, Manmeet Sakshi ;
Berry, Scott ;
Losa, Marco ;
Hornemann, Simone ;
Kampmann, Martin ;
Pelkmans, Lucas ;
Hoepfner, Dominic ;
Heutink, Peter ;
Aguzzi, Adriano .
NATURE BIOMEDICAL ENGINEERING, 2024, :127-148
[24]   Genome-Wide Screen Reveals Valosin-Containing Protein Requirement for Coronavirus Exit from Endosomes [J].
Wong, Hui Hui ;
Kumar, Pankaj ;
Tay, Felicia Pei Ling ;
Moreau, Dimitri ;
Liu, Ding Xiang ;
Bard, Frederic .
JOURNAL OF VIROLOGY, 2015, 89 (21) :11116-11128
[25]   Structural and genome-wide analyses suggest that transposon-derived protein SETMAR alters transcription and splicing [J].
Chen, Qiujia ;
Bates, Alison M. ;
Hanquier, Jocelyne N. ;
Simpson, Edward ;
Rusch, Douglas B. ;
Podicheti, Ram ;
Liu, Yunlong ;
Wek, Ronald C. ;
Cornett, Evan M. ;
Georgiadis, Millie M. .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2022, 298 (05)
[26]   Genome-wide identification, classification and expression analysis of the PHD-finger protein family in Populus trichocarpa [J].
Wu, Shengnan ;
Wu, Min ;
Dong, Qing ;
Jiang, Haiyang ;
Cai, Ronghao ;
Xiang, Yan .
GENE, 2016, 575 (01) :75-89
[27]   Genome-wide identification and divergent transcriptional expression of StAR-related lipid transfer (START) genes in teleosts [J].
Teng, Huajing ;
Cai, Wanshi ;
Zeng, Kun ;
Mao, Fengbiao ;
You, Mingcong ;
Wang, Tao ;
Zhao, Fangqing ;
Sun, Zhongsheng .
GENE, 2013, 519 (01) :18-25
[28]   VELCRO-IP RNA-seq reveals ribosome expansion segment function in translation genome-wide [J].
Leppek, Kathrin ;
Byeon, Gun Woo ;
Fujii, Kotaro ;
Barna, Maria .
CELL REPORTS, 2021, 34 (03)
[29]   DeepPFP: a multi-task-aware architecture for protein function prediction [J].
Wang, Han ;
Ren, Zilin ;
Sun, Jinghong ;
Chen, Yongbing ;
Bo, Xiaochen ;
Xue, Jiguo ;
Gao, Jingyang ;
Ni, Ming .
BRIEFINGS IN BIOINFORMATICS, 2025, 26 (01)
[30]   Genome-wide targeting of the epigenetic regulatory protein CTCF to gene promoters by the transcription factor TFII-I [J].
Pena-Hernandez, Rodrigo ;
Marques, Maud ;
Hilmi, Khalid ;
Zhao, Teijun ;
Saad, Amine ;
Alaoui-Jamali, Moulay A. ;
del Rincon, Sonia V. ;
Ashworth, Todd ;
Roy, Ananda L. ;
Emerson, Beverly M. ;
Witcher, Michael .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2015, 112 (07) :E677-E686