LightGBM-LncLoc: A LightGBM-Based Computational Predictor for Recognizing Long Non-Coding RNA Subcellular Localization

被引:16
作者
Lyu, Jianyi [1 ]
Zheng, Peijie [1 ]
Qi, Yue [1 ]
Huang, Guohua [1 ]
机构
[1] Shaoyang Univ, Sch Informat Engn, Shaoyang 422000, Peoples R China
基金
中国国家自然科学基金;
关键词
lncRNA; subcellular localization; lightGBM; reverse complement k-mer; machine learning; CD-HIT; PROTEIN; GENOME;
D O I
10.3390/math11030602
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Long non-coding RNAs (lncRNA) are a class of RNA transcripts with more than 200 nucleotide residues. LncRNAs play versatile roles in cellular processes and are thus becoming a hot topic in the field of biomedicine. The function of lncRNAs was discovered to be closely associated with subcellular localization. Although many methods have been developed to identify the subcellular localization of lncRNAs, there still is much room for improvement. Herein, we present a lightGBM-based computational predictor for recognizing lncRNA subcellular localization, which is called LightGBM-LncLoc. LightGBM-LncLoc uses reverse complement k-mer and position-specific trinucleotide propensity based on the single strand for multi-class sequences to encode LncRNAs and employs LightGBM as the learning algorithm. LightGBM-LncLoc reaches state-of-the-art performance by five-fold cross-validation and independent test over the datasets of five categories of lncRNA subcellular localization. We also implemented LightGBM-LncLoc as a user-friendly web server.
引用
收藏
页数:13
相关论文
共 54 条
[1]  
Alaa A, 2019, IEEE ENG MED BIO, P1355, DOI [10.1109/embc.2019.8857598, 10.1109/EMBC.2019.8857598]
[2]  
[Anonymous], 2013, ADV NEURAL INF PROCE
[3]  
[Anonymous], 2011, Proceedings of the 20th International Conference on World Wide Web, WWW '11, DOI DOI 10.1145/1963405.1963461
[4]   DeepLoc: prediction of protein subcellular localization using deep learning [J].
Armenteros, Jose Juan Almagro ;
Sonderby, Casper Kaae ;
Sonderby, Soren Kaae ;
Nielsen, Henrik ;
Winther, Ole .
BIOINFORMATICS, 2017, 33 (21) :3387-3395
[5]   PSLpred: prediction of subcellular localization of bacterial proteins [J].
Bhasin, M ;
Garg, A ;
Raghava, GPS .
BIOINFORMATICS, 2005, 21 (10) :2522-2524
[6]   An Interpretable Prediction Model for Identifying N7-Methylguanosine Sites Based on XGBoost and SHAP [J].
Bi, Yue ;
Xiang, Dongxu ;
Ge, Zongyuan ;
Li, Fuyi ;
Jia, Cangzhi ;
Song, Jiangning .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2020, 22 :362-372
[7]   Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project [J].
Birney, Ewan ;
Stamatoyannopoulos, John A. ;
Dutta, Anindya ;
Guigo, Roderic ;
Gingeras, Thomas R. ;
Margulies, Elliott H. ;
Weng, Zhiping ;
Snyder, Michael ;
Dermitzakis, Emmanouil T. ;
Stamatoyannopoulos, John A. ;
Thurman, Robert E. ;
Kuehn, Michael S. ;
Taylor, Christopher M. ;
Neph, Shane ;
Koch, Christoph M. ;
Asthana, Saurabh ;
Malhotra, Ankit ;
Adzhubei, Ivan ;
Greenbaum, Jason A. ;
Andrews, Robert M. ;
Flicek, Paul ;
Boyle, Patrick J. ;
Cao, Hua ;
Carter, Nigel P. ;
Clelland, Gayle K. ;
Davis, Sean ;
Day, Nathan ;
Dhami, Pawandeep ;
Dillon, Shane C. ;
Dorschner, Michael O. ;
Fiegler, Heike ;
Giresi, Paul G. ;
Goldy, Jeff ;
Hawrylycz, Michael ;
Haydock, Andrew ;
Humbert, Richard ;
James, Keith D. ;
Johnson, Brett E. ;
Johnson, Ericka M. ;
Frum, Tristan T. ;
Rosenzweig, Elizabeth R. ;
Karnani, Neerja ;
Lee, Kirsten ;
Lefebvre, Gregory C. ;
Navas, Patrick A. ;
Neri, Fidencio ;
Parker, Stephen C. J. ;
Sabo, Peter J. ;
Sandstrom, Richard ;
Shafer, Anthony .
NATURE, 2007, 447 (7146) :799-816
[8]   Long noncoding RNAs in B-cell development and activation [J].
Brazao, Tiago F. ;
Johnson, Jethro S. ;
Muller, Jennifer ;
Heger, Andreas ;
Ponting, Chris P. ;
Tybulewicz, Victor L. J. .
BLOOD, 2016, 128 (07) :E10-E19
[9]   LNCcation: lncRNA localization and function [J].
Bridges, Mary Catherine ;
Daulagala, Amanda C. ;
Kourtidis, Antonis .
JOURNAL OF CELL BIOLOGY, 2021, 220 (02)
[10]   The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier [J].
Cao, Zhen ;
Pan, Xiaoyong ;
Yang, Yang ;
Huang, Yan ;
Shen, Hong-Bin .
BIOINFORMATICS, 2018, 34 (13) :2185-2194