Towards a better prediction of subcellular location of long non-coding RNA

被引:35
作者
Zhang, Zhao-Yue
Sun, Zi-Jie
Yang, Yu-He
Lin, Hao [1 ]
机构
[1] Univ Elect Sci & Technol China, Key Lab NeuroInformat, Minist Educ, Sch Life Sci & Technol, Chengdu 610054, Peoples R China
关键词
lncRNA; subcellular localization; support vector machine; mutual information; Web server; LNCRNA; LOCALIZATION; DATABASE; ASSOCIATIONS; GENES; SITES;
D O I
10.1007/s11704-021-1015-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The spatial distribution pattern of long non-coding RNA (lncRNA) in cell is tightly related to their function. With the increment of publicly available subcellular location data, a number of computational methods have been developed for the recognition of the subcellular localization of lncRNA. Unfortunately, these computational methods suffer from the low discriminative power of redundant features or overfitting of oversampling. To address those issues and enhance the prediction performance, we present a support vector machine-based approach by incorporating mutual information algorithm and incremental feature selection strategy. As a result, the new predictor could achieve the overall accuracy of 91.60%. The highly automated web-tool is available at . It will help to get the knowledge of lncRNA subcellular localization.
引用
收藏
页数:7
相关论文
共 66 条
  • [1] Locate-R: Subcellular localization of long non-coding RNAs using nucleotide compositions
    Ahmad, Ahsan
    Lin, Hao
    Shatabda, Swakkhar
    [J]. GENOMICS, 2020, 112 (03) : 2583 - 2589
  • [2] [Anonymous], 2011, Acm T. Intel. Syst. Tec., DOI DOI 10.1145/1961189.1961199
  • [3] DREME: motif discovery in transcription factor ChIP-seq data
    Bailey, Timothy L.
    [J]. BIOINFORMATICS, 2011, 27 (12) : 1653 - 1659
  • [4] SDM6A: A Web-Based Integrative Machine-Learning Framework for Predicting 6mA Sites in the Rice Genome
    Basith, Shaherin
    Manavalan, Balachandran
    Shin, Tae Hwan
    Lee, Gwang
    [J]. MOLECULAR THERAPY-NUCLEIC ACIDS, 2019, 18 : 131 - 141
  • [5] The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier
    Cao, Zhen
    Pan, Xiaoyong
    Yang, Yang
    Huang, Yan
    Shen, Hong-Bin
    [J]. BIOINFORMATICS, 2018, 34 (13) : 2185 - 2194
  • [6] Improved prediction and characterization of anticancer activities of peptides using a novel flexible scoring card method
    Charoenkwan, Phasit
    Chiangjong, Wararat
    Lee, Vannajan Sanghiran
    Nantasenamat, Chanin
    Hasan, Md Mehedi
    Shoombuatong, Watshara
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [7] iUmami-SCM: A Novel Sequence-Based Predictor for Prediction and Analysis of Umami Peptides Using a Scoring Card Method with Propensity Scores of Dipeptides
    Charoenkwan, Phasit
    Yana, Janchai
    Nantasenamat, Chanin
    Hasan, Mehedi
    Shoombuatong, Watshara
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (12) : 6666 - 6678
  • [8] iAMY-SCM: Improved prediction and analysis of amyloid proteins using a scoring card method with propensity scores of dipeptides
    Charoenkwan, Phasit
    Kanthawong, Sakawrat
    Nantasenamat, Chanin
    Hasan, Md Mehedi
    Shoombuatong, Watshara
    [J]. GENOMICS, 2021, 113 (01) : 689 - 698
  • [9] iDPPIV-SCM: A Sequence-Based Predictor for Identifying and Analyzing Dipeptidyl Peptidase IV (DPP-IV) Inhibitory Peptides Using a Scoring Card Method
    Charoenkwan, Phasit
    Kanthawong, Sakawrat
    Nantasenamat, Chanin
    Hasan, Mehedi
    Shoombuatong, Watshara
    [J]. JOURNAL OF PROTEOME RESEARCH, 2020, 19 (10) : 4125 - 4136
  • [10] Meta-iPVP: a sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation
    Charoenkwan, Phasit
    Nantasenamat, Chanin
    Hasan, Md. Mehedi
    Shoombuatong, Watshara
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2020, 34 (10) : 1105 - 1116