Linc2function: A Comprehensive Pipeline and Webserver for Long Non-Coding RNA (lncRNA) Identification and Functional Predictions Using Deep Learning Approaches

被引:1
|
作者
Ramakrishnaiah, Yashpal [1 ,2 ]
Morris, Adam P. [3 ]
Dhaliwal, Jasbir [2 ]
Philip, Melcy [1 ]
Kuhlmann, Levin [4 ]
Tyagi, Sonika [1 ,2 ]
机构
[1] Monash Univ, Cent Clin Sch, Melbourne, Vic 3000, Australia
[2] RMIT Univ, Sch Comp Technol, Royal Melbourne Inst Technol, Melbourne, Vic 3000, Australia
[3] Monash Univ, Monash Data Futures Inst, Clayton, Vic 3800, Australia
[4] Monash Univ, Fac Informat Technol, Clayton, Vic 3800, Australia
关键词
lncRNA; non-coding RNA; machine learning; functional annotation; deep learning; DATABASE; GENE;
D O I
10.3390/epigenomes7030022
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Long non-coding RNAs (lncRNAs), comprising a significant portion of the human transcriptome, serve as vital regulators of cellular processes and potential disease biomarkers. However, the function of most lncRNAs remains unknown, and furthermore, existing approaches have focused on gene-level investigation. Our work emphasizes the importance of transcript-level annotation to uncover the roles of specific transcript isoforms. We propose that understanding the mechanisms of lncRNA in pathological processes requires solving their structural motifs and interactomes. A complete lncRNA annotation first involves discriminating them from their coding counterparts and then predicting their functional motifs and target bio-molecules. Current in silico methods mainly perform primary-sequence-based discrimination using a reference model, limiting their comprehensiveness and generalizability. We demonstrate that integrating secondary structure and interactome information, in addition to using transcript sequence, enables a comprehensive functional annotation. Annotating lncRNA for newly sequenced species is challenging due to inconsistencies in functional annotations, specialized computational techniques, limited accessibility to source code, and the shortcomings of reference-based methods for cross-species predictions. To address these challenges, we developed a pipeline for identifying and annotating transcript sequences at the isoform level. We demonstrate the effectiveness of the pipeline by comprehensively annotating the lncRNA associated with two specific disease groups. The source code of our pipeline is available under the MIT licensefor local use by researchers to make new predictions using the pre-trained models or to re-train models on new sequence datasets. Non-technical users can access the pipeline through a web server setup.
引用
收藏
页数:17
相关论文
共 18 条
  • [1] LncRNAnet: long non-coding RNA identification using deep learning
    Baek, Junghwan
    Lee, Byunghan
    Kwon, Sunyoung
    Yoon, Sungroh
    BIOINFORMATICS, 2018, 34 (22) : 3889 - 3897
  • [2] Identification and functional prediction of cold-related long non-coding RNA (lncRNA) in grapevine
    Pengfei Wang
    Lingmin Dai
    Jun Ai
    Yongmei Wang
    Fengshan Ren
    Scientific Reports, 9
  • [3] Identification and Functional Analysis of Long Non-Coding RNA (lncRNA) in Response to Seed Aging in Rice
    Zhang, Yixin
    Fan, Fan
    Zhang, Qunjie
    Luo, Yongjian
    Liu, Qinjian
    Gao, Jiadong
    Liu, Jun
    Chen, Guanghui
    Zhang, Haiqing
    PLANTS-BASEL, 2022, 11 (23):
  • [4] Identification and functional prediction of cold-related long non-coding RNA (lncRNA) in grapevine
    Wang, Pengfei
    Dai, Lingmin
    Ai, Jun
    Wang, Yongmei
    Ren, Fengshan
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [5] LncRNA-ID: Long non-coding RNA IDentification using balanced random forests
    Achawanantakun, Rujira
    Chen, Jiao
    Sun, Yanni
    Zhang, Yuan
    BIOINFORMATICS, 2015, 31 (24) : 3897 - 3905
  • [6] lncRNA-MFDL: identification of human long non-coding RNAs by fusing multiple features and using deep learning
    Fan, Xiao-Nan
    Zhang, Shao-Wu
    MOLECULAR BIOSYSTEMS, 2015, 11 (03) : 892 - 897
  • [7] Interpretable Deep Learning Model Reveals Subsequences of Various Functions for Long Non-Coding RNA Identification
    Lin, Rattaphon
    Wichadakul, Duangdao
    FRONTIERS IN GENETICS, 2022, 13
  • [8] Identification of Long Non-coding RNA from inherent features using Machine Learning Techniques
    Sreeshma, C. M.
    Manu, Madhavan
    Gopakumar, G.
    2018 INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND SYSTEMS BIOLOGY (BSB), 2018, : 97 - 102
  • [9] Identification of Potential Key Long Non-Coding RNAs and Target Genes Associated with Pneumonia Using Long Non-Coding RNA Sequencing (lncRNA-Seq): A Preliminary Study
    Huang, Sai
    Feng, Cong
    Chen, Li
    Huang, Zhi
    Zhou, Xuan
    Li, Bei
    Wang, Li-li
    Chen, Wei
    Lv, Fa-qin
    Li, Tan-shi
    MEDICAL SCIENCE MONITOR, 2016, 22 : 3394 - 3408
  • [10] A comprehensive survey on deep learning-based identification and predicting the interaction mechanism of long non-coding RNAs
    Diao, Biyu
    Luo, Jin
    Guo, Yu
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2024, 23 (04) : 314 - 324