MSLP: mRNA subcellular localization predictor based on machine learning techniques

被引:9
|
作者
Musleh, Saleh [1 ]
Islam, Mohammad Tariqul [2 ]
Qureshi, Rizwan [1 ]
Alajez, Nihad [3 ,4 ]
Alam, Tanvir [1 ]
机构
[1] Hamad Bin Khalifa Univ, Coll Sci & Engn, Doha, Qatar
[2] Southern Connecticut State Univ, Comp Sci Dept, New Haven, CT USA
[3] Hamad Bin Khalifa Univ, Qatar Biomed Res Inst QBRI, Translat Canc & Immun Ctr TC, Doha, Qatar
[4] Hamad Bin Khalifa Univ, Coll Hlth & Life Sci, Doha, Qatar
关键词
RNA; mRNA; Machine learning; Sequence analysis; Localization prediction; Subcellular localization; NERVOUS-SYSTEM; RNALOCATE; SEQUENCES; RESOURCE;
D O I
10.1186/s12859-023-05232-0
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Subcellular localization of messenger RNA (mRNAs) plays a pivotal role in the regulation of gene expression, cell migration as well as in cellular adaptation. Experiment techniques for pinpointing the subcellular localization of mRNAs are laborious, time-consuming and expensive. Therefore, in silico approaches for this purpose are attaining great attention in the RNA community. Methods: In this article, we propose MSLP, a machine learning-based method to predict the subcellular localization of mRNA. We propose a novel combination of four types of features representing k-mer, pseudo k-tuple nucleotide composition (PseKNC), physicochemical properties of nucleotides, and 3D representation of sequences based on Z-curve transformation to feed into machine learning algorithm to predict the subcellular localization of mRNAs. Results: Considering the combination of the above-mentioned features, ennsemble-based models achieved state-of-the-art results in mRNA subcellular localization prediction tasks for multiple benchmark datasets. We evaluated the performance of our method in ten subcellular locations, covering cytoplasm, nucleus, endoplasmic reticulum (ER), extracellular region (ExR), mitochondria, cytosol, pseudopodium, posterior, exosome, and the ribosome. Ablation study highlighted k-mer and PseKNC to be more dominant than other features for predicting cytoplasm, nucleus, and ER localizations. On the other hand, physicochemical properties and Z-curve based features contributed the most to ExR and mitochondria detection. SHAP-based analysis revealed the relative importance of features to provide better insights into the proposed approach. Availability: We have implemented a Docker container and API for end users to run their sequences on our model. Datasets, the code of API and the Docker are shared for the community in GitHub at: https://github.com/smusleh/MSLP.
引用
收藏
页数:23
相关论文
共 50 条
  • [11] Prediction of protein subcellular localization using machine learning with novel use of generic feature set
    Upama, Paramita Basak
    Tanny, Nawshin Tabassum
    Akhter, Shahin
    PROCEEDINGS OF 2020 6TH IEEE INTERNATIONAL WOMEN IN ENGINEERING (WIE) CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE 2020), 2020, : 98 - 101
  • [12] An ensemble deep learning framework for multi-class LncRNA subcellular localization with innovative encoding strategy
    Hu, Wenxing
    Yue, Yan
    Yan, Ruomei
    Guan, Lixin
    Li, Mengshan
    BMC BIOLOGY, 2025, 23 (01)
  • [13] Recent Advancement in Predicting Subcellular Localization of Mycobacterial Protein with Machine Learning Methods
    Li, Shi-Hao
    Guan, Zheng-Xing
    Zhang, Dan
    Zhang, Zi-Mei
    Huang, Jian
    Yang, Wuritu
    Lin, Hao
    MEDICINAL CHEMISTRY, 2020, 16 (05) : 605 - 619
  • [14] Comparison of Machine Learning Techniques Based Brain Source Localization Using EEG Signals
    Jatoi, Munsif Ali
    Dharejo, Fayaz Ali
    Teevino, Sadam Hussain
    CURRENT MEDICAL IMAGING, 2021, 17 (01) : 64 - 72
  • [15] TE-based Machine Learning Techniques for Link Fault Localization in Complex Networks
    Srinivasan, Srinikethan Madapuzi
    Tram Truong-Huu
    Gurusamy, Mohan
    2018 IEEE 6TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD (FICLOUD 2018), 2018, : 25 - 32
  • [16] Improving Indoor WiFi Localization by Using Machine Learning Techniques
    Gorjan, Hanieh Esmaeili
    Jimenez, Victor P. Gil
    SENSORS, 2024, 24 (19)
  • [17] Subcellular localization of mRNA and factors involved in translation initiation
    Hoyle, Nathaniel P.
    Ashe, Mark P.
    BIOCHEMICAL SOCIETY TRANSACTIONS, 2008, 36 : 648 - 652
  • [18] RNAlight: a machine learning model to identify nucleotide features determining RNA subcellular localization
    Yuan, Guo-Hua
    Wang, Ying
    Wang, Guang-Zhong
    Yang, Li
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
  • [19] A review from biological mapping to computation-based subcellular localization
    Li, Jing
    Zou, Quan
    Yuan, Lei
    MOLECULAR THERAPY NUCLEIC ACIDS, 2023, 32 : 507 - 521
  • [20] Advances in the Prediction of Protein Subcellular Locations with Machine Learning
    Zhang, Ting-He
    Zhang, Shao-Wu
    CURRENT BIOINFORMATICS, 2019, 14 (05) : 406 - 421