Biological Data Classification and Analysis Using Convolutional Neural Network

被引:2
作者
Ahmed, Iftikhar [1 ]
Iqbal, Muhammad Javed [2 ]
Basheri, Mohammad [1 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Informat Technol Dept, Jeddah 21589, Saudi Arabia
[2] Univ Engn & Technol, Dept Comp Sci, Taxila 47080, Pakistan
关键词
Bioinformatics; Protein Sequence Classification; Deep Learning; Convolutional Neural Network; Sequence Encoding; Genes Healthcare Informatics;
D O I
10.1166/jmihi.2020.3179
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The size of data gathered from various ongoing biological and clinically studies is increasing at an exponential rate. The bio-inspired data mainly comprises of genes of DNA, protein and variety of proteomics and genetic diseases. Additionally, DNA microarray data is also available for early diagnosis and prediction of various types of cancer diseases. Interestingly, this data may store very vital information about genes, their structure and important biological function. The huge volume and constant increase in the extracted bio data has opened several challenges. Many bioinformatics and machine learning models have been developed but those fail to address key challenges presents in the efficient and accurate analysis of variety of complex biologically inspired data such as genetic diseases etc. The reliable and robust process of classifying the extracted data into different classes based on the information hidden in the sample data is also a very interesting and open problem. This research work mainly focuses to overcome major challenges in the accurate protein classification keeping in view of the success of deep learning models in natural language processing since it assumes the proteins sequences as a language. The learning ability and overall classification performance of the proposed system can be validated with deep learning classification models. The proposed system can have the superior ability to accurately classify the mentioned datasets than previous approaches and shows better results. The in-depth analysis of multifaceted biological data may also help in the early diagnosis of diseases that causes due to mutation of genes and to overcome arising challenges in the development of large-scale healthcare systems.
引用
收藏
页码:2459 / 2465
页数:7
相关论文
共 40 条
  • [1] Anastasiadis AD, 2003, LECT NOTES COMPUT SC, V2810, P430, DOI 10.1007/978-3-540-45231-7_40
  • [2] Angadi Ulavappa B, 2010, J Bioinform Comput Biol, V8, P825, DOI 10.1142/S0219720010004951
  • [3] Bandyopadhyay S., 2005, SYSTEMS, V152, P5
  • [4] Motif-based protein sequence classification using neural networks
    Blekas, K
    Fotiadis, DI
    Likas, A
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2005, 12 (01) : 64 - 82
  • [5] Cui Y., 2019, BMC bioinformatics, V20, P1
  • [6] A neural network based approach for protein structural class prediction
    Datta, Ayan
    Talukdar, Veera
    Konar, Amit
    Jain, Lakhmi C.
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2009, 20 (1-2) : 61 - 71
  • [7] Dianhui W., 2002, P 9 INT C NEUR INF P
  • [8] Computational intelligence techniques in bioinformatics
    Hassanien, Aboul Ella
    Al-Shammari, Eiman Tamah
    Ghali, Neveen I.
    [J]. COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2013, 47 : 37 - 47
  • [9] DeepSF: deep convolutional neural network for mapping protein sequences to folds
    Hou, Jie
    Adhikari, Badri
    Cheng, Jianlin
    [J]. BIOINFORMATICS, 2018, 34 (08) : 1295 - 1303
  • [10] Iqbal M.J., 2015, COMPUTATIONAL INTELL, V33, P32