A Kernelized Classification Approach for Cancer Recognition Using Markovian Analysis of DNA Structure Patterns as Feature Mining

被引:6
作者
Kalal, Vijay [1 ]
Jha, Brajesh Kumar [1 ]
机构
[1] Pandit Deendayal Energy Univ, Sch Technol, Dept Math, Gandhinagar 382007, Gujarat, India
关键词
Cancer and non-cancer; Dinucleotide analysis; KLR and SVM; Nucleotide sequences; Markov chain model; ANTICANCER PEPTIDES; MODEL; IDENTIFICATION; SELECTION; HEALTHY; GENES;
D O I
10.1007/s12013-024-01336-3
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Nucleotide-based molecules called DNA and RNA are essential for several biological processes that affect both normal and cancerous cells. They contain the critical genetic material needed for normal cell growth and functioning. The DNA structure patterns that make up the genetic code affect cells' growth, behavior, and control. Different DNA structure patterns indicate different physiological effects in the cell. Knowledge of these patterns is necessary to identify the molecular origins of cancer and other disorders. Analyzing these patterns can help in the early detection of diseases, which is essential for the effectiveness of cancer research and therapy. The novelty of this study is to examine the patterns of dinucleotide structure in many genomic regions, including the non-coding region sequence (N-CDS), coding region sequence (CDS), and whole raw DNA sequence (W.R. sequence). It provides an in-depth discussion of dinucleotide patterns related to these diverse genetic environments and contains malignant and non-malignant DNA sequences. The Markovian modeling that predicts dinucleotide probabilities also reduces feature complexity and minimizes computational costs compared to the approaches of Kernelized Logistic Regression (KLR) and Support Vector Machine (SVM). This technique is effectively evaluated in essential case studies, as indicated by accuracy metrics and 10-fold cross-validation. The classifier and feature reduction, which are generated by Markovian probability, operate well together and can help predict cancer. Our findings successfully distinguish DNA sequences related to cancer from those diagnostics of non-cancerous diseases by analyzing the W.R. DNA sequence as well as its CDS and N-CDS regions.
引用
收藏
页码:2249 / 2274
页数:26
相关论文
共 42 条
[1]   iAFPs-Mv-BiTCN: Predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks [J].
Akbar, Shahid ;
Zou, Quan ;
Raza, Ali ;
Alarfaj, Fawaz Khaled .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2024, 151
[2]   cACP-DeepGram: Classification of anticancer peptides via deep neural network and skip-gram-based word embedding model [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Tahir, Muhammad ;
Khan, Salman ;
Alarfaj, Fawaz Khaled .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2022, 131
[3]   cACP-2LFS: Classification of Anticancer Peptides Using Sequential Discriminative Model of KSAAP and Two-Level Feature Selection Approach [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Tahir, Muhammad ;
Chong, Kil To .
IEEE ACCESS, 2020, 8 :131939-131948
[4]   cACP: Classifying anticancer peptides using discriminative intelligent model via Chou's 5-step rules and general pseudo components [J].
Akbar, Shahid ;
Rahman, Ateeq Ur ;
Hayat, Maqsood ;
Sohail, Mohammad .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 196
[5]   iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space [J].
Akbar, Shahid ;
Hayat, Maqsood ;
Iqbal, Muhammad ;
Jan, Mian Ahmad .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2017, 79 :62-70
[6]  
Alberts B, 2015, MOLECULAR BIOLOGY OF THE CELL, SIXTH EDITION, P1
[7]   A Comparative Study of Kernel Logistic Regression, Radial Basis Function Classifier, Multinomial Naive Bayes, and Logistic Model Tree for Flash Flood Susceptibility Mapping [J].
Binh Thai Pham ;
Tran Van Phong ;
Huu Duy Nguyen ;
Qi, Chongchong ;
Al-Ansari, Nadhir ;
Amini, Ata ;
Lanh Si Ho ;
Tran Thi Tuyen ;
Hoang Phan Hai Yen ;
Hai-Bang Ly ;
Prakash, Indra ;
Dieu Tien Bui .
WATER, 2020, 12 (01)
[8]  
Blitzstein J. K., 2014, Introduction to probability, DOI [10.1201/b17221, DOI 10.1201/B17221]
[9]  
Breneman James., 2005, Technometrics, V47, P237, DOI DOI 10.1198/TECH.2005.S264
[10]   Finding the genes in genomic DNA [J].
Burge, CB ;
Karlin, S .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1998, 8 (03) :346-354