A Novel Machine-Learning Approach to Predict Stress-Responsive Genes in Arabidopsis

被引:2
|
作者
Nazari, Leyla [1 ]
Ghotbi, Vida [2 ]
Nadimi, Mohammad [3 ]
Paliwal, Jitendra [3 ]
机构
[1] Agr Res Educ & Extens Org AREEO, Fars Agr & Nat Resources Res & Educ Ctr, Crop & Hort Sci Res Dept, Shiraz 7155863511, Iran
[2] Agr Res Educ & Extens Org AREEO, Seed & Plant Improvement Inst, Karaj 3135933151, Iran
[3] Univ Manitoba, Dept Biosyst Engn, Winnipeg, MB R3T 5V6, Canada
关键词
LASSO; information gain; ReliefF; classifiers; random forest; SELECTION; TRANSCRIPTOMICS; EXPRESSION; TIME;
D O I
10.3390/a16090407
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study proposes a hybrid gene selection method to identify and predict key genes in Arabidopsis associated with various stresses (including salt, heat, cold, high-light, and flagellin), aiming to enhance crop tolerance. An open-source microarray dataset (GSE41935) comprising 207 samples and 30,380 genes was analyzed using several machine learning tools including the synthetic minority oversampling technique (SMOTE), information gain (IG), ReliefF, and least absolute shrinkage and selection operator (LASSO), along with various classifiers (BayesNet, logistic, multilayer perceptron, sequential minimal optimization (SMO), and random forest). We identified 439 differentially expressed genes (DEGs), of which only three were down-regulated (AT3G20810, AT1G31680, and AT1G30250). The performance of the top 20 genes selected by IG and ReliefF was evaluated using the classifiers mentioned above to classify stressed versus non-stressed samples. The random forest algorithm outperformed other algorithms with an accuracy of 97.91% and 98.51% for IG and ReliefF, respectively. Additionally, 42 genes were identified from all 30,380 genes using LASSO regression. The top 20 genes for each feature selection were analyzed to determine three common genes (AT5G44050, AT2G47180, and AT1G70700), which formed a three-gene signature. The efficiency of these three genes was evaluated using random forest and XGBoost algorithms. Further validation was performed using an independent RNA_seq dataset and random forest. These gene signatures can be exploited in plant breeding to improve stress tolerance in a variety of crops.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] A machine-learning approach to predict Upper Gastrointestinal multidisciplinary team treatment decisions
    Thavanesan, Navamayooran
    Vigneswaran, Ganesh
    Rahman, Saqib
    Underwood, Timothy
    BRITISH JOURNAL OF SURGERY, 2022, 109
  • [42] A Machine-learning Approach to Predict Missing Flux Densities in Multiband Galaxy Surveys
    Chartab, Nima
    Mobasher, Bahram
    Cooray, Asantha R.
    Hemmati, Shoubaneh
    Sattari, Zahra
    Ferguson, Henry C.
    Sanders, David B.
    Weaver, John R.
    Stern, Daniel K.
    McCracken, Henry J.
    Masters, Daniel C.
    Toft, Sune
    Capak, Peter L.
    Davidzon, Iary
    Dickinson, Mark E.
    Rhodes, Jason
    Moneti, Andrea
    Ilbert, Olivier
    Zalesky, Lukas
    McPartland, Conor J. R.
    Szapudi, Istvan
    Koekemoer, Anton M.
    Teplitz, Harry I.
    Giavalisco, Mauro
    ASTROPHYSICAL JOURNAL, 2023, 942 (02):
  • [43] Heritable epigenetic patterns of stress-responsive genes in traumatized children
    Smith, Alicia
    Kilaru, Valun
    Klengel, Torsten
    Nishitani, Shota
    Binder, Elisabeth
    Ressler, Kerry
    Bradley, Bekh
    Jovanovic, Tanja
    PSYCHONEUROENDOCRINOLOGY, 2016, 71 : 43 - 43
  • [44] Integrating omics analysis of salt stress-responsive genes in rice
    Seo-Woo Kim
    Hee-Jeong Jeong
    Ki-Hong Jung
    Genes & Genomics, 2015, 37 : 645 - 655
  • [45] Integrating omics analysis of salt stress-responsive genes in rice
    Kim, Seo-Woo
    Jeong, Hee-
    Jung, Ki-Hong
    GENES & GENOMICS, 2015, 37 (08) : 645 - 655
  • [46] Editorial: Identification and functional dissection of stress-responsive genes in cotton
    Wu, Jiahe
    Wang, Peng
    Ge, Xiaoyang
    FRONTIERS IN PLANT SCIENCE, 2022, 13
  • [47] Machine-Learning Approach to Predict Total Fabrication Duration of Industrial Pipe Spools
    Mohsen, Osama
    Petre, Cristian
    Mohamed, Yasser
    JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT, 2023, 149 (02)
  • [48] DNAffinity: a machine-learning approach to predict DNA binding affinities of transcription factors
    Barissi, Sandro
    Sala, Alba
    Wieczor, Milosz
    Battistini, Federica
    Orozco, Modesto
    NUCLEIC ACIDS RESEARCH, 2022, 50 (16) : 9105 - 9114
  • [49] A machine-learning approach to predict success of a biocontrol for invasive Eurasian watermilfoil reduction
    White, Diana T.
    Antoniou, Thibaud M.
    Martin, Jonathan M.
    Kmetz, William
    Twiss, Michael R.
    ECOLOGICAL APPLICATIONS, 2022, 32 (06)
  • [50] MACHINE-LEARNING, A POWERFUL APPROACH TO PREDICT OUTCOME IN LUNG TRANSPLANTATION: PRELIMINARY RESULTS
    Fessler, Julien
    Gouy-Pailler, Cedric
    Roux, Antoine
    Sage, Edouard
    Fischler, Marc
    Le Guen, Morgan
    TRANSPLANT INTERNATIONAL, 2019, 32 : 363 - 364