A Novel Machine-Learning Approach to Predict Stress-Responsive Genes in Arabidopsis

被引:2
|
作者
Nazari, Leyla [1 ]
Ghotbi, Vida [2 ]
Nadimi, Mohammad [3 ]
Paliwal, Jitendra [3 ]
机构
[1] Agr Res Educ & Extens Org AREEO, Fars Agr & Nat Resources Res & Educ Ctr, Crop & Hort Sci Res Dept, Shiraz 7155863511, Iran
[2] Agr Res Educ & Extens Org AREEO, Seed & Plant Improvement Inst, Karaj 3135933151, Iran
[3] Univ Manitoba, Dept Biosyst Engn, Winnipeg, MB R3T 5V6, Canada
关键词
LASSO; information gain; ReliefF; classifiers; random forest; SELECTION; TRANSCRIPTOMICS; EXPRESSION; TIME;
D O I
10.3390/a16090407
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study proposes a hybrid gene selection method to identify and predict key genes in Arabidopsis associated with various stresses (including salt, heat, cold, high-light, and flagellin), aiming to enhance crop tolerance. An open-source microarray dataset (GSE41935) comprising 207 samples and 30,380 genes was analyzed using several machine learning tools including the synthetic minority oversampling technique (SMOTE), information gain (IG), ReliefF, and least absolute shrinkage and selection operator (LASSO), along with various classifiers (BayesNet, logistic, multilayer perceptron, sequential minimal optimization (SMO), and random forest). We identified 439 differentially expressed genes (DEGs), of which only three were down-regulated (AT3G20810, AT1G31680, and AT1G30250). The performance of the top 20 genes selected by IG and ReliefF was evaluated using the classifiers mentioned above to classify stressed versus non-stressed samples. The random forest algorithm outperformed other algorithms with an accuracy of 97.91% and 98.51% for IG and ReliefF, respectively. Additionally, 42 genes were identified from all 30,380 genes using LASSO regression. The top 20 genes for each feature selection were analyzed to determine three common genes (AT5G44050, AT2G47180, and AT1G70700), which formed a three-gene signature. The efficiency of these three genes was evaluated using random forest and XGBoost algorithms. Further validation was performed using an independent RNA_seq dataset and random forest. These gene signatures can be exploited in plant breeding to improve stress tolerance in a variety of crops.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] MLAS: Machine Learning-Based Approach for Predicting Abiotic Stress-Responsive Genes in Chinese Cabbage
    You, Xiong
    Shu, Yiting
    Ni, Xingcheng
    Lv, Hengmin
    Luo, Jian
    Tao, Jianping
    Bai, Guanghui
    Feng, Shusu
    HORTICULTURAE, 2025, 11 (01)
  • [2] Repression of stress-responsive genes by FIERY2, a novel transcriptional regulator in Arabidopsis
    Xiong, LM
    Lee, H
    Ishitani, M
    Tanaka, Y
    Stevenson, B
    Koiwa, H
    Bressan, RA
    Hasegawa, PM
    Zhu, JK
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (16) : 10899 - 10904
  • [3] A novel machine-learning based approach to predict flares of psoriasis
    Ramelyte, E.
    Djamei, V.
    Maul, T. J.
    Anzengruber, F.
    Navarini, A.
    EXPERIMENTAL DERMATOLOGY, 2018, 27 (03) : E44 - E45
  • [4] Machine Learning-Based Differential Network Analysis: A Study of Stress-Responsive Transcriptomes in Arabidopsis
    Ma, Chuang
    Xin, Mingming
    Feldmann, Kenneth A.
    Wang, Xiangfeng
    PLANT CELL, 2014, 26 (02): : 520 - 537
  • [5] Transcriptional divergence of the duplicated oxidative stress-responsive genes in the Arabidopsis genome
    Kim, HS
    Yu, Y
    Snesrud, EC
    Moy, LP
    Linford, LD
    Haas, BJ
    Nierman, WC
    Quackenbush, J
    PLANT JOURNAL, 2005, 41 (02): : 212 - 220
  • [6] A machine-learning approach to predict postprandial hypoglycemia
    Seo, Wonju
    Lee, You-Bin
    Lee, Seunghyun
    Jin, Sang-Man
    Park, Sung-Min
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (01)
  • [7] A machine-learning approach to predict postprandial hypoglycemia
    Wonju Seo
    You-Bin Lee
    Seunghyun Lee
    Sang-Man Jin
    Sung-Min Park
    BMC Medical Informatics and Decision Making, 19
  • [8] Gene trapping with firefly luciferase in Arabidopsis.: Tagging of stress-responsive genes
    Alvarado, MC
    Zsigmond, LM
    Kovács, I
    Cséplö, A
    Koncz, C
    Szabados, LM
    PLANT PHYSIOLOGY, 2004, 134 (01) : 18 - 27
  • [9] STIFDB2: An Updated Version of Plant Stress-Responsive TranscrIption Factor DataBase with Additional Stress Signals, Stress-Responsive Transcription Factor Binding Sites and Stress-Responsive Genes in Arabidopsis and Rice
    Naika, Mahantesha
    Shameer, Khader
    Mathew, Oommen K.
    Gowda, Ramanjini
    Sowdhamini, Ramanathan
    PLANT AND CELL PHYSIOLOGY, 2013, 54 (02) : E8 - +
  • [10] The role of stress-responsive genes in ecotoxicology
    Georgeseu, B
    Georgescu, C
    Cosier, V
    Dezmirean, D
    Zahan, M
    Bulletin of the University of Agricultural Sciences and Veterinary Medicine, Vol 60: ANIMAL SCIENCE AND BIOTECHNOLOGY, 2004, 60 : 397 - 397