Predicting transcriptional responses to heat and drought stress from genomic features using a machine learning approach in rice

被引:3
|
作者
Smet, Dajo [1 ,2 ]
Opdebeeck, Helder [1 ,2 ]
Vandepoele, Klaas [1 ,2 ,3 ]
机构
[1] Univ Ghent, Dept Plant Biotechnol & Bioinformat, Ghent, Belgium
[2] VIB, Ctr Plant Syst Biol, Ghent, Belgium
[3] Univ Ghent, Bioinformat Inst Ghent, Ghent, Belgium
来源
FRONTIERS IN PLANT SCIENCE | 2023年 / 14卷
关键词
rice; regulatory elements; regulation of heat stress; regulation of drought stress; machine learning interpretation; GENE-EXPRESSION; ARABIDOPSIS; NETWORKS; E2F;
D O I
10.3389/fpls.2023.1212073
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
Plants have evolved various mechanisms to adapt to adverse environmental stresses, such as the modulation of gene expression. Expression of stress-responsive genes is controlled by specific regulators, including transcription factors (TFs), that bind to sequence-specific binding sites, representing key components of cis-regulatory elements and regulatory networks. Our understanding of the underlying regulatory code remains, however, incomplete. Recent studies have shown that, by training machine learning (ML) algorithms on genomic sequence features, it is possible to predict which genes will transcriptionally respond to a specific stress. By identifying the most important features for gene expression prediction, these trained ML models allow, in theory, to further elucidate the regulatory code underlying the transcriptional response to abiotic stress. Here, we trained random forest ML models to predict gene expression in rice (Oryza sativa) in response to heat or drought stress. Apart from thoroughly assessing model performance and robustness across various input training data, the importance of promoter and gene body sequence features to train ML models was evaluated. The use of enriched promoter oligomers, complementing known TF binding sites, allowed us to gain novel insights in DNA motifs contributing to the stress regulatory code. By comparing genomic feature importance scores for drought and heat stress over time, general and stress-specific genomic features contributing to the performance of the learned models and their temporal variation were identified. This study provides a solid foundation to build and interpret ML models accurately predicting transcriptional responses and enables novel insights in biological sequence features that are important for abiotic stress responses.
引用
收藏
页数:18
相关论文
共 39 条
  • [31] Predicting Stress-Strain Characteristics of Hot Deformed Cu-Zr Metallic Glass Alloy Composite Nanowires Using Supervised Machine Learning Algorithms
    Katakareddi, Ganesh
    Ali, Md. Shafdar
    Jungalwala, Kerfegarshahvir
    Yedla, Natraj
    JOURNAL OF MATERIALS ENGINEERING AND PERFORMANCE, 2024, : 8165 - 8181
  • [32] Predicting green technology innovation in the construction field from a technology convergence perspective: A two-stage predictive approach based on interpretable machine learning
    Feng, Shuai
    Liu, Guiwen
    Shan, Tianlong
    Li, Kaijian
    Lai, Sha
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2024, 372
  • [33] Early Detection of Rice Leaf Blast Disease Using Unmanned Aerial Vehicle Remote Sensing: A Novel Approach Integrating a New Spectral Vegetation Index and Machine Learning
    Zhao, Dongxue
    Cao, Yingli
    Li, Jinpeng
    Cao, Qiang
    Li, Jinxuan
    Guo, Fuxu
    Feng, Shuai
    Xu, Tongyu
    AGRONOMY-BASEL, 2024, 14 (03):
  • [34] An integrative machine learning approach to discovering multi-level molecular mechanisms of obesity using data from monozygotic twin pairs
    Kibble, Milla
    Khan, Suleiman A.
    Ammad-ud-din, Muhammad
    Bollepalli, Sailalitha
    Palviainen, Teemu
    Kaprio, Jaakko
    Pietilainen, Kirsi H.
    Ollikainen, Miina
    ROYAL SOCIETY OPEN SCIENCE, 2020, 7 (10):
  • [35] Fintech adoption dynamics in a pandemic: An experience from some financial institutions in Nigeria during COVID-19 using machine learning approach
    Edo, Onome Christopher
    Etu, Egbe-Etu
    Tenebe, Imokhai
    Oladele, Oluwarotimi Samuel
    Edo, Solomon
    Diekola, Oladapo Ayodeji
    Emakhu, Joshua
    COGENT BUSINESS & MANAGEMENT, 2023, 10 (02):
  • [36] Comprehensive gene expression analysis of the NAC gene family under normal growth conditions, hormone treatment, and drought stress conditions in rice using near-isogenic lines (NILs) generated from crossing Aday Selection (drought tolerant) and IR64
    Nuruzzaman, Mohammed
    Sharoni, Akhter Most
    Satoh, Kouji
    Moumeni, Ali
    Venuprasad, Ramiah
    Serraj, Rachid
    Kumar, Arvind
    Leung, Hei
    Attia, Kotb
    Kikuchi, Shoshi
    MOLECULAR GENETICS AND GENOMICS, 2012, 287 (05) : 389 - 410
  • [37] Immune responses of different COVID-19 vaccination strategies by analyzing single-cell RNA sequencing data from multiple tissues using machine learning methods
    Li, Hao
    Ma, Qinglan
    Ren, Jingxin
    Guo, Wei
    Feng, Kaiyan
    Li, Zhandong
    Huang, Tao
    Cai, Yu-Dong
    FRONTIERS IN GENETICS, 2023, 14
  • [39] Prediction and validation of protein-protein interactors from genome-wide DNA-binding data using a knowledge-based machine-learning approach
    Waardenberg, Ashley J.
    Homan, Bernou
    Mohamed, Stephanie
    Harvey, Richard P.
    Bouveret, Romaric
    OPEN BIOLOGY, 2016, 6 (09)