Differential Gene Expression Prediction by Ensemble Deep Networks on Histone Modification Data

被引:1
作者
Huang, Zimo [1 ]
Wang, Jun [1 ,2 ]
Yan, Zhongmin [1 ]
Wan, Lin [1 ]
Guo, Maozu [3 ]
机构
[1] Shandong Univ, Sch Software, Jinan 250101, Shandong, Peoples R China
[2] Shandong Univ, Joint SDU NTU Ctr Artificial Intellige Res, Jinan 250101, Shandong, Peoples R China
[3] Beijing Univ Civil Engn & Architecture, Coll Elect & Informat Engn, Beijing 102616, Peoples R China
基金
中国国家自然科学基金;
关键词
Histone modification; differential expressed gene; deep neural networks; ensemble learning; feature fusion; CHROMATIN STATE; GENOME; LANGUAGE;
D O I
10.1109/TCBB.2021.3139634
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Predicting differential gene expression (DGE) from Histone modifications (HM) signal is crucial to understand how HM controls cell functional heterogeneity through influencing differential gene regulation. Most existing prediction methods use fixed-length bins to represent HM signals and transmit these bins into a single machine learning model to predict differential expression genes of single cell type or cell type pair. However, the inappropriate bin length may cause the splitting of the important HM segment and lead to information loss. Furthermore, the bias of single learning model may limit the prediction accuracy. Considering these problems, in this paper, we proposes an Ensemble deep neural networks framework for predicting Differential Gene Expression (EnDGE). EnDGE employs different feature extractors on input HM signal data with different bin lengths and fuses the feature vectors for DGE prediction. Ensemble multiple learning models with different HM signal cutting strategies helps to keep the integrity and consistency of genetic information in each signal segment, and offset the bias of individual models. Besides the popular feature extractors, we also propose a new Residual Network based model with higher prediction accuracy to increase the diversity of feature extractors. Experiments on the real datasets from the Roadmap Epigenome Project (REMC) show that for all cell type pairs, EnDGE significantly outperforms the state-of-the-art baselines for differential gene expression prediction.
引用
收藏
页码:340 / 351
页数:12
相关论文
共 49 条
[1]   The interplay of epigenetic marks during stem cell differentiation and development [J].
Atlasi, Yaser ;
Stunnenberg, Hendrik G. .
NATURE REVIEWS GENETICS, 2017, 18 (11) :643-658
[2]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[3]   Regulation of chromatin by histone modifications [J].
Bannister, Andrew J. ;
Kouzarides, Tony .
CELL RESEARCH, 2011, 21 (03) :381-395
[4]   The complex language of chromatin regulation during transcription [J].
Berger, Shelley L. .
NATURE, 2007, 447 (7143) :407-412
[5]   The mammalian epigenome [J].
Bernstein, Bradley E. ;
Meissner, Alexander ;
Lander, Eric S. .
CELL, 2007, 128 (04) :669-681
[6]   The lncLocator: a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier [J].
Cao, Zhen ;
Pan, Xiaoyong ;
Yang, Yang ;
Huang, Yan ;
Shen, Hong-Bin .
BIOINFORMATICS, 2018, 34 (13) :2185-2194
[7]   Epigenetic Regulation: A New Frontier for Biomedical Engineers [J].
Chen, Zhen ;
Li, Shuai ;
Subramaniam, Shankar ;
Shyy, John Y. -J. ;
Chien, Shu .
ANNUAL REVIEW OF BIOMEDICAL ENGINEERING, VOL 19, 2017, 19 :195-219
[8]   Modeling the relative relationship of transcription factor binding and histone modifications to gene expression levels in mouse embryonic stem cells [J].
Cheng, Chao ;
Gerstein, Mark .
NUCLEIC ACIDS RESEARCH, 2012, 40 (02) :553-568
[9]   A statistical framework for modeling gene expression using chromatin features and application to modENCODE datasets [J].
Cheng, Chao ;
Yan, Koon-Kiu ;
Yip, Kevin Y. ;
Rozowsky, Joel ;
Alexander, Roger ;
Shou, Chong ;
Gerstein, Mark .
GENOME BIOLOGY, 2011, 12 (02)
[10]   Predicting gene expression in T cell differentiation from histone modifications and transcription factor binding affinities by linear mixture models [J].
Costa, Ivan G. ;
Roider, Helge G. ;
do Rego, Thais G. ;
de Carvalho, Francisco de A. T. .
BMC BIOINFORMATICS, 2011, 12