StackEPI: identification of cell line-specific enhancer-promoter interactions based on stacking ensemble learning

被引:4
作者
Fan, Yongxian [1 ]
Peng, Binchao [1 ]
机构
[1] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
基金
中国国家自然科学基金;
关键词
Enhancer-promoter interaction; Bioinformatics; Machine learning; Stacking strategy; Feature extraction; 3D GENOME; PRINCIPLES; PSEKNC;
D O I
10.1186/s12859-022-04821-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Understanding the regulatory role of enhancer-promoter interactions (EPIs) on specific gene expression in cells contributes to the understanding of gene regulation, cell differentiation, etc., and its identification has been a challenging task. On the one hand, using traditional wet experimental methods to identify EPIs often means a lot of human labor and time costs. On the other hand, although the currently proposed computational methods have good recognition effects, they generally require a long training time. Results: In this study, we studied the EPIs of six human cell lines and designed a cell line-specific EPIs prediction method based on a stacking ensemble learning strategy, which has better prediction performance and faster training speed, called StackEPI. Specifically, by combining different encoding schemes and machine learning methods, our prediction method can extract the cell line-specific effective information of enhancer and promoter gene sequences comprehensively and in many directions, and make accurate recognition of cell line-specific EPIs. Ultimately, the source code to implement StackEPI and experimental data involved in the experiment are available at https://github.com/20032303092/StackEPI.git. Conclusions: The comparison results show that our model can deliver better performance on the problem of identifying cell line-specific EPIs and outperform other state-of-the-art models. In addition, our model also has a more efficient computation speed.
引用
收藏
页数:18
相关论文
共 50 条
[21]   How can machine-learning methods assist in virtual screening for hyperuricemia? A healthcare machine-learning approach [J].
Ichikawa, Daisuke ;
Saito, Toki ;
Ujita, Waka ;
Oyama, Hiroshi .
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 64 :20-24
[22]  
Jianguo Zhou, 2020, E3S Web of Conferences, V218, DOI 10.1051/e3sconf/202021803046
[23]   Prediction of enhancer-promoter interactions using the cross-cell type information and domain adversarial neural network [J].
Jing, Fang ;
Zhang, Shao-Wu ;
Zhang, Shihua .
BMC BIOINFORMATICS, 2020, 21 (01)
[24]  
Ke GL, 2017, ADV NEUR IN, V30
[25]   Asymptotic behaviors of support vector machines with Gaussian kernel [J].
Keerthi, SS ;
Lin, CJ .
NEURAL COMPUTATION, 2003, 15 (07) :1667-1689
[26]   Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome [J].
Lieberman-Aiden, Erez ;
van Berkum, Nynke L. ;
Williams, Louise ;
Imakaev, Maxim ;
Ragoczy, Tobias ;
Telling, Agnes ;
Amit, Ido ;
Lajoie, Bryan R. ;
Sabo, Peter J. ;
Dorschner, Michael O. ;
Sandstrom, Richard ;
Bernstein, Bradley ;
Bender, M. A. ;
Groudine, Mark ;
Gnirke, Andreas ;
Stamatoyannopoulos, John ;
Mirny, Leonid A. ;
Lander, Eric S. ;
Dekker, Job .
SCIENCE, 2009, 326 (5950) :289-293
[27]  
Mao W, 2017, MODELING ENHANCER PR
[28]  
Mikolov Tomas, 2013, CORR
[29]   Predicting enhancer-promoter interactions by deep learning and matching heuristic [J].
Min, Xiaoping ;
Ye, Congmin ;
Liu, Xiangrong ;
Zeng, Xiangxiang .
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
[30]  
Nair AS, 2006, BIOINFORMATION, V1, P197