ILGBMSH: an interpretable classification model for the shRNA target prediction with ensemble learning algorithm

被引:13
作者
Zhao, Chengkui [1 ]
Xu, Nan [2 ]
Tan, Jingwen [2 ]
Cheng, Qi [1 ]
Xie, Weixin [1 ]
Xu, Jiayu [1 ]
Wei, Zhenyu [1 ]
Ye, Jing [3 ]
Yu, Lei [3 ,4 ]
Feng, Weixing [1 ]
机构
[1] Harbin Engn Univ, Coll Intelligent Syst Sci & Engn, Harbin, Peoples R China
[2] Shanghai Unicar Therapy Biomed Technol Co Ltd, Res & Dev, Shanghai, Peoples R China
[3] Shanghai Unicar Therapy Biomed Technol Co Ltd, Shanghai, Peoples R China
[4] East China Normal Univ, Sch Chem & Mol Engn, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
shRNA prediction; ensemble learning; deep learning; knockdown experiment; DESIGN;
D O I
10.1093/bib/bbac429
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Short hairpin RNA (shRNA)-mediated gene silencing is an important technology to achieve RNA interference, in which the design of potent and reliable shRNA molecules plays a crucial role. However, efficient shRNA target selection through biological technology is expensive and time consuming. Hence, it is crucial to develop a more precise and efficient computational method to design potent and reliable shRNA molecules. In this work, we present an interpretable classification model for the shRNA target prediction using the Light Gradient Boosting Machine algorithm called ILGBMSH. Rather than utilizing only the shRNA sequence feature, we extracted 554 biological and deep learning features, which were not considered in previous shRNA prediction research. We evaluated the performance of our model compared with the state-of-the-art shRNA target prediction models. Besides, we investigated the feature explanation from the model's parameters and interpretable method called Shapley Additive Explanations, which provided us with biological insights from the model. We used independent shRNA experiment data from other resources to prove the predictive ability and robustness of our model. Finally, we used our model to design the miR30-shRNA sequences and conducted a gene knockdown experiment. The experimental result was perfectly in correspondence with our expectation with a Pearson's coefficient correlation of 0.985. In summary, the ILGBMSH model can achieve state-of-the-art shRNA prediction performance and give biological insights from the machine learning model parameters.
引用
收藏
页数:10
相关论文
共 29 条
[1]  
Albawi S, 2017, I C ENG TECHNOL
[2]  
Chen TQ, 2016, Arxiv, DOI [arXiv:1603.02754, 10.48550/arXiv.1603.02754, DOI 10.48550/ARXIV.1603.02754, DOI 10.1145/2939672.2939785]
[3]   Techniques for Interpretable Machine Learning [J].
Du, Mengnan ;
Li, Ninghao ;
Hu, Xia .
COMMUNICATIONS OF THE ACM, 2020, 63 (01) :68-77
[4]   Multi-target inhibition by four tandem shRNAs embedded in homo- or hetero-miRNA backbones [J].
Du, Xiao ;
Cai, Yanhui ;
Xi, Wenjin ;
Zhang, Rui ;
Jia, Lintao ;
Yang, Angang ;
Zhao, Jing ;
Yan, Bo .
MOLECULAR MEDICINE REPORTS, 2018, 17 (01) :307-314
[5]   Functional Identification of Optimized RNAi Triggers Using a Massively Parallel Sensor Assay [J].
Fellmann, Christof ;
Zuber, Johannes ;
McJunkin, Katherine ;
Chang, Kenneth ;
Malone, Colin D. ;
Dickins, Ross A. ;
Xu, Qikai ;
Hengartner, Michael O. ;
Elledge, Stephen J. ;
Hannon, Gregory J. ;
Lowe, Scott W. .
MOLECULAR CELL, 2011, 41 (06) :733-746
[6]  
Freund Y., 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P148
[7]   Greedy function approximation: A gradient boosting machine [J].
Friedman, JH .
ANNALS OF STATISTICS, 2001, 29 (05) :1189-1232
[8]   Interpretable Machine Learning Framework Reveals Robust Gut Microbiome Features Associated With Type 2 Diabetes [J].
Gou, Wanglong ;
Ling, Chu-wen ;
He, Yan ;
Jiang, Zengliang ;
Fu, Yuanqing ;
Xu, Fengzhe ;
Miao, Zelei ;
Sun, Ting-yu ;
Lin, Jie-sheng ;
Zhu, Hui-lian ;
Zhou, Hongwei ;
Chen, Yu-ming ;
Zheng, Ju-Sheng .
DIABETES CARE, 2021, 44 (02) :358-366
[9]  
Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]
[10]   Thermodynamic instability of siRNA duplex is a prerequisite for dependable prediction of siRNA activities [J].
Ichihara, Masatoshi ;
Murakumo, Yoshiki ;
Masuda, Akio ;
Matsuura, Toru ;
Asai, Naoya ;
Jijiwa, Mayumi ;
Ishida, Maki ;
Shinmi, Jun ;
Yatsuya, Hiroshi ;
Qiao, Shanlou ;
Takahashi, Masahide ;
Ohno, Kinji .
NUCLEIC ACIDS RESEARCH, 2007, 35 (18)