BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches

被引:279
作者
Liu, Bin [1 ,2 ]
Gao, Xin [3 ]
Zhang, Hanyu [3 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing, Peoples R China
[2] Beijing Inst Technol, Adv Res Inst Multidisciplinary Sci, Beijing, Peoples R China
[3] Harbin Inst Technol, Sch Comp Sci & Technol, Shenzhen, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
AMINO-ACID-COMPOSITION; PREDICTION; DATABASE; AUTOCORRELATION; CLASSIFICATION; COLLOCATION; RECOGNITION; PROMOTERS; PROFILES; DESIGN;
D O I
10.1093/nar/gkz740
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
As the first web server to analyze various biological sequences at sequence level based on machine learning approaches, many powerful predictors in the field of computational biology have been developed with the assistance of the BioSeq-Analysis. However, the BioSeq-Analysis can be only applied to the sequence-level analysis tasks, preventing its applications to the residue-level analysis tasks, and an intelligent tool that is able to automatically generate various predictors for biological sequence analysis at both residue level and sequence level is highly desired. In this regard, we decided to publish an important updated server covering a total of 26 features at the residue level and 90 features at the sequence level called BioSeq-Analysis2.0 (http://bliulab.net/BioSeq-Analysis2.0/), by which the users only need to upload the benchmark dataset, and the BioSeq-Analysis2.0 can generate the predictors for both residue-level analysis and sequence-level analysis tasks. Furthermore, the corresponding stand-alone tool was also provided, which can be downloaded from http://bliulab.net/BioSeq-Analysis2.0/download/. To the best of our knowledge, the BioSeq-Analysis2.0 is the first tool for generating predictors for biological sequence analysis tasks at residue level. Specifically, the experimental results indicated that the predictors developed by BioSeq-Analysis2.0 can achieve comparable or even better performance than the existing state-of-the-art predictors.
引用
收藏
页码:E127 / E127
页数:12
相关论文
共 77 条
[1]  
Altschul S, 1998, FASEB J, V12, pA1326
[2]   Iterated profile searches with PSI-BLAST - a tool for discovery in protein databases [J].
Altschul, SF ;
Koonin, EV .
TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (11) :444-447
[3]  
[Anonymous], 1998, ELECT J BIOTECHNOLOG
[4]   Classification of nuclear receptors based on amino acid composition and dipeptide composition [J].
Bhasin, M ;
Raghava, GPS .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2004, 279 (22) :23262-23266
[5]   Predicting protein-protein interactions from primary structure [J].
Bock, JR ;
Gough, DA .
BIOINFORMATICS, 2001, 17 (05) :455-460
[6]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[7]   ProtDec-LTR2.0: an improved method for protein remote homology detection by combining pseudo protein and supervised Learning to Rank [J].
Chen, Junjie ;
Guo, Mingyue ;
Li, Shumin ;
Liu, Bin .
BIOINFORMATICS, 2017, 33 (21) :3473-3476
[8]   A comprehensive review and comparison of different computational methods for protein remote homology detection [J].
Chen, Junjie ;
Guo, Mingyue ;
Wang, Xiaolong ;
Liu, Bin .
BRIEFINGS IN BIOINFORMATICS, 2018, 19 (02) :231-244
[9]   Prediction of protein structural class using novel evolutionary collocation-based sequence representation [J].
Chen, Ke ;
Kurgan, Lukasz A. ;
Ruan, Jishou .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2008, 29 (10) :1596-1604
[10]   Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs [J].
Chen, Ke ;
Kurgan, Lukasz A. ;
Ruan, Jishou .
BMC STRUCTURAL BIOLOGY, 2007, 7