CRISPRloci: comprehensive and accurate annotation of CRISPR-Cas systems

被引:25
作者
Alkhnbashi, Omer S. [1 ]
Mitrofanov, Alexander [1 ]
Bonidia, Robson [2 ]
Raden, Martin [1 ]
Van Dinh Tran [1 ]
Eggenhofer, Florian [1 ]
Shah, Shiraz A. [3 ]
Oeztuerk, Ekrem [1 ]
Padilha, Victor A. [2 ]
Sanches, Danilo S. [4 ]
de Carvalho, Andre C. P. L. F. [2 ]
Backofen, Rolf [1 ,5 ,6 ]
机构
[1] Univ Freiburg, Dept Comp Sci, Bioinformat Grp, Georges Koehler Allee 106, D-79110 Freiburg, Germany
[2] Univ Sao Paulo, Inst Math & Comp Sci, Sao Carlos, SP, Brazil
[3] Univ Copenhagen, Herlev & Gentofte Hosp, Copenhagen Prospect Studies Asthma Childhood, Copenhagen, Denmark
[4] Univ Tecnol Fed Parana, Campus Cornelio Procopio, BR-86300000 Cornelio Procopio, PR, Brazil
[5] Univ Freiburg, Signalling Res Ctr BIOSS, Schaenzlestr 18, D-79104 Freiburg, Germany
[6] Univ Freiburg, CIBSS, Schaenzlestr 18, D-79104 Freiburg, Germany
基金
巴西圣保罗研究基金会;
关键词
EVOLUTIONARY CLASSIFICATION;
D O I
10.1093/nar/gkab456
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
CRISPR-Cas systems are adaptive immune systems in prokaryotes, providing resistance against invading viruses and plasmids. The identification of CRISPR loci is currently a non-standardized, ambiguous process, requiring the manual combination of multiple tools, where existing tools detect only parts of the CRISPR-systems, and lack quality control, annotation and assessment capabilities of the detected CRISPR loci. Our CRISPRloci server provides the first resource for the prediction and assessment of all possible CRISPR loci. The server integrates a series of advanced Machine Learning tools within a seamless web interface featuring: (i) prediction of all CRISPR arrays in the correct orientation; (ii) definition of CRISPR leaders for each locus; and (iii) annotation of cas genes and their unambiguous classification. As a result, CRISPRloci is able to accurately determine the CRISPR array and associated information, such as: the Cas subtypes; cassette boundaries; accuracy of the repeat structure, orientation and leader sequence; virus-host interactions; self-targeting; as well as the annotation of cas genes, all of which have been missing from existing tools. This annotation is presented in an interactive interface, making it easy for scientists to gain an overview of the CRISPR system in their organism of interest. Predictions are also rendered in GFF format, enabling in-depth genome browser inspection. In summary, CRISPRloci constitutes a full suite for CRISPR-Cas system characterization that offers annotation quality previously available only after manual inspection.
引用
收藏
页码:W125 / W130
页数:6
相关论文
共 25 条
[1]   CRISPR-Cas bioinformatics [J].
Alkhnbashi, Omer S. ;
Meier, Tobias ;
Mitrofanov, Alexander ;
Backofen, Rolf ;
Voss, Bjoern .
METHODS, 2020, 172 :3-11
[2]   Characterizing leader sequences of CRISPR loci [J].
Alkhnbashi, Omer S. ;
Shah, Shiraz A. ;
Garrett, Roger A. ;
Saunders, Sita J. ;
Costa, Fabrizio ;
Backofen, Rolf .
BIOINFORMATICS, 2016, 32 (17) :576-585
[3]   CRISPRstrand: predicting repeat orientations to determine the crRNA-encoding strand at CRISPR loci [J].
Alkhnbashi, Omer S. ;
Costa, Fabrizio ;
Shah, Shiraz A. ;
Garrett, Roger A. ;
Saunders, Sita J. ;
Backofen, Rolf .
BIOINFORMATICS, 2014, 30 (17) :I489-I496
[4]   Weight Loss after Gastric Bypass Surgery in Human Obesity Remodels Promoter Methylation [J].
Barres, Romain ;
Kirchner, Henriette ;
Rasmussen, Morten ;
Yan, Jie ;
Kantor, Francisc R. ;
Krook, Anna ;
Naslund, Erik ;
Zierath, Juleen R. .
CELL REPORTS, 2013, 3 (04) :1020-1027
[5]   RNA Accessibility in cubic time [J].
Bernhart, Stephan H. ;
Mueckstein, Ullrike ;
Hofacker, Ivo L. .
ALGORITHMS FOR MOLECULAR BIOLOGY, 2011, 6
[6]  
Breiman L., 1984, Classification and Regression Trees, DOI [DOI 10.1201/9781315139470, 10.1201/9781315139470]
[7]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[8]   CRISPRCasFinder, an update of CRISRFinder, includes a portable version, enhanced performance and integrates search for Cas proteins [J].
Couvin, David ;
Bernheim, Aude ;
Toffano-Nioche, Claire ;
Touchon, Marie ;
Michalik, Juraj ;
Neron, Bertrand ;
Rocha, Eduardo P. C. ;
Vergnaud, Gilles ;
Gautheret, Daniel ;
Pourcel, Christine .
NUCLEIC ACIDS RESEARCH, 2018, 46 (W1) :W246-W251
[9]   Extremely randomized trees [J].
Geurts, P ;
Ernst, D ;
Wehenkel, L .
MACHINE LEARNING, 2006, 63 (01) :3-42
[10]   Bioconda: sustainable and comprehensive software distribution for the life sciences [J].
Gruening, Bjoern ;
Dale, Ryan ;
Sjoedin, Andreas ;
Chapman, Brad A. ;
Rowe, Jillian ;
Tomkins-Tinch, Christopher H. ;
Valieris, Renan ;
Koester, Johannes ;
Team, Bioconda .
NATURE METHODS, 2018, 15 (07) :475-476