Chaos game representation dataset of SARS-CoV-2 genome

被引:9
作者
Barbosa, Raquel de M. [1 ]
Fernes, Marcelo A. C. [2 ,3 ,4 ]
机构
[1] MIT, Dept Chem Engn, Cambridge, MA 02142 USA
[2] Univ Fed Rio Grande do Norte, IMD nPITI, Lab Machine Learning & Intelligent Instrumentat, BR-59078970 Natal, RN, Brazil
[3] Univ Fed Rio Grande do Norte, Dept Comp Engn & Automat, BR-59078970 Natal, RN, Brazil
[4] Harvard Univ, John A Paulson Sch Engn & Appl Sci, Cambridge, MA 02138 USA
关键词
SARS-CoV-2; CGR; COVID-19;
D O I
10.1016/j.dib.2020.105618
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
As of April 16, 2020, the novel coronavirus disease (called COVID-19) spread to more than 185 countries/regions with more than 142,000 deaths and more than 2,000,0 00 confirmed cases. In the bioinformatics area, one of the crucial points is the analysis of the virus nucleotide sequences using approaches such as data stream, digital signal processing, and machine learning techniques and algorithms. However, to make feasible this approach, it is necessary to transform the nucleotide sequences string to numerical values representation. Thus, the dataset provides a chaos game representation (CGR) of SARS-CoV-2 virus nucleotide sequences. The dataset provides the CGR of 100 instances of SARS-CoV2 virus, 11540 instances of other viruses from the Virus-Host DB dataset, and three instances of Riboviria viruses from NCBI (Betacoronavirus RaTG13, bat-SL-CoVZC45, and bat-SL-CoVZXC21). (C) 2020 The Author(s). Published by Elsevier Inc.
引用
收藏
页数:5
相关论文
共 7 条
[1]   Numerical encoding of DNA sequences by chaos game representation with application in similarity comparison [J].
Hoang, Tung ;
Yin, Changchuan ;
Yau, Stephen S. -T. .
GENOMICS, 2016, 108 (3-4) :134-142
[2]   CHAOS GAME REPRESENTATION OF GENE STRUCTURE [J].
JEFFREY, HJ .
NUCLEIC ACIDS RESEARCH, 1990, 18 (08) :2163-2170
[3]   Linking Virus Genomes with Host Taxonomy [J].
Mihara, Tomoko ;
Nishimura, Yosuke ;
Shimizu, Yugo ;
Nishiyama, Hiroki ;
Yoshikawa, Genki ;
Uehara, Hideya ;
Hingamp, Pascal ;
Goto, Susumu ;
Ogata, Hiroyuki .
VIRUSES-BASEL, 2016, 8 (03)
[4]  
NCBI, 2020, SARS COV 2 SEV AC RE
[5]  
Randhawa GS, 2020, BIORXIV, DOI [10.1101/2020.02.03.932350, DOI 10.1101/2020.02.03.932350]
[6]  
Virus-HostDB, 2020, HOST IND
[7]  
Yin C., 2017, ARXIV171204546