CNSA: a data repository for archiving omics data

被引:286
作者
Guo, Xueqin [1 ]
Chen, Fengzhen [1 ]
Gao, Fei [1 ]
Li, Ling [1 ]
Liu, Ke [1 ]
You, Lijin [1 ]
Hua, Cong [1 ]
Yang, Fan [1 ]
Liu, Wanliang [1 ]
Peng, Chunhua [1 ]
Wang, Lina [1 ]
Yang, Xiaoxia [1 ]
Zhou, Feiyu [1 ]
Tong, Jiawei [1 ]
Cai, Jia [1 ]
Li, Zhiyong [1 ]
Wan, Bo [1 ]
Zhang, Lei [1 ]
Yang, Tao [1 ]
Zhang, Minwen [1 ]
Yang, Linlin [1 ]
Yang, Yawen [1 ]
Zeng, Wenjun [1 ]
Wang, Bo [1 ]
Wei, Xiaofeng [1 ]
Xu, Xun [1 ,2 ,3 ]
机构
[1] China Natl GeneBank, Shenzhen 518120, Peoples R China
[2] BGI Shenzhen, Shenzhen 518083, Peoples R China
[3] Guangdong Prov Key Lab Genome Read & Write, Shenzhen 518120, Peoples R China
来源
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION | 2020年
关键词
GENOMES;
D O I
10.1093/database/baaa055
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
With the application and development of high-throughput sequencing technology in life and health sciences, massive multi-omics data brings the problem of efficient management and utilization. Database development and biocuration are the prerequisites for the reuse of these big data. Here, relying on China National GeneBank (CNGB), we present CNGB Sequence Archive (CNSA) for archiving omics data, including raw sequencing data and its further analyzed results which are organized into six objects, namely Project, Sample, Experiment, Run, Assembly and Variation at present. Moreover, CNSA has created a correlation model of living samples, sample information and analytical data on some projects. Both living samples and analytical data are directly correlated with the sample information. From either one, information or data of the other two can be obtained, so that all data can be traced throughout the life cycle from the living sample to the sample information to the analytical data. Complying with the data standards commonly used in the life sciences, CNSA is committed to building a comprehensive and curated data repository for storing, managing and sharing of omics data. We will continue to improve the data standards and provide free access to open-data resources for worldwide scientific communities to support academic research and the bio-industry.
引用
收藏
页数:6
相关论文
共 18 条
[1]   fastp: an ultra-fast all-in-one FASTQ preprocessor [J].
Chen, Shifu ;
Zhou, Yanqing ;
Chen, Yaru ;
Gu, Jia .
BIOINFORMATICS, 2018, 34 (17) :884-890
[2]   The European Bioinformatics Institute in 2017: data coordination and integration [J].
Cook, Charles E. ;
Bergman, Mary T. ;
Cochrane, Guy ;
Apweiler, Rolf ;
Birney, Ewan .
NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) :D21-D29
[3]   The Global Genome Biodiversity Network (GGBN) Data Standard specification [J].
Droege, G. ;
Barker, K. ;
Seberg, O. ;
Coddington, J. ;
Benson, E. ;
Berendsohn, W. G. ;
Bunk, B. ;
Butler, C. ;
Cawsey, E. M. ;
Deck, J. ;
Doring, M. ;
Flemons, P. ;
Gemeinholzer, B. ;
Guentsch, A. ;
Hollowell, T. ;
Kelbert, P. ;
Kostadinov, I. ;
Kottmann, R. ;
Lawlor, R. T. ;
Lyal, C. ;
Mackenzie-Dodds, J. ;
Meyer, C. ;
Mulcahy, D. ;
Nussbeck, S. Y. ;
O'Tuama, E. ;
Orrell, T. ;
Petersen, G. ;
Robertson, T. ;
Soehngen, C. ;
Whitacre, J. ;
Wieczorek, J. ;
Yilmaz, P. ;
Zetzsche, H. ;
Zhang, Y. ;
Zhou, X. .
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2016, :1-11
[4]   The Earth BioGenome project: opportunities and challenges for plant genomics and conservation [J].
Exposito-Alonso, Moises ;
Drost, Hajk-Georg ;
Burbano, Hernan A. ;
Weigel, Detlef .
PLANT JOURNAL, 2020, 102 (02) :222-229
[5]   International network of cancer genome projects [J].
Hudson, Thomas J. ;
Anderson, Warwick ;
Aretz, Axel ;
Barker, Anna D. ;
Bell, Cindy ;
Bernabe, Rosa R. ;
Bhan, M. K. ;
Calvo, Fabien ;
Eerola, Iiro ;
Gerhard, Daniela S. ;
Guttmacher, Alan ;
Guyer, Mark ;
Hemsley, Fiona M. ;
Jennings, Jennifer L. ;
Kerr, David ;
Klatt, Peter ;
Kolar, Patrik ;
Kusuda, Jun ;
Lane, David P. ;
Laplace, Frank ;
Lu, Youyong ;
Nettekoven, Gerd ;
Ozenberger, Brad ;
Peterson, Jane ;
Rao, T. S. ;
Remacle, Jacques ;
Schafer, Alan J. ;
Shibata, Tatsuhiro ;
Stratton, Michael R. ;
Vockley, Joseph G. ;
Watanabe, Koichi ;
Yang, Huanming ;
Yuen, Matthew M. F. ;
Knoppers, M. ;
Bobrow, Martin ;
Cambon-Thomsen, Anne ;
Dressler, Lynn G. ;
Dyke, Stephanie O. M. ;
Joly, Yann ;
Kato, Kazuto ;
Kennedy, Karen L. ;
Nicolas, Pilar ;
Parker, Michael J. ;
Rial-Sebbag, Emmanuelle ;
Romeo-Casabona, Carlos M. ;
Shaw, Kenna M. ;
Wallace, Susan ;
Wiesner, Georgia L. ;
Zeps, Nikolajs ;
Lichter, Peter .
NATURE, 2010, 464 (7291) :993-998
[6]   The international nucleotide sequence database collaboration [J].
Karsch-Mizrachi, Ilene ;
Takagi, Toshihisa ;
Cochrane, Guy .
NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) :D48-D51
[7]   DNA Data Bank of Japan: 30th anniversary [J].
Kodama, Yuichi ;
Mashima, Jun ;
Kosuge, Takehide ;
Kaminuma, Eli ;
Ogasawara, Osamu ;
Okubo, Kousaku ;
Nakamura, Yasukazu ;
Takagi, Toshihisa .
NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) :D30-D35
[8]  
Levy M, 2020, LANCET GLOB HEALTH, V8, pE591, DOI 10.1016/S2214-109X(20)30078-4
[9]   Molecular digitization of a botanical garden: high-depth whole-genome sequencing of 689 vascular plant species from the Ruili Botanical Garden [J].
Liu, Huan ;
Wei, Jinpu ;
Yang, Ting ;
Mu, Weixue ;
Song, Bo ;
Yang, Tuo ;
Fu, Yuan ;
Wang, Xuebing ;
Hu, Guohai ;
Li, Wangsheng ;
Zhou, Hongcheng ;
Chang, Yue ;
Chen, Xiaoli ;
Chen, Hongyun ;
Cheng, Le ;
He, Xuefei ;
Cai, Hechen ;
Cai, Xianchu ;
Wang, Mei ;
Li, Yang ;
Sahu, Sunil Kumar ;
Yang, Jinlong ;
Wang, Yu ;
Mu, Ranchang ;
Liu, Jie ;
Zhao, Jianming ;
Huang, Ziheng ;
Xu, Xun ;
Liu, Xin .
GIGASCIENCE, 2019, 8 (04) :1-9
[10]   DataCite and DOI names for research data [J].
Neumann, Janna ;
Brase, Jan .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2014, 28 (10) :1035-1041