The Allele Catalog Tool: a web-based interactive tool for allele discovery and analysis

被引:11
作者
Chan, Yen On [1 ,2 ]
Dietz, Nicholas [3 ]
Zeng, Shuai [4 ]
Wang, Juexin [2 ,4 ]
Flint-Garcia, Sherry [5 ]
Salazar-Vidal, M. Nancy [3 ,6 ]
Skrabisova, Maria [7 ]
Bilyeu, Kristin [5 ]
Joshi, Trupti [1 ,2 ,4 ,8 ]
机构
[1] Univ Missouri Columbia, MU Inst Data Sci & Informat, Columbia, MO 65211 USA
[2] Univ Missouri Columbia, Christopher S Bond Life Sci Ctr, Columbia, MO 65211 USA
[3] Univ Missouri Columbia, Div Plant Sci & Technol, Columbia, MO USA
[4] Univ Missouri Columbia, Dept Elect Engn & Comp Sci, Columbia, MO 65211 USA
[5] USDA ARS, Plant Genet Res Unit, Columbia, MO 65211 USA
[6] Univ Calif Davis, Dept Evolut & Ecol, Davis, CA USA
[7] Palacky Univ Olomouc, Fac Sci, Dept Biochem, Olomouc, Czech Republic
[8] Univ Missouri Columbia, Dept Hlth Management & Informat, Columbia, MO 65211 USA
关键词
Variant Calling Pipeline; Allele Catalog Pipeline; Allele Catalog Tool; Alleles in Gene; Data Visualization; SEED DORMANCY; ARABIDOPSIS; ADAPTATION; DOG1;
D O I
10.1186/s12864-023-09161-3
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background The advancement of sequencing technologies today has made a plethora of whole-genome re-sequenced (WGRS) data publicly available. However, research utilizing the WGRS data without further configuration is nearly impossible. To solve this problem, our research group has developed an interactive Allele Catalog Tool to enable researchers to explore the coding region allelic variation present in over 1,000 re-sequenced accessions each for soybean, Arabidopsis, and maize. Results The Allele Catalog Tool was designed originally with soybean genomic data and resources. The Allele Catalog datasets were generated using our variant calling pipeline (SnakyVC) and the Allele Catalog pipeline (AlleleCatalog). The variant calling pipeline is developed to parallelly process raw sequencing reads to generate the Variant Call Format (VCF) files, and the Allele Catalog pipeline takes VCF files to perform imputations, functional effect predictions, and assemble alleles for each gene to generate curated Allele Catalog datasets. Both pipelines were utilized to generate the data panels (VCF files and Allele Catalog files) in which the accessions of the WGRS datasets were collected from various sources, currently representing over 1,000 diverse accessions for soybean, Arabidopsis, and maize individually. The main features of the Allele Catalog Tool include data query, visualization of results, categorical filtering, and download functions. Queries are performed from user input, and results are a tabular format of summary results by categorical description and genotype results of the alleles for each gene. The categorical information is specific to each species; additionally, available detailed meta-information is provided in modal popups. The genotypic information contains the variant positions, reference or alternate genotypes, the functional effect classes, and the amino-acid changes of each accession. Besides that, the results can also be downloaded for other research purposes. Conclusions The Allele Catalog Tool is a web-based tool that currently supports three species: soybean, Arabidopsis, and maize. The Soybean Allele Catalog Tool is hosted on the SoyKB website, while the Allele Catalog Tool for Arabidopsis and maize is hosted on the KBCommons website. Researchers can use this tool to connect variant alleles of genes with meta-information of species.
引用
收藏
页数:14
相关论文
共 40 条
[1]   1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana [J].
Alonso-Blanco, Carlos ;
Andrade, Jorge ;
Becker, Claude ;
Bemm, Felix ;
Bergelson, Joy ;
Borgwardt, Karsten M. ;
Cao, Jun ;
Chae, Eunyoung ;
Dezwaan, Todd M. ;
Ding, Wei ;
Ecker, Joseph R. ;
Exposito-Alonso, Moises ;
Farlow, Ashley ;
Fitz, Joffrey ;
Gan, Xiangchao ;
Grimm, Dominik G. ;
Hancock, Angela M. ;
Henz, Stefan R. ;
Holm, Svante ;
Horton, Matthew ;
Jarsulic, Mike ;
Kerstetter, Randall A. ;
Korte, Arthur ;
Korte, Pamela ;
Lanz, Christa ;
Lee, Cheng-Ruei ;
Meng, Dazhe ;
Michael, Todd P. ;
Mott, Richard ;
Muliyati, Ni Wayan ;
Nagele, Thomas ;
Nagler, Matthias ;
Nizhynska, Viktoria ;
Nordborg, Magnus ;
Novikova, Polina Yu. ;
Pico, F. Xavier ;
Platzer, Alexander ;
Rabanal, Fernando A. ;
Rodriguez, Alex ;
Rowan, Beth A. ;
Salome, Patrice A. ;
Schmid, Karl J. ;
Schmitz, Robert J. ;
Seren, Umit ;
Sperone, Felice Gianluca ;
Sudkamp, Mitchell ;
Svardal, Hannes ;
Tanzer, Matt M. ;
Todd, Donald ;
Volchenboum, Samuel L. .
CELL, 2016, 166 (02) :481-491
[2]   Cloning of DOG1, a quantitative trait locus controlling seed dormancy in Arabidopsis [J].
Bentsink, Leonie ;
Jowett, Jemma ;
Hanhart, Corrie J. ;
Koornneef, Maarten .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (45) :17042-17047
[3]   Adaptation of Maize to Temperate Climates: Mid-Density Genome-Wide Association Genetics and Diversity Patterns Reveal Key Genomic Regions, with a Major Contribution of the Vgt2 (ZCN8) Locus [J].
Bouchet, Sophie ;
Servin, Bertrand ;
Bertin, Pascal ;
Madur, Delphine ;
Combes, Valerie ;
Dumas, Fabrice ;
Brunel, Dominique ;
Laborde, Jacques ;
Charcosset, Alain ;
Nicolas, Stephane .
PLOS ONE, 2013, 8 (08)
[4]   TASSEL: software for association mapping of complex traits in diverse samples [J].
Bradbury, Peter J. ;
Zhang, Zhiwu ;
Kroon, Dallas E. ;
Casstevens, Terry M. ;
Ramdoss, Yogesh ;
Buckler, Edward S. .
BIOINFORMATICS, 2007, 23 (19) :2633-2635
[5]   A One-Penny Imputed Genome from Next-Generation Reference Panels [J].
Browning, Brian L. ;
Zhou, Ying ;
Browning, Sharon R. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2018, 103 (03) :338-348
[6]   Construction of the third-generation Zea mays haplotype map [J].
Bukowski, Robert ;
Guo, Xiaosen ;
Lu, Yanli ;
Zou, Cheng ;
He, Bing ;
Rong, Zhengqin ;
Wang, Bo ;
Xu, Dawen ;
Yang, Bicheng ;
Xie, Chuanxiao ;
Fan, Longjiang ;
Gao, Shibin ;
Xu, Xun ;
Zhang, Gengyun ;
Li, Yingrui ;
Jiao, Yinping ;
Doebley, John F. ;
Ross-Ibarra, Jeffrey ;
Lorant, Anne ;
Buffalo, Vince ;
Romay, M. Cinta ;
Buckler, Edward S. ;
Ware, Doreen ;
Lai, Jinsheng ;
Sun, Qi ;
Xu, Yunbi .
GIGASCIENCE, 2017, 7 (04)
[7]   Maize adaptation across temperate climates was obtained via expression of two florigen genes [J].
Castelletti, Sara ;
Coupel-Ledru, Aude ;
Granato, Italo ;
Palaffre, Carine ;
Cabrera-Bosquet, Llorenc ;
Tonelli, Chiara ;
Nicolas, Stephane D. ;
Tardieu, Francois ;
Welcker, Claude ;
Conti, Lucio .
PLOS GENETICS, 2020, 16 (07)
[8]   DOG1 expression is predicted by the seed-maturation environment and contributes to geographical variation in germination in Arabidopsis thaliana [J].
Chiang, George C. K. ;
Bartsch, Melanie ;
Barua, Deepak ;
Nakabayashi, Kazumi ;
Debieu, Marilyne ;
Kronholm, Ilkka ;
Koornneef, Maarten ;
Soppe, Wim J. J. ;
Donohue, Kathleen ;
de Meaux, Juliette .
MOLECULAR ECOLOGY, 2011, 20 (16) :3336-3349
[9]   A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3 [J].
Cingolani, Pablo ;
Platts, Adrian ;
Wang, Le Lily ;
Coon, Melissa ;
Tung Nguyen ;
Wang, Luan ;
Land, Susan J. ;
Lu, Xiangyi ;
Ruden, Douglas M. .
FLY, 2012, 6 (02) :80-92
[10]  
Debieu M, 2013, PLOS ONE, V8, DOI [10.1371/journal.pone.0082943, 10.1371/journal.pone.0061075]