ARG-based genome-wide analysis of cacao cultivars

被引:4
作者
Utro, Filippo [1 ]
Cornejo, Omar Eduardo [2 ]
Livingstone, Donald [3 ]
Motamayor, Juan Carlos [4 ]
Parida, Laxmi [1 ]
机构
[1] IBM TJ Watson Res, Computat Biol Ctr, Yorktown Hts, NY 10598 USA
[2] Stanford Univ, Sch Med, Dept Genet, Stanford, CA 94305 USA
[3] USDA, Miami, FL 33186 USA
[4] Mars Inc, Miami, FL 33158 USA
来源
BMC BIOINFORMATICS | 2012年 / 13卷
关键词
ANCESTRAL RECOMBINATIONS GRAPH; GENETIC DIVERSITY; NETWORKS; PATTERNS; MARKERS;
D O I
10.1186/1471-2105-13-S19-S17
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Ancestral recombinations graph (ARG) is a topological structure that captures the relationship between the extant genomic sequences in terms of genetic events including recombinations. IRiS is a system that estimates the ARG on sequences of individuals, at genomic scales, capturing the relationship between these individuals of the species. Recently, this system was used to estimate the ARG of the recombining X Chromosome of a collection of human populations using relatively dense, bi-allelic SNP data. Results: While the ARG is a natural model for capturing the inter-relationship between a single chromosome of the individuals of a species, it is not immediately apparent how the model can utilize whole-genome (across chromosomes) diploid data. Also, the sheer complexity of an ARG structure presents a challenge to graph visualization techniques. In this paper we examine the ARG reconstruction for (1) genome-wide or multiple chromosomes, (2) multi-allelic and (3) extremely sparse data. To aid in the visualization of the results of the reconstructed ARG, we additionally construct a much simplified topology, a classification tree, suggested by the ARG. As the test case, we study the problem of extracting the relationship between populations of Theobroma cacao. The chocolate tree is an outcrossing species in the wild, due to self-incompatibility mechanisms at play. Thus a principled approach to understanding the inter-relationships between the different populations must take the shuffling of the genomic segments into account. The polymorphisms in the test data are short tandem repeats (STR) and are multi-allelic (sometimes as high as 30 distinct possible values at a locus). Each is at a genomic location that is bilaterally transmitted, hence the ARG is a natural model for this data. Another characteristic of this plant data set is that while it is genome-wide, across 10 linkage groups or chromosomes, it is very sparse, i.e., only 96 loci from a genome of approximately 400 megabases. The results are visualized both as MDS plots and as classification trees. To evaluate the accuracy of the ARG approach, we compare the results with those available in literature. Conclusions: We have extended the ARG model to incorporate genome-wide (ensemble of multiple chromosomes) data in a natural way. We present a simple scheme to implement this in practice. Finally, this is the first time that a plant population data set is being studied by estimating its underlying ARG. We demonstrate an overall precision of 0.92 and an overall recall of 0.93 of the ARG-based classification, with respect to the gold standard. While we have corroborated the classification of the samples with that in literature, this opens the door to other potential studies that can be made on the ARG.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Genome-wide analysis on the maize genome reveals weak selection on synonymous mutations
    Chu, Duan
    Wei, Lai
    BMC GENOMICS, 2020, 21 (01)
  • [22] A Genome-Wide Survey of Date Palm Cultivars Supports Two Major Subpopulations in Phoenix dactylifera
    Mathew, Lisa S.
    Seidel, Michael A.
    George, Binu
    Mathew, Sweety
    Spannagl, Manuel
    Haberer, Georg
    Torres, Maria F.
    Al-Dous, Eman K.
    Al-Azwani, Eman K.
    Diboun, Ilhem
    Krueger, Robert R.
    Mayer, Klaus F. X.
    Mohamoud, Yasmin Ali
    Suhre, Karsten
    Malek, Joel A.
    G3-GENES GENOMES GENETICS, 2015, 5 (07): : 1429 - 1438
  • [23] Genome-wide association study for soybean cyst nematode resistance in Chinese elite soybean cultivars
    Zhang, Jun
    Wen, Zixiang
    Li, Wei
    Zhang, Yanwei
    Zhang, Lifeng
    Dai, Haiying
    Wang, Dechun
    Xu, Ran
    MOLECULAR BREEDING, 2017, 37 (05)
  • [24] Genome-wide copy number analysis in primary breast cancer
    Ueno, Takayuki
    Emi, Mitsuru
    Sato, Hidenori
    Ito, Noriko
    Muta, Mariko
    Kuroi, Katsumasa
    Toi, Masakazu
    EXPERT OPINION ON THERAPEUTIC TARGETS, 2012, 16 : S31 - S35
  • [25] Genome-wide characterization and analysis of microsatellite sequences in camelid species
    Manee, Manee M.
    Algarni, Abdulmalek T.
    Alharbi, Sultan N.
    Al-Shomrani, Badr M.
    Ibrahim, Mohanad A.
    Binghadir, Sarah A.
    Al-Fageeh, Mohamed B.
    MAMMAL RESEARCH, 2020, 65 (02) : 359 - 373
  • [26] Genome-wide Analysis of Microsatellite Sequence in Seven Filamentous Fungi
    Li, Cheng-Yun
    Liu, Lin
    Yang, Jing
    Li, Jin-Bin
    Su, Yuan
    Zhang, Yue
    Wang, Yun-Yue
    Zhu, You-Yong
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2009, 1 (02) : 141 - 150
  • [27] Genome-wide analysis of poly(A) site selection in Schizosaccharomyces pombe
    Schlackow, Margarita
    Marguerat, Samuel
    Proudfoot, Nicholas J.
    Baehler, Juerg
    Erban, Radek
    Gullerova, Monika
    RNA, 2013, 19 (12) : 1617 - 1631
  • [28] Population Genetic Analysis in Persimmons (Diospyros kaki Thunb.) Based on Genome-Wide Single-Nucleotide Polymorphisms
    Park, Seoyeon
    Park, Ye-Ok
    Park, Younghoon
    PLANTS-BASEL, 2023, 12 (11):
  • [29] Genome-Wide Analysis of Cold Adaptation in Indigenous Siberian Populations
    Cardona, Alexia
    Pagani, Luca
    Antao, Tiago
    Lawson, Daniel J.
    Eichstaedt, Christina A.
    Yngvadottir, Bryndis
    Ma Than Than Shwe
    Wee, Joseph
    Romero, Irene Gallego
    Raj, Srilakshmi
    Metspalu, Mait
    Villems, Richard
    Willerslev, Eske
    Tyler-Smith, Chris
    Malyarchuk, Boris A.
    Derenko, Miroslava V.
    Kivisild, Toomas
    PLOS ONE, 2014, 9 (05):
  • [30] Analysis of serum genome-wide microRNAs for breast cancer detection
    Wu, Qian
    Wang, Chao
    Lu, Zuhong
    Guo, Li
    Ge, Qinyu
    CLINICA CHIMICA ACTA, 2012, 413 (13-14) : 1058 - 1065