Biogeographical Ancestry Inference from Genotype: A Comparison of Ancestral Informative SNPs and Genome-wide SNPs

被引:0
作者
Qu, Yue [1 ]
Tran, Dat [1 ]
Martinez-Marroquin, Elisa [1 ]
机构
[1] Univ Canberra, Fac Sci & Technol, Canberra, ACT, Australia
来源
2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI) | 2020年
关键词
Biogeographical ancestry (BGA); Genome-wide analysis; Hidden Markov Model (HMM); Support Vector Machine (SVM); Convolutional Neural network (CNN); GENETIC ANCESTRY; POPULATION-STRUCTURE; POLYMORPHISM; PANEL; IDENTIFICATION; PREDICTION; DIVERSITY; ADMIXTURE; MARKERS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The biogeographical ancestry (BGA) information can provide supporting information in epidemiology and leading intelligence in forensics. Several sets of ancestral informative markers (AIM) have been proposed to facilitate the BGA inference. A small set of markers can improve efficiency though, it has limitations in their ability of balancing different populations and differentiating sub-populations. Genome-wide SNPs provide much more comprehensive information of an individual's ancestral information. In this paper, we study the problem of BGA inference under the abundance of genome-wide high density data. We studied 1043 individuals from 7 continental populations of the Human Genonte Diversity Panel at 32212 gelatine-wide autosomal single nucleotide polymorphism (SNP) loci. We detected the population structure and compared the BGA inference accuracy using three widely used genetic sequence analysis algorithms through AIMs and genome-wide SNPs. Our results show that genome-wide SNPs reveal population structure with dearer clusterness and provide more accurate BGA inference, confirming the rich information carried by genome-wide SNPs. The findings help to give a clearer picture of candidate ancestral population groups of an individual, and potentially help the BGA inference in a fine population scale.
引用
收藏
页码:64 / 70
页数:7
相关论文
共 53 条
  • [21] Improving human forensics through advances in genetics, genomics and molecular biology
    Kayser, Manfred
    de Knijff, Peter
    [J]. NATURE REVIEWS GENETICS, 2011, 12 (03) : 179 - 192
  • [22] Progress toward an efficient panel of SNPs for ancestry inference
    Kidd, Kenneth K.
    Speed, William C.
    Pakstis, Andrew J.
    Furtado, Manohar R.
    Fang, Rixun
    Madbouly, Abeer
    Maiers, Martin
    Middha, Mridu
    Friedlaender, Francoise R.
    Kidd, Judith R.
    [J]. FORENSIC SCIENCE INTERNATIONAL-GENETICS, 2014, 10 : 23 - 32
  • [23] Ancestry Informative Marker Sets for Determining Continental Origin and Admixture Proportions in Common Populations in America
    Kosoy, Roman
    Nassir, Rami
    Tian, Chao
    White, Phoebe A.
    Butler, Lesley M.
    Silva, Gabriel
    Kittles, Rick
    Alarcon-Riquelme, Marta E.
    Gregersen, Peter K.
    Belmont, John W.
    De La Vega, Francisco M.
    Seldin, Michael F.
    [J]. HUMAN MUTATION, 2009, 30 (01) : 69 - 78
  • [24] Increasing the information content of STS-based genome maps: Identifying polymorphisms in mapped STSs
    Kwok, PY
    Deng, Q
    Zakeri, H
    Taylor, SL
    Nickerson, DA
    [J]. GENOMICS, 1996, 31 (01) : 123 - 126
  • [25] Approximating the multiclass ROC by pairwise analysis
    Landgrebe, Thomas C. W.
    Duin, Robert P. W.
    [J]. PATTERN RECOGNITION LETTERS, 2007, 28 (13) : 1747 - 1758
  • [26] Proportioning whole-genome single-nucleotide-polymorphism diversity for the identification of geographic population structure and genetic ancestry
    Lao, O
    van Duijn, K
    Kersbergen, P
    de Knijff, P
    Kayser, M
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2006, 78 (04) : 680 - 690
  • [27] Worldwide human relationships inferred from genome-wide patterns of variation
    Li, Jun Z.
    Absher, Devin M.
    Tang, Hua
    Southwick, Audrey M.
    Casto, Amanda M.
    Ramachandran, Sohini
    Cann, Howard M.
    Barsh, Gregory S.
    Feldman, Marcus
    Cavalli-Sforza, Luigi L.
    Myers, Richard M.
    [J]. SCIENCE, 2008, 319 (5866) : 1100 - 1104
  • [28] FastPop: a rapid principal component derived method to infer intercontinental ancestry using genetic data
    Li, Yafang
    Byun, Jinyoung
    Cai, Guoshuai
    Xiao, Xiangjun
    Han, Younghun
    Cornelis, Olivier
    Dinulos, James E.
    Dennis, Joe
    Easton, Douglas
    Gorlov, Ivan
    Seldin, Michael F.
    Amos, Christopher I.
    [J]. BMC BIOINFORMATICS, 2016, 17
  • [29] Rare copy number variants in over 100,000 European ancestry subjects reveal multiple disease associations
    Li, Yun Rose
    Glessner, Joseph T.
    Coe, Bradley P.
    Li, Jian-jun
    Mohebnasab, Maede
    Chang, Xiao
    Connolly, John
    Kao, Charlly
    Wei, Zhi
    Bradfield, Jonathan
    Kim, Cecilia
    Hou, Cuiping
    Khan, Munir
    Mentch, Frank
    Qiu, Haijun
    Bakay, Marina
    Cardinale, Christopher
    Lemma, Maria
    Abrams, Debra
    Bridglall-Jhingoor, Andrew
    Behr, Meckenzie
    Harrison, Shanell
    Otieno, George
    Thomas, Alexandria
    Wang, Fengxiang
    Chiavacci, Rosetta
    Wu, Lawrence
    Hadley, Dexter
    Goldmuntz, Elizabeth
    Elia, Josephine
    Maris, John
    Grundmeier, Robert
    Devoto, Marcella
    Keating, Brendan
    March, Michael
    Pellagrino, Renata
    Grant, Struan F. A.
    Sleiman, Patrick M. A.
    Li, Mingyao
    Eichler, Evan E.
    Hakonarson, Hakon
    [J]. NATURE COMMUNICATIONS, 2020, 11 (01)
  • [30] Accurate prediction of potential druggable proteins based on genetic algorithm and Bagging-SVM ensemble classifier
    Lin, Jianying
    Chen, Hui
    Li, Shan
    Liu, Yushuang
    Li, Xuan
    Yu, Bin
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2019, 98 : 35 - 47