Large-scale selection of highly informative microhaplotypes for ancestry inference and population specific informativeness

被引:0
作者
Rodrigues, Maria Luisa de Barros [1 ]
Rodrigues, Marcelo Porto
Norton, Heather L. [2 ]
Mendes-Junior, Celso Teixeira [3 ]
Simoes, Aguinaldo Luiz [4 ]
Lawson, Daniel John [5 ,6 ]
机构
[1] Univ Sao Paulo, Fac Med Ribeirao Preto, Programa Posgraduacao Genet, Ave Bandeirantes 3900, BR-14049900 Ribeirao Preto, SP, Brazil
[2] Univ Cincinnati, Dept Anthropol, Cincinnati, OH 45221 USA
[3] Univ Sao Paulo, Fac Filosofia, Dept Quim, Lab Pesquisas Forenses & Genom,Ciencias & Letras R, BR-14040901 Ribeirao Preto, SP, Brazil
[4] Univ Sao Paulo, Fac Med Ribeirao Preto, Dept Genet, Ave Bandeirantes 3900, BR-14049900 Ribeirao Preto, SP, Brazil
[5] Univ Bristol, Inst Stat Sci, Sch Math, Woodland Rd, Bristol BS8 1UG, England
[6] Univ Bristol, Sch Med, MRC Integrat Epidemiol Unit, Oakfield Grove, Bristol BS8 2BN, England
基金
欧盟地平线“2020”;
关键词
Microhaplotypes; Ancestry; Informativeness; Brazilian population; Native Americans; Microarray; BIOGEOGRAPHICAL ANCESTRY; ADMIXTURE; PANEL; SOFTWARE; DATABASE; GENOMES; MARKERS; ORIGIN; FORMAT;
D O I
10.1016/j.fsigen.2024.103153
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Microhaplotypes (MHs) describe physically close genetic markers that are inherited together and are gaining prominence due to their efficiency in forensic, clinical, and population studies. They excel in kinship analysis, DNA mixture detection, and ancestry inference, offering advantages in precision over individual SNPs and STRs. In this study, a pipeline was developed to efficiently select highly informative MHs from large-scale genomic datasets. Over 120,000 MHs were identified from almost a million markers, which allow this non-independent information to be efficiently used for inference. The MHs were compared to SNPs in terms of their informativeness and performance of their subsets in ancestry inference and all the results consistently favored MHs. A method for ranking markers by specific population informativeness was also introduced, which showed improvement in the accuracy of Native American ancestry estimation, overcoming the challenges of its underrepresentation in datasets. In conclusion, this study presents a comprehensive way for selecting highly informative MHs for accurate ancestry inference. The proposed approach and the subsets selected by specific population informativeness offer valuable tools for improving ancestry inference accuracy, particularly for admixed populations as demonstrated for a Brazilian dataset.
引用
收藏
页数:10
相关论文
共 53 条
[1]   Fast model-based estimation of ancestry in unrelated individuals [J].
Alexander, David H. ;
Novembre, John ;
Lange, Kenneth .
GENOME RESEARCH, 2009, 19 (09) :1655-1664
[2]   Ancient Rome: Agenetic crossroads of Europe and the Mediterranean [J].
Antonio, Margaret L. ;
Gao, Ziyue ;
Moots, Hannah M. ;
Lucci, Michaela ;
Candilio, Francesca ;
Sawyer, Susanna ;
Oberreiter, Victoria ;
Calderon, Diego ;
Devitofranceschi, Katharina ;
Aikens, Rachael C. ;
Aneli, Serena ;
Bartoli, Fulvio ;
Bedini, Alessandro ;
Cheronet, Olivia ;
Cotter, Daniel J. ;
Fernandes, Daniel M. ;
Gasperetti, Gabriella ;
Grifoni, Renata ;
Guidi, Alessandro ;
La Pastina, Francesco ;
Loreti, Ersilia ;
Manacorda, Daniele ;
Matullo, Giuseppe ;
Morretta, Simona ;
Nava, Alessia ;
Nicolai, Vincenzo Fiocchi ;
Nomi, Federico ;
Pavolini, Carlo ;
Pentiricci, Massimo ;
Pergola, Philippe ;
Piranomonte, Marina ;
Schmidt, Ryan ;
Spinola, Giandomenico ;
Sperduti, Alessandra ;
Rubini, Mauro ;
Bondioli, Luca ;
Coppa, Alfredo ;
Pinhasi, Ron ;
Pritchard, Jonathan K. .
SCIENCE, 2019, 366 (6466) :708-+
[3]   Amerindian ancestry and extended longevity in Nicoya, Costa Rica [J].
Azofeifa, Jorge ;
Ruiz-Narvaez, Edward A. ;
Leal, Alejandro ;
Gerlovin, Hanna ;
Rosero-Bixby, Luis .
AMERICAN JOURNAL OF HUMAN BIOLOGY, 2018, 30 (01)
[4]   Ancestry Informative Marker Panel to Estimate Population Stratification Using Genome-wide Human Array [J].
Barbosa, Fernanda B. ;
Cagnin, Natalia F. ;
Simioni, Milena ;
Farias, Allysson A. ;
Torres, Fabio R. ;
Molck, Miriam C. ;
Araujo, Tania K. ;
Gil-Da-Silva-Lopes, Vera L. ;
Donadi, Eduardo A. ;
Simoes, Aguinaldo L. .
ANNALS OF HUMAN GENETICS, 2017, 81 (06) :225-233
[5]   Insights into human genetic variation and population history from 929 diverse genomes [J].
Bergstrom, Anders ;
McCarthy, Shane A. ;
Hui, Ruoyun ;
Almarri, Mohamed A. ;
Ayub, Qasim ;
Danecek, Petr ;
Chen, Yuan ;
Felkel, Sabine ;
Hallast, Pille ;
Kamm, Jack ;
Blanche, Helene ;
Deleuze, Jean-Francois ;
Cann, Howard ;
Mallick, Swapan ;
Reich, David ;
Sandhu, Manjinder S. ;
Skoglund, Pontus ;
Scally, Aylwyn ;
Xue, Yali ;
Durbin, Richard ;
Tyler-Smith, Chris .
SCIENCE, 2020, 367 (6484) :1339-+
[6]   Multi-nucleotide de novo Mutations in Humans [J].
Besenbacher, Soren ;
Sulem, Patrick ;
Helgason, Agnar ;
Helgason, Hannes ;
Kristjansson, Helgi ;
Jonasdottir, Aslaug ;
Jonasdottir, Adalbjorg ;
Magnusson, Olafur Th. ;
Thorsteinsdottir, Unnur ;
Masson, Gisli ;
Kong, Augustine ;
Gudbjartsson, Daniel F. ;
Stefansson, Kari .
PLOS GENETICS, 2016, 12 (11)
[7]   Ancestry inference of 96 population samples using microhaplotypes [J].
Bulbul, Ozlem ;
Pakstis, Andrew J. ;
Soundararajan, Usha ;
Gurkan, Cemal ;
Brissenden, Jane E. ;
Roscoe, Janet M. ;
Evsanaa, Baigalmaa ;
Togtokh, Ariunaa ;
Paschou, Peristera ;
Grigorenko, Elena L. ;
Gurwitz, David ;
Wootton, Sharon ;
Lagace, Robert ;
Chang, Joseph ;
Speed, William C. ;
Kidd, Kenneth K. .
INTERNATIONAL JOURNAL OF LEGAL MEDICINE, 2018, 132 (03) :703-711
[8]   High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios [J].
Byrska-Bishop, Marta ;
Evani, Uday S. ;
Zhao, Xuefang ;
Basile, Anna O. ;
Abel, Haley J. ;
Regier, Allison A. ;
Corvelo, Andre ;
Clarke, Wayne E. ;
Musunuri, Rajeeva ;
Nagulapalli, Kshithija ;
Fairley, Susan ;
Runnels, Alexi ;
Winterkorn, Lara ;
Lowy, Ernesto ;
Flicek, Paul ;
Germer, Soren ;
Brand, Harrison ;
Hall, Ira M. ;
Talkowski, Michael E. ;
Narzisi, Giuseppe ;
Zody, Michael C. .
CELL, 2022, 185 (18) :3426-+
[9]   Second-generation PLINK: rising to the challenge of larger and richer datasets [J].
Chang, Christopher C. ;
Chow, Carson C. ;
Tellier, Laurent C. A. M. ;
Vattikuti, Shashaank ;
Purcell, Shaun M. ;
Lee, James J. .
GIGASCIENCE, 2015, 4
[10]   Identifying novel microhaplotypes for ancestry inference [J].
Chen, Peng ;
Zhu, Wenjia ;
Tong, Fang ;
Pu, Yan ;
Yu, Youjia ;
Huang, Shuainan ;
Li, Zheng ;
Zhang, Lin ;
Liang, Weibo ;
Chen, Feng .
INTERNATIONAL JOURNAL OF LEGAL MEDICINE, 2019, 133 (04) :983-988