An automated framework for fast cognate detection and Bayesian phylogenetic inference in computational historical linguistics

被引:0
作者
Rama, Taraka [1 ]
List, Johann-Mattis [2 ]
机构
[1] Univ North Texas, Dept Linguist, Denton, TX 76203 USA
[2] MPI SHH, DLCE, Jena, Germany
来源
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019) | 2019年
基金
欧洲研究理事会;
关键词
CHAIN MONTE-CARLO; DNA-SEQUENCES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a fully automated workflow for phylogenetic reconstruction on large datasets, consisting of two novel methods, one for fast detection of cognates and one for fast Bayesian phylogenetic inference. Our results show that the methods take less than a few minutes to process language families that have so far required large amounts of time and computational power. Moreover, the cognates and the trees inferred from the method are quite close, both to gold standard cognate judgments and to expert language family trees. Given its speed and ease of application, our framework is specifically useful for the exploration of very large datasets in historical linguistics.
引用
收藏
页码:6225 / 6235
页数:11
相关论文
共 44 条
[1]   Parallel metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference [J].
Altekar, G ;
Dwarkadas, S ;
Huelsenbeck, JP ;
Ronquist, F .
BIOINFORMATICS, 2004, 20 (03) :407-415
[2]   A comparison of extrinsic clustering evaluation metrics based on formal constraints [J].
Amigo, Enrique ;
Gonzalo, Julio ;
Artiles, Javier ;
Verdejo, Felisa .
INFORMATION RETRIEVAL, 2009, 12 (04) :461-486
[3]   An introduction to MCMC for machine learning [J].
Andrieu, C ;
de Freitas, N ;
Doucet, A ;
Jordan, MI .
MACHINE LEARNING, 2003, 50 (1-2) :5-43
[4]  
[Anonymous], 1964, VOPR YAZYKOZNANIYA
[5]  
[Anonymous], P 2017 C EMP METH NA
[6]  
Ayres DL, 2012, SYST BIOL, V61, P170, DOI [10.1093/sysbio/syr100, 10.1093/sysbio/sys029]
[7]   Mapping the Origins and Expansion of the Indo-European Language Family [J].
Bouckaert, Remco ;
Lemey, Philippe ;
Dunn, Michael ;
Greenhill, Simon J. ;
Alekseyenko, Alexander V. ;
Drummond, Alexei J. ;
Gray, Russell D. ;
Suchard, Marc A. ;
Atkinson, Quentin D. .
SCIENCE, 2012, 337 (6097) :957-960
[8]   ANCESTRY-CONSTRAINED PHYLOGENETIC ANALYSIS SUPPORTS THE INDO-EUROPEAN STEPPE HYPOTHESIS [J].
Chang, Will ;
Cathcart, Chundra ;
Hall, David ;
Garrett, Andrew .
LANGUAGE, 2015, 91 (01) :194-244
[9]   Fast calculation of the quartet distance between trees of arbitrary degrees [J].
Christiansen, Chris ;
Mailund, Thomas ;
Pedersen, Christian N. S. ;
Randers, Martin ;
Stissing, Martin Stig .
ALGORITHMS FOR MOLECULAR BIOLOGY, 2006, 1 (1)
[10]  
Du, 2016, SEARCH OPTIMIZATION, P29, DOI [10.1007/978-3-319-41192-7_2, DOI 10.1007/978-3-319-41192-7_2, DOI 10.1007/978-3-319-41192-72]