On the use of machine learning algorithms in forensic anthropology

被引:50
作者
Nikita, Efthymia [1 ]
Nikitas, Panos [2 ]
机构
[1] Cyprus Inst, Sci & Technol Archaeol & Culture Res Ctr, 20 Konstantinou Kavafi St, CY-2121 Nicosia, Cyprus
[2] Aristotle Univ Thessaloniki, Dept Chem, Univ Campus, Thessaloniki 54124, Greece
关键词
Forensic anthropology; Sex estimation; Ancestry estimation; Cranium; Pelvis; Statistical methods; DISCRIMINANT FUNCTION-ANALYSIS; ESTIMATING ANCESTRY; SEXUAL-DIMORPHISM; HUMAN INNOMINATE; CLASSIFICATION; TRAITS; MODELS; SHAPE;
D O I
10.1016/j.legalmed.2020.101771
中图分类号
DF [法律]; D9 [法律]; R [医药、卫生];
学科分类号
0301 ; 10 ;
摘要
The classification performance of the statistical methods binary logistic regression (BLR), multinomial and penalized multinomial logistic regression (MLR, pMLR), linear discriminant analysis (LDA), and the machine learning algorithms naive Bayes classification (NBC), decision trees (DT), random forest (RF), artificial neural networks (ANN), support vector machines (linear, polynomial or radial) (SVM), multivariate adaptive regression splines (MARS), and extreme gradient boosting (XGB) is examined in skeletal sex/ancestry estimation. The datasets used to test the performance of these methods were obtained from a documented human skeletal col-lection, Athens Collection, and the Howells Craniometric data set. For their implementation, an R package has been written to search for the optimum tuning parameters under cross-validation and perform sex/ancestry classification. It was found that the classification performance may vary significantly depending on the problem. From the methods tested, LDA and the machine learning technique of linear SVM exhibit the best performance, with high prediction accuracy and relatively low bias in most of the tests. ANN and pMLR can generally be considered to give satisfactory predictions, whereas NBC when using metric traits and DT are the worst of the classification methods examined. The possibility of making the models developed via the machine learning algorithms applicable to other assemblages without the use of a training sample is also discussed.
引用
收藏
页数:8
相关论文
共 58 条
[1]  
[Anonymous], **DATA OBJECT**
[2]  
[Anonymous], 2018, Applied predictive modeling
[3]  
[Anonymous], 2009, Neural networks and learning machines
[4]  
Breiman L., 2001, IEEE Trans. Broadcast., V45, P5
[5]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[6]   A method for sex estimation using the proximal femur [J].
Curate, Francisco ;
Coelho, Joao ;
Goncalves, David ;
Coelho, Catarina ;
Ferreira, Maria Teresa ;
Navega, David ;
Cunha, Eugenia .
FORENSIC SCIENCE INTERNATIONAL, 2016, 266 :579.e1-579.e7
[7]   Comparison on three classification techniques for sex estimation from the bone length of Asian children below 19 years old: An analysis using different group of ages [J].
Darmawan, M. F. ;
Yusuf, Suhaila M. ;
Kadir, M. R. Abdul ;
Haron, H. .
FORENSIC SCIENCE INTERNATIONAL, 2015, 247 :130.e1-130.e11
[8]   A comparison between neural network and other metric methods to determine sex from the upper femur in a modern French population [J].
du Jardin, Ph. ;
Ponsaille, J. ;
Alunni-Perret, V. ;
Quatrehomme, G. .
FORENSIC SCIENCE INTERNATIONAL, 2009, 192 (1-3) :127.e1-127.e6
[9]  
Dunham MH., 2002, INTRO ADV TOPICS
[10]   A modern, documented human skeletal collection from Greece [J].
Eliopouios, C. ;
Lagia, A. ;
Manolls, S. .
HOMO-JOURNAL OF COMPARATIVE HUMAN BIOLOGY, 2007, 58 (03) :221-228