Likelihood ratio-based probabilistic classifier

被引:0
作者
Martyna, Agnieszka [1 ]
Nordgaard, Anders [2 ]
机构
[1] Univ Silesia Katowice, Inst Chem, Forens Chem Res Grp, Szkolna 9, PL-40006 Katowice, Poland
[2] Linkoping Univ, Swedish Natl Forens Ctr, Dept Comp & Informat Sci, S-58194 Linkoping, Sweden
关键词
Probabilistic classification; Likelihood ratio; Multimodal class distributions; MULTIVARIATE DATA; MODEL;
D O I
10.1016/j.chemolab.2023.104862
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern classification methods are likely to misclassify samples with rare but class-specific data that are more similar (less distant) to the data of another than the original class. This is because they tend to focus on the majority of data, leaving the information provided by the rare data practically ignored. Nevertheless, it is an invaluable source of information that should support classification of samples with such data, despite their low frequency. Current solutions considering the rarity information involve likelihood ratio models (LR). We intend to modify the existing LR models to establish the class membership for the analysed samples by comparing them with the samples of known class label. If two compared samples show similarities of rare but class-specific features it makes the analysed sample much more likely to be a member of this class than any other class, even when its features are less distant to the features of most samples from other classes. The fundamental advantage of the developed methodology is inclusion of information about rare, class-specific features, which is neglected by ordinary classifiers. Converting LR values into probabilities with which a sample belongs to the classes under consideration, generates a powerful tool within the concept of probabilistic classification.
引用
收藏
页数:12
相关论文
共 20 条
[1]   Evaluation of transfer evidence for three-level multivariate data with the use of graphical models [J].
Aitken, C. G. G. ;
Lucy, D. ;
Zadora, G. ;
Curran, J. M. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (10) :2571-2588
[2]   Evaluation of trace evidence in the form of multivariate data [J].
Aitken, CGG ;
Lucy, D .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2004, 53 :109-122
[3]   A two-level model for evidence evaluation [J].
Aitken, Colin G. G. ;
Zadora, Grzegorz ;
Lucy, David .
JOURNAL OF FORENSIC SCIENCES, 2007, 52 (02) :412-419
[4]  
[Anonymous], 2018, R: A Language and Environment for Statistical Computing
[5]  
Brown SD, 2009, COMPREHENSIVE CHEMOMETRICS: CHEMICAL AND BIOCHEMICAL DATA ANALYSIS, VOLS 1-4, P1
[6]  
European Network of Forensic Science Institutes (ENFSI), 2015, ENFSI GUID EV REP FO
[7]  
FORINA M, 1986, VITIS, V25, P189, DOI 10.5073/vitis.1986.25.189-201
[8]   COMPUTER AIDED DESIGN OF EXPERIMENTS [J].
KENNARD, RW ;
STONE, LA .
TECHNOMETRICS, 1969, 11 (01) :137-&
[9]   Improving discrimination of Raman spectra by optimising preprocessing strategies on the basis of the ability to refine the relationship between variance components [J].
Martyna, Agnieszka ;
Menzyk, Alicja ;
Damin, Alessandro ;
Michalska, Aleksandra ;
Martra, Gianmario ;
Alladio, Eugenio ;
Zadora, Grzegorz .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 202
[10]   Forensic comparison of pyrograms using score-based likelihood ratios [J].
Martyna, Agnieszka ;
Zadora, Grzegorz ;
Ramos, Daniel .
JOURNAL OF ANALYTICAL AND APPLIED PYROLYSIS, 2018, 133 :198-215