Outliers in statistical pattern recognition and an application to automatic chromosome classification

被引:118
作者
Ritter, G
Gallegos, MT
机构
[1] Fak. für Math. und Informatik, Universität Passau
关键词
outlier estimation; mixture distributions; trimming method; Bayesian classification; statistical pattern recognition; automatic chromosome classification; karyotyping; diagnostic classification; biomedical data model;
D O I
10.1016/S0167-8655(97)00049-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a heuristic method of parameter estimation in mixture models for data with outliers and design a Bayesian classifier for assignment of m objects to n greater than or equal to m classes under constraints. This method of outlier handling combined with the classifier is applied to the well-known problem of automatic, constrained classification of chromosomes into their biological classes. We show that it decreases the error rate relative to the classical, normal, model by more than 50%. When applied to the Edinburgh feature data of the large Copenhagen image data set Cpr our best classifier yields an error rate close to 1.3% relative to chromosomes; 4 out of 5 cells are correctly classified. (C) 1997 Elsevier Science B.V.
引用
收藏
页码:525 / 539
页数:15
相关论文
共 30 条
[1]  
[Anonymous], [No title captured]
[2]  
[Anonymous], 1979, Multivariate analysis
[3]  
Anscombe FJ., 1960, Technometrics, V2, P123, DOI DOI 10.1080/00401706.1960.10489888
[4]   SIGNATURE METHODS FOR THE ASSIGNMENT PROBLEM [J].
BALINSKI, ML .
OPERATIONS RESEARCH, 1985, 33 (03) :527-536
[5]  
BARNETT V, 1994, OUTLIERS STATISTICAL
[6]   Probabilistic models in cluster analysis [J].
Bock, HH .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1996, 23 (01) :5-28
[7]   THE IDENTIFICATION OF MULTIPLE OUTLIERS [J].
DAVIES, L ;
GATHER, U .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (423) :782-792
[8]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[9]   PROCESSING DATA FOR OUTLIERS [J].
DIXON, WJ .
BIOMETRICS, 1953, 9 (01) :74-89
[10]  
Fang K -T., 1990, SYMMETRIC MULTIVARIA