Classification with discrete and continuous variables via general mixed-data models

被引:14
|
作者
de Leon, A. R. [1 ]
Soo, A. [2 ]
Williamson, T. [2 ]
机构
[1] Univ Calgary, Dept Math & Stat, Calgary, AB T2N 1N4, Canada
[2] Univ Calgary, Dept Community Hlth Sci, Calgary, AB T2N 1N4, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
error rate; general location model; grouped continuous model; maximum likelihood; measurement level; minimum distance probability; misclassification probability; plug-in estimates; DISCRIMINANT-ANALYSIS; DISTANCE; ROBUSTNESS; LOCATION; BINARY;
D O I
10.1080/02664761003758976
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We study the problem of classifying an individual into one of several populations based on mixed nominal, continuous, and ordinal data. Specifically, we obtain a classification procedure as an extension to the so-called location linear discriminant function, by specifying a general mixed-data model for the joint distribution of the mixed discrete and continuous variables. We outline methods for estimating misclassification error rates. Results of simulations of the performance of proposed classification rules in various settings vis-a-vis a robust mixed-data discrimination method are reported as well. We give an example utilizing data on croup in children.
引用
收藏
页码:1021 / 1032
页数:12
相关论文
共 50 条