Optimal Ratio of Continuous to Categorical Variables for the Two Group Location Model

被引：0

作者：

Baah, Philemon ^{[1
]}

Adebanji, Atinuke ^{[1
]}

Kakaie, Romain Glele ^{[2
]}

机构：

[1] Kwame Nkrumah Univ Sci & Technol, Dept Math, Kumasi, Ghana

[2] Univ Abomey Calavi, Fac Agron Sci, Cotonou, Benin

来源：

INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS & STATISTICS | 2013年 / 42卷 / 12期

关键词：

Location model; classification; categorical to continuous variables; contingency table; leave-one-out method;

D O I：

暂无

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

We investigated the effect of different combinations of (p) continuous to (q) categorical variables and increasing group centroid separation function (delta = 1, 2, 3) on the performance of the Location model for two groups (Pi(i), i = 1, 2). The number of predictor variables were 4 and 8 with 1:3, 1:1 and 3:1 being the predetermined ratios for p : q. We generated N(mu(1), I) of sizes 40, 80 and 120 with MatLab R2007b for p variables within 2(q) binary cells in Pi(1). The size of Pi(2) was determined using sample ratios 1:1, 1:2, 1:3 and 1:4 for n(1) : n(2) within 2(q) cells. Group1 has mean mu((1))(1) = 0 in the first cell (for p continuous variables) and mu((1))(2) 2 = delta, subsequent cells, mu((m+1))(i) = mu((m))(i) + 1. Error rates reduced more rapidly for increase in d than asymptotically. The optimal p : q was 3: 1 and the model deteriorated at 1: 3 with larger variability. The 8 variable model performed better than the 4 variable model for large sample sizes of p : q = 1 : 1 and outperformed it for all sample sizes of p : q = 3 : 1. Results showed that to use the Location model for classification problems with equal (or more) categorical to continuous variables, it should be compensated with increased distance function and sample sizes.

引用

页码：18 / 26

页数：9

共 50 条

[21] Graphics and statistics for cardiology: comparing categorical and continuous variables
Rice, Kenneth
Lumley, Thomas
HEART, 2016, 102 (05) : 349 - 355
[22] Optimal design of frame structures with mixed categorical and continuous design variables using the Gumbel-Softmax method
Ebrahimi, Mehran
Cheong, Hyunmin
Jayaraman, Pradeep Kumar
Javid, Farhad
STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2024, 67 (03)
[23] Interpreting and Modeling the Nugget Effect for Mixed Categorical and Continuous Variables
Miguel-Silva, Victor
MINING METALLURGY & EXPLORATION, 2025, 42 (01) : 295 - 306
[24] A SGeMS code for pattern simulation of continuous and categorical variables: FILTERSIM
Wu, Jianbing
Boucher, Alexandre
Zhang, Tuanfeng
COMPUTERS & GEOSCIENCES, 2008, 34 (12) : 1863 - 1876
[25] The analysis of continuous and categorical independent variables: Alternatives to dichotomization.
Brauer, M
ANNEE PSYCHOLOGIQUE, 2002, 102 (03): : 449 - 484
[26] DISTANCE BETWEEN POPULATIONS USING MIXED CONTINUOUS AND CATEGORICAL VARIABLES
KRZANOWSKI, WJ
BIOMETRIKA, 1983, 70 (01) : 235 - 243
[27] Continuous, categorical, and time to event cocaine use outcome variables: degree of intercorrelation and sensitivity to treatment group differences
McKay, JR
Alterman, AI
Koppenhaver, JM
Mulvaney, FD
Bovasso, GB
Ward, K
DRUG AND ALCOHOL DEPENDENCE, 2001, 62 (01) : 19 - 30
[28] Informational distances and related statistics in mixed continuous and categorical variables
Morales, D
Pardo, L
Zografos, K
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1998, 75 (01) : 47 - 63
[29] ON MODEL-CHOICE IN DISCRIMINATION WITH CATEGORICAL VARIABLES
WERNECKE, KD
HAERTING, J
KALB, G
STUERZEBECHER, E
BIOMETRICAL JOURNAL, 1989, 31 (03) : 289 - 296
[30] Decomposing Group Differences of Latent Means of Ordered Categorical Variables within a Genetic Factor Model
Seung Bin Cho
Phillip K. Wood
Andrew C. Heath
Behavior Genetics, 2009, 39 : 101 - 122

← 1 2 3 4 5 →