Feature screening for ultrahigh dimensional categorical data with covariates missing at random
被引:9
作者:
Ni, Lyu
论文数: 0引用数: 0
h-index: 0
机构:
East China Normal Univ, Sch Data Sci & Engn, Shanghai, Peoples R ChinaEast China Normal Univ, Sch Data Sci & Engn, Shanghai, Peoples R China
Ni, Lyu
[1
]
Fang, Fang
论文数: 0引用数: 0
h-index: 0
机构:
East China Normal Univ, Sch Stat, Key Lab Adv Theory & Applicat Stat & Data Sci MOE, Shanghai, Peoples R ChinaEast China Normal Univ, Sch Data Sci & Engn, Shanghai, Peoples R China
Fang, Fang
[2
]
Shao, Jun
论文数: 0引用数: 0
h-index: 0
机构:
East China Normal Univ, Sch Stat, Key Lab Adv Theory & Applicat Stat & Data Sci MOE, Shanghai, Peoples R China
Univ Wisconsin, Dept Stat, Madison, WI 53706 USAEast China Normal Univ, Sch Data Sci & Engn, Shanghai, Peoples R China
Shao, Jun
[2
,3
]
机构:
[1] East China Normal Univ, Sch Data Sci & Engn, Shanghai, Peoples R China
[2] East China Normal Univ, Sch Stat, Key Lab Adv Theory & Applicat Stat & Data Sci MOE, Shanghai, Peoples R China
[3] Univ Wisconsin, Dept Stat, Madison, WI 53706 USA
Feature screening;
Missing at random;
Missing covariate;
Pearson Chi-Square statistic;
Sure screening property;
VARIABLE SELECTION;
KOLMOGOROV FILTER;
MODEL SELECTION;
REGRESSION;
D O I:
10.1016/j.csda.2019.106824
中图分类号:
TP39 [计算机的应用];
学科分类号:
081203 ;
0835 ;
摘要:
Most existing feature screening methods assume that data are fully observed. It is quite a challenge to develop screening methods for incomplete data since the traditional missing data analysis techniques cannot be directly applied to ultrahigh dimensional case. A two-step model-free feature screening procedure for ultrahigh dimensional categorical data when some covariate values are missing at random is developed. For each covariate with missing data, the first step screens out the variables in the unspecified propensity function. In the second step, screening statistics such as the adjusted Pearson Chi-Square statistics can be calculated by leveraging the variables obtained in the first step and the special structure of categorical data. Sure screening properties are established for the proposed method. Finite sample performance is investigated by simulation studies and a real data example. (C) 2019 Elsevier B.V. All rights reserved.
机构:
Capital Normal Univ, Dept Stat, Beijing, Peoples R ChinaCapital Normal Univ, Dept Stat, Beijing, Peoples R China
Cui, Hengjian
Li, Runze
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USA
Penn State Univ, Methodol Ctr, University Pk, PA 16802 USACapital Normal Univ, Dept Stat, Beijing, Peoples R China
Li, Runze
Zhong, Wei
论文数: 0引用数: 0
h-index: 0
机构:
Xiamen Univ, Wang Yanan Inst Studies Econ WISE, Dept Stat, Beijing, Peoples R China
Xiamen Univ, Fujian Key Lab Stat Sci, Xiamen, Peoples R ChinaCapital Normal Univ, Dept Stat, Beijing, Peoples R China
机构:
Chinese Acad Sci, Acad Math & Syst Sci, Ctr Stat Res, Beijing 100080, Peoples R ChinaChinese Acad Sci, Acad Math & Syst Sci, Ctr Stat Res, Beijing 100080, Peoples R China
Fan, Jianqing
Ma, Yunbei
论文数: 0引用数: 0
h-index: 0
机构:
Southwestern Univ Finance & Econ, Sch Stat, Chengdu 611130, Sichuan, Peoples R ChinaChinese Acad Sci, Acad Math & Syst Sci, Ctr Stat Res, Beijing 100080, Peoples R China
Ma, Yunbei
Dai, Wei
论文数: 0引用数: 0
h-index: 0
机构:
Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USAChinese Acad Sci, Acad Math & Syst Sci, Ctr Stat Res, Beijing 100080, Peoples R China
机构:
Peking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R ChinaPeking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R China
Huang, Danyang
Li, Runze
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USA
Penn State Univ, Methodol Ctr, University Pk, PA 16802 USAPeking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R China
Li, Runze
Wang, Hansheng
论文数: 0引用数: 0
h-index: 0
机构:
Peking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R ChinaPeking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R China
机构:
Capital Normal Univ, Dept Stat, Beijing, Peoples R ChinaCapital Normal Univ, Dept Stat, Beijing, Peoples R China
Cui, Hengjian
Li, Runze
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USA
Penn State Univ, Methodol Ctr, University Pk, PA 16802 USACapital Normal Univ, Dept Stat, Beijing, Peoples R China
Li, Runze
Zhong, Wei
论文数: 0引用数: 0
h-index: 0
机构:
Xiamen Univ, Wang Yanan Inst Studies Econ WISE, Dept Stat, Beijing, Peoples R China
Xiamen Univ, Fujian Key Lab Stat Sci, Xiamen, Peoples R ChinaCapital Normal Univ, Dept Stat, Beijing, Peoples R China
机构:
Chinese Acad Sci, Acad Math & Syst Sci, Ctr Stat Res, Beijing 100080, Peoples R ChinaChinese Acad Sci, Acad Math & Syst Sci, Ctr Stat Res, Beijing 100080, Peoples R China
Fan, Jianqing
Ma, Yunbei
论文数: 0引用数: 0
h-index: 0
机构:
Southwestern Univ Finance & Econ, Sch Stat, Chengdu 611130, Sichuan, Peoples R ChinaChinese Acad Sci, Acad Math & Syst Sci, Ctr Stat Res, Beijing 100080, Peoples R China
Ma, Yunbei
Dai, Wei
论文数: 0引用数: 0
h-index: 0
机构:
Princeton Univ, Dept Operat Res & Financial Engn, Princeton, NJ 08544 USAChinese Acad Sci, Acad Math & Syst Sci, Ctr Stat Res, Beijing 100080, Peoples R China
机构:
Peking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R ChinaPeking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R China
Huang, Danyang
Li, Runze
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USA
Penn State Univ, Methodol Ctr, University Pk, PA 16802 USAPeking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R China
Li, Runze
Wang, Hansheng
论文数: 0引用数: 0
h-index: 0
机构:
Peking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R ChinaPeking Univ, Guanghua Sch Management, Dept Business Stat & Econometr, Beijing 100871, Peoples R China