We consider the model-free feature screening problem that aims to discard non-informative features before downstream analysis. Most of the existing feature screening approaches have at least quadratic computational cost with respect to the sample size n, thus, may suffer from a huge computational burden when n is large. To alleviate the computational burden, we propose a scalable model-free sure independence screening approach. This approach is based on the so-called sliced-Wasserstein dependency, a novel metric that measures the dependence between two random variables. Specifically, we quantify the dependence between two random variables by measuring the sliced-Wasserstein distance between their joint distribution and the product of their marginal distributions. For a predictor matrix of size n x d, the computational cost for the proposed algorithm is at the order of O(n log (n)d), even when the response variable is multivariate. Theoretically, we show the proposed method enjoys both sure screening and rank consistency properties under mild regularity conditions. Numerical studies on various synthetic and real-world datasets demonstrate the superior performance of the proposed method in comparison with mainstream competitors, requiring significantly less computational time. for this article are available online.
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USAPenn State Univ, Dept Stat, University Pk, PA 16802 USA
Liu, Wanjun
Ke, Yuan
论文数: 0引用数: 0
h-index: 0
机构:
Univ Georgia, Dept Stat, Athens, GA 30602 USAPenn State Univ, Dept Stat, University Pk, PA 16802 USA
Ke, Yuan
Liu, Jingyuan
论文数: 0引用数: 0
h-index: 0
机构:
Xiamen Univ, Wang Yanan Inst Studies Econ, Sch Econ, MOE Key Lab Econometr,Dept Stat, Xiamen 361005, Peoples R China
Xiamen Univ, Fujian Key Lab Stat, Xiamen 361005, Peoples R ChinaPenn State Univ, Dept Stat, University Pk, PA 16802 USA
Liu, Jingyuan
Li, Runze
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USAPenn State Univ, Dept Stat, University Pk, PA 16802 USA
机构:
Chinese Acad Sci, Acad Math & Syst Sci, Beijing 100190, Peoples R China
Univ Chinese Acad Sci, Beijing 100049, Peoples R ChinaChinese Acad Sci, Acad Math & Syst Sci, Beijing 100190, Peoples R China
Sheng, Ying
Wang, Qihua
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Acad Sci, Acad Math & Syst Sci, Beijing 100190, Peoples R China
Univ Chinese Acad Sci, Beijing 100049, Peoples R China
Zhejiang Gongshang Univ, Sch Stat & Math, Hangzhou 310018, Peoples R ChinaChinese Acad Sci, Acad Math & Syst Sci, Beijing 100190, Peoples R China
机构:
Shanghai Univ Finance & Econ, Sch Stat & Management, Shanghai, Peoples R ChinaPenn State Univ, Dept Stat, University Pk, PA 16802 USA
Zhu, Li-Ping
Li, Lexin
论文数: 0引用数: 0
h-index: 0
机构:
N Carolina State Univ, Dept Stat, Raleigh, NC 27695 USAPenn State Univ, Dept Stat, University Pk, PA 16802 USA
Li, Lexin
Li, Runze
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USA
Penn State Univ, Methodol Ctr, University Pk, PA 16802 USAPenn State Univ, Dept Stat, University Pk, PA 16802 USA
Li, Runze
Zhu, Li-Xing
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Baptist Univ, Dept Math, Hong Kong, Hong Kong, Peoples R ChinaPenn State Univ, Dept Stat, University Pk, PA 16802 USA
机构:
Zhejiang Univ Finance & Econ, Sch Data Sci, Hangzhou, Zhejiang, Peoples R ChinaZhejiang Univ Finance & Econ, Sch Data Sci, Hangzhou, Zhejiang, Peoples R China
Zhou, Tingyou
Zhu, Liping
论文数: 0引用数: 0
h-index: 0
机构:
Renmin Univ China, Ctr Appl Stat, Inst Stat & Big Data, 59 Zhongguancun Ave, Beijing 100872, Peoples R ChinaZhejiang Univ Finance & Econ, Sch Data Sci, Hangzhou, Zhejiang, Peoples R China
Zhu, Liping
Xu, Chen
论文数: 0引用数: 0
h-index: 0
机构:
Univ Ottawa, Dept Math & Stat, Ottawa, ON, CanadaZhejiang Univ Finance & Econ, Sch Data Sci, Hangzhou, Zhejiang, Peoples R China
Xu, Chen
Li, Runze
论文数: 0引用数: 0
h-index: 0
机构:
Penn State Univ, Dept Stat, University Pk, PA 16802 USA
Penn State Univ, Methodol Ctr, University Pk, PA 16802 USAZhejiang Univ Finance & Econ, Sch Data Sci, Hangzhou, Zhejiang, Peoples R China