Robust and flexible learning of a high-dimensional classification rule using auxiliary outcomes

被引：0

作者：

Liang, Muxuan ^{[1
]}

Park, Jaeyoung ^{[2
]}

Lu, Qing ^{[1
]}

Zhong, Xiang ^{[3
]}

机构：

[1] Univ Florida, Dept Biostat, 2004 Mowry Rd, 5th Floor CTRB, Gainesville, FL 32611 USA

[2] Univ Cent Florida, Sch Global Hlth Management & Informat, Orlando, FL 32816 USA

[3] Univ Florida, Dept Ind & Syst Engn, Gainesville, FL 32611 USA

来源：

BIOMETRICS | 2024年 / 80卷 / 04期

关键词：

auxiliary outcomes; classification; high-dimensional data; multi-task learning; transfer learning; MULTITASK; ALGORITHMS; PREDICT;

D O I：

10.1093/biomtc/ujae144

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Correlated outcomes are common in many practical problems. In some settings, one outcome is of particular interest, and others are auxiliary. To leverage information shared by all the outcomes, traditional multi-task learning (MTL) minimizes an averaged loss function over all the outcomes, which may lead to biased estimation for the target outcome, especially when the MTL model is misspecified. In this work, based on a decomposition of estimation bias into two types, within-subspace and against-subspace, we develop a robust transfer learning approach to estimating a high-dimensional linear decision rule for the outcome of interest with the presence of auxiliary outcomes. The proposed method includes an MTL step using all outcomes to gain efficiency and a subsequent calibration step using only the outcome of interest to correct both types of biases. We show that the final estimator can achieve a lower estimation error than the one using only the single outcome of interest. Simulations and real data analysis are conducted to justify the superiority of the proposed method.

引用

页数：9

共 50 条

[41] Consistent and Flexible Selectivity Estimation for High-Dimensional Data
Wang, Yaoshu
Xiao, Chuan
Qin, Jianbin
Mao, Rui
Onizuka, Makoto
Wang, Wei
Zhang, Rui
Ishikawa, Yoshiharu
SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2319 - 2327
[42] Joint and Progressive Learning from High-Dimensional Data for Multi-label Classification
Hong, Danfeng
Yokoya, Naoto
Xu, Jian
Zhu, Xiaoxiang
COMPUTER VISION - ECCV 2018, PT VIII, 2018, 11212 : 478 - 493
[43] Benchmark for filter methods for feature selection in high-dimensional classification data
Bommert, Andrea
Sun, Xudong
Bischl, Bernd
Rahnenfuehrer, Joerg
Lang, Michel
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 143
[44] Genetic programming for feature construction and selection in classification on high-dimensional data
Binh Tran
Bing Xue
Mengjie Zhang
Memetic Computing, 2016, 8 : 3 - 15
[45] Genetic Programming Based on Granular Computing for Classification with High-Dimensional Data
Pei, Wenbin
Xue, Bing
Shang, Lin
Zhang, Mengjie
AI 2018: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11320 : 643 - 655
[46] High-dimensional variable selection via low-dimensional adaptive learning
Staerk, Christian
Kateri, Maria
Ntzoufras, Ioannis
ELECTRONIC JOURNAL OF STATISTICS, 2021, 15 (01): : 830 - 879
[47] Genetic programming for feature construction and selection in classification on high-dimensional data
Binh Tran
Xue, Bing
Zhang, Mengjie
MEMETIC COMPUTING, 2016, 8 (01) : 3 - 15
[48] Robust high-dimensional regression for data with anomalous responses
Ren, Mingyang
Zhang, Sanguo
Zhang, Qingzhao
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2021, 73 (04) : 703 - 736
[49] Genetic programming for multiple-feature construction on high-dimensional classification
Binh Tran
Xue, Bing
Zhang, Mengjie
PATTERN RECOGNITION, 2019, 93 : 404 - 417
[50] High-Dimensional Data Classification Based on Smooth Support Vector Machines
Purnami, Santi Wulan
Andari, Shofi
Pertiwi, Yuniati Dian
THIRD INFORMATION SYSTEMS INTERNATIONAL CONFERENCE 2015, 2015, 72 : 477 - 484

← 1 2 3 4 5 →