Random projection ensemble conformal prediction for high-dimensional classification

被引:1
|
作者
Qian, Xiaoyu [1 ]
Wu, Jinru [1 ]
Wei, Ligong [3 ]
Lin, Youwu [1 ,2 ,4 ]
机构
[1] Guilin Univ Elect Technol, Guangxi Coll & Univ Key Lab Data Anal & Computat, Sch Math & Comp Sci, Guilin 541002, Peoples R China
[2] Ctr Appl Math Guangxi GUET, Guilin 541002, Peoples R China
[3] Guangxi Acad Sci High Tech Grp Co LTD, Nanning 530007, Peoples R China
[4] Peking Univ, Guanghua Sch Management, Beijing, Peoples R China
关键词
High-dimensional classification; Conformal prediction; Random projection; Ensemble algorithm; JOHNSON-LINDENSTRAUSS;
D O I
10.1016/j.chemolab.2024.105225
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In classification problems, many models with superior performance fail to provide confidence estimates or intervals for each prediction. This lack of reliability poses risks in real-world applications, making these models difficult to trust. Conformal prediction, as distribution-free and model-free approaches with finite-sample coverage guarantee, have recently been widely used to construct prediction sets for classification models. However, traditional conformal prediction methods only produce set-valued results without specifying a definitive predicted class. Particularly in complex settings, these methods fail to assist models in effectively addressing challenges such as high dimensionality, resulting in ambiguous prediction sets with low statistical efficiency, i.e. the prediction sets contain many false classes. In this study, a novel Ensemble Conformal Prediction algorithm based on Random Projection and a designed voting strategy, RPECP, is developed to tackle these challenges. Initially, a procedure for selecting the approximately oracle random projections and classifiers is executed to best leverage the internal information and structure of the data. Subsequently, based on the approximately oracle random projections and underlying classifiers, conformal prediction is performed on new test samples in a lower-dimensional space, resulting in multiple independent prediction sets. Finally, an accurate predicted class and a precise prediction set with high coverage and statistical efficiency are produced through a designed voting strategy. Compared to several base classifiers, RPECP obtain higher classification accuracy; against other conformal prediction algorithms, it achieves less ambiguous prediction sets with fewer false classes while guaranteeing high coverage. For illustration, this paper demonstrates RPECP's superiority over other methods in four cases: two high-dimensional settings and two real-world datasets.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Random projection ensemble conformal prediction for high-dimensional classification (Vol 253, 105225, 2024)
    Qian, Xiaoyu
    Wu, Jinru
    Wei, Ligong
    Lin, Youwu
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2024, 254
  • [2] Random projection ensemble classification with high-dimensional time series
    Zhang, Fuli
    Chan, Kung-Sik
    BIOMETRICS, 2023, 79 (02) : 964 - 974
  • [3] Classification in High-Dimensional Feature Spaces: Random Subsample Ensemble
    Serpen, Gursel
    Pathical, Santhosh
    EIGHTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2009, : 740 - 745
  • [4] Targeted Random Projection for Prediction From High-Dimensional Features
    Mukhopadhyay, Minerva
    Dunson, David B.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (532) : 1998 - 2010
  • [5] High-Dimensional Ensemble Learning Classification: An Ensemble Learning Classification Algorithm Based on High-Dimensional Feature Space Reconstruction
    Zhao, Miao
    Ye, Ning
    APPLIED SCIENCES-BASEL, 2024, 14 (05):
  • [6] Ensemble Method for Classification of High-Dimensional Data
    Piao, Yongjun
    Park, Hyun Woo
    Jin, Cheng Hao
    Ryu, Keun Ho
    2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 245 - +
  • [7] Random-projection ensemble classification
    Cannings, Timothy I.
    Samworth, Richard J.
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2017, 79 (04) : 959 - 1035
  • [8] Ensemble of optimal trees, random forest and random projection ensemble classification
    Khan, Zardad
    Gul, Asma
    Perperoglou, Aris
    Miftahuddin, Miftahuddin
    Mahmoud, Osama
    Adler, Werner
    Lausen, Berthold
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2020, 14 (01) : 97 - 116
  • [9] Ensemble of optimal trees, random forest and random projection ensemble classification
    Zardad Khan
    Asma Gul
    Aris Perperoglou
    Miftahuddin Miftahuddin
    Osama Mahmoud
    Werner Adler
    Berthold Lausen
    Advances in Data Analysis and Classification, 2020, 14 : 97 - 116
  • [10] High-dimensional data classification model based on random projection and Bagging-support vector machine
    Sun, Yujia
    Platos, Jan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (09):