Machine and deep learning performance in out-of-distribution regressions

被引:0
作者
Shmuel, Assaf [1 ]
Glickman, Oren [1 ]
Lazebnik, Teddy [2 ]
机构
[1] Department of Computer Science, Bar Ilan University, Ramat Gan
[2] Department of Cancer Biology, Cancer Institute, University College London, London
来源
Machine Learning: Science and Technology | 2024年 / 5卷 / 04期
关键词
data-driven model generalization; feature engineering; machine learning robustness; out of distribution; symbolic regression;
D O I
10.1088/2632-2153/ada221
中图分类号
学科分类号
摘要
Machine learning (ML) and deep learning (DL) models are gaining popularity due to their effectiveness in many computational tasks. These models are based on an intuitive, but frequently unsatisfied, assumption that the data used to train these models is well-representing the task at hand. This gives rise to the out-of-distribution (OOD) challenge which can cause an unexpected drop in the data-driven model’s performance. In this study, we evaluate the performance of various ML and DL models in in-distribution (ID) versus OOD prediction. While the degradation in OOD performance is well acknowledged, to the best of our knowledge, this is one of the first studies to quantify it for various models on a large benchmark n = 15 real-world regression datasets. We extensively ( n > 40 000 runs) compare the ID versus OOD performance of XGBoost, random forest, K-nearest-neighbors, support vector machine, and linear regression models, as well as AutoML models (Tree-based Pipeline Optimization Tool and AutoKeras). In addition, to tackle this challenge, we propose to integrate a symbolic regression (SR) as a feature engineering method model with an ML or DL model to improve its performance for OOD samples. Our results show that the incorporation of SR-derived features significantly enhances the predictive capabilities of both ML and DL models with 3.70% and 10.20%, on average, of the OOD samples, respectively, without reducing ID performance and in fact improving it to a slightly lower extent. As such, this method can help produce more generalized and robust data-driven models. © 2025 The Author(s). Published by IOP Publishing Ltd.
引用
收藏
相关论文
共 50 条
  • [21] Out-of-Distribution Detection for Deep Neural Networks With Isolation Forest and Local Outlier Factor
    Luan, Siyu
    Gu, Zonghua
    Freidovich, Leonid B.
    Jiang, Lili
    Zhao, Qingling
    IEEE ACCESS, 2021, 9 : 132980 - 132989
  • [22] OOD ATTACK: GENERATING OVERCONFIDENT OUT-OF-DISTRIBUTION EXAMPLES TO FOOL DEEP NEURAL CLASSIFIERS
    Tang, Keke
    Cai, Xujian
    Peng, Weilong
    Li, Shudong
    Wang, Wenping
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1260 - 1264
  • [23] Robust Cough Detection With Out-of-Distribution Detection
    Chen, Yuhan
    Attri, Pankaj
    Barahona, Jeffrey
    Hernandez, Michelle L.
    Carpenter, Delesha
    Bozkurt, Alper
    Lobaton, Edgar
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (07) : 3210 - 3221
  • [24] SEMI-SUPERVISED LEARNING WITH OUT-OF-DISTRIBUTION UNLABELED SAMPLES FOR RETINAL IMAGE CLASSIFICATION
    Jia, Lize
    Guo, Jia
    Zhang, Weihang
    Liu, Hanruo
    Wang, Ningli
    Li, Huiqi
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [25] IMPROVING SELF-SUPERVISED LEARNING FOR OUT-OF-DISTRIBUTION TASK VIA AUXILIARY CLASSIFIER
    Boonlia, Harshita
    Dam, Tanmoy
    Ferdaus, Md Meftahul
    Anavatti, Sreenatha G.
    Mullick, Ankan
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3036 - 3040
  • [26] Learning to Improve Out-of-Distribution Generalization via Self-Adaptive Language Masking
    Jiang, Shuoran
    Pan, Youcheng
    Chen, Qingcai
    Xiang, Yang
    Wu, Xiangping
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2739 - 2750
  • [27] ROTOGBML: TOWARDS OUT-OF-DISTRIBUTION GENERALIZATION FOR GRADIENT-BASED META-LEARNING
    Zhang, Min
    Zhuang, Zifeng
    Wang, Zhitao
    Wang, Donglin
    2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
  • [28] Improving out-of-distribution detection by enforcing confidence margin
    Tamang, Lakpa
    Bouadjenek, Mohamed Reda
    Dazeley, Richard
    Aryal, Sunil
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025,
  • [29] An Out-of-Distribution Attack Resistance Approach to Emotion Categorization
    Shehu H.A.
    Browne W.N.
    Eisenbarth H.
    IEEE Transactions on Artificial Intelligence, 2021, 2 (06): : 564 - 573
  • [30] Out-of-Distribution Detection with Logical Reasoning (Extended Abstract)
    Kirchheim, Konstantin
    Gonschorek, Tim
    Ortmeier, Frank
    KI 2024: ADVANCES IN ARTIFICIAL INTELLIGENCE, KI 2024, 2024, 14992 : 346 - 349