Estimating non-overfitted convex production technologies: A stochastic machine learning approach

被引:0
|
作者
Guillen, Maria D. [1 ]
Charles, Vincent [2 ]
Aparicio, Juan [1 ,3 ]
机构
[1] Miguel Hernandez Univ Elche, Ctr Operat Res, Avda Univ S-N, Elche 03202, Spain
[2] Queens Univ Belfast, Queens Business Sch, Belfast BT9 5EE, North Ireland
[3] ValgrAI Valencian Grad Sch & Res Network Artificia, Joint Res Unit, Camino Vera S-N, Valencia 46022, Spain
关键词
Data Envelopment Analysis; Technical efficiency measurement; Stochastic gradient boosting; Machine learning; DATA ENVELOPMENT ANALYSIS; MEASURING EFFICIENCY; BOOTSTRAP; MODELS; DEA;
D O I
10.1016/j.ejor.2024.11.030
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Overfitting is a classical statistical issue that occurs when a model fits a particular observed data sample too closely, potentially limiting its generalizability. While Data Envelopment Analysis (DEA) is a powerful nonparametric method for assessing the relative efficiency of decision-making units (DMUs), its reliance on the minimal extrapolation principle can lead to concerns about overfitting, particularly when the goal extends beyond evaluating the specific DMUs in the sample to making broader inferences. In this paper, we propose an adaptation of Stochastic Gradient Boosting to estimate production possibility sets that mitigate overfitting while satisfying shape constraints such as convexity and free disposability. Our approach is not intended to replace DEA but to complement it, offering an additional tool for scenarios where generalization is important. Through simulation experiments, we demonstrate that the proposed method performs well compared to DEA, especially in high-dimensional settings. Furthermore, the new machine learning-based technique is compared to the Corrected Concave Non-parametric Least Squares (C2NLS), showing competitive performance. We also illustrate how the usual efficiency measures in DEA can be implemented under our approach. Finally, we provide an empirical example based on data from the Program for International Student Assessment (PISA) to demonstrate the applicability of the new method.
引用
收藏
页码:224 / 240
页数:17
相关论文
共 50 条
  • [31] Machine Learning Memory Kernels as Closure for Non-Markovian Stochastic Processes
    Russo, Antonio
    Duran-Olivencia, Miguel A.
    Kevrekidis, Ioannis G.
    Kalliadasis, Serafim
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 6531 - 6543
  • [32] Prediction of critical total drawdown in sand production from gas wells: Machine learning approach
    Alakbari, Fahd Saeed
    Mohyaldinn, Mysara Eissa
    Ayoub, Mohammed Abdalla
    Muhsan, Ali Samer
    Abdulkadir, Said Jadid
    Hussein, Ibnelwaleed A.
    Salih, Abdullah Abduljabbar
    CANADIAN JOURNAL OF CHEMICAL ENGINEERING, 2023, 101 (05) : 2493 - 2509
  • [33] The Application of Machine-Learning Technologies in the Design and Production of Composite-Material Structures
    I. D. Shonichev
    V. S. Tynchenko
    A. S. Borodulin
    S. S. Muzyka
    Polymer Science, Series D, 2024, 17 (4) : 928 - 933
  • [34] Attractive or Aggressive? A Face Recognition and Machine Learning Approach for Estimating Returns to Visual Appearance
    Guo, Guodong
    Humphreys, Brad R.
    Wang, Qiangchang
    Zhou, Yang
    JOURNAL OF SPORTS ECONOMICS, 2023, 24 (06) : 737 - 758
  • [35] Estimating intergenerational income mobility on sub-optimal data: a machine learning approach
    Francesco Bloise
    Paolo Brunori
    Patrizio Piraino
    The Journal of Economic Inequality, 2021, 19 : 643 - 665
  • [36] Estimating intergenerational income mobility on sub-optimal data: a machine learning approach
    Bloise, Francesco
    Brunori, Paolo
    Piraino, Patrizio
    JOURNAL OF ECONOMIC INEQUALITY, 2021, 19 (04) : 643 - 665
  • [37] Machine learning approach for estimating the human-related VOC emissions in a university classroom
    Liu, Jialong
    Zhang, Rui
    Xiong, Jianyin
    BUILDING SIMULATION, 2023, 16 (06) : 915 - 925
  • [38] Estimating the Prevalence of Dementia in India Using a Semi-Supervised Machine Learning Approach
    Jin, Haomiao
    Crimmins, Eileen
    Langa, Kenneth M.
    Dey, A. B.
    Lee, Jinkook
    NEUROEPIDEMIOLOGY, 2023, 57 (01) : 43 - 50
  • [39] Estimating Aqueous Solubility Directly From Molecular Structure Using Machine Learning Approach
    Dutta, Anurag
    Karmakar, Rahul
    PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, : 467 - 473
  • [40] The non-linear dynamics of South Australian regional housing markets: A machine learning approach
    Soltani, Ali
    Lee, Chyi Lin
    APPLIED GEOGRAPHY, 2024, 166