A Data-Centric AI Paradigm for Socio-Industrial and Global Challenges

被引:1
作者
Majeed, Abdul [1 ]
Hwang, Seong Oun [1 ]
机构
[1] Gachon Univ, Dept Comp Engn, Seongnam 13120, South Korea
基金
新加坡国家研究基金会;
关键词
AI technology development; data-centric AI; data quality; AI models; model-centric AI; scarce training data; global problems; social issues; artificial intelligence; hyper-parameter tuning; ARTIFICIAL-INTELLIGENCE; MODELS; PLATFORM;
D O I
10.3390/electronics13112156
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to huge investments by both the public and private sectors, artificial intelligence (AI) has made tremendous progress in solving multiple real-world problems such as disease diagnosis, chatbot misbehavior, and crime control. However, the large-scale development and widespread adoption of AI have been hindered by the model-centric mindset that only focuses on improving the code/architecture of AI models (e.g., tweaking the network architecture, shrinking model size, tuning hyper-parameters, etc.). Generally, AI encompasses a model (or code) that solves a given problem by extracting salient features from underlying data. However, when the AI model yields a low performance, developers iteratively improve the code/algorithm without paying due attention to other aspects such as data. This model-centric AI (MC-AI) approach is limited to only those few businesses/applications (language models, text analysis, etc.) where big data readily exists, and it cannot offer a feasible solution when good data are not available. However, in many real-world cases, giant datasets either do not exist or cannot be curated. Therefore, the AI community is searching for appropriate solutions to compensate for the lack of giant datasets without compromising model performance. In this context, we need a data-centric AI (DC-AI) approach in order to solve the problems faced by the conventional MC-AI approach, and to enhance the applicability of AI technology to domains where data are limited. From this perspective, we analyze and compare MC-AI and DC-AI, and highlight their working mechanisms. Then, we describe the crucial problems (social, performance, drift, affordance, etc.) of the conventional MC-AI approach, and identify opportunities to solve those crucial problems with DC-AI. We also provide details concerning the development of the DC-AI approach, and discuss many techniques that are vital in bringing DC-AI from theory to practice. Finally, we highlight enabling technologies that can contribute to realizing DC-AI, and discuss various noteworthy use cases where DC-AI is more suitable than MC-AI. Through this analysis, we intend to open up a new direction in AI technology to solve global problems (e.g., climate change, supply chain disruption) that are threatening human well-being around the globe.
引用
收藏
页数:48
相关论文
共 155 条
  • [21] A novel lifelong machine learning-based method to eliminate calibration drift in clinical prediction models
    Chi, Shengqiang
    Tian, Yu
    Wang, Feng
    Zhou, Tianshu
    Jin, Shan
    Li, Jingsong
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2022, 125
  • [22] ydata-profiling: Accelerating data-centric AI with high-quality data
    Clemente, Fabiana
    Ribeiro, Goncalo Martins
    Quemy, Alexandre
    Santos, Miriam Seoane
    Pereira, Ricardo Cardoso
    Barros, Alex
    [J]. NEUROCOMPUTING, 2023, 554
  • [23] LongReMix: Robust learning with high confidence samples in a noisy label environment
    Cordeiro, Filipe R.
    Sachdeva, Ragav
    Belagiannis, Vasileios
    Reid, Ian
    Carneiro, Gustavo
    [J]. PATTERN RECOGNITION, 2023, 133
  • [24] Cui JS, 2022, Arxiv, DOI arXiv:2207.09639
  • [25] Relational data embeddings for feature enrichment with background information
    Cvetkov-Iliev, Alexis
    Allauzen, Alexandre
    Varoquaux, Gael
    [J]. MACHINE LEARNING, 2023, 112 (02) : 687 - 720
  • [26] DeepSMOTE: Fusing Deep Learning and SMOTE for Imbalanced Data
    Dablain, Damien
    Krawczyk, Bartosz
    Chawla, Nitesh, V
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (09) : 6390 - 6404
  • [27] Challenges of Deep Learning in Medical Image Analysis - Improving Explainability and Trust
    Jadavpur University, Department of Electrical Engineering, Kolkata
    712235, India
    不详
    700156, India
    不详
    560109, India
    不详
    RG6 6AH, United Kingdom
    [J]. IEEE Trans. Technol. Soc., 2023, 1 (68-75):
  • [28] Diamantopoulos A., 2023, Taking the Fear out of Data Analysis
  • [29] Steps Toward Robust Artificial Intelligence
    Dietterich, Thomas G.
    [J]. AI MAGAZINE, 2017, 38 (03) : 3 - 24
  • [30] Durán JM, 2021, J MED ETHICS, V47, P329, DOI [10.1136/medethics-2020-106820, 10.1136/medethics-2021-107531]