Data-centric AI: Techniques and Future Perspectives

被引:9
|
作者
Zha, Daochen [1 ]
Lai, Kwei-Herng [2 ]
Yang, Fan [3 ]
Zou, Na [4 ]
Gao, Huiji [1 ]
Hu, Xia [2 ]
机构
[1] Airbnb Inc, San Francisco, CA 94103 USA
[2] Rice Univ, Houston, TX USA
[3] Wake Forest Univ, Winston Salem, NC USA
[4] Texas A&M Univ, College Stn, TX USA
关键词
D O I
10.1145/3580305.3599553
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The role of data in AI has been significantly magnified by the emerging concept of data-centric AI. In contrast to the traditional model-centric paradigm, which focuses on developing more effective models given fixed datasets, data-centric AI emphasizes the systematic engineering of data in building AI systems. However, as a new concept, many critical aspects of data-centric AI remain ambiguous, such as its definitions, associated tasks, algorithms, challenges, and benchmarks. This tutorial aims to review and discuss this emerging field, with a particular focus on the three general data-centric AI goals: training data development, inference data development, and data maintenance. The objective of this tutorial is threefold: (1) to formally categorize the field of data-centric AI using a goal-driven taxonomy and discuss the needs and challenges of each goal, (2) to comprehensively review the state-of-the-art techniques, and (3) to discuss the future perspectives and open research directions to inspire further innovations in this field.
引用
收藏
页码:5839 / 5840
页数:2
相关论文
共 50 条
  • [21] Data-centric Edge-AI: A Symbolic Representation Use Case
    Ilager, Shashikant
    De Maio, Vincenzo
    Lujic, Ivan
    Brandic, Ivona
    2023 IEEE INTERNATIONAL CONFERENCE ON EDGE COMPUTING AND COMMUNICATIONS, EDGE, 2023, : 301 - 308
  • [22] A Data-Centric AI Paradigm for Socio-Industrial and Global Challenges
    Majeed, Abdul
    Hwang, Seong Oun
    ELECTRONICS, 2024, 13 (11)
  • [23] Data-Centric Communication and Containerization for Future Automotive Software Architectures
    Kugele, Stefan
    Hettler, David
    Peter, Jan
    2018 IEEE 15TH INTERNATIONAL CONFERENCE ON SOFTWARE ARCHITECTURE (ICSA), 2018, : 65 - 74
  • [24] ydata-profiling: Accelerating data-centric AI with high-quality data
    Clemente, Fabiana
    Ribeiro, Goncalo Martins
    Quemy, Alexandre
    Santos, Miriam Seoane
    Pereira, Ricardo Cardoso
    Barros, Alex
    NEUROCOMPUTING, 2023, 554
  • [25] Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A Comprehensive Benchmark
    Hansen, Lasse
    Seedat, Nabeel
    van der Schaar, Mihaela
    Petrovic, Andrija
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [26] Towards Unlocking the Hidden Potentials of the Data-Centric AI Paradigm in the Modern Era
    Majeed, Abdul
    Hwang, Seong Oun
    APPLIED SYSTEM INNOVATION, 2024, 7 (04)
  • [27] Data-centric automated data mining
    Campos, MM
    Stengard, PJ
    Milenova, BL
    ICMLA 2005: FOURTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2005, : 97 - 104
  • [28] Rapidly predicting Kohn–Sham total energy using data-centric AI
    Hasan Kurban
    Mustafa Kurban
    Mehmet M. Dalkilic
    Scientific Reports, 12
  • [29] RDF Data-Centric Storage
    Levandoski, Justin J.
    Mokbel, Mohamed F.
    2009 IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, VOLS 1 AND 2, 2009, : 911 - 918
  • [30] Data-Centric and Model-Centric AI: Twin Drivers of Compact and Robust Industry 4.0 Solutions
    Hamid, Oussama H.
    APPLIED SCIENCES-BASEL, 2023, 13 (05):