Enhancing Collaboration and Agility in Data-Centric AI Projects

被引:0
作者
Stieler, Fabian [1 ]
Baul, Bernhard [1 ]
机构
[1] Univ Augsburg, Software Methodol Distributed Syst, Augsburg, Germany
来源
EVALUATION OF NOVEL APPROACHES TO SOFTWARE ENGINEERING, ENASE 2023 | 2024年 / 2028卷
关键词
Software engineering for machine learning; Agile development; MLOps; AI development; Data-centric AI;
D O I
10.1007/978-3-031-64182-4_15
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Usually, mature Artificial Intelligence (AI) projects are developed by a team of various members, such as data engineers, data scientists, software engineers and machine learning (ML) engineers. They often pursue highly heterogeneous approaches, leading to new challenges in collaboration, particularly regarding software quality, data versioning and the traceability of model metrics and other resulting artifacts. These challenges are further intensified when AI projects rely on dynamic datasets, introducing an entirely new dimension that teams must deal with. Adopting principles from the machine learning operations (MLOps) paradigm becomes essential in this context. To go beyond existing process models and develop actionable guidelines, our work introduces a Git workflow for AI projects. We present basic instructions for data and code while outlining a minimal infrastructure setup. Building upon abstract concepts, we delve into concrete, actionable steps by examining the proposed branching workflow. Through a case study, we apply the development methodology to two use cases and demonstrate that the principles and approaches positively impact project outcomes.
引用
收藏
页码:321 / 343
页数:23
相关论文
共 41 条
  • [1] Software Engineering for Machine Learning: A Case Study
    Amershi, Saleema
    Begel, Andrew
    Bird, Christian
    DeLine, Robert
    Gall, Harald
    Kamar, Ece
    Nagappan, Nachiappan
    Nushi, Besmira
    Zimmermann, Thomas
    [J]. 2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2019), 2019, : 291 - 300
  • [2] Software Engineering Challenges of Deep Learning
    Arpteg, Anders
    Brinne, Bjorn
    Crnkovic-Friis, Luka
    Bosch, Jan
    [J]. 44TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS (SEAA 2018), 2018, : 50 - 59
  • [3] Breck E., 2016, 30 C NEUR INF PROC S
  • [4] Guest Editorial Skin Image Analysis in the Age of Deep Learning
    Celebi, M. Emre
    Barata, Catarina
    Halpern, Allan
    Tschandl, Philipp
    Combalia, Marc
    Liu, Yuan
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (01) : 143 - 144
  • [5] Fayyad U., 1996, KDD-96 Proceedings. Second International Conference on Knowledge Discovery and Data Mining, P82
  • [6] Fayyad U, 1996, AI MAG, V17, P37
  • [7] Howard AG, 2017, Arxiv, DOI [arXiv:1704.04861, DOI 10.48550/ARXIV.1704.04861]
  • [8] A Survey on Concept Drift Adaptation
    Gama, Joao
    Zliobaite, Indre
    Bifet, Albert
    Pechenizkiy, Mykola
    Bouchachia, Abdelhamid
    [J]. ACM COMPUTING SURVEYS, 2014, 46 (04)
  • [9] A software engineering perspective on engineering machine learning systems: State of the art and challenges
    Giray, Gorkem
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2021, 180
  • [10] PhysioBank, PhysioToolkit, and PhysioNet - Components of a new research resource for complex physiologic signals
    Goldberger, AL
    Amaral, LAN
    Glass, L
    Hausdorff, JM
    Ivanov, PC
    Mark, RG
    Mietus, JE
    Moody, GB
    Peng, CK
    Stanley, HE
    [J]. CIRCULATION, 2000, 101 (23) : E215 - E220