Scanflow: A multi-graph framework for Machine Learning workflow management, supervision, and debugging

被引:5
|
作者
Bravo-Rocca, Gusseppe [1 ]
Liu, Peini [1 ]
Guitart, Jordi [1 ,2 ]
Dholakia, Ajay [3 ]
Ellison, David [3 ]
Falkanger, Jeffrey [3 ]
Hodak, Miroslav [3 ]
机构
[1] Barcelona Supercomp Ctr BSC, Emerging Technol Artificial Intelligence, Barcelona, Spain
[2] Univ Politecn Catalunya UPC, Comp Architecture Dept, Barcelona, Spain
[3] Lenovo, Lenovo Infrastruct Solut Grp, Morrisville, NC USA
关键词
Machine Learning; Symbolic knowledge; Graph; Robustness; Containerization; Concept drift;
D O I
10.1016/j.eswa.2022.117232
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine Learning (ML) is more than just training models, the whole workflow must be considered. Once deployed, a ML model needs to be watched and constantly supervised and debugged to guarantee its validity and robustness in unexpected situations. Debugging in ML aims to identify (and address) the model weaknesses in not trivial contexts. Several techniques have been proposed to identify different types of model weaknesses, such as bias in classification, model decay, adversarial attacks, etc., yet there is not a generic framework that allows them to work in a collaborative, modular, portable, iterative way and, more importantly, flexible enough to allow both human- and machine-driven techniques. In this paper, we propose a novel containerized directed graph framework to support and accelerate end-to-end ML workflow management, supervision, and debugging. The framework allows defining and deploying ML workflows in containers, tracking their metadata, checking their behavior in production, and improving the models by using both learned and human-provided knowledge. We demonstrate these capabilities by integrating in the framework two hybrid systems to detect data drift distribution which identify the samples that are far from the latent space of the original distribution, ask for human intervention, and whether retrain the model or wrap it with a filter to remove the noise of corrupted data at inference time. We test these systems on MNIST-C, CIFAR-10-C, and FashionMNIST-C datasets, obtaining promising accuracy results with the help of human involvement.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Multi-Instance Multi-Graph Dual Embedding Learning
    Wu, Jia
    Zhu, Xingquan
    Zhang, Chengqi
    Cai, Zhihua
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 827 - 836
  • [2] Machine learning in a graph framework for subcortical segmentation
    Guo, Zhihui
    Kashyap, Satyananda
    Sonka, Milan
    Oguz, Ipek
    MEDICAL IMAGING 2017: IMAGE PROCESSING, 2017, 10133
  • [3] A Graph Machine Learning Framework to Compute Zero Forcing Sets in Graphs
    Ahmad, Obaid Ullah
    Shabbir, Mudassir
    Abbas, Waseem
    Koutsoukos, Xenofon
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2024, 11 (02): : 2110 - 2123
  • [4] Anomaly Detection with Machine Learning and Graph Databases in Fraud Management
    Magomedov, Shamil
    Pavelyev, Sergei
    Ivanova, Irina
    Dobrotvorsky, Alexey
    Khrestina, Marina
    Yusubaliev, Timur
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2018, 9 (11) : 33 - 38
  • [5] Building a scientific workflow framework to enable real-time machine learning and visualization
    Li, Feng
    Song, Fengguang
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (16)
  • [6] The machine learning framework for traffic management in smart cities
    Tiwari, Pulkit
    MANAGEMENT OF ENVIRONMENTAL QUALITY, 2024, 35 (02) : 445 - 462
  • [7] A Machine Learning based Knowledge Graph Framework for Heterogeneous Power Grid Systems
    Zhang, Shujuan
    Zheng, GuoQiang
    Liu, Li
    Li, Longyue
    Li, JinZhong
    Wang, Xin
    2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS AND COMPUTER ENGINEERING (ICCECE), 2021, : 119 - 123
  • [8] Machine learning based trust management framework for vehicular networks
    El-Sayed, Hesham
    Ignatious, Henry Alexander
    Kulkarni, Parag
    Bouktif, Salah
    VEHICULAR COMMUNICATIONS, 2020, 25
  • [9] Toward a Method Engineering Framework for Project Management and Machine Learning
    Uysal, Murat Pasa
    2023 IEEE 47TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE, COMPSAC, 2023, : 1186 - 1190
  • [10] Machine learning workflow to predict multi-target subsurface signals for the exploration of hydrocarbon and water
    Osogba, Oghenekaro
    Misra, Siddharth
    Xu, Chicheng
    FUEL, 2020, 278