Sintel: A Machine Learning Framework to Extract Insights from Signals

被引:6
作者
Alnegheimish, Sarah [1 ]
Liu, Dongyu [1 ]
Sala, Carles [1 ]
Berti-Equille, Laure [2 ]
Veeramachaneni, Kalyan [1 ]
机构
[1] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] IRD, Marseille, France
来源
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22) | 2022年
关键词
Machine Learning Framework; Anomaly Detection; Human-In-the-Loop AI; Time Series Data; Data Science Pipeline; ANOMALY DETECTION; TIME-SERIES;
D O I
10.1145/3514221.3517910
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The detection of anomalies in time series data is a critical task with many monitoring applications. Existing systems often fail to encompass an end-to-end detection process, to facilitate comparative analysis of various anomaly detection methods, or to incorporate human knowledge to refine output. This precludes current methods from being used in real-world settings by practitioners who are not ML experts. In this paper, we introduce Sintel, a machine learning framework for end-to-end time series tasks such as anomaly detection. The framework uses state-of-the-art approaches to support all steps of the anomaly detection process. Sintel logs the entire anomaly detection journey, providing detailed documentation of anomalies over time. It enables users to analyze signals, compare methods, and investigate anomalies through an interactive visualization tool, where they can annotate, modify, create, and remove events. Using these annotations, the framework leverages human knowledge to improve the anomaly detection pipeline. We demonstrate the usability, efficiency, and effectiveness of Sintel through a series of experiments on three public time series datasets, as well as one real-world use case involving spacecraft experts tasked with anomaly analysis tasks. Sinters framework, code, and datasets are open-sourced at https://github.com/sintel-dev/.
引用
收藏
页码:1855 / 1865
页数:11
相关论文
共 51 条
[1]   Unsupervised real-time anomaly detection for streaming data [J].
Ahmad, Subutai ;
Lavin, Alexander ;
Purdy, Scott ;
Agha, Zuha .
NEUROCOMPUTING, 2017, 262 :134-147
[2]  
Alexandrov A, 2020, J MACH LEARN RES, V21
[3]   Cardea: An Open Automated Machine Learning Framework for Electronic Health Records [J].
Alnegheimish, Sarah ;
Alrashed, Najat ;
Aleissa, Faisal ;
Althobaiti, Shahad ;
Liu, Dongyu ;
Alsaleh, Mansour ;
Veeramachaneni, Kalyan .
2020 IEEE 7TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2020), 2020, :536-545
[4]  
Alnegheimish Sarah., 2022, REPROD SINTEL MACHIN
[5]   A survey of methods for time series change point detection [J].
Aminikhanghahi, Samaneh ;
Cook, Diane J. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 51 (02) :339-367
[6]  
Arundo Analytics, 2020, TOOLK
[7]  
Bergstra J., 2011, Adv. Neural Inf. Process. Syst., P2546
[8]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[9]  
Buitinck L., 2013, arXiv, DOI [DOI 10.48550/ARXIV.1309.0238, DOI 10.48550/ARXIV.1309.0238,ARXIV]
[10]   Smile: A System to Support Machine Learning on EEG Data at Scale [J].
Cao, Lei ;
Tao, Wenbo ;
An, Sungtae ;
Jin, Jing ;
Yan, Yizhou ;
Liu, Xiaoyu ;
Ge, Wendong ;
Sah, Adam ;
Battle, Leilani ;
Sun, Jimeng ;
Chang, Remco ;
Westover, Brandon ;
Madden, Samuel ;
Stonebrakerl, Michael .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (12) :2230-2241