Can Large Language Models be Anomaly Detectors for Time Series?

被引:1
作者
Alnegheimish, Sarah [1 ]
Nguyen, Linh [1 ]
Berti-Equille, Laure [2 ]
Veeramachaneni, Kalyan [1 ]
机构
[1] MIT, LIDS, Cambridge, MA 02139 USA
[2] IRD ESPACE DEV, Marseille, France
来源
2024 IEEE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, DSAA 2024 | 2024年
关键词
anomaly detection; time series; large language models;
D O I
10.1109/DSAA61799.2024.10722786
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The flexible nature of large language models allows them to be used for diverse applications. Recent studies have showcased numerous abilities of these models, including performing time series forecasting. In this paper, we present a novel study of large language models used for the challenging task of time series anomaly detection. This problem entails two novel aspects for LLMs specifically: first, the model needs to be able to identify part of an input sequence (or multiple parts) as anomalous; and second, the model needs to work with time series data rather than with text input. We introduce SIGLLM, a framework for time series anomaly detection using large language models. Our framework includes a time-series-to-text conversion module, as well as end-to-end pipelines that prompt language models to perform time series anomaly detection. We investigate two paradigms for testing the abilities of large language models to perform the detection task. First, we present a prompt-based detection method that directly asks a language model to indicate which elements of the input are anomalies. Second, we leverage the forecasting capability of a large language model to guide the anomaly detection process. We evaluated our framework on 11 datasets spanning various sources and 10 pipelines. We show that the forecasting method significantly outperformed the prompting method in all 11 datasets with respect to the F1 score. Moreover, while large language models are capable of finding anomalies, state-of-the-art deep learning models are still superior in performance, achieving 30% improvement.
引用
收藏
页码:218 / 227
页数:10
相关论文
共 34 条
[1]  
Alnegheimish S., 2024, Making the end-user a priority in benchmarking: Orionbench for unsupervised time series anomaly detection
[2]   Sintel: A Machine Learning Framework to Extract Insights from Signals [J].
Alnegheimish, Sarah ;
Liu, Dongyu ;
Sala, Carles ;
Berti-Equille, Laure ;
Veeramachaneni, Kalyan .
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, :1855-1865
[3]  
Ansari AF, 2024, Arxiv, DOI [arXiv:2403.07815, DOI 10.48550/ARXIV.2403.07815]
[4]  
Austin J., 2021, arXiv, DOI DOI 10.48550/ARXIV.2108.07732
[5]  
Biderman Stella, 2023, Advances in Neural Information Processing Systems
[6]  
Box G. E. P., 1970, Time Series Analysis: Forecasting and Control, DOI DOI 10.1080/01621459.1970.10481180
[7]  
Brown TB, 2020, ADV NEUR IN, V33
[8]  
Chang KK, 2023, Arxiv, DOI [arXiv:2305.00118, DOI 10.48550/ARXIV.2305.00118, 10.48550/arXiv.2305.00118]
[9]  
Chen Mark., 2021, CORR, P2021, DOI [10.48550/arXiv.2107.03374, DOI 10.48550/ARXIV.2107.03374]
[10]  
Chowdhery A, 2023, J MACH LEARN RES, V24