Efficient Classification of Long Documents via State-Space Models

被引：0

作者：

Lu, Peng ^{[1
,2
]}

Wang, Suyuchen ^{[1
,2
]}

Rezagholizadeh, Mehdi ^{[1
]}

Liu, Bang ^{[2
]}

Kobyzev, Ivan ^{[1
]}

机构：

[1] Huawei Noahs Ark Lab, Toronto, ON, Canada

[2] Univ Montreal, Dept Comp Sci & Operat Res, Montreal, PQ, Canada

来源：

2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Transformer-based models have achieved state-of-the-art performance on numerous NLP applications. However, long documents which are prevalent in real-world scenarios cannot be efficiently processed by transformers with the vanilla self-attention module due to their quadratic computation complexity and limited length extrapolation ability. Instead of tackling the computation difficulty for self-attention with sparse or hierarchical structures, in this paper, we investigate the use of State-Space Models (SSMs) for long document classification tasks. We conducted extensive experiments on six long document classification datasets, including binary, multi-class, and multi-label classification, comparing SSMs (with and without pre-training) to self-attention-based models. We also introduce the SSM-pooler model and demonstrate that it achieves comparable performance while being on average 36% more efficient. Additionally our method exhibits higher robustness to the input noise even in the extreme scenario of 40%.

引用

页码：6559 / 6565

页数：7

共 50 条

[1] Long Movie Clip Classification with State-Space Video Models
Islam, Md Mohaiminul
Bertasius, Gedas
COMPUTER VISION - ECCV 2022, PT XXXV, 2022, 13695 : 87 - 104
[2] CLASSIFICATION OF TRENDS VIA THE LINEAR STATE-SPACE MODEL
GANTERT, C
BIOMETRICAL JOURNAL, 1994, 36 (07) : 825 - 839
[3] Efficient Kalman smoothing for harmonic state-space models
Barber, David
2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 2979 - 2982
[4] Efficient State-Space Inference of Periodic Latent Force Models
Reece, Steven
Ghosh, Siddhartha
Rogers, Alex
Roberts, Stephen
Jennings, Nicholas R.
JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 2337 - 2397
[5] EFFICIENT CALCULATION OF THERMOACOUSTIC MODES UTILIZING STATE-SPACE MODELS
Meindl, Max
Emmert, Thomas
Polifke, Wolfgang
PROCEEDINGS OF THE 23RD INTERNATIONAL CONGRESS ON SOUND AND VIBRATION: FROM ANCIENT TO MODERN ACOUSTICS, 2016,
[6] EFFICIENT GENERALIZED CROSS-VALIDATION FOR STATE-SPACE MODELS
ANSLEY, CF
KOHN, R
BIOMETRIKA, 1987, 74 (01) : 139 - 148
[7] Efficient state-space inference of periodic latent force models
Reece, Steven
Ghosh, Siddhartha
Rogers, Alex
Roberts, Stephen
Jennings, Nicholas R.
Journal of Machine Learning Research, 2014, 15 : 2337 - 2397
[8] Discriminative State-Space Models
Kuznetsov, Vitaly
Mohri, Mehryar
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[9] Dynamic state-space models
Guo, WS
JOURNAL OF TIME SERIES ANALYSIS, 2003, 24 (02) : 149 - 158
[10] Structured state-space models are deep Wiener modelsStructured state-space models are deep Wiener models
Bonassi, Fabio
Andersson, Carl
Mattsson, Per
Schon, Thomas B.
IFAC PAPERSONLINE, 2024, 58 (15): : 247 - 252

← 1 2 3 4 5 →