Efficient Classification of Long Documents via State-Space Models

被引:0
|
作者
Lu, Peng [1 ,2 ]
Wang, Suyuchen [1 ,2 ]
Rezagholizadeh, Mehdi [1 ]
Liu, Bang [2 ]
Kobyzev, Ivan [1 ]
机构
[1] Huawei Noahs Ark Lab, Toronto, ON, Canada
[2] Univ Montreal, Dept Comp Sci & Operat Res, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based models have achieved state-of-the-art performance on numerous NLP applications. However, long documents which are prevalent in real-world scenarios cannot be efficiently processed by transformers with the vanilla self-attention module due to their quadratic computation complexity and limited length extrapolation ability. Instead of tackling the computation difficulty for self-attention with sparse or hierarchical structures, in this paper, we investigate the use of State-Space Models (SSMs) for long document classification tasks. We conducted extensive experiments on six long document classification datasets, including binary, multi-class, and multi-label classification, comparing SSMs (with and without pre-training) to self-attention-based models. We also introduce the SSM-pooler model and demonstrate that it achieves comparable performance while being on average 36% more efficient. Additionally our method exhibits higher robustness to the input noise even in the extreme scenario of 40%.
引用
收藏
页码:6559 / 6565
页数:7
相关论文
共 50 条
  • [1] Long Movie Clip Classification with State-Space Video Models
    Islam, Md Mohaiminul
    Bertasius, Gedas
    COMPUTER VISION - ECCV 2022, PT XXXV, 2022, 13695 : 87 - 104
  • [2] CLASSIFICATION OF TRENDS VIA THE LINEAR STATE-SPACE MODEL
    GANTERT, C
    BIOMETRICAL JOURNAL, 1994, 36 (07) : 825 - 839
  • [3] Efficient Kalman smoothing for harmonic state-space models
    Barber, David
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 2979 - 2982
  • [4] Efficient State-Space Inference of Periodic Latent Force Models
    Reece, Steven
    Ghosh, Siddhartha
    Rogers, Alex
    Roberts, Stephen
    Jennings, Nicholas R.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 2337 - 2397
  • [5] EFFICIENT CALCULATION OF THERMOACOUSTIC MODES UTILIZING STATE-SPACE MODELS
    Meindl, Max
    Emmert, Thomas
    Polifke, Wolfgang
    PROCEEDINGS OF THE 23RD INTERNATIONAL CONGRESS ON SOUND AND VIBRATION: FROM ANCIENT TO MODERN ACOUSTICS, 2016,
  • [6] EFFICIENT GENERALIZED CROSS-VALIDATION FOR STATE-SPACE MODELS
    ANSLEY, CF
    KOHN, R
    BIOMETRIKA, 1987, 74 (01) : 139 - 148
  • [7] Efficient state-space inference of periodic latent force models
    Reece, Steven
    Ghosh, Siddhartha
    Rogers, Alex
    Roberts, Stephen
    Jennings, Nicholas R.
    Journal of Machine Learning Research, 2014, 15 : 2337 - 2397
  • [8] Discriminative State-Space Models
    Kuznetsov, Vitaly
    Mohri, Mehryar
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [9] Dynamic state-space models
    Guo, WS
    JOURNAL OF TIME SERIES ANALYSIS, 2003, 24 (02) : 149 - 158
  • [10] Structured state-space models are deep Wiener modelsStructured state-space models are deep Wiener models
    Bonassi, Fabio
    Andersson, Carl
    Mattsson, Per
    Schon, Thomas B.
    IFAC PAPERSONLINE, 2024, 58 (15): : 247 - 252