Efficient Classification of Long Documents via State-Space Models

被引:0
|
作者
Lu, Peng [1 ,2 ]
Wang, Suyuchen [1 ,2 ]
Rezagholizadeh, Mehdi [1 ]
Liu, Bang [2 ]
Kobyzev, Ivan [1 ]
机构
[1] Huawei Noahs Ark Lab, Toronto, ON, Canada
[2] Univ Montreal, Dept Comp Sci & Operat Res, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based models have achieved state-of-the-art performance on numerous NLP applications. However, long documents which are prevalent in real-world scenarios cannot be efficiently processed by transformers with the vanilla self-attention module due to their quadratic computation complexity and limited length extrapolation ability. Instead of tackling the computation difficulty for self-attention with sparse or hierarchical structures, in this paper, we investigate the use of State-Space Models (SSMs) for long document classification tasks. We conducted extensive experiments on six long document classification datasets, including binary, multi-class, and multi-label classification, comparing SSMs (with and without pre-training) to self-attention-based models. We also introduce the SSM-pooler model and demonstrate that it achieves comparable performance while being on average 36% more efficient. Additionally our method exhibits higher robustness to the input noise even in the extreme scenario of 40%.
引用
收藏
页码:6559 / 6565
页数:7
相关论文
共 50 条
  • [21] Identification of Piecewise Affine State-Space Models via Expectation Maximization
    Rui, Rafael
    Ardeshiri, Tohid
    Bazanella, Alexandre
    2016 IEEE CONFERENCE ON COMPUTER AIDED CONTROL SYSTEM DESIGN (CACSD), 2016, : 1066 - 1071
  • [22] Granger causality for state-space models
    Barnett, Lionel
    Seth, Anil K.
    PHYSICAL REVIEW E, 2015, 91 (04):
  • [23] ON GIBBS SAMPLING FOR STATE-SPACE MODELS
    CARTER, CK
    KOHN, R
    BIOMETRIKA, 1994, 81 (03) : 541 - 553
  • [24] State-space models for optical imaging
    Myers, Kary L.
    Brockwell, Anthony E.
    Eddy, William F.
    STATISTICS IN MEDICINE, 2007, 26 (21) : 3862 - 3874
  • [25] State-Space Models for Control and Identification
    2005, Springer Verlag (308):
  • [26] DERIVATION OF STATE-SPACE MODELS OF CRYSTALLIZERS
    DEWOLF, S
    JAGER, J
    VISSER, B
    KRAMER, HJM
    BOSGRA, OH
    ACS SYMPOSIUM SERIES, 1990, 438 : 144 - 158
  • [27] Bootstrapping Periodic State-Space Models
    Guerbyenne, Hafida
    Hamdi, Faycal
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2015, 44 (02) : 374 - 401
  • [28] Probabilistic Recurrent State-Space Models
    Doerr, Andreas
    Daniel, Christian
    Schiegg, Martin
    Nguyen-Tuong, Duy
    Schaal, Stefan
    Toussaint, Marc
    Trimpe, Sebastian
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [29] Inequality Constrained State-Space Models
    Qian, Hang
    JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2019, 37 (02) : 350 - 362
  • [30] DISTURBANCE SMOOTHER FOR STATE-SPACE MODELS
    KOOPMAN, SJ
    BIOMETRIKA, 1993, 80 (01) : 117 - 126