Efficient Classification of Long Documents via State-Space Models

被引:0
|
作者
Lu, Peng [1 ,2 ]
Wang, Suyuchen [1 ,2 ]
Rezagholizadeh, Mehdi [1 ]
Liu, Bang [2 ]
Kobyzev, Ivan [1 ]
机构
[1] Huawei Noahs Ark Lab, Toronto, ON, Canada
[2] Univ Montreal, Dept Comp Sci & Operat Res, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based models have achieved state-of-the-art performance on numerous NLP applications. However, long documents which are prevalent in real-world scenarios cannot be efficiently processed by transformers with the vanilla self-attention module due to their quadratic computation complexity and limited length extrapolation ability. Instead of tackling the computation difficulty for self-attention with sparse or hierarchical structures, in this paper, we investigate the use of State-Space Models (SSMs) for long document classification tasks. We conducted extensive experiments on six long document classification datasets, including binary, multi-class, and multi-label classification, comparing SSMs (with and without pre-training) to self-attention-based models. We also introduce the SSM-pooler model and demonstrate that it achieves comparable performance while being on average 36% more efficient. Additionally our method exhibits higher robustness to the input noise even in the extreme scenario of 40%.
引用
收藏
页码:6559 / 6565
页数:7
相关论文
共 50 条
  • [31] State-space estimation with uncertain models
    Sayed, AH
    Subramanian, A
    TOTAL LEAST SQUARES AND ERRORS-IN-VARIABLES MODELING: ANALYSIS, ALGORITHMS AND APPLICATIONS, 2002, : 191 - 202
  • [32] Identification of structured state-space models
    Yu, Chengpu
    Ljung, Lennart
    Verhaegen, Michel
    AUTOMATICA, 2018, 90 : 54 - 61
  • [33] Approximate Methods for State-Space Models
    Koyama, Shinsuke
    Perez-Bolde, Lucia Castellanos
    Shalizi, Cosma Rohilla
    Kass, Robert E.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2010, 105 (489) : 170 - 180
  • [34] State-space models for control and identification
    Raynaud, HF
    Kulcsár, C
    Hammi, R
    ADVANCES IN COMMUNICATION CONTROL NETWORKS, 2005, 308 : 177 - 197
  • [35] Smoothing algorithms for state-space models
    Briers, Mark
    Doucet, Arnaud
    Maskell, Simon
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2010, 62 (01) : 61 - 89
  • [36] Emotional Valence Tracking and Classification via State-Space Analysis of Facial Electromyography
    Yadav, Taruna
    Atique, Md Moin Uddin
    Azgomi, Hamid Fekri
    Francis, Joseph T.
    Faghih, Rose T.
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 2116 - 2120
  • [37] Spatiotemporal blocking of the bouncy particle sampler for efficient inference in state-space models
    Goldman, Jacob Vorstrup
    Singh, Sumeetpal S.
    STATISTICS AND COMPUTING, 2021, 31 (05)
  • [38] EFFICIENT ESTIMATION OF COMPRESSIBLE STATE-SPACE MODELS WITH APPLICATION TO CALCIUM SIGNAL DECONVOLUTION
    Kazemipour, Abbas
    Liu, Ji
    Kanold, Patrick
    Wu, Min
    Babadi, Behtash
    2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 1176 - 1180
  • [39] Spatiotemporal blocking of the bouncy particle sampler for efficient inference in state-space models
    Jacob Vorstrup Goldman
    Sumeetpal S. Singh
    Statistics and Computing, 2021, 31
  • [40] Computationally Efficient Nonlinear Predictive Control Based on State-Space Neural Models
    Lawrynczuk, Maciej
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, PT I, 2010, 6067 : 350 - 359