Efficient Classification of Long Documents via State-Space Models

被引:0
|
作者
Lu, Peng [1 ,2 ]
Wang, Suyuchen [1 ,2 ]
Rezagholizadeh, Mehdi [1 ]
Liu, Bang [2 ]
Kobyzev, Ivan [1 ]
机构
[1] Huawei Noahs Ark Lab, Toronto, ON, Canada
[2] Univ Montreal, Dept Comp Sci & Operat Res, Montreal, PQ, Canada
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based models have achieved state-of-the-art performance on numerous NLP applications. However, long documents which are prevalent in real-world scenarios cannot be efficiently processed by transformers with the vanilla self-attention module due to their quadratic computation complexity and limited length extrapolation ability. Instead of tackling the computation difficulty for self-attention with sparse or hierarchical structures, in this paper, we investigate the use of State-Space Models (SSMs) for long document classification tasks. We conducted extensive experiments on six long document classification datasets, including binary, multi-class, and multi-label classification, comparing SSMs (with and without pre-training) to self-attention-based models. We also introduce the SSM-pooler model and demonstrate that it achieves comparable performance while being on average 36% more efficient. Additionally our method exhibits higher robustness to the input noise even in the extreme scenario of 40%.
引用
收藏
页码:6559 / 6565
页数:7
相关论文
共 50 条
  • [41] Long-term prediction of time series using state-space models
    Liitiainen, Elia
    Lendasse, Amaury
    ARTIFICIAL NEURAL NETWORKS - ICANN 2006, PT 2, 2006, 4132 : 181 - 190
  • [42] Efficient Likelihood Evaluation of State-Space Representations
    DeJong, David N.
    Liesenfeld, Roman
    Moura, Guilherme V.
    Richard, Jean-Francois
    Dharmarajan, Hariharan
    REVIEW OF ECONOMIC STUDIES, 2013, 80 (02): : 538 - 567
  • [43] TIME-EFFICIENT STATE-SPACE SEARCH
    REINEFELD, A
    RIDINGER, P
    ARTIFICIAL INTELLIGENCE, 1994, 71 (02) : 397 - 408
  • [44] STABILITY ROBUSTNESS MEASURES OF STATE-SPACE MODELS
    SOH, CB
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 1991, 22 (10) : 1867 - 1884
  • [45] ROBUST STABILITY IN LINEAR STATE-SPACE MODELS
    JIANG, CL
    INTERNATIONAL JOURNAL OF CONTROL, 1988, 48 (02) : 813 - 816
  • [46] STATE-SPACE MODELS OF LUMPED AND DISTRIBUTED SYSTEMS
    KECMAN, V
    LECTURE NOTES IN CONTROL AND INFORMATION SCIENCES, 1988, 112 : 1 - &
  • [47] Standard state-space models preclude unawareness
    Dekel, E
    Lipman, BL
    Rustichini, A
    ECONOMETRICA, 1998, 66 (01) : 159 - 173
  • [48] Review of State-Space Models for Fisheries Science
    Aeberhard, William H.
    Flemming, Joanna Mills
    Nielsen, Anders
    ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 5, 2018, 5 : 215 - 235
  • [49] State-Space Models for Clustering of Compositional Trajectories
    Panarotto, Andrea
    Cattelan, Manuela
    Bellio, Ruggero
    DEVELOPMENTS IN STATISTICAL MODELLING, IWSM 2024, 2024, : 197 - 203
  • [50] Latent State-Space Models for Neural Decoding
    Aghagolzadeh, Mehdi
    Truccolo, Wilson
    2014 36TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2014, : 3033 - 3036