Hierarchical and Self-Attended Sequence Autoencoder

被引:17
作者
Chien, Jen-Tzung [1 ]
Wang, Chun-Wei [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu 30010, Taiwan
关键词
Decoding; Stochastic processes; Training; Semantics; Recurrent neural networks; Natural languages; Data models; Sequence generation; recurrent neural network; variational autoencoder; hierarchical model; self attention;
D O I
10.1109/TPAMI.2021.3068187
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is important and challenging to infer stochastic latent semantics for natural language applications. The difficulty in stochastic sequential learning is caused by the posterior collapse in variational inference. The input sequence is disregarded in the estimated latent variables. This paper proposes three components to tackle this difficulty and build the variational sequence autoencoder (VSAE) where sufficient latent information is learned for sophisticated sequence representation. First, the complementary encoders based on a long short-term memory (LSTM) and a pyramid bidirectional LSTM are merged to characterize global and structural dependencies of an input sequence, respectively. Second, a stochastic self attention mechanism is incorporated in a recurrent decoder. The latent information is attended to encourage the interaction between inference and generation in an encoder-decoder training procedure. Third, an autoregressive Gaussian prior of latent variable is used to preserve the information bound. Different variants of VSAE are proposed to mitigate the posterior collapse in sequence modeling. A series of experiments are conducted to demonstrate that the proposed individual and hybrid sequence autoencoders substantially improve the performance for variational sequential learning in language modeling and semantic understanding for document classification and summarization.
引用
收藏
页码:4975 / 4986
页数:12
相关论文
共 37 条
[1]  
Aksan E., 2019, PROC INT C LEARN REP
[2]  
[Anonymous], 2015, P EMNLP, DOI DOI 10.18653/V1/D15-1166
[3]  
[Anonymous], 2018, PROC INT C LEARN REP
[4]  
Bowman S. R., 2016, P 20 SIGNLL C COMP N, P10, DOI 10.18653/v1/K16-1002
[5]  
Burda Y., 2016, P 4 INT C LEARN REPR
[6]  
Chan W, 2016, INT CONF ACOUST SPEE, P4960, DOI 10.1109/ICASSP.2016.7472621
[7]  
Chang YL, 2009, INT CONF ACOUST SPEE, P1689, DOI 10.1109/ICASSP.2009.4959927
[8]  
Chien J.-T., 2020, PROC IEEE INT C MULT, P1
[9]   Self Attention in Variational Sequential Learning for Summarization [J].
Chien, Jen-Tzung ;
Wang, Chun-Wei .
INTERSPEECH 2019, 2019, :1318-1322
[10]  
Chien JT, 2019, INT CONF ACOUST SPEE, P3202, DOI [10.1109/icassp.2019.8683771, 10.1109/ICASSP.2019.8683771]