Semi-supervised Sequence Learning

被引:0
作者
Dai, Andrew M. [1 ]
Le, Quoc V. [1 ]
机构
[1] Google Inc, Mountain View, CA 94043 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015) | 2015年 / 28卷
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present two approaches to use unlabeled data to improve Sequence Learning with recurrent networks. The first approach is to predict what comes next in a sequence, which is a language model in NLP. The second approach is to use a sequence autoencoder, which reads the input sequence into a vector and predicts the input sequence again. These two algorithms can be used as a "pretraining" algorithm for a later supervised sequence learning algorithm. In other words, the parameters obtained from the pretraining step can then be used as a starting point for other supervised training models. In our experiments, we find that long short term memory recurrent networks after pretrained with the two approaches become more stable to train and generalize better. With pretraining, we were able to achieve strong performance in many classification tasks, such as text classification with IMDB, DBpedia or image recognition in CIFAR-10.
引用
收藏
页数:9
相关论文
共 39 条
[1]  
Ando RK, 2005, J MACH LEARN RES, V6, P1817
[2]  
[Anonymous], ICML
[3]  
[Anonymous], 1974, Ph.D. Thesis
[4]  
[Anonymous], NAACL
[5]  
[Anonymous], 2015, NIPS
[6]  
[Anonymous], 2015, CVPR
[7]  
[Anonymous], 2014, NIPS
[8]  
[Anonymous], DATASETS SINGLE LABE
[9]  
[Anonymous], 2015, NIPS
[10]  
[Anonymous], 2013, PROC 7 ACM C RECOMME