Distributed Deep Learning for Question Answering

被引:4
作者
Feng, Minwei [1 ]
Xiang, Bing [1 ]
Zhou, Bowen [1 ]
机构
[1] IBM Watson, Yorktown Hts, NY 10598 USA
来源
CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT | 2016年
关键词
distributed training; deep learning; question answering;
D O I
10.1145/2983323.2983377
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper is an empirical study of the distributed deep learning for question answering subtasks: answer selection and question classification. Comparison studies of SGD, MSGD, ADADELTA, ADAGRAD, ADAM/ADAMAX, RMSPROP, DOWNPOUR and EASGD/EAMSGD algorithms have been presented. Experimental results show that the distributed framework based on the message passing interface can accelerate the convergence speed at a sublinear scale. This paper demonstrates the importance of distributed training. For example, with 48 workers, a 24x speedup is achievable for the answer selection task and running time is decreased from 138.2 hours to 5.81 hours, which will increase the productivity significantly.
引用
收藏
页码:2413 / 2416
页数:4
相关论文
共 14 条
[1]  
[Anonymous], 2013, INT C MACHINE LEARNI
[2]  
[Anonymous], 2016, NAT METHODS, DOI DOI 10.1038/nmeth.3707
[3]  
[Anonymous], 2013, GENERATING SEQUENCES
[4]  
[Anonymous], 2014, OPERATING SYSTEMS DE
[5]  
[Anonymous], 2015, ARXIV E PRINTS
[6]  
[Anonymous], 2014, P USENIX OSDI
[7]  
[Anonymous], 2012, MACH LEARN
[8]  
[Anonymous], 1998, Online Algorithms and Stochastic Approximations
[9]  
Dean J., 2012, ADV NEURAL INFORM PR, V25, P1223, DOI DOI 10.5555/2999134.2999271
[10]  
Duchi J, 2011, J MACH LEARN RES, V12, P2121