Distributed Deep Learning for Question Answering

被引：4

作者：

Feng, Minwei ^{[1
]}

Xiang, Bing ^{[1
]}

Zhou, Bowen ^{[1
]}

机构：

[1] IBM Watson, Yorktown Hts, NY 10598 USA

来源：

CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT | 2016年

关键词：

distributed training; deep learning; question answering;

D O I：

10.1145/2983323.2983377

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper is an empirical study of the distributed deep learning for question answering subtasks: answer selection and question classification. Comparison studies of SGD, MSGD, ADADELTA, ADAGRAD, ADAM/ADAMAX, RMSPROP, DOWNPOUR and EASGD/EAMSGD algorithms have been presented. Experimental results show that the distributed framework based on the message passing interface can accelerate the convergence speed at a sublinear scale. This paper demonstrates the importance of distributed training. For example, with 48 workers, a 24x speedup is achievable for the answer selection task and running time is decreased from 138.2 hours to 5.81 hours, which will increase the productivity significantly.

引用

页码：2413 / 2416

页数：4

共 14 条

[1]

[Anonymous], 2013, INT C MACHINE LEARNI

[2]

[Anonymous], 2016, NAT METHODS, DOI DOI 10.1038/nmeth.3707

[3]

[Anonymous], 2013, GENERATING SEQUENCES

[4]

[Anonymous], 2014, OPERATING SYSTEMS DE

[5]

[Anonymous], 2015, ARXIV E PRINTS

[6]

[Anonymous], 2014, P USENIX OSDI

[7]

[Anonymous], 2012, MACH LEARN

[8]

[Anonymous], 1998, Online Algorithms and Stochastic Approximations

[9]

Dean J., 2012, ADV NEURAL INFORM PR, V25, P1223, DOI DOI 10.5555/2999134.2999271

[10]

Duchi J, 2011, J MACH LEARN RES, V12, P2121

← 1 2 →