Training Google Neural Machine Translation on an Intel CPU Cluster

被引:1
作者
Kalamkar, Dhiraj D. [1 ]
Banerjee, Kunal [1 ]
Srinivasan, Sudarshan [1 ]
Sridharan, Srinivas [1 ]
Georganas, Evangelos [2 ]
Smorkalov, Mikhail E. [3 ]
Xu, Cong [3 ]
Heinecke, Alexander [2 ]
机构
[1] Intel Corp, Parallel Comp Lab, Bangalore, Karnataka, India
[2] Intel Corp, Parallel Comp Lab, Santa Clara, CA USA
[3] Intel Corp, Intel Arch Graph & Sw, Nizhnii Novgorod, Russia
来源
2019 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER) | 2019年
关键词
machine translation; recurrent neural networks; TensorFlow; LIBXSMM; Intel architecture;
D O I
10.1109/cluster.2019.8891019
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Google's neural machine translation (GNMT) is state-of-the-art recurrent neural network (RNN/LSTM) based language translation application. It is computationally more demanding than well-studied convolutional neural networks (CNNs). Also, in contrast to CNNs, RNNs heavily mix compute and memory bound layers which requires careful tuning on a latency machine to optimally use fast on-die memories for best single processor performance. Additionally, due to massive compute demand, it is essential to distribute the entire workload among several processors and even compute nodes. To the best of our knowledge, this is the first work which attempts to scale this application on an Intel CPU cluster. Our CPU-based GNMT optimization, the first of its kind, achieves this by the following steps: (i) we choose a monolithic long short-term memory (LSTM) cell implementation from LIBXSMM library (specifically tuned for CPUs) and integrate it into TensorFlow, (ii) we modify GNMT code to use fused time step LSTM op for the encoding stage, (iii) we combine Horovod and Intel MLSL scaling libraries for improved performance on multiple nodes, and (iv) we extend the bucketing logic for grouping similar length sentences together to multiple nodes for achieving load balance across multiple ranks. In summary, we demonstrate that due to these changes we are able to outperform Google's stock CPU-based GNMT implementation by similar to 2x on single node and potentially enable more than 25x speedup using 16 node CPU cluster.
引用
收藏
页码:193 / 202
页数:10
相关论文
共 50 条
  • [21] Impact of Corpora Quality on Neural Machine Translation
    Rikters, Matiss
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2018, 2018, 307 : 126 - 133
  • [22] Modeling Future Cost for Neural Machine Translation
    Duan, Chaoqun
    Chen, Kehai
    Wang, Rui
    Utiyama, Masao
    Sumita, Eiichiro
    Zhu, Conghui
    Zhao, Tiejun
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 770 - 781
  • [23] Deep Learning for Unsupervised Neural Machine Translation
    Yu, Kuai
    2021 2ND INTERNATIONAL CONFERENCE ON BIG DATA & ARTIFICIAL INTELLIGENCE & SOFTWARE ENGINEERING (ICBASE 2021), 2021, : 614 - 617
  • [24] Attending From Foresight: A Novel Attention Mechanism for Neural Machine Translation
    Li, Xintong
    Liu, Lemao
    Tu, Zhaopeng
    Li, Guanlin
    Shi, Shuming
    Meng, Max Q. -H.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2606 - 2616
  • [25] Lexical Diversity in Statistical and Neural Machine Translation
    Brglez, Mojca
    Vintar, Spela
    INFORMATION, 2022, 13 (02)
  • [26] Neural Machine Translation With Explicit Phrase Alignment
    Zhang, Jiacheng
    Luan, Huanbo
    Sun, Maosong
    Zhai, Feifei
    Xu, Jingfang
    Liu, Yang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 1001 - 1010
  • [27] Parallel Attention Mechanisms in Neural Machine Translation
    Medina, Julian Richard
    Kalita, Jugal
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 547 - 552
  • [28] Readability Metrics for Machine Translation in Dutch: Google vs. Azure & IBM
    van Toledo, Chaim
    Schraagen, Marijn
    van Dijk, Friso
    Brinkhuis, Matthieu
    Spruit, Marco
    APPLIED SCIENCES-BASEL, 2023, 13 (07):
  • [29] Some Weaknesses of Modern Machine Translation (by Example of Google Translate Web Service)
    Samokhin, Ivan Sergeevich
    Sokolova, Nataliya Leonidovna
    Sergeyeva, Marina Georgiyevna
    NAUCHNYI DIALOG, 2018, (10): : 148 - 157
  • [30] Natural Language to Visualization by Neural Machine Translation
    Luo, Yuyu
    Tang, Nan
    Li, Guoliang
    Tang, Jiawei
    Chai, Chengliang
    Qin, Xuedi
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2022, 28 (01) : 217 - 226