AN EMPIRICAL STUDY OF TRANSFORMER-BASED NEURAL LANGUAGE MODEL ADAPTATION

被引:0
|
作者
Li, Ke [1 ,2 ]
Liu, Zhe [1 ]
He, Tianxing [3 ]
Huang, Hongzhao [1 ]
Peng, Fuchun [1 ]
Povey, Daniel
Khudanpur, Sanjeev [2 ]
机构
[1] Facebook AI, Menlo Pk, CA 94025 USA
[2] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[3] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
neural language model; language model adaptation; Transformer; linear interpolation; automatic speech recognition;
D O I
10.1109/icassp40776.2020.9053399
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We explore two adaptation approaches of deep Transformer based neural language models (LMs) for automatic speech recognition. The first approach is a pretrain-finetune framework, where we first pretrain a Transformer LM on a large-scale text corpus from scratch and then adapt it to relatively small target domains via finetuning. The second approach is a mixer of dynamically weighted models that are separately trained on source and target domains, aiming to improve simple linear interpolation with dynamic weighting. We compare the two approaches with three baselines - without adaptation, merging data, and simple interpolation - on Switchboard (SWBD) and Wall Street Journal (WSJ). Experiments show that the mixer model generally performs better than baselines and finetuning. Compared with no adaptation, finetuning and the mixer approach obtain up to relative 11.5% and 14.1% WER reductions on SWBD, respectively. The mixer model also outperforms linear interpolation and merging data. On WSJ, the mixer approach achieves a new state-of-the-art WER result.
引用
收藏
页码:7934 / 7938
页数:5
相关论文
共 50 条
  • [1] A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model
    Um, Taehum
    Kim, Namhyoung
    APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [2] Empirical Study of Tweets Topic Classification Using Transformer-Based Language Models
    Mandal, Ranju
    Chen, Jinyan
    Becken, Susanne
    Stantic, Bela
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2021, 2021, 12672 : 340 - 350
  • [3] Transformer-based heart language model with electrocardiogram annotations
    Tudjarski, Stojancho
    Gusev, Marjan
    Kanoulas, Evangelos
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [4] LVBERT: Transformer-Based Model for Latvian Language Understanding
    Znotins, Arturs
    Barzdins, Guntis
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 : 111 - 115
  • [5] ParsBERT: Transformer-based Model for Persian Language Understanding
    Mehrdad Farahani
    Mohammad Gharachorloo
    Marzieh Farahani
    Mohammad Manthouri
    Neural Processing Letters, 2021, 53 : 3831 - 3847
  • [6] Localizing in-domain adaptation of transformer-based biomedical language models
    Buonocore, Tommaso Mario
    Crema, Claudio
    Redolfi, Alberto
    Bellazzi, Riccardo
    Parimbelli, Enea
    JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 144
  • [7] ParsBERT: Transformer-based Model for Persian Language Understanding
    Farahani, Mehrdad
    Gharachorloo, Mohammad
    Farahani, Marzieh
    Manthouri, Mohammad
    NEURAL PROCESSING LETTERS, 2021, 53 (06) : 3831 - 3847
  • [8] Roles and Utilization of Attention Heads in Transformer-based Neural Language Models
    Jo, Jae-young
    Myaeng, Sung-hyon
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3404 - 3417
  • [9] TransPolymer: a Transformer-based language model for polymer property predictions
    Xu, Changwen
    Wang, Yuyang
    Farimani, Amir Barati
    NPJ COMPUTATIONAL MATERIALS, 2023, 9 (01)
  • [10] Transformer-Based Single-Cell Language Model: A Survey
    Lan, Wei
    He, Guohang
    Liu, Mingyang
    Chen, Qingfeng
    Cao, Junyue
    Peng, Wei
    BIG DATA MINING AND ANALYTICS, 2024, 7 (04): : 1169 - 1186