MEASURING THE IMPACT OF DOMAIN FACTORS IN SELF-SUPERVISED PRE-TRAINING

被引：0

作者：

Sanabria, Ramon ^{[1
]}

Wei-Ning, Hsu ^{[2
]}

Alexei, Baevski ^{[2
]}

Auli, Michael ^{[2
]}

机构：

[1] Univ Edinburgh, Edinburgh, Midlothian, Scotland

[2] Meta AI, New York, NY USA

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW | 2023年

关键词：

speech recognition; self-supervised learning; domain mismatch;

D O I：

10.1109/ICASSPW59220.2023.10193184

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Human speech data comprises a rich set of domain factors such as accent, syntactic and semantic variety, or acoustic environment. Previous work explores the effect of domain mismatch in automatic speech recognition between pre-training and fine-tuning as a whole [1] but does not dissect the contribution of individual factors. In this paper, we present a controlled study to better understand the effect of such factors on the performance of pre-trained representations on automatic speech recognition. To do so, we pre-train models either on modified natural speech or synthesized audio, with a single domain factor modified, and then measure performance after fine-tuning. Results show that phonetic domain factors play an important role during pre-training while grammatical and syntactic factors are far less important. To our knowledge, this is the first study to better understand the domain characteristics of pre-trained sets in self-supervised pre-training for speech.

引用

页数：5

共 50 条

[1] Reducing Domain mismatch in Self-supervised speech pre-training
Baskar, Murali Karthick
Rosenberg, Andrew
Ramabhadran, Bhuvana
Zhang, Yu
INTERSPEECH 2022, 2022, : 3028 - 3032
[2] CDS: Cross-Domain Self-supervised Pre-training
Kim, Donghyun
Saito, Kuniaki
Oh, Tae-Hyun
Plummer, Bryan A.
Sclaroff, Stan
Saenko, Kate
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9103 - 9112
[3] Self-supervised ECG pre-training
Liu, Han
Zhao, Zhenbo
She, Qiang
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 70
[4] ENHANCING THE DOMAIN ROBUSTNESS OF SELF-SUPERVISED PRE-TRAINING WITH SYNTHETIC IMAGES
Hassan, Mohamad N. C.
Bhattacharya, Avigyan
da Costa, Victor G. Turrisi
Banerjee, Biplab
Ricci, Elisa
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 5470 - 5474
[5] Self-Supervised Underwater Image Generation for Underwater Domain Pre-Training
Wu, Zhiheng
Wu, Zhengxing
Chen, Xingyu
Lu, Yue
Yu, Junzhi
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73 : 1 - 14
[6] Self-supervised Pre-training of Text Recognizers
Kiss, Martin
Hradis, Michal
DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT IV, 2024, 14807 : 218 - 235
[7] Self-supervised Pre-training for Mirror Detection
Lin, Jiaying
Lau, Rynson W. H.
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 12193 - 12202
[8] Self-supervised Pre-training for Nuclei Segmentation
Haq, Mohammad Minhazul
Huang, Junzhou
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT II, 2022, 13432 : 303 - 313
[9] EFFECTIVENESS OF SELF-SUPERVISED PRE-TRAINING FOR ASR
Baevski, Alexei
Mohamed, Abdelrahman
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7694 - 7698
[10] Self-Supervised Pre-training for Time Series Classification
Shi, Pengxiang
Ye, Wenwen
Qin, Zheng
2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,

← 1 2 3 4 5 →