SUPERB @ SLT 2022: CHALLENGE ON GENERALIZATION AND EFFICIENCY OF SELF-SUPERVISED SPEECH REPRESENTATION LEARNING

被引：9

作者：

Feng, Tzu-Hsun ^{[1
]}

Dong, Annie ^{[2
]}

Yeh, Ching-Feng ^{[2
]}

Yang, Shu-Wen ^{[1
]}

Lin, Tzu-Quan ^{[1
]}

Shi, Jiatong

Chang, Kai-Wei ^{[1
]}

Huang, Zili ^{[4
]}

Wu, Haibin ^{[1
]}

Chang, Xuankai ^{[3
]}

Watanabe, Shinji ^{[3
]}

Mohamed, Abdelrahman ^{[2
]}

Li, Shang-Wen ^{[2
]}

Lee, Hung-Yi ^{[1
]}

机构：

[1] Natl Taiwan Univ, Taipei City, Taiwan

[2] Meta, Menlo Pk, CA USA

[3] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[4] Johns Hopkins Univ, Baltimore, MD 21218 USA

来源：

2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT | 2022年

关键词：

Self-supervised Learning; Pre-training; Network Compression;

D O I：

10.1109/SLT54892.2023.10022770

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present the SUPERB challenge at SLT 2022, which aims at learning self-supervised speech representation for better performance, generalization, and efficiency. The challenge builds upon the SUPERB benchmark and implements metrics to measure the computation requirements of self-supervised learning (SSL) representation and to evaluate its generalizability and performance across the diverse SUPERB tasks. The SUPERB benchmark provides comprehensive coverage of popular speech processing tasks, from speech and speaker recognition to audio generation and semantic understanding. As SSL has gained interest in the speech community and showed promising outcomes, we envision the challenge to uplevel the impact of SSL techniques by motivating more practical designs of techniques beyond task performance. We summarize the results of 14 submitted models in this paper. We also discuss the main findings from those submissions and the future directions of SSL research.

引用

页码：1096 / 1103

页数：8

共 50 条

[1] Phonetically Motivated Self-Supervised Speech Representation Learning
Yue, Xianghu
Li, Haizhou
INTERSPEECH 2021, 2021, : 746 - 750
[2] Self-Supervised Speech Representation Learning: A Review
Mohamed, Abdelrahman
Lee, Hung-yi
Borgholt, Lasse
Havtorn, Jakob D.
Edin, Joakim
Igel, Christian
Kirchhoff, Katrin
Li, Shang-Wen
Livescu, Karen
Maaloe, Lars
Sainath, Tara N.
Watanabe, Shinji
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (06) : 1179 - 1210
[3] Self-Supervised Learning With Segmental Masking for Speech Representation
Yue, Xianghu
Lin, Jingru
Gutierrez, Fabian Ritter
Li, Haizhou
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (06) : 1367 - 1379
[4] CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
Meng, Chutong
Ao, Junyi
Ko, Tom
Wang, Mingxuan
Li, Haizhou
INTERSPEECH 2023, 2023, : 2978 - 2982
[5] TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
Liu, Andy T.
Li, Shang-Wen
Lee, Hung-yi
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 2351 - 2366
[6] Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation
Luo, Jian
Wang, Jianzong
Cheng, Ning
Xiao, Jing
INTERSPEECH 2021, 2021, : 1169 - 1173
[7] EXPLORING THE INTEGRATION OF SPEECH SEPARATION AND RECOGNITION WITH SELF-SUPERVISED LEARNING REPRESENTATION
Masuyama, Yoshiki
Chang, Xuankai
Zhang, Wangyou
Cornell, Samuele
Wang, Zhong-Qiu
Ono, Nobutaka
Qian, Yanmin
Watanabe, Shinji
2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
[8] On the Generalization and Causal Explanation in Self-Supervised Learning
Qiang, Wenwen
Song, Zeen
Gu, Ziyin
Li, Jiangmeng
Zheng, Changwen
Sun, Fuchun
Xiong, Hui
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 1727 - 1754
[9] HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Hsu, Wei-Ning
Bolte, Benjamin
Tsai, Yao-Hung Hubert
Lakhotia, Kushal
Salakhutdinov, Ruslan
Mohamed, Abdelrahman
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3451 - 3460
[10] Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction
Brima, Yusuf
Krumnack, Ulf
Pika, Simone
Heidemann, Gunther
INFORMATION, 2024, 15 (02)

← 1 2 3 4 5 →