SUPERB @ SLT 2022: CHALLENGE ON GENERALIZATION AND EFFICIENCY OF SELF-SUPERVISED SPEECH REPRESENTATION LEARNING

被引：9

作者：

Feng, Tzu-Hsun ^{[1
]}

Dong, Annie ^{[2
]}

Yeh, Ching-Feng ^{[2
]}

Yang, Shu-Wen ^{[1
]}

Lin, Tzu-Quan ^{[1
]}

Shi, Jiatong

Chang, Kai-Wei ^{[1
]}

Huang, Zili ^{[4
]}

Wu, Haibin ^{[1
]}

Chang, Xuankai ^{[3
]}

Watanabe, Shinji ^{[3
]}

Mohamed, Abdelrahman ^{[2
]}

Li, Shang-Wen ^{[2
]}

Lee, Hung-Yi ^{[1
]}

机构：

[1] Natl Taiwan Univ, Taipei City, Taiwan

[2] Meta, Menlo Pk, CA USA

[3] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[4] Johns Hopkins Univ, Baltimore, MD 21218 USA

来源：

2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT | 2022年

关键词：

Self-supervised Learning; Pre-training; Network Compression;

D O I：

10.1109/SLT54892.2023.10022770

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present the SUPERB challenge at SLT 2022, which aims at learning self-supervised speech representation for better performance, generalization, and efficiency. The challenge builds upon the SUPERB benchmark and implements metrics to measure the computation requirements of self-supervised learning (SSL) representation and to evaluate its generalizability and performance across the diverse SUPERB tasks. The SUPERB benchmark provides comprehensive coverage of popular speech processing tasks, from speech and speaker recognition to audio generation and semantic understanding. As SSL has gained interest in the speech community and showed promising outcomes, we envision the challenge to uplevel the impact of SSL techniques by motivating more practical designs of techniques beyond task performance. We summarize the results of 14 submitted models in this paper. We also discuss the main findings from those submissions and the future directions of SSL research.

引用

页码：1096 / 1103

页数：8

共 50 条

[21] Spectral Salt-and-Pepper Patch Masking for Self-Supervised Speech Representation Learning
Kim, June-Woo
Chung, Hoon
Jung, Ho-Young
MATHEMATICS, 2023, 11 (15)
[22] Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
Zaiem, Salah
Kemiche, Youcef
Parcollet, Titouan
Essid, Slim
Ravanelli, Mirco
INTERSPEECH 2023, 2023, : 2873 - 2877
[23] INVESTIGATING SELF-SUPERVISED LEARNING FOR SPEECH ENHANCEMENT AND SEPARATION
Huang, Zili
Watanabe, Shinji
Yang, Shu-wen
Garcia, Paola
Khudanpur, Sanjeev
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6837 - 6841
[24] Self-supervised Consensus Representation Learning for Attributed Graph
Liu, Changshu
Wen, Liangjian
Kang, Zhao
Luo, Guangchun
Tian, Ling
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2654 - 2662
[25] TRIBYOL: TRIPLET BYOL FOR SELF-SUPERVISED REPRESENTATION LEARNING
Li, Guang
Togo, Ren
Ogawa, Takahiro
Haseyama, Miki
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3458 - 3462
[26] ViewMix: Augmentation for Robust Representation in Self-Supervised Learning
Das, Arjon
Zhong, Xin
IEEE ACCESS, 2024, 12 : 8461 - 8470
[27] Randomly shuffled convolution for self-supervised representation learning
Oh, Youngjin
Jeon, Minkyu
Ko, Dohwan
Kim, Hyunwoo J.
INFORMATION SCIENCES, 2023, 623 : 206 - 219
[28] Self-supervised representation learning for surgical activity recognition
Paysan, Daniel
Haug, Luis
Bajka, Michael
Oelhafen, Markus
Buhmann, Joachim M.
INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2021, 16 (11) : 2037 - 2044
[29] AtmoDist: Self-supervised representation learning for atmospheric dynamics
Hoffmann, Sebastian
Lessig, Christian
ENVIRONMENTAL DATA SCIENCE, 2023, 2
[30] Heuristic Attention Representation Learning for Self-Supervised Pretraining
Van Nhiem Tran
Liu, Shen-Hsuan
Li, Yung-Hui
Wang, Jia-Ching
SENSORS, 2022, 22 (14)

← 1 2 3 4 5 →