Vector-Quantized Autoregressive Predictive Coding

被引:47
|
作者
Chung, Yu-An [1 ]
Tang, Hao [1 ]
Glass, James [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
来源
关键词
self-supervised learning; unsupervised learning; representation learning; vector quantization; transfer learning;
D O I
10.21437/Interspeech.2020-1228
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Autoregressive Predictive Coding (APC), as a self-supervised objective, has enjoyed success in learning representations from large amounts of unlabeled data, and the learned representations are rich for many downstream tasks. However, the connection between low self-supervised loss and strong performance in downstream tasks remains unclear. In this work, we propose Vector-Quantized Autoregressive Predictive Coding (VQ-APC), a novel model that produces quantized representations, allowing us to explicitly control the amount of information encoded in the representations. By studying a sequence of increasingly limited models, we reveal the constituents of the learned representations. In particular, we confirm the presence of information with probing tasks, while showing the absence of information with mutual information, uncovering the model's preference in preserving speech information as its capacity becomes constrained. We find that there exists a point where phonetic and speaker information are amplified to maximize a self-supervised objective. As a byproduct, the learned codes for a particular model capacity correspond well to English phones.
引用
收藏
页码:3760 / 3764
页数:5
相关论文
共 50 条
  • [1] LOW-DELAY VECTOR-QUANTIZED SUBBAND ADPCM CODING
    Fink, Marco
    Zoelzer, Udo
    DAFX-15: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON DIGITAL AUDIO EFFECTS, 2015, : 162 - 167
  • [2] DCRVQ: A new strategy for efficient entropy coding of vector-quantized images
    DeNatale, FGB
    Fioravanti, S
    Giusto, DD
    IEEE TRANSACTIONS ON COMMUNICATIONS, 1996, 44 (06) : 696 - 706
  • [3] DCRVQ: a new strategy for efficient entropy coding of vector-quantized images
    Univ of Genova, Genova, Italy
    IEEE Trans Commun, 6 (696-706):
  • [4] ENHANCING INTO THE CODEC: NOISE ROBUST SPEECH CODING WITH VECTOR-QUANTIZED AUTOENCODERS
    Casebeer, Jonah
    Vale, Vinjai
    Isik, Umut
    Valin, Jean-Marc
    Giri, Ritwik
    Krishnaswamy, Arvindh
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 711 - 715
  • [5] Fully Vector-Quantized Neural Network-Based Code-Excited Nonlinear Predictive Speech Coding
    Wu, Lizhong
    Niranjan, Mahesan
    Fallside, Frank
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04): : 482 - 489
  • [6] VQCPC-GAN: VARIABLE-LENGTH ADVERSARIAL AUDIO SYNTHESIS USING VECTOR-QUANTIZED CONTRASTIVE PREDICTIVE CODING
    Nistal, Javier
    Aouameur, Cyran
    Limner, Stefan
    Richard, Gael
    2021 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2021, : 116 - 120
  • [7] Vector-Quantized Variational AutoEncoder for pansharpening
    Talbi, Farid
    Elmezouar, Miloud Chikr
    Boutellaa, Elhocine
    Alim, Fatiha
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (20) : 6329 - 6349
  • [8] VECTOR-QUANTIZED TRANSFORM CODER FOR SPEECH CODING AT 9.6KBIT/S AND BELOW
    KONDOZ, A
    EVANS, BG
    ELECTRONICS LETTERS, 1987, 23 (24) : 1286 - 1288
  • [9] Vector-Quantized Autoencoder With Copula for Collaborative Filtering
    Wang, Guanyu
    Zhong, Ting
    Xu, Xovee
    Zhang, Kunpeng
    Zhou, Fan
    Wang, Yong
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3458 - 3462
  • [10] Vector-Quantized Prompt Learning for Paraphrase Generation
    Luo, Haotian
    Liu, Yixin
    Liu, Peidong
    Liut, Xianggen
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 13389 - 13398