A HIGH-RATE EXTENSION TO SOUNDSTREAM

被引：1

作者：

Kang, Hong-Goo ^{[1
,2
]}

Skoglund, Jan ^{[1
]}

Kleijn, W. Bastiaan ^{[1
,3
]}

Storus, Andrew ^{[1
]}

Yeh, Hengchin ^{[1
]}

机构：

[1] Google LLC, San Francisco, CA 94105 USA

[2] Yonsei Univ, Elect & Elect Engn, Seoul, South Korea

[3] Victoria Univ Wellington, Sch Engn & Comp Sci, Wellington, New Zealand

来源：

2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA | 2023年

关键词：

neural speech coding; convolutional transformer; embedding decomposition;

D O I：

10.1109/WASPAA58266.2023.10248100

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a high-rate extension of the SoundStream codec which is able to generate almost transparent quality audio at 16 kbps for wideband speech signals. SoundStream shows reasonably good performance at low bit-rates (e.g. around 9 kbps), but its performance does not improve much when more bits are used for encoding the latent embeddings. Motivated by experimental results showing that neural audio codec performance is highly related to the characteristics of latent embeddings such as dimensionality, dependency, and probability density function shape, we propose a convolutional transformer architecture and an attention-based multi-scale latent decomposition method that significantly enhances codec performance when quantizing high-dimensional embeddings. Experimental results show the superiority of our proposed model over conventional approaches.

引用

页数：5

共 50 条

[1] Phase separation in dilute polymer solutions at high-rate extension
Subbotin, Andrey V.
Semenov, Alexander N.
JOURNAL OF POLYMER SCIENCE PART B-POLYMER PHYSICS, 2016, 54 (11) : 1066 - 1073
[2] Phase Separation Kinetics in Unentangled Polymer Solutions Under High-Rate Extension
Semenov, Alexander N.
Subbotin, Andrey V.
JOURNAL OF POLYMER SCIENCE PART B-POLYMER PHYSICS, 2017, 55 (07) : 623 - 637
[3] Dynamic bifurcation during high-rate planar extension of a thin rectangular block
Sorensen, NJ
Freund, LB
EUROPEAN JOURNAL OF MECHANICS A-SOLIDS, 1998, 17 (05) : 709 - 724
[4] HIGH-RATE FILTRATION
KIRCHMAN, WB
JONES, WH
JOURNAL AMERICAN WATER WORKS ASSOCIATION, 1972, 64 (03): : 157 - &
[5] HIGH-RATE CLARIFICATION
RYDER, RA
WATER-ENGINEERING & MANAGEMENT, 1990, 137 (05): : 47 - 48
[6] HIGH-RATE MCVD
SIMPSON, JR
MACCHESNEY, JB
WALKER, KL
JOURNAL OF NON-CRYSTALLINE SOLIDS, 1980, 38-9 (MAY-) : 831 - 836
[7] HIGH-RATE FORMING
不详
JOURNAL OF THE INSTITUTE OF METALS, 1966, 94 (05): : 71 - &
[8] A HIGH-RATE OF RETURN
JOHNSON, E
CHEMISTRY & INDUSTRY, 1993, (17) : 677 - 677
[9] Overflow rate of high-rate settlers
Vukovic, Z
WATER POLLUTION V: MODELLING, MEASURING AND PREDICTION, 1999, 1 : 77 - 83
[10] Unstable neck formation as a precursor to ductile fracture during high-rate planar extension
Freund, LB
Sorensen, NJ
ADVANCES IN FRACTURE RESEARCH, VOLS 1-6, 1997, : 2685 - 2696

← 1 2 3 4 5 →