ENHANCED METHOD OF AUDIO CODING USING CNN-BASED SPECTRAL RECOVERY WITH ADAPTIVE STRUCTURE

被引：0

作者：

Shin, Seong-Hyeon ^{[1
]}

Beack, Seung Kwon ^{[2
]}

Lim, Wootaek ^{[2
]}

Park, Hochong ^{[1
]}

机构：

[1] Kwangwoon Univ, Seoul, South Korea

[2] Elect & Telecommun Res Inst, Daejeon, South Korea

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

关键词：

adaptive structure; audio coding; autoencoder; convolutional neural network; spectral recovery; SPEECH;

D O I：

10.1109/icassp40776.2020.9054409

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

A process of spectral recovery can enhance the performance of transform-based audio coding by transmitting only a portion of spectral data and recovering the missing spectral data in the decoder. This study proposes an enhanced method of audio coding based on spectral recovery with an adaptive structure that yields improved sound quality compared with the previous method. The spectral data to be recovered are arranged in an adaptive pattern depending on the difficulty of recovery. In addition, according to the spectral characteristics, prior information associated with these spectral data is selectively transmitted that helps a neural network improve the performance of magnitude recovery. Prior information also provides the signs of recovered magnitudes. A subjective performance evaluation shows that, for mono coding without window switching at 40 kbps, the proposed coding method provides better sound quality than the conventional method on average.

引用

页码：351 / 355

页数：5

共 21 条

[1]

Advanced Television Systems Committee (ATSC), 1994, AC3 ATSC

[2]

[Anonymous], 2012, ISO/IEC 23003-3

[3]

[Anonymous], 2004, PROC INT C ACOUST

[4]

[Anonymous], 2008, N9927 ISOIEC JTC1SC2

[5] Single-Mode-Based Unified Speech and Audio Coding by Extending the Linear Prediction Domain Coding Mode [J].

Beack, Seungkwon ;

Seong, Jongmo ;

Lee, Misuk ;

Lee, Taejin .

ETRI JOURNAL, 2017, 39 (03) :310-318

[6]

Breebaart J., 2005, 119 CONV AUD ENG SOC

[7]

Dietz M., 2002, Audio Eng. Soc. Conv., P49

[8]

Gârbacea C, 2019, INT CONF ACOUST SPEE, P735, DOI [10.1109/icassp.2019.8683277, 10.1109/ICASSP.2019.8683277]

[9]

Helmrich CR, 2015, INT CONF ACOUST SPEE, P389, DOI 10.1109/ICASSP.2015.7177997

[10]

International Telecommunication Union, 2015, Recommendation ITU-R BS.1534-3

← 1 2 3 →