HILCodec: High-Fidelity and Lightweight Neural Audio Codec

被引：0

作者：

Ahn, Sunghwan ^{[1
,2
]}

Woo, Beom Jun ^{[1
,2
]}

Han, Min Hyun ^{[1
,2
]}

Moon, Chanyeong ^{[1
,2
]}

Kim, Nam Soo ^{[1
,2
]}

机构：

[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul 08826, South Korea

[2] Seoul Natl Univ, Inst New Media & Commun, Seoul 08826, South Korea

来源：

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING | 2024年 / 18卷 / 08期

关键词：

Codecs; Convolution; Decoding; Vocoders; Psychoacoustic models; Training; Speech coding; Spectrogram; Generative adversarial networks; Distortion; Acoustic signal processing; audio coding; codecs; generative adversarial networks; residual neural networks;

D O I：

10.1109/JSTSP.2024.3469530

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The recent advancement of end-to-end neural audio codecs enables compressing audio at very low bitrates while reconstructing the output audio with high fidelity. Nonetheless, such improvements often come at the cost of increased model complexity. In this paper, we identify and address the problems of existing neural audio codecs. We show that the performance of the SEANet-based codec does not increase consistently as the network depth increases. We analyze the root cause of such a phenomenon and suggest a variance-constrained design. Also, we reveal various distortions in previous waveform domain discriminators and propose a novel distortion-free discriminator. The resulting model, HILCodec, is a real-time streaming audio codec that demonstrates state-of-the-art quality across various bitrates and audio types.

引用

页码：1517 / 1530

页数：14

共 50 条

[41] Baccalaureate nursing students' experiences with high-fidelity simulation: protocol for a qualitative systematic review
Zhu, Yuxuan
Geng, Cong
Pei, Xianbo
Chen, Xiaoli
BMJ OPEN, 2020, 10 (12):
[42] Using High-Fidelity Avatars to Advance Camera-Based Cardiac Pulse Measurement
McDuff, Daniel
Hernandez, Javier
Liu, Xin
Wood, Erroll
Baltrusaitis, Tadas
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2022, 69 (08) : 2646 - 2656
[43] Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis
Yang, Wanggong
Zhao, Yifei
IEEE ACCESS, 2025, 13 : 26037 - 26049
[44] Validation of a High-Fidelity Electrophysiology Simulator and Development of a Proficiency-Based Simulator Training Program
Ullah, Waqas
Hunter, Ross J.
Finlay, Malcolm
Earley, Mark J.
McLean, Ailsa
Marazzi, Raffaella
De Ponti, Roberto
Schilling, Richard J.
SIMULATION IN HEALTHCARE-JOURNAL OF THE SOCIETY FOR SIMULATION IN HEALTHCARE, 2017, 12 (01): : 41 - 46
[45] High-Fidelity Reversible Image Watermarking Based on Effective Prediction Error-Pairs Modification
He, Wenguang
Cai, Zhanchuan
Wang, Yaomin
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 52 - 63
[46] A low-cost and high-fidelity animal model for nonpalpable implant removal: A pilot study
Chene, Gautier
Cerruto, Emanuele
Nohuz, Erdogan
INTERNATIONAL JOURNAL OF GYNECOLOGY & OBSTETRICS, 2025,
[47] Fast Generation of High-Fidelity RGB-D Images by Deep Learning With Adaptive Convolution
Xian, Chuhua
Zhang, Dongjiu
Dai, Chengkai
Wang, Charlie C. L.
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2021, 18 (03) : 1328 - 1340
[48] Exploration of shock-droplet interaction based on high-fidelity simulation and improved theoretical model
Xiong, Tianheng
Shao, Changxiao
Luo, Kun
JOURNAL OF FLUID MECHANICS, 2024, 988
[49] Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Lu, Ye-Xin
Ai, Yang
Ling, Zhen-Hua
MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2022, 2023, 1765 : 68 - 80
[50] Educational efficacy of high-fidelity simulation in neonatal resuscitation training: a systematic review and meta-analysis
Jichong Huang
Ying Tang
Jun Tang
Jing Shi
Hua Wang
Tao Xiong
Bin Xia
Li Zhang
Yi Qu
Dezhi Mu
BMC Medical Education, 19

← 1 2 3 4 5 →