HILCodec: High-Fidelity and Lightweight Neural Audio Codec

被引:0
|
作者
Ahn, Sunghwan [1 ,2 ]
Woo, Beom Jun [1 ,2 ]
Han, Min Hyun [1 ,2 ]
Moon, Chanyeong [1 ,2 ]
Kim, Nam Soo [1 ,2 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul 08826, South Korea
[2] Seoul Natl Univ, Inst New Media & Commun, Seoul 08826, South Korea
关键词
Codecs; Convolution; Decoding; Vocoders; Psychoacoustic models; Training; Speech coding; Spectrogram; Generative adversarial networks; Distortion; Acoustic signal processing; audio coding; codecs; generative adversarial networks; residual neural networks;
D O I
10.1109/JSTSP.2024.3469530
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The recent advancement of end-to-end neural audio codecs enables compressing audio at very low bitrates while reconstructing the output audio with high fidelity. Nonetheless, such improvements often come at the cost of increased model complexity. In this paper, we identify and address the problems of existing neural audio codecs. We show that the performance of the SEANet-based codec does not increase consistently as the network depth increases. We analyze the root cause of such a phenomenon and suggest a variance-constrained design. Also, we reveal various distortions in previous waveform domain discriminators and propose a novel distortion-free discriminator. The resulting model, HILCodec, is a real-time streaming audio codec that demonstrates state-of-the-art quality across various bitrates and audio types.
引用
收藏
页码:1517 / 1530
页数:14
相关论文
共 50 条
  • [41] Baccalaureate nursing students' experiences with high-fidelity simulation: protocol for a qualitative systematic review
    Zhu, Yuxuan
    Geng, Cong
    Pei, Xianbo
    Chen, Xiaoli
    BMJ OPEN, 2020, 10 (12):
  • [42] Using High-Fidelity Avatars to Advance Camera-Based Cardiac Pulse Measurement
    McDuff, Daniel
    Hernandez, Javier
    Liu, Xin
    Wood, Erroll
    Baltrusaitis, Tadas
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2022, 69 (08) : 2646 - 2656
  • [43] Artistic Intelligence: A Diffusion-Based Framework for High-Fidelity Landscape Painting Synthesis
    Yang, Wanggong
    Zhao, Yifei
    IEEE ACCESS, 2025, 13 : 26037 - 26049
  • [44] Validation of a High-Fidelity Electrophysiology Simulator and Development of a Proficiency-Based Simulator Training Program
    Ullah, Waqas
    Hunter, Ross J.
    Finlay, Malcolm
    Earley, Mark J.
    McLean, Ailsa
    Marazzi, Raffaella
    De Ponti, Roberto
    Schilling, Richard J.
    SIMULATION IN HEALTHCARE-JOURNAL OF THE SOCIETY FOR SIMULATION IN HEALTHCARE, 2017, 12 (01): : 41 - 46
  • [45] High-Fidelity Reversible Image Watermarking Based on Effective Prediction Error-Pairs Modification
    He, Wenguang
    Cai, Zhanchuan
    Wang, Yaomin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 52 - 63
  • [46] A low-cost and high-fidelity animal model for nonpalpable implant removal: A pilot study
    Chene, Gautier
    Cerruto, Emanuele
    Nohuz, Erdogan
    INTERNATIONAL JOURNAL OF GYNECOLOGY & OBSTETRICS, 2025,
  • [47] Fast Generation of High-Fidelity RGB-D Images by Deep Learning With Adaptive Convolution
    Xian, Chuhua
    Zhang, Dongjiu
    Dai, Chengkai
    Wang, Charlie C. L.
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2021, 18 (03) : 1328 - 1340
  • [48] Exploration of shock-droplet interaction based on high-fidelity simulation and improved theoretical model
    Xiong, Tianheng
    Shao, Changxiao
    Luo, Kun
    JOURNAL OF FLUID MECHANICS, 2024, 988
  • [49] Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
    Lu, Ye-Xin
    Ai, Yang
    Ling, Zhen-Hua
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2022, 2023, 1765 : 68 - 80
  • [50] Educational efficacy of high-fidelity simulation in neonatal resuscitation training: a systematic review and meta-analysis
    Jichong Huang
    Ying Tang
    Jun Tang
    Jing Shi
    Hua Wang
    Tao Xiong
    Bin Xia
    Li Zhang
    Yi Qu
    Dezhi Mu
    BMC Medical Education, 19