HILCodec: High-Fidelity and Lightweight Neural Audio Codec

被引:0
|
作者
Ahn, Sunghwan [1 ,2 ]
Woo, Beom Jun [1 ,2 ]
Han, Min Hyun [1 ,2 ]
Moon, Chanyeong [1 ,2 ]
Kim, Nam Soo [1 ,2 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul 08826, South Korea
[2] Seoul Natl Univ, Inst New Media & Commun, Seoul 08826, South Korea
关键词
Codecs; Convolution; Decoding; Vocoders; Psychoacoustic models; Training; Speech coding; Spectrogram; Generative adversarial networks; Distortion; Acoustic signal processing; audio coding; codecs; generative adversarial networks; residual neural networks;
D O I
10.1109/JSTSP.2024.3469530
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The recent advancement of end-to-end neural audio codecs enables compressing audio at very low bitrates while reconstructing the output audio with high fidelity. Nonetheless, such improvements often come at the cost of increased model complexity. In this paper, we identify and address the problems of existing neural audio codecs. We show that the performance of the SEANet-based codec does not increase consistently as the network depth increases. We analyze the root cause of such a phenomenon and suggest a variance-constrained design. Also, we reveal various distortions in previous waveform domain discriminators and propose a novel distortion-free discriminator. The resulting model, HILCodec, is a real-time streaming audio codec that demonstrates state-of-the-art quality across various bitrates and audio types.
引用
收藏
页码:1517 / 1530
页数:14
相关论文
共 50 条
  • [31] An Information Security Method Based on Optimized High-Fidelity Reversible Data Hiding
    Kong, Xiaoxi
    Cai, Zhanchuan
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (12) : 8529 - 8539
  • [32] High-fidelity simulation in post-graduate training and assessment: an Irish perspective
    M. G. Langdon
    A. J. Cunningham
    Irish Journal of Medical Science, 2007, 176 : 267 - 271
  • [33] High-Fidelity Illumination Normalization for Face Recognition Based on Auto-Encoder
    Li, Chunlu
    Da, Feipeng
    Wang, Chenxing
    IEEE ACCESS, 2020, 8 (08): : 95512 - 95522
  • [34] High-fidelity simulation in post-graduate training and assessment: an Irish perspective
    Langdon, M. G.
    Cunningham, A. J.
    IRISH JOURNAL OF MEDICAL SCIENCE, 2007, 176 (04) : 267 - 271
  • [35] An Empirical Analysis of Diffusion, Autoencoders, and Adversarial Deep Learning Models for Predicting Dementia Using High-Fidelity MRI
    Gajjar, Pranshav
    Garg, Manav
    Desai, Shivani
    Chhinkaniwala, Hitesh
    Sanghvi, Harshal A.
    Patel, Riki H.
    Gupta, Shailesh
    Pandya, Abhijit S.
    IEEE ACCESS, 2024, 12 : 131231 - 131243
  • [36] High-quality scalable audio codec - art. no. 67770E
    Kim, Miyoung
    Oh, Eunmi
    Kim, JungHoe
    MULTIMEDIA SYSTEMS AND APPLICATIONS X, 2007, 6777 : E7770 - E7770
  • [37] Fast, Nonlocal and Neural: A Lightweight High Quality Solution to Image Denoising
    Guo, Yu
    Davy, Axel
    Facciolo, Gabriele
    Morel, Jean-Michel
    Jin, Qiyu
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1515 - 1519
  • [38] FA-GAN: Artifacts-free and Phase-aware High-fidelity GAN-based Vocoder
    Shen, Rubing
    Ren, Yanzhen
    Sung, Zongkun
    INTERSPEECH 2024, 2024, : 3884 - 3888
  • [39] A Systematic Review and Meta-analysis of the Use of High-Fidelity Simulation in Obstetric Ultrasound
    Dromey, Brian P.
    Peebles, Donald M.
    Stoyanov, Danail V.
    SIMULATION IN HEALTHCARE-JOURNAL OF THE SOCIETY FOR SIMULATION IN HEALTHCARE, 2021, 16 (01): : 52 - 59
  • [40] Comparative performance of high-fidelity training models for flexible ureteroscopy: Are all models effective?
    Mishra, Shashikant
    Sharma, Rajan
    Kumar, Akhilesh
    Ganatra, Pradeep
    Sabnis, Ravindra B.
    Desai, Mahesh R.
    INDIAN JOURNAL OF UROLOGY, 2011, 27 (04) : 451 - 456