Generation and Analysis of Vocal Spectrograms: Combining Generative Adversarial Networks

被引:0
作者
Yang, Zhe [1 ]
机构
[1] Weifang Engn Vocat Coll, Weifang 262500, Shandong, Peoples R China
来源
PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON MACHINE INTELLIGENCE AND DIGITAL APPLICATIONS, MIDA2024 | 2024年
关键词
Generative Adversarial Network; Vocal Music Spectrum Map; Speech Enhancement; Deep Learning;
D O I
10.1145/3662739.3672183
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vocal spectrogram is a representation of sound in the frequency domain and has important application value in fields such as music and speech. Through generative an adversarial network (GAN), realistic vocal spectrograms can be generated or analyzed using the generated spectrograms. This article introduces the basic principles and structure of GAN, including the design of generator and discriminator networks, discusses the data preparation and definition of loss function in vocal spectrogram generation, and describes in detail the steps of training GAN, including alternating training of generator and discriminator to generate more realistic vocal spectrograms. After generating vocal spectrograms, it further introduces how to use corresponding technologies and algorithms to analyze the generated spectrograms, and studies the evaluation indicators for the generated vocal spectrograms or analysis results. The vocal spectrogram generated by generative adversarial networks has high performance, with the highest clarity reaching 92%. The generation and analysis of vocal spectrograms can play an increasingly important role in audio processing and acoustic research, and bring new breakthroughs to the development of audio technology.
引用
收藏
页码:534 / 539
页数:6
相关论文
共 15 条
  • [1] Chen Haixiu, 2024, China Test, V50, P54
  • [2] Chen Jingxia, 2024, Computer Engineering and Design, V45, P777
  • [3] [陈铭 Chen Ming], 2024, [电光与控制, Electronics Optics & Control], V31, P83
  • [4] Cui Yan, 2023, Foreign Electronic Measurement Technology, V42, P144
  • [5] ECG Signal Analysis based on the Spectrogram and Spider Monkey Optimisation Technique
    Gupta V.
    Mittal M.
    Mittal V.
    Diwania S.
    Saxena N.K.
    [J]. Journal of The Institution of Engineers (India): Series B, 2023, 104 (01) : 153 - 164
  • [6] Jha A K, 2022, Indian Journal of Nuclear Medicine, V37, pS67
  • [7] Stability, Reliability, and Robustness of GaN Power Devices: A Review
    Kozak, Joseph Peter
    Zhang, Ruizhe
    Porter, Matthew
    Song, Qihao
    Liu, Jingcun
    Wang, Bixuan
    Wang, Rudy
    Saito, Wataru
    Zhang, Yuhao
    [J]. IEEE TRANSACTIONS ON POWER ELECTRONICS, 2023, 38 (07) : 8442 - 8471
  • [8] Heart Sound Classification Using Deep Learning Techniques Based on Log-mel Spectrogram
    Minh Tuan Nguyen
    Lin, Wei Wen
    Huang, Jin H.
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (01) : 344 - 360
  • [9] Pardede J, 2023, JOIN (Jurnal Online Informatika), V8, P44
  • [10] Qiu Zhibin, 2022, Journal of South China University of Technology (Natural Science Edition), V50, P129, DOI 10.12141/j.issn.1000-565X.210531