Music genre classification based on fusing audio and lyric information

被引:0
|
作者
You Li
Zhihai Zhang
Han Ding
Liang Chang
机构
[1] Guilin University of Electronic Technology,Guangxi Key Laboratory of Trusted Software
[2] Guilin University of Electronic Technology,School of Electronic Engineering and Automation
来源
Multimedia Tools and Applications | 2023年 / 82卷
关键词
Music genre classification; Audio information; Lyric information; Information fusion;
D O I
暂无
中图分类号
学科分类号
摘要
Music genre classification (MGC) has a wide range of application scenarios. Traditional MGC methods only consider either audio information or lyric information, resulting in an unsatisfactory recognition effect. In this paper, we propose a multimodal music genre classification framework that integrates both audio information and lyric information. By using the complementarity of multimodal information, music genres can be represented more comprehensively. First, the framework extracts the mel-spectrogram of audio, and a convolutional neural network is used to extract audio features. Simultaneously, BERT is used to obtain the distributed representation of the lyrics. Then, the two modal pieces of information are fused through different strategies, such as at the feature level and decision level. To solve the serious inconsistency between the convergence speed of the audio channel and the lyric channel, we adopt the strategy of asynchronous start training of two channels and different learning rates. A series of experiments are carried out to verify the effectiveness of the proposed model. The F1 score of the proposed model is 0.87 for music genre classification, which is approximately 4% higher than that of the best baseline in the experiment.
引用
收藏
页码:20157 / 20176
页数:19
相关论文
共 50 条
  • [1] Music genre classification based on fusing audio and lyric information
    Li, You
    Zhang, Zhihai
    Ding, Han
    Chang, Liang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (13) : 20157 - 20176
  • [2] Music genre classification of MPEG AAC audio data
    Kobayakawa, Michihiro
    Hoshi, Mamoru
    Yuzawa, Koichiro
    2014 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2014, : 347 - 352
  • [3] Optimizing ANN's Architecture for Audio Music Genre Classification
    Stergiopoulos, Panagiotis S.
    Efremides, Odysseas B.
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (IACSIT ICMLC 2009), 2009, : 230 - 233
  • [4] Content-based information fusion for semi-supervised music genre classification
    Song, Yangqiu
    Zhang, Changshui
    IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (01) : 145 - 152
  • [5] Music Genre Classification Based on Paraconsistency
    Silva Paulo, Katia Cristina
    Solgon Bassi, Regiane Denise
    Delorme, Andre Luis
    Guido, Rodrigo Capobianco
    da Silva, Ivan Nunes
    2ND INTERNATIONAL CONFERENCE ON ADVANCED EDUCATION TECHNOLOGY AND MANAGEMENT SCIENCE (AETMS 2014), 2015, : 427 - 431
  • [6] Deep attention based music genre classification
    Yu, Yang
    Luo, Sen
    Liu, Shenglan
    Qiao, Hong
    Liu, Yang
    Feng, Lin
    NEUROCOMPUTING, 2020, 372 : 84 - 91
  • [7] Automatic Music Genre Classification Based on CRNN
    Cheng, Yu-Huei
    Chang, Pang-Ching
    Nguyen, Duc-Man
    Kuo, Che-Nan
    ENGINEERING LETTERS, 2021, 29 (01) : 312 - 316
  • [8] VIOLENCE DETECTION IN VIDEOS BASED ON FUSING VISUAL AND AUDIO INFORMATION
    Pang, Wen-Feng
    He, Qian-Hua
    Hu, Yong-jian
    Li, Yan-Xiong
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2260 - 2264
  • [9] Music Features based on Hu Moments for Genre Classification
    Lopes, Renia
    Chapaneri, Santosh
    Jayaswal, Deepak
    2017 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS, COMPUTING AND IT APPLICATIONS (CSCITA), 2017, : 22 - 27
  • [10] Music Genre Classification Based on Functional Data Analysis
    Shen, Jiahong
    Xiao, Guangrun
    IEEE ACCESS, 2024, 12 : 185482 - 185491