Music genre classification based on fusing audio and lyric information

被引:0
|
作者
You Li
Zhihai Zhang
Han Ding
Liang Chang
机构
[1] Guilin University of Electronic Technology,Guangxi Key Laboratory of Trusted Software
[2] Guilin University of Electronic Technology,School of Electronic Engineering and Automation
来源
Multimedia Tools and Applications | 2023年 / 82卷
关键词
Music genre classification; Audio information; Lyric information; Information fusion;
D O I
暂无
中图分类号
学科分类号
摘要
Music genre classification (MGC) has a wide range of application scenarios. Traditional MGC methods only consider either audio information or lyric information, resulting in an unsatisfactory recognition effect. In this paper, we propose a multimodal music genre classification framework that integrates both audio information and lyric information. By using the complementarity of multimodal information, music genres can be represented more comprehensively. First, the framework extracts the mel-spectrogram of audio, and a convolutional neural network is used to extract audio features. Simultaneously, BERT is used to obtain the distributed representation of the lyrics. Then, the two modal pieces of information are fused through different strategies, such as at the feature level and decision level. To solve the serious inconsistency between the convergence speed of the audio channel and the lyric channel, we adopt the strategy of asynchronous start training of two channels and different learning rates. A series of experiments are carried out to verify the effectiveness of the proposed model. The F1 score of the proposed model is 0.87 for music genre classification, which is approximately 4% higher than that of the best baseline in the experiment.
引用
收藏
页码:20157 / 20176
页数:19
相关论文
共 50 条
  • [41] Robust handcrafted features for music genre classification
    Muniz, Victor Hugo da Silva
    de Oliveira e Souza Filho, Joao Baptista
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (13) : 9335 - 9348
  • [42] Recurrent Neural Networks for Music Genre Classification
    Kakarla, Chaitanya
    Eshwarappa, Vidyashree
    Saheer, Lakshmi Babu
    Oghaz, Mahdi Maktabdar
    ARTIFICIAL INTELLIGENCE XXXIX, AI 2022, 2022, 13652 : 267 - 279
  • [43] Music Genre Classification With Machine Learning Techniques
    Karatana, Ali
    Yildiz, Oktay
    2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [44] Semi-supervised music genre classification
    Song, Yangqiu
    Zhang, Changshui
    Xiang, Shiming
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 729 - +
  • [45] Music Genre Classification Using Contrastive Dissimilarity
    Costanzi, Gabriel Henrique
    Teixeira, Lucas O.
    Felipe, Gustavo Z.
    Cavalcanti, George D. C.
    Costa, Yandre M. G.
    2024 31ST INTERNATIONAL CONFERENCE ON SYSTEMS, SIGNALS AND IMAGE PROCESSING, IWSSIP 2024, 2024,
  • [46] Exploring different approaches for music genre classification
    Homsi Goulart, Antonio Jose
    Guido, Rodrigo Capobianco
    Maciel, Carlos Dias
    EGYPTIAN INFORMATICS JOURNAL, 2012, 13 (02) : 59 - 63
  • [47] Music genre classification based on res-gated CNN and attention mechanism
    Changjiang Xie
    Huazhu Song
    Hao Zhu
    Kaituo Mi
    Zhouhan Li
    Yi Zhang
    Jiawen Cheng
    Honglin Zhou
    Renjie Li
    Haofeng Cai
    Multimedia Tools and Applications, 2024, 83 : 13527 - 13542
  • [48] Music genre classification based on ensemble of signals produced by source separation methods
    Lampropoulos, Aristomenis S.
    Lampropoulou, Paraskevi S.
    Tsihrintzis, George A.
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2010, 4 (03): : 229 - 237
  • [49] Novel Hybrid Model for Music Genre Classification based on Support Vector Machine
    Sharma, Srishti
    Fulzele, Prasenjeet
    Sreedevi, Indu
    2018 IEEE SYMPOSIUM ON COMPUTER APPLICATIONS & INDUSTRIAL ELECTRONICS (ISCAIE 2018), 2018, : 395 - 400
  • [50] Multilingual I-Vector based Statistical Modeling for Music Genre Classification
    Dai, Jia
    Xue, Wei
    Liu, Wenju
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 459 - 463