Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

被引:1
|
作者
Su, Feng [1 ]
Xue, Hao [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China
来源
MULTIMEDIA MODELING (MMM 2017), PT I | 2017年 / 10132卷
基金
美国国家科学基金会;
关键词
Music mood classification; Multimodal; Graph learning; Locality Preserving Projection; Bag of sentences; EMOTION CLASSIFICATION;
D O I
10.1007/978-3-319-51811-4_13
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic music mood classification is an important and challenging problem in the field of music information retrieval (MIR) and has attracted growing attention from variant research areas. In this paper, we proposed a novel multimodal method for music mood classification that exploits the complementarity of the lyrics and audio information of music to enhance the classification accuracy. We first extract descriptive sentence-level lyrics and audio features from the music. Then, we project the paired low-level features of two different modalities into a learned common discriminative latent space, which not only eliminates between modality heterogeneity, but also increases the discriminability of the resulting descriptions. On the basis of the latent representation of music, we employ a graph learning based multi-modal classification model for music mood, which takes the cross-modality similarity between local audio and lyrics descriptions of music into account for effective exploitation of correlations between different modalities. The acquired predictions of mood category for every sentence of music are then aggregated by a simple voting scheme. The effectiveness of the proposed method has been demonstrated in the experiments on a real dataset composed of more than 3,000 min of music and corresponding lyrics.
引用
收藏
页码:152 / 163
页数:12
相关论文
共 14 条
  • [1] Music Mood Classification Based on Lifelog
    Tong, Haoyue
    Zhang, Min
    Soleimaninejadian, Pouneh
    Zhang, Qianfan
    Wu, Kailu
    Liu, Yiqun
    Ma, Shaoping
    INFORMATION RETRIEVAL, CCIR 2018, 2018, 11168 : 55 - 66
  • [2] Sparse graph-based transduction for image classification
    Huang, Sheng
    Yang, Dan
    Zhou, Jia
    Huangfu, Lunwen
    Zhang, Xiaohong
    JOURNAL OF ELECTRONIC IMAGING, 2015, 24 (02)
  • [3] Graph-based adaptive and discriminative subspace learning for face image clustering
    Liao, Mengmeng
    Li, Yunjie
    Gao, Meiguo
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 192
  • [4] Automatic Music Mood Classification Based on Timbre and Modulation Features
    Ren, Jia-Min
    Wu, Ming-Ju
    Jang, Jyh-Shing Roger
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2015, 6 (03) : 236 - 246
  • [5] Sparse graph-based inductive learning with its application to image classification
    Huang, Qianying
    Zhang, Xiaohong
    Huang, Sheng
    Yang, Dan
    JOURNAL OF ELECTRONIC IMAGING, 2016, 25 (05)
  • [6] Orthogonal latent space learning with feature weighting and graph learning for multimodal Alzheimer's disease diagnosis
    Chen, Zhi
    Liu, Yongguo
    Zhang, Yun
    Li, Qiaoqin
    MEDICAL IMAGE ANALYSIS, 2023, 84
  • [7] Multimodal Semantics-Based Supervised Latent Dirichlet Allocation for Event Classification
    Miao, Naiyang
    Xue, Feng
    Hong, Richang
    IEEE MULTIMEDIA, 2021, 28 (04) : 8 - 17
  • [8] Deep multimodal graph-based network for survival prediction from highly multiplexed images and patient variables
    Fu, Xiaohang
    Patrick, Ellis
    Yang, Jean Y. H.
    Feng, David Dagan
    Kim, Jinman
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 154
  • [9] AI-Empowered Multimodal Hierarchical Graph-Based Learning for Situation Awareness on Enhancing Disaster Responses
    Chen, Jieli
    Seng, Kah Phooi
    Ang, Li Minn
    Smith, Jeremy
    Xu, Hanyue
    FUTURE INTERNET, 2024, 16 (05)
  • [10] A Novel Method Based on OMPGW Method for Feature Extraction in Automatic Music Mood Classification
    Mo, Shasha
    Niu, Jianwei
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2019, 10 (03) : 313 - 324