Audio-Based Semantic Concept Classification for Consumer Video

被引:57
作者
Lee, Keansub [1 ]
Ellis, Daniel P. W. [1 ]
机构
[1] Columbia Univ, Dept Elect Engn, Lab Recognit & Org Speech & Audio LabROSA, New York, NY 10027 USA
来源
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2010年 / 18卷 / 06期
基金
美国国家科学基金会;
关键词
Audio classification; consumer video classification; semantic concept detection; soundtrack analysis; RETRIEVAL; MUSIC; SEGMENTATION;
D O I
10.1109/TASL.2009.2034776
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a novel method for automatically classifying consumer video clips based on their soundtracks. We use a set of 25 overlapping semantic classes, chosen for their usefulness to users, viability of automatic detection and of annotator labeling, and sufficiency of representation in available video collections. A set of 1873 videos from real users has been annotated with these concepts. Starting with a basic representation of each video clip as a sequence of mel-frequency cepstral coefficient (MFCC) frames, we experiment with three clip-level representations: single Gaussian modeling, Gaussian mixture modeling, and probabilistic latent semantic analysis of a Gaussian component histogram. Using such summary features, we produce support vector machine (SVM) classifiers based on the Kullback-Leibler, Bhattacharyya, or Mahalanobis distance measures. Quantitative evaluation shows that our approaches are effective for detecting interesting concepts in a large collection of real-world consumer video clips.
引用
收藏
页码:1406 / 1416
页数:11
相关论文
共 50 条
  • [31] A Novel Video Summarization Based on Mining the Story-Structure and Semantic Relations Among Concept Entities
    Chen, Bo-Wei
    Wang, Jia-Ching
    Wang, Jhing-Fa
    IEEE TRANSACTIONS ON MULTIMEDIA, 2009, 11 (02) : 295 - 312
  • [32] Multi-level feature representations for video semantic concept detection
    Li, Haojie
    Liu, Lijuan
    Sun, Fuming
    Bao, Yu
    Liu, Chenxin
    NEUROCOMPUTING, 2016, 172 : 64 - 70
  • [33] Audio classification method based on machine learning
    Rong, Feng
    2016 INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION, BIG DATA & SMART CITY (ICITBS), 2017, : 81 - 84
  • [34] Audio Songs Classification Based on Music Patterns
    Sharma, Rahul
    Murthy, Y. V. Srinivasa
    Koolagudi, Shashidhar G.
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 3, 2016, 381 : 157 - 166
  • [35] Hierarchical VS Non-hierarchical Audio Indexation and Classification for Video Genres
    Dammak, Nouha
    BenAyed, Yassine
    TENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2017), 2018, 10696
  • [36] An audio-based anger detection algorithm using a hybrid artificial neural network and fuzzy logic model
    Surana, Arihant
    Rathod, Manish
    Gite, Shilpa
    Patil, Shruti
    Kotecha, Ketan
    Selvachandran, Ganeshsree
    Quek, Shio Gai
    Abraham, Ajith
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (13) : 38909 - 38929
  • [37] An audio-based anger detection algorithm using a hybrid artificial neural network and fuzzy logic model
    Arihant Surana
    Manish Rathod
    Shilpa Gite
    Shruti Patil
    Ketan Kotecha
    Ganeshsree Selvachandran
    Shio Gai Quek
    Ajith Abraham
    Multimedia Tools and Applications, 2024, 83 : 38909 - 38929
  • [38] An Audio Classification Approach Based on Machine Learning
    Dan, Wu
    2019 INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION, BIG DATA & SMART CITY (ICITBS), 2019, : 626 - 629
  • [39] AUDIO CLASSIFICATION BASED ON WEAKLY LABELED DATA
    Cheng, Chieh-Feng
    Anderson, David, V
    Davenport, Mark A.
    Rashidi, Abbas
    2018 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2018, : 568 - 572
  • [40] SPORTS AUDIO CLASSIFICATION BASED ON MFCC AND GMM
    Liu Jiqing
    Dong Yuan
    Huang Jun
    Zhao Xianyu
    Wang Haila
    PROCEEDINGS OF 2009 2ND IEEE INTERNATIONAL CONFERENCE ON BROADBAND NETWORK & MULTIMEDIA TECHNOLOGY, 2009, : 482 - +