Audio-Based Semantic Concept Classification for Consumer Video

被引：57

作者：

Lee, Keansub ^{[1
]}

Ellis, Daniel P. W. ^{[1
]}

机构：

[1] Columbia Univ, Dept Elect Engn, Lab Recognit & Org Speech & Audio LabROSA, New York, NY 10027 USA

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2010年 / 18卷 / 06期

基金：

美国国家科学基金会;

关键词：

Audio classification; consumer video classification; semantic concept detection; soundtrack analysis; RETRIEVAL; MUSIC; SEGMENTATION;

D O I：

10.1109/TASL.2009.2034776

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper presents a novel method for automatically classifying consumer video clips based on their soundtracks. We use a set of 25 overlapping semantic classes, chosen for their usefulness to users, viability of automatic detection and of annotator labeling, and sufficiency of representation in available video collections. A set of 1873 videos from real users has been annotated with these concepts. Starting with a basic representation of each video clip as a sequence of mel-frequency cepstral coefficient (MFCC) frames, we experiment with three clip-level representations: single Gaussian modeling, Gaussian mixture modeling, and probabilistic latent semantic analysis of a Gaussian component histogram. Using such summary features, we produce support vector machine (SVM) classifiers based on the Kullback-Leibler, Bhattacharyya, or Mahalanobis distance measures. Quantitative evaluation shows that our approaches are effective for detecting interesting concepts in a large collection of real-world consumer video clips.

引用

页码：1406 / 1416

页数：11

共 50 条

[1] Audio-based context recognition
Eronen, AJ
Peltonen, VT
Tuomi, JT
Klapuri, AP
Fagerlund, S
Sorsa, T
Lorho, G
Huopaniemi, J
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (01): : 321 - 329
[2] Audio-Visual Atoms for Generic Video Concept Classification
Jiang, Wei
Cotton, Courtenay
Chang, Shih-Fu
Ellis, Dan
Loui, Alexander C.
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2010, 6 (03)
[3] Semantic concept detection for video based on extreme learning machine
Lu, Bo
Wang, Guoren
Yuan, Ye
Han, Dong
NEUROCOMPUTING, 2013, 102 : 176 - 183
[4] Audio-based description and structuring of videos
Harb H.
Chen L.
International Journal on Digital Libraries, 2006, 6 (1) : 70 - 81
[5] Audio-based queries for video retrieval over Java']Java enabled mobile devices
Ahmad, I
Cheikh, FA
Kiranyaz, S
Gabbouj, M
MULTIMEDIA ON MOBILE DEVICES II, 2006, 6074
[6] Audio-Based Hate Speech Classification from Online Short-Form Videos
Ibanez, Michael
Sapinit, Ranz
Reyes, Lloyd Antonie
Hussien, Mohammed
Imperial, Joseph Marvin
Rodriguez, Ramon
2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 72 - 77
[7] "Are You Playing a Shooter Again?!" Deep Representation Learning for Audio-Based Video Game Genre Recognition
Amiriparian, Shahin
Cummins, Nicholas
Gerczuk, Maurice
Pugachevskiy, Sergey
Ottl, Sandra
Schuller, Bjorn
IEEE TRANSACTIONS ON GAMES, 2020, 12 (02) : 145 - 154
[8] EXPLORING AUDIO SEMANTIC CONCEPTS FOR EVENT-BASED VIDEO RETRIEVAL
Wang, Yipei
Rawat, Shourabh
Metze, Florian
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[9] Audio-Video based Segmentation and Classification Using SVM
Subashini, K.
Palanivel, S.
Ramaligam, V.
2012 THIRD INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION & NETWORKING TECHNOLOGIES (ICCCNT), 2012,
[10] Tensor semantic model for an audio classification system
Xing Ling
Ma Qiang
Zhu Min
SCIENCE CHINA-INFORMATION SCIENCES, 2013, 56 (06) : 1 - 9

← 1 2 3 4 5 →