Audio-based shot classification for audiovisual indexing using PCA, MGD and fuzzy algorithm

被引:14
作者
Nitanda, Naoki [1 ]
Haseyama, Miki [1 ]
机构
[1] Hokkaido Univ, Grad Sch Informat Sci & Technol, Sapporo, Hokkaido 0600814, Japan
关键词
audiovisual classification; PCA; MGD; fuzzy algorithm;
D O I
10.1093/ietfec/e90-a.8.1542
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
An audio-based shot classification method for audiovisual indexing is proposed in this paper. The proposed method mainly consists of two parts, an audio analysis part and a shot classification part. In the audio analysis part, the proposed method utilizes both principal component analysis (PCA) and Mahalanobis generalized distance (MGD). The effective features for the analysis can be automatically obtained by using PCA, and these features are analyzed based on MGD, which can take into account the correlations of the data set. Thus, accurate analysis results can be obtained by the combined use of PCA and MGD. In the shot classification part, the proposed method utilizes a fuzzy algorithm. By using the fuzzy algorithm, the mixing rate of the multiple audio sources can be roughly measured, and thereby accurate shot classification can be attained. Results of experiments performed by applying the proposed method to actual audiovisual materials are shown to verify the effectiveness of the proposed method.
引用
收藏
页码:1542 / 1548
页数:7
相关论文
共 14 条
[1]  
Huang JC, 1998, 1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 3, P526, DOI 10.1109/ICIP.1998.727252
[2]   ON MEAN ACCURACY OF STATISTICAL PATTERN RECOGNIZERS [J].
HUGHES, GF .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1968, 14 (01) :55-+
[3]   FUZZY-LOGIC IN CONTROL-SYSTEMS - FUZZY-LOGIC CONTROLLER .1. [J].
LEE, CC .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1990, 20 (02) :404-418
[4]   Classification of general audio data for content-based retrieval [J].
Li, DG ;
Sethi, IK ;
Dimitrova, N ;
McGee, T .
PATTERN RECOGNITION LETTERS, 2001, 22 (05) :533-544
[5]   Audio feature extraction and analysis for scene segmentation and classification [J].
Liu, Z ;
Wang, Y ;
Chen, TH .
JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 1998, 20 (1-2) :61-79
[6]   Content analysis for audio classification and segmentation [J].
Lu, L ;
Zhang, HJ ;
Jiang, H .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (07) :504-516
[7]   A fast audio classification from MPEG coded data [J].
Nakajima, Y ;
Lu, Y ;
Sugano, M ;
Yoneyama, A ;
Yanagihara, H ;
Kurematsu, A .
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, :3005-3008
[8]   Compressed video processing for cut detection [J].
Patel, NV ;
Sethi, IK .
IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, 1996, 143 (05) :315-323
[9]   A NOVEL ANALOG FUZZY CONTROLLER FOR INTELLIGENT SENSORS [J].
PETERS, L ;
GUO, SW ;
CAMPOSANO, R .
FUZZY SETS AND SYSTEMS, 1995, 70 (2-3) :235-247
[10]  
RUNKLER TA, 1993, SECOND IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1 AND 2, P1161, DOI 10.1109/FUZZY.1993.327350