Audio-visual event detection based on mining of semantic audio-visual labels

被引:0
|
作者
Goh, KS [1 ]
Miyahara, K [1 ]
Radhakrishan, R [1 ]
Xiong, ZY [1 ]
Divakaran, A [1 ]
机构
[1] Mitsubishi Elect Res Labs, Cambridge, MA USA
来源
STORAGE AND RETRIEVAL METHODS AND APPLICATIONS FOR MULTIMEDIA 2004 | 2004年 / 5307卷
关键词
commercial detection; unsupervised clustering; audio classification; motion activity;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Removing commercials from television programs is a much sought-after feature for a personal video recorder. In this paper, we employ an unsupervised clustering scheme (CM-Detect) to detect commercials in television programs. Each program is first divided into W-s-minute chunks, and we extract audio and visual features from each of these chunks. Next, we apply k-means clustering to assign each chunk with a commercial/program, label. In contrast to other methods, we do not make any assumptions regarding the program content. Thus, our method is highly content-adaptive and computationally inexpensive. Through empirical studies on various content; including American news; Japanese news, and sports programs, we demonstrate that our method is able to filter out most of the commercials without falsely removing the regular program.
引用
收藏
页码:292 / 299
页数:8
相关论文
共 50 条
  • [1] Semantic Audio-Visual Navigation
    Chen, Changan
    Al-Halah, Ziad
    Grauman, Kristen
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15511 - 15520
  • [2] Semantic and Relation Modulation for Audio-Visual Event Localization
    Wang, Hao
    Zha, Zheng-Jun
    Li, Liang
    Chen, Xuejin
    Luo, Jiebo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) : 7711 - 7725
  • [3] An audio-visual distance for audio-visual speech vector quantization
    Girin, L
    Foucher, E
    Feng, G
    1998 IEEE SECOND WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, 1998, : 523 - 528
  • [4] Catching audio-visual mice:: The extrapolation of audio-visual speed
    Hofbauer, MM
    Wuerger, SM
    Meyer, GF
    Röhrbein, F
    Schill, K
    Zetzsche, C
    PERCEPTION, 2003, 32 : 96 - 96
  • [5] A Robust Audio-visual Speech Recognition Using Audio-visual Voice Activity Detection
    Tamura, Satoshi
    Ishikawa, Masato
    Hashiba, Takashi
    Takeuchi, Shin'ichi
    Hayamizu, Satoru
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2702 - +
  • [6] An audio-visual speech recognition with a new mandarin audio-visual database
    Liao, Wen-Yuan
    Pao, Tsang-Long
    Chen, Yu-Te
    Chang, Tsun-Wei
    INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2007, : 19 - +
  • [7] AUDIO-VISUAL EDUCATION
    Brickman, William W.
    SCHOOL AND SOCIETY, 1948, 67 (1739): : 320 - 326
  • [8] Audio-Visual Objects
    Kubovy M.
    Schutz M.
    Review of Philosophy and Psychology, 2010, 1 (1) : 41 - 61
  • [9] Audio-Visual Segmentation
    Zhou, Jinxing
    Wang, Jianyuan
    Zhang, Jiayi
    Sun, Weixuan
    Zhang, Jing
    Birchfield, Stan
    Guo, Dan
    Kong, Lingpeng
    Wang, Meng
    Zhong, Yiran
    COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 386 - 403
  • [10] AUDIO-VISUAL CLINICS
    GRABER, TM
    HANNETT, HA
    AMERICAN JOURNAL OF ORTHODONTICS AND DENTOFACIAL ORTHOPEDICS, 1963, 49 (07) : 538 - &