Audio Feature Extraction and Analysis for Scene Segmentation and Classification

被引:0
作者
Zhu Liu
Yao Wang
Tsuhan Chen
机构
[1] Polytechnic University,
[2] Carnegie Mellon University,undefined
来源
Journal of VLSI signal processing systems for signal, image and video technology | 1998年 / 20卷
关键词
Audio Signal; Audio Feature; Scene Change; Football Game; Audio Clip;
D O I
暂无
中图分类号
学科分类号
摘要
Understanding of the scene content of a video sequence is very important for content-based indexing and retrieval of multimedia databases. Research in this area in the past several years has focused on the use of speech recognition and image analysis techniques. As a complimentary effort to the prior work, we have focused on using the associated audio information (mainly the nonspeech portion) for video scene analysis. As an example, we consider the problem of discriminating five types of TV programs, namely commercials, basketball games, football games, news reports, and weather forecasts. A set of low-level audio features are proposed for characterizing semantic contents of short audio clips. The linear separability of different classes under the proposed feature space is examined using a clustering analysis. The effective features are identified by evaluating the intracluster and intercluster scattering matrices of the feature space. Using these features, a neural net classifier was successful in separating the above five types of TV programs. By evaluating the changes between the feature vectors of adjacent clips, we also can identify scene breaks in an audio sequence quite accurately. These results demonstrate the capability of the proposed audio features for characterizing the semantic content of an audio sequence.
引用
收藏
页码:61 / 79
页数:18
相关论文
共 39 条
  • [31] Classification of Depression and Its Severity Based on Multiple Audio Features Using a Graphical Convolutional Neural Network
    Ishimaru, Momoko
    Okada, Yoshifumi
    Uchiyama, Ryunosuke
    Horiguchi, Ryo
    Toyoshima, Itsuki
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2023, 20 (02)
  • [32] Using multi-stream hierarchical deep neural network to extract deep audio feature for acoustic event detection
    Yanxiong Li
    Xue Zhang
    Hai Jin
    Xianku Li
    Qin Wang
    Qianhua He
    Qian Huang
    Multimedia Tools and Applications, 2018, 77 : 897 - 916
  • [33] Using multi-stream hierarchical deep neural network to extract deep audio feature for acoustic event detection
    Li, Yanxiong
    Zhang, Xue
    Jin, Hai
    Li, Xianku
    Wang, Qin
    He, Qianhua
    Huang, Qian
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (01) : 897 - 916
  • [34] Improving performance of X-rated video classification with the optimized repeated curve-like spectrum feature and the skip-and-analysis processing
    Lim, Jae-Deok
    Kim, Jeong-Nyeo
    Jung, Young-Giu
    Yoon, Young-Doo
    Lee, Cheol-Hoon
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 71 (02) : 717 - 740
  • [35] Improving performance of X-rated video classification with the optimized repeated curve-like spectrum feature and the skip-and-analysis processing
    Jae-Deok Lim
    Jeong-Nyeo Kim
    Young-Giu Jung
    Young-Doo Yoon
    Cheol-Hoon Lee
    Multimedia Tools and Applications, 2014, 71 : 717 - 740
  • [36] Design of audio signal testing and analysis system based on virtual instrument
    Zhou, Nanquan
    MECHATRONICS AND INTELLIGENT MATERIALS II, PTS 1-6, 2012, 490-495 : 208 - 212
  • [37] A Novel Digital Video Watermarking Scheme Based on the Scene Change Analysis
    Geetamma, Tummalapalli
    Krishna, T. V. N. N. M. Vamsi
    Rao, D. Srinivasa
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS DESIGN AND INTELLIGENT APPLICATIONS 2012 (INDIA 2012), 2012, 132 : 187 - 194
  • [38] Surface Discharges Identification of 10kV Solid Insulation Cabinet Based on Energy Characteristics Extraction of Audio Signal
    Hou, Chunguang
    Jia, Maoyuan
    Cao, Yundong
    PROCEEDINGS OF THE 2018 28TH INTERNATIONAL SYMPOSIUM ON DISCHARGES AND ELECTRICAL INSULATION IN VACUUM (ISDEIV 2018), VOL 1, 2018, : 143 - 146
  • [39] Exploring the Impact of Image-Based Audio Representations in Classification Tasks Using Vision Transformers and Explainable AI Techniques
    Masri, Sari
    Hasasneh, Ahmad
    Tami, Mohammad
    Tadj, Chakib
    INFORMATION, 2024, 15 (12)