Video handling with music and speech detection

被引:46
|
作者
Minami, K [1 ]
Akutsu, A [1 ]
Hamada, H [1 ]
Tonomura, Y [1 ]
机构
[1] Nippon Telegraph & Tel Corp, Human Interface Labs, Kanagawa 2390847, Japan
关键词
D O I
10.1109/93.713301
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The audio-based approach to video indexing described here detects music and speech independently even when they occur simultaneously. The indexed video segments, when presented on the Video Sound Browser, let users randomly access the video. The Video in Time system provides different video condensation levels based on video structuring that can link the video segments and the director's intentions.
引用
收藏
页码:17 / 25
页数:9
相关论文
共 50 条
  • [1] Occlusion Detection and Handling In Video Surveillance
    Rawat, Manmohan Singh
    Varun, N. S.
    Vinotha, S. R.
    FUTURE INFORMATION TECHNOLOGY, 2011, 13 : 486 - 489
  • [2] Handling Bias in Toxic Speech Detection: A Survey
    Garg, Tanmay
    Masud, Sarah
    Suresh, Tharun
    Chakraborty, Tanmoy
    ACM COMPUTING SURVEYS, 2023, 55 (13S)
  • [3] Detection of the Music or Video Files in BitTorrent
    Zhou Zhiqiang
    Yoshiura, Noriaki
    THEORY AND PRACTICE OF COMPUTATION, 2012, 5 : 202 - 213
  • [4] Detection of speech and music based on spectral tracking
    Taniguchi, Toru
    Tohyama, Mikio
    Shirai, Katsuhiko
    SPEECH COMMUNICATION, 2008, 50 (07) : 547 - 563
  • [5] Implicit Motion Handling for Video Camouflaged Object Detection
    Cheng, Xuelian
    Xiong, Huan
    Fan, Deng-Ping
    Zhong, Yiran
    Harandi, Mehrtash
    Drummond, Tom
    Ge, Zongyuan
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13854 - 13863
  • [6] A framework for handling spatiotemporal variations in video copy detection
    Chiu, Chih-Yi
    Chen, Chu-Song
    Chien, Lee-Feng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2008, 18 (03) : 412 - 417
  • [7] Robust singing detection in speech/music discriminator design
    Chou, W
    Gu, L
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 865 - 868
  • [8] A large TV dataset for speech and music activity detection
    Yun-Ning Hung
    Chih-Wei Wu
    Iroro Orife
    Aaron Hipple
    William Wolcott
    Alexander Lerch
    EURASIP Journal on Audio, Speech, and Music Processing, 2022
  • [9] A large TV dataset for speech and music activity detection
    Hung, Yun-Ning
    Wu, Chih-Wei
    Orife, Iroro
    Hipple, Aaron
    Wolcott, William
    Lerch, Alexander
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2022, 2022 (01)
  • [10] Salient Region Detection Algorithm for Music Video Browsing
    Kim, Hyoung-Gook
    Shin, Dong
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2009, 28 (02): : 112 - 118