FRAMEWORK FOR EVALUATION OF SOUND EVENT DETECTION IN WEB VIDEOS

被引:0
|
作者
Badlani, Rohan [2 ]
Shah, Ankit [1 ]
Elizalde, Benjamin [1 ]
Kumar, Anurag [1 ]
Raj, Bhiksha [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[2] BITS Pilani, Dept Comp Sci, Hyderabad, Telangana, India
来源
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2018年
关键词
Sound Event Detection; Convolutional Neural Network; Large-Scale audio event detection; Video Content Analysis;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The largest source of sound events is web videos. Most videos lack sound event labels at segment level, however, a significant number of them do respond to text queries, from a match found using metadata by search engines. In this paper we explore the extent to which a search query can be used as the true label for detection of sound events in videos. We present a framework for large-scale sound event recognition on web videos. The framework crawls videos using search queries corresponding to 78 sound event labels drawn from three datasets. The datasets are used to train three classifiers, and we obtain a prediction on 3.7 million web video segments. We evaluated performance using the search query as true label and compare it with human labeling. Both types of ground truth exhibited close performance, to within 10%, and similar performance trend with increasing number of evaluated segments. Hence, our experiments show potential for using search query as a preliminary true label for sound event recognition in web videos.
引用
收藏
页码:3096 / 3100
页数:5
相关论文
共 50 条
  • [1] A FRAMEWORK FOR THE ROBUST EVALUATION OF SOUND EVENT DETECTION
    Bilen, Cagdas
    Ferroni, Giacomo
    Tuveri, Francesco
    Azcarreta, Juan
    Krstulovic, Sacha
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 61 - 65
  • [2] Hierarchic ConvNets Framework for Rare Sound Event Detection
    Vesperini, Fabio
    Droghini, Diego
    Principi, Emanuele
    Gabrielli, Leonardo
    Squartini, Stefano
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1497 - 1501
  • [3] MULTIMODAL EVALUATION METHOD FOR SOUND EVENT DETECTION
    Modaresi, Seyed M. R.
    Osmani, Aomar
    Razzazi, Mohammadreza
    Chibani, Abdelghani
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1026 - 1030
  • [4] THRESHOLD INDEPENDENT EVALUATION OF SOUND EVENT DETECTION SCORES
    Ebbers, Janek
    Haeb-Umbach, Reinhold
    Serizel, Romain
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1021 - 1025
  • [5] A MUTUAL LEARNING FRAMEWORK FOR FEW-SHOT SOUND EVENT DETECTION
    Yang, Dongchao
    Wang, Helin
    Zou, Yuexian
    Ye, Zhongjie
    Wang, Wenwu
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 811 - 815
  • [6] A safety-oriented framework for sound event detection in driving scenarios
    Castorena, Carlos
    Cobos, Maximo
    Lopez-Ballester, Jesus
    Ferri, Francesc J.
    APPLIED ACOUSTICS, 2024, 215
  • [7] Novel sound event and sound activity detection framework based on intrinsic mode functions and deep learning
    Vahid Hajihashemi
    Abdorreza Alavigharahbagh
    J. J. M. Machado
    João Manuel R. S. Tavares
    Multimedia Tools and Applications, 2025, 84 (14) : 13515 - 13543
  • [8] Robust Sound Event Detection in Continuous Audio Environments
    Zhang, Haomin
    McLoughlin, Ian
    Song, Yan
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2977 - 2981
  • [9] Towards a system for automatic traffic sound event detection
    Chavdar, Marko
    Gerazov, Branislav
    Ivanovski, Zoran
    Kartalov, Tomislav
    2020 28TH TELECOMMUNICATIONS FORUM (TELFOR), 2020, : 209 - 212
  • [10] Event Specific Attention for Polyphonic Sound Event Detection
    Sundar, Harshavardhan
    Sun, Ming
    Wang, Chao
    INTERSPEECH 2021, 2021, : 566 - 570