SARdB: A dataset for audio scene source counting and analysis

被引:5
|
作者
Nigro, Michael [1 ]
Krishnan, Sridhar [1 ]
机构
[1] Ryerson Univ, Dept Elect Comp & Biomed Engn, 350 Victoria St, Toronto, ON M5B 2K3, Canada
关键词
Source counting; Speaker count estimation; Audio scene analysis; Speaker diarization; Sound event detection; DIARIZATION;
D O I
10.1016/j.apacoust.2021.107985
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Determining the number of sources in a signal is an important consideration for many audio scene analysis tasks. However, source counting is not actively researched like many other audio tasks. This work looks to create Ryerson University's Signal Analysis Research (SAR) group's SARdB: a multimodal audio-text dataset with the goal of promoting research on source counting and audio scene analysis. SARdB consists of 10s long acoustic scenes containing between 1 and 4 speakers and 0-5 sound events present for a total of similar to 21 hours of data. We demonstrate the utility in performing source counting and how it can be a benefit to audio scene analysis tasks in general. Crown Copyright (C) 2021 Published by Elsevier Ltd. All rights reserved.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Weakly Supervised Representation Learning for Audio-Visual Scene Analysis
    Parekh, Sanjeel
    Essid, Slim
    Ozerov, Alexey
    Ngoc Q K Duong
    Perez, Patrick
    Richard, Gael
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 416 - 428
  • [42] Horror film genre typing and scene labeling via audio analysis
    Moncrieff, S
    Venkatesh, S
    Dorai, C
    2003 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL II, PROCEEDINGS, 2003, : 193 - 196
  • [43] GCE: An Audio-Visual Dataset for Group Cohesion and Emotion Analysis
    Lim, Eunchae
    Ho, Ngoc-Huynh
    Pant, Sudarshan
    Kang, Young-Shin
    Jeon, Seong-Eun
    Kim, Seungwon
    Kim, Soo-Hyung
    Yang, Hyung-Jeong
    APPLIED SCIENCES-BASEL, 2024, 14 (15):
  • [44] Scene-Aware Audio Rendering via Deep Acoustic Analysis
    Tang, Zhenyu
    Bryan, Nicholas J.
    Li, Dingzeyu
    Langlois, Timothy R.
    Manocha, Dinesh
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2020, 26 (05) : 1991 - 2001
  • [45] Source Counting and Separation Based on Simplex Analysis
    Laufer-Goldshtein, Bracha
    Talmon, Ronen
    Gannot, Sharon
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2018, 66 (24) : 6458 - 6473
  • [46] Audio-visual sensing from a quadcopter: dataset and baselines for source localization and sound enhancement
    Wang, Lin
    Sanchez-Matilla, Ricardo
    Cavallaro, Andrea
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 5320 - 5325
  • [47] Audio scene recognition based on audio events and topic model
    Leng, Yan
    Zhou, Nai
    Sun, Chengli
    Xu, Xinyan
    Yuan, Qi
    Cheng, Chuanfu
    Liu, Yunxia
    Li, Dengwang
    KNOWLEDGE-BASED SYSTEMS, 2017, 125 : 1 - 12
  • [48] Audio-based cough counting using independent subspace analysis
    Leamy, Paul
    Burke, Ted
    Barry, Dan
    Dorran, David
    2021 43RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY (EMBC), 2021, : 1026 - 1030
  • [49] Blind audio source counting and separation of anechoic mixtures using the multichannel complex NMF framework
    Mirzaei, Sayeh
    Van hamme, Hugo
    Norouzi, Yaser
    SIGNAL PROCESSING, 2015, 115 : 27 - 37
  • [50] Smartphone Audio Replay Attacks Dataset
    Mandalapu, Hareesh
    Ramachandra, Raghavendra
    Busch, Christoph
    2021 9TH INTERNATIONAL WORKSHOP ON BIOMETRICS AND FORENSICS (IWBF 2021), 2021,