SINGLE-CHANNEL SPEAKER DIARIZATION BASED ON SPATIAL FEATURES

被引:0
|
作者
Hu, Mathieu [1 ]
Parada, Pablo Peso [2 ]
Sharma, Dushyant [2 ]
Doclo, Simon [3 ,4 ]
van Waterschoot, Toon [5 ]
Brookes, Mike [1 ]
Naylor, Patrick A. [1 ]
机构
[1] Univ London Imperial Coll Sci Technol & Med, Dept Elect & Elect Engn, London SW7 2AZ, England
[2] Nuance Commun Inc, Voicemail To Text Res, Marlow, Bucks, England
[3] Carl von Ossietzky Univ Oldenburg, Dept Med Phys & Acoust, D-26111 Oldenburg, Germany
[4] Carl von Ossietzky Univ Oldenburg, Cluster Excellence Hearing4All, D-26111 Oldenburg, Germany
[5] Katholieke Univ Leuven, Dept Elect Engn ESAT STADIUS ETC, Leuven, Belgium
来源
2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA) | 2015年
关键词
Speaker diarization; direct-to-reverberant ratio; spatial acoustic features;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speaker diarization has gained much importance over the past five years in helping overcome key challenges faced by automatic meeting transcription systems. Current state-of-the-art algorithms can only utilize spatial information when multi-microphone recordings are available. In this paper, we propose the novel use of reverberation as a source of spatial information obtained from single-channel recordings to perform speaker diarization. The proposed system is shown to reduce speaker classification errors by 34% when compared with current MFCC based single-channel systems.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] MULTI-CHANNEL SPEAKER DIARIZATION USING SPATIAL FEATURES FOR MEETINGS
    Zheng, Naijun
    Li, Na
    Yu, JianWei
    Weng, Chao
    Su, Dan
    Liu, XunYing
    Meng, Helen
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7337 - 7341
  • [2] Simultaneous Speech Detection With Spatial Features for Speaker Diarization
    Zelenak, Martin
    Segura, Carlos
    Luque, Jordi
    Hernando, Javier
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 436 - 446
  • [3] Speaker Separation Using Visual Speech Features and Single-channel Audio
    Khan, Faheem
    Milner, Ben
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3263 - 3267
  • [4] FILTERBANK SLOPE BASED FEATURES FOR SPEAKER DIARIZATION
    Madikeri, Srikanth
    Bourlard, Herve
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [5] Speaker Diarization Based on Intensity Channel Contribution
    Barra-Chicote, Roberto
    Manuel Pardo, Jose
    Ferreiros, Javier
    Manuel Montero, Juan
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 754 - 761
  • [6] Overlap Detection for Speaker Diarization by Fusing Spectral and Spatial Features
    Zelenak, Martin
    Segura, Carlos
    Hernando, Javier
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2302 - 2305
  • [7] Speaker Verification-Based Evaluation of Single-Channel Speech Separation
    Maciejewski, Matthew
    Watanabe, Shinji
    Khudanpur, Sanjeev
    INTERSPEECH 2021, 2021, : 3520 - 3524
  • [8] Channel and channel subband selection for speaker diarization
    Ahmed, Ahmed Isam
    Chiverton, John P.
    Ndzi, David L.
    Al-Faris, Mahmoud M.
    COMPUTER SPEECH AND LANGUAGE, 2022, 75
  • [9] Channel and channel subband selection for speaker diarization
    Ahmed, Ahmed Isam
    Chiverton, John P.
    Ndzi, David L.
    Al-Faris, Mahmoud M.
    Computer Speech and Language, 2022, 75
  • [10] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
    Taherian, Hassan
    Wang, Zhong-Qiu
    Chang, Jorge
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302