LONG-TERM CONVERSATION ANALYSIS: PRIVACY-UTILITY TRADE-OFF UNDER NOISE AND REVERBERATION

被引:0
作者
Pohlhausen, Jule [1 ,2 ]
Nespoli, Francesco [3 ,4 ]
Bitzer, Joerg [1 ,5 ]
机构
[1] Jade Univ Appl Sci, Inst Hearing Technol & Audiol, Oldenburg, Germany
[2] Carl von Ossietzky Univ Oldenburg, Dept Med Phys & Acoust, Oldenburg, Germany
[3] Microsoft, London, England
[4] Imperial Coll, Dept Elect & Elect Engn, London, England
[5] Fraunhofer IDMT Dept HSA, Oldenburg, Germany
来源
2024 18TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT, IWAENC 2024 | 2024年
关键词
privacy; conversation analysis; speech recognition; speaker recognition; voice activity detection; speaker diarization;
D O I
10.1109/IWAENC61483.2024.10694640
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recordings in everyday life require privacy preservation of the speech content and speaker identity. This contribution explores the influence of noise and reverberation on the trade-off between privacy and utility for low-cost privacy-preserving methods feasible for edge computing. These methods compromise spectral and temporal smoothing, speaker anonymization using the McAdams coefficient, sampling with a very low sampling rate, and combinations. Privacy is assessed by automatic speech and speaker recognition, while our utility considers voice activity detection and speaker diarization. Overall, our evaluation shows that additional noise degrades the performance of all models more than reverberation. This degradation corresponds to enhanced speech privacy, while utility is less deteriorated for some methods.
引用
收藏
页码:404 / 408
页数:5
相关论文
共 22 条
  • [1] Privacy-Aware Acoustic Assessments of Everyday Life
    Bitzer, Joerg
    Kisser, Sven
    Holube, Inga
    [J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2016, 64 (06): : 395 - 404
  • [2] Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus
    Carletta, Jean
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2007, 41 (02) : 181 - 190
  • [3] Chung JS, 2018, INTERSPEECH, P1086
  • [4] ECAPA-TDNN Embeddings for Speaker Diarization
    Dawalatabad, Nauman
    Ravanelli, Mirco
    Grondin, Francois
    Thienpondt, Jenthe
    Desplanques, Brecht
    Na, Hwidong
    [J]. INTERSPEECH 2021, 2021, : 3560 - 3564
  • [5] ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification
    Desplanques, Brecht
    Thienpondt, Jenthe
    Demuynck, Kris
    [J]. INTERSPEECH 2020, 2020, : 3830 - 3834
  • [6] Graves A., 2006, P INT C MACHINE LEAR, P369, DOI [DOI 10.1145/1143844.1143891, 10.1145/1143844.1143891.1143891]
  • [7] Hao M, 2019, IEEE ICC
  • [8] Jeub M, 2009, 2009 16TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, VOLS 1 AND 2, P550
  • [9] Ko T, 2017, INT CONF ACOUST SPEE, P5220, DOI 10.1109/ICASSP.2017.7953152
  • [10] The Electronically Activated Recorder (EAR): A Method for the Naturalistic Observation of Daily Social Behavior
    Mehl, Matthias R.
    [J]. CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE, 2017, 26 (02) : 184 - 190