LONG-TERM CONVERSATION ANALYSIS: PRIVACY-UTILITY TRADE-OFF UNDER NOISE AND REVERBERATION

被引：0

作者：

Pohlhausen, Jule ^{[1
,2
]}

Nespoli, Francesco ^{[3
,4
]}

Bitzer, Joerg ^{[1
,5
]}

机构：

[1] Jade Univ Appl Sci, Inst Hearing Technol & Audiol, Oldenburg, Germany

[2] Carl von Ossietzky Univ Oldenburg, Dept Med Phys & Acoust, Oldenburg, Germany

[3] Microsoft, London, England

[4] Imperial Coll, Dept Elect & Elect Engn, London, England

[5] Fraunhofer IDMT Dept HSA, Oldenburg, Germany

来源：

2024 18TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT, IWAENC 2024 | 2024年

关键词：

privacy; conversation analysis; speech recognition; speaker recognition; voice activity detection; speaker diarization;

D O I：

10.1109/IWAENC61483.2024.10694640

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Recordings in everyday life require privacy preservation of the speech content and speaker identity. This contribution explores the influence of noise and reverberation on the trade-off between privacy and utility for low-cost privacy-preserving methods feasible for edge computing. These methods compromise spectral and temporal smoothing, speaker anonymization using the McAdams coefficient, sampling with a very low sampling rate, and combinations. Privacy is assessed by automatic speech and speaker recognition, while our utility considers voice activity detection and speaker diarization. Overall, our evaluation shows that additional noise degrades the performance of all models more than reverberation. This degradation corresponds to enhanced speech privacy, while utility is less deteriorated for some methods.

引用

页码：404 / 408

页数：5

共 22 条

[1] Privacy-Aware Acoustic Assessments of Everyday Life
Bitzer, Joerg
Kisser, Sven
Holube, Inga
[J]. JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2016, 64 (06): : 395 - 404
[2] Unleashing the killer corpus: experiences in creating the multi-everything AMI Meeting Corpus
Carletta, Jean
[J]. LANGUAGE RESOURCES AND EVALUATION, 2007, 41 (02) : 181 - 190
[3] Chung JS, 2018, INTERSPEECH, P1086
[4] ECAPA-TDNN Embeddings for Speaker Diarization
Dawalatabad, Nauman
Ravanelli, Mirco
Grondin, Francois
Thienpondt, Jenthe
Desplanques, Brecht
Na, Hwidong
[J]. INTERSPEECH 2021, 2021, : 3560 - 3564
[5] ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification
Desplanques, Brecht
Thienpondt, Jenthe
Demuynck, Kris
[J]. INTERSPEECH 2020, 2020, : 3830 - 3834
[6] Graves A., 2006, P INT C MACHINE LEAR, P369, DOI [DOI 10.1145/1143844.1143891, 10.1145/1143844.1143891.1143891]
[7] Hao M, 2019, IEEE ICC
[8] Jeub M, 2009, 2009 16TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, VOLS 1 AND 2, P550
[9] Ko T, 2017, INT CONF ACOUST SPEE, P5220, DOI 10.1109/ICASSP.2017.7953152
[10] The Electronically Activated Recorder (EAR): A Method for the Naturalistic Observation of Daily Social Behavior
Mehl, Matthias R.
[J]. CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE, 2017, 26 (02) : 184 - 190

← 1 2 3 →