PRIVACY SENSITIVE SPEECH ANALYSIS USING FEDERATED LEARNING TO ASSESS DEPRESSION

被引：13

作者：

Suhas, B. N. ^{[1
]}

Abdullah, Saeed ^{[1
]}

机构：

[1] Penn State Univ, Coll Informat Sci & Technol, University Pk, PA 16802 USA

来源：

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2022年

关键词：

speech classification; depression; privacy; paralinguistics; mHealth;

D O I：

10.1109/ICASSP43922.2022.9746827

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Recent studies have used speech signals to assess depression. However, speech features can lead to serious privacy concerns. To address these concerns, prior work has used privacy-preserving speech features. However, using a subset of features can lead to information loss and, consequently, non-optimal model performance. Furthermore, prior work relies on a centralized approach to support continuous model updates, posing privacy risks. This paper proposes to use Federated Learning (FL) to enable decentralized, privacy-preserving speech analysis to assess depression. Using an existing dataset (DAIC-WOZ), we show that FL models enable a robust assessment of depression with only 4-6% accuracy loss compared to a centralized approach. These models also outperform prior work using the same dataset. Furthermore, the FL models have short inference latency and small memory footprints while being energy-efficient. These models, thus, can be deployed on mobile devices for real-time, continuous, and privacy-preserving depression assessment at scale.

引用

页码：6272 / 6276

页数：5

共 24 条

[1] Automatic detection of social rhythms in bipolar disorder
Abdullah, Saeed
Matthews, Mark
Frank, Ellen
Doherty, Gavin
Gay, Geri
Choudhury, Tanzeem
[J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2016, 23 (03) : 538 - 543
[2] Detecting Depression with Audio/Text Sequence Modeling of Interviews
Alhanai, Tuka
Ghassemi, Mohammad
Glass, James
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1716 - 1720
[3] Federated learning of predictive models from federated Electronic Health Records
Brisimi, Theodora S.
Chen, Ruidi
Mela, Theofanie
Olshevsky, Alex
Paschalidis, Ioannis Ch.
Shi, Wei
[J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2018, 112 : 59 - 67
[4] A review of depression and suicide risk assessment using speech analysis
Cummins, Nicholas
Scherer, Stefan
Krajewski, Jarek
Schnieder, Sebastian
Epps, Julien
Quatieri, Thomas F.
[J]. SPEECH COMMUNICATION, 2015, 71 : 10 - 49
[5] Gratch J, 2014, LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P3123
[6] Jeong E, 2018, Communicationefficient ondevice machine learning: Federated distillation and augmentation under noniid private data, P1
[7] Enhanced speech emotion detection using deep neural networks
Lalitha, S.
Tripathi, Shikha
Gupta, Deepa
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) : 497 - 510
[8] A Roadmap for Foundational Research on Artificial Intelligence in Medical Imaging: From the 2018 NIH/RSNA/ACR/The Academy Workshop
Langlotz, Curtis P.
Allen, Bibb
Erickson, Bradley J.
Kalpathy-Cramer, Jayashree
Bigelow, Keith
Cook, Tessa S.
Flanders, Adam E.
Lungren, Matthew P.
Mendelson, David S.
Rudie, Jeffrey D.
Wang, Ge
Kandarpa, Krishna
[J]. RADIOLOGY, 2019, 291 (03) : 781 - 791
[9] Federated Learning: Challenges, Methods, and Future Directions
Li, Tian
Sahu, Anit Kumar
Talwalkar, Ameet
Smith, Virginia
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2020, 37 (03) : 50 - 60
[10] Li X, 2020, INT C LEARN REPR

← 1 2 3 →