Artificial intelligence empowered voice generation for amyotrophic lateral sclerosis patients

被引:0
作者
Regondi, Stefano [1 ,4 ]
Donvito, Giordana [1 ]
Frontoni, Emanuele [1 ,2 ]
Kostovic, Milutin [1 ]
Minazzi, Fabio [3 ]
Bratieres, Sebastien [3 ]
Filosto, Massimiliano [4 ,5 ]
Pugliese, Raffaele [1 ]
机构
[1] ASST GOM Niguarda Ca Granda Hosp, NeMO Lab, Milan, Italy
[2] Univ Macerata, SPOCRI Dept, VRAI Lab, Macerata, Italy
[3] Translated, Rome, Italy
[4] Fdn Serena Onlus, NEuroMuscular Omnictr NEMO, Milan, Italy
[5] Univ Brescia, Dept Clin & Expt Sci, Brescia, Italy
来源
SCIENTIFIC REPORTS | 2025年 / 15卷 / 01期
关键词
Amyotrophic lateral sclerosis; Artificial intelligence; HiFi-GAN; Voice banking; Synthetic voice; Voice generation; Alternative augmentative communication;
D O I
10.1038/s41598-024-84728-y
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease that can result in a progressive loss of speech due to bulbar dysfunction, which can have significant negative impact on the patient's mental well-being. Alternative Augmentative Communication (AAC) strategies based on synthetic voices have been shown to assist patients in maintaining communication and improving their Quality of Life (QoL). However, such synthetic voices are often perceived as impersonal and fail to capture the unique voice and identity of the patient. To tackle this issue, combining voice banking (VB) and artificial intelligence (AI) has emerged as a more natural communication strategy, enabling individuals to preserve their voice for use with AAC devices as needed. This involves recording speech samples to generate a synthetic voice closely resembling the individual's own. Despite the increasing interest in VB, there's a lack of clear strategies for its effective implementation in rapidly progressing diseases like ALS. Additionally, the perceptual quality of VB on patients with preserved speech, especially when offered early in the disease, remains poorly understood. In light of these challenges, this study aims to assess the effectiveness and the perceptual impact of AI-generated voices on ALS patients with preserved speech, utilizing a personalized voice synthesis system based on machine learning. The AI-generated patient-specific voice is achieved through voice recording, followed by fine-tuning using a Generative Adversarial Network for Efficient and High Fidelity Speech Synthesis (HiFi-GAN), resulting in a model capable of producing speech highly similar to the patient's own voice, with exceptional expressive and audio quality. By addressing these aspects, this study intends to offer valuable insights into the potential benefits and challenges of combining VB with AI voices to enhance communication support for ALS patients.
引用
收藏
页数:12
相关论文
共 31 条
  • [1] Aaron, 2016, arXiv
  • [2] [Anonymous], About us
  • [3] Communication Support for People with ALS
    Beukelman, David
    Fager, Susan
    Nordness, Amy
    [J]. NEUROLOGY RESEARCH INTERNATIONAL, 2011, 2011
  • [4] ALS : Management Problems
    Brent, Jonathan R.
    Franz, Colin K.
    Coleman, John M., III
    Ajroud-Driss, Senda
    [J]. NEUROLOGIC CLINICS, 2020, 38 (03) : 565 - 575
  • [5] El Escorial revisited: Revised criteria for the diagnosis of amyotrophic lateral sclerosis
    Brooks, BR
    Miller, RG
    Swash, M
    Munsat, TL
    [J]. AMYOTROPHIC LATERAL SCLEROSIS AND OTHER MOTOR NEURON DISORDERS, 2000, 1 (05): : 293 - 299
  • [6] A qualitative evidence synthesis of the experiences and perspectives of communicating using augmentative and alternative communication (AAC)
    Broomfield, Katherine
    Harrop, Deborah
    Jones, Georgina L.
    Sage, Karen
    Judge, Simon
    [J]. DISABILITY AND REHABILITATION-ASSISTIVE TECHNOLOGY, 2022, : 1802 - 1816
  • [7] Brown RH, 2017, NEW ENGL J MED, V377, P1602, DOI [10.1056/NEJMra1603471, 10.1038/nrdp.2017.85, 10.1056/NEJMc1710379, 10.1016/S0140-6736(17)31287-4, 10.1016/S0140-6736(10)61156-7]
  • [8] Voice banking for people living with motor neurone disease: Views and expectations
    Cave, Richard
    Bloch, Steven
    [J]. INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS, 2021, 56 (01) : 116 - 129
  • [9] Augmentative and Alternative Communication (AAC) Advances: A Review of Configurations for Individuals with a Speech Disability
    Elsahar, Yasmin
    Hu, Sijung
    Bouazza-Marouf, Kaddour
    Kerr, David
    Mansor, Annysa
    [J]. SENSORS, 2019, 19 (08)
  • [10] Amyotrophic lateral sclerosis
    Feldman, Eva L.
    Goutman, Stephen A.
    Petri, Susanne
    Mazzini, Letizia
    Savelieff, Masha G.
    Shaw, Pamela J.
    Sobue, Gen
    [J]. LANCET, 2022, 400 (10360) : 1363 - 1380