Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks

被引:0
作者
Lee, Seo-Hyun [1 ]
Lee, Young-Eun [1 ]
Kim, Soowon [2 ]
Ko, Byung-Kwan [2 ]
Kim, Jun-Young [2 ]
机构
[1] Korea Univ, Dept Brain & Cognit Engn, Seoul, South Korea
[2] Korea Univ, Dept Artificial Intelligence, Seoul, South Korea
来源
2024 12TH INTERNATIONAL WINTER CONFERENCE ON BRAIN-COMPUTER INTERFACE, BCI 2024 | 2024年
关键词
brain-computer interface; deep neural networks; electroencephalogram; generative adversarial network; imagined speech; speech synthesis; COMMUNICATION; IMAGERY;
D O I
10.1109/BCI60775.2024.10480503
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Brain-to-speech technology represents a fusion of interdisciplinary applications encompassing fields of artificial intelligence, brain-computer interfaces, and speech synthesis. Neural representation learning based intention decoding and speech synthesis directly connects the neural activity to the means of human linguistic communication, which may greatly enhance the naturalness of communication. With the current discoveries on representation learning and the development of the speech synthesis technologies, direct translation of brain signals into speech has shown great promise. Especially, the processed input features and neural speech embeddings which are given to the neural network play a significant role in the overall performance when using deep generative models for speech generation from brain signals. In this paper, we introduce the current brain-tospeech technology with the possibility of speech synthesis from brain signals, which may ultimately facilitate innovation in nonverbal communication. Also, we perform comprehensive analysis on the neural features and neural speech embeddings underlying the neurophysiological activation while performing speech, which may play a significant role in the speech synthesis works.
引用
收藏
页数:4
相关论文
共 22 条
  • [1] Real-time synthesis of imagined speech processes from minimally invasive recordings of neural activity
    Angrick, Miguel
    Ottenhoff, Maarten C.
    Diener, Lorenz
    Ivucic, Darius
    Ivucic, Gabriel
    Goulis, Sophocles
    Saal, Jeremy
    Colon, Albert J.
    Wagner, Louis
    Krusienski, Dean J.
    Kubben, Pieter L.
    Schultz, Tanja
    Herff, Christian
    [J]. COMMUNICATIONS BIOLOGY, 2021, 4 (01)
  • [2] Speech synthesis from neural decoding of spoken sentences
    Anumanchipalli, Gopala K.
    Chartier, Josh
    Chang, Edward F.
    [J]. NATURE, 2019, 568 (7753) : 493 - +
  • [3] Spatio-Spectral Feature Representation for Motor Imagery Classification Using Convolutional Neural Networks
    Bang, Ji-Seon
    Lee, Min-Ho
    Fazli, Siamac
    Guan, Cuntai
    Lee, Seong-Whan
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (07) : 3038 - 3049
  • [4] Brain-computer interfaces for communication and rehabilitation
    Chaudhary, Ujwal
    Birbaumer, Niels
    Ramos-Murguialday, Ander
    [J]. NATURE REVIEWS NEUROLOGY, 2016, 12 (09) : 513 - 525
  • [5] EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis
    Delorme, A
    Makeig, S
    [J]. JOURNAL OF NEUROSCIENCE METHODS, 2004, 134 (01) : 9 - 21
  • [6] A Subject-Transfer Framework Based on Single-Trial EMG Analysis Using Convolutional Neural Networks
    Kim, Keun-Tae
    Guan, Cuntai
    Lee, Seong-Whan
    [J]. IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2020, 28 (01) : 94 - 103
  • [7] Kim S., 2023, INTERSPEECH
  • [8] The Berlin Brain-Computer Interface (BBCI) -: towards a new communication channel for online control in gaming applications
    Krepki, Roman
    Blankertz, Benjamin
    Curio, Gabriel
    Mueller, Klaus-Robert
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2007, 33 (01) : 73 - 90
  • [9] High-frequency neural activity and human cognition: Past, present and possible future of intracranial EEG research
    Lachaux, Jean-Philippe
    Axmacher, Nikolai
    Mormann, Florian
    Halgren, Eric
    Crone, Nathan E.
    [J]. PROGRESS IN NEUROBIOLOGY, 2012, 98 (03) : 279 - 301
  • [10] Possible Effect of Binaural Beat Combined With Autonomous Sensory Meridian Response for Inducing Sleep
    Lee, Minji
    Song, Chae-Bin
    Shin, Gi-Hwan
    Lee, Seong-Whan
    [J]. FRONTIERS IN HUMAN NEUROSCIENCE, 2019, 13