An Interpretable Deep Learning Model for Speech Activity Detection Using Electrocorticographic Signals

被引:6
作者
Stuart, Morgan [1 ]
Lesaja, Srdjan [2 ]
Shih, Jerry J. [3 ]
Schultz, Tanja [4 ]
Manic, Milos [1 ]
Krusienski, Dean J. [2 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
[2] Virginia Commonwealth Univ, Dept Biomed Engn, Richmond, VA 23284 USA
[3] UCSD Hlth, Neurol Dept, San Diego, CA 92093 USA
[4] Univ Bremen, Cognit Syst Lab, D-28359 Bremen, Germany
关键词
Electrodes; Brain modeling; Band-pass filters; Decoding; Convolution; Computer architecture; Deep learning; Brain-Computer Interfaces (BCIs); deep learning; electroencephalography;
D O I
10.1109/TNSRE.2022.3207624
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Numerous state-of-the-art solutions for neural speech decoding and synthesis incorporate deep learning into the processing pipeline. These models are typically opaque and can require significant computational resources for training and execution. A deep learning architecture is presented that learns input bandpass filters that capture task-relevant spectral features directly from data. Incorporating such explainable feature extraction into the model furthers the goal of creating end-to-end architectures that enable automated subject-specific parameter tuning while yielding an interpretable result. The model is implemented using intracranial brain data collected during a speech task. Using raw, unprocessed timesamples, the model detects the presence of speech at every timesample in a causal manner, suitable for online application. Model performance is comparable or superior to existing approaches that require substantial signal preprocessing and the learned frequency bands were found to converge to ranges that are supported by previous studies.
引用
收藏
页码:2783 / 2792
页数:10
相关论文
共 47 条
  • [1] Real-time synthesis of imagined speech processes from minimally invasive recordings of neural activity
    Angrick, Miguel
    Ottenhoff, Maarten C.
    Diener, Lorenz
    Ivucic, Darius
    Ivucic, Gabriel
    Goulis, Sophocles
    Saal, Jeremy
    Colon, Albert J.
    Wagner, Louis
    Krusienski, Dean J.
    Kubben, Pieter L.
    Schultz, Tanja
    Herff, Christian
    [J]. COMMUNICATIONS BIOLOGY, 2021, 4 (01)
  • [2] Speech synthesis from ECoG using densely connected 3D convolutional neural networks
    Angrick, Miguel
    Herff, Christian
    Mugler, Emily
    Tate, Matthew C.
    Slutzky, Marc W.
    Krusienski, Dean J.
    Schultz, Tanja
    [J]. JOURNAL OF NEURAL ENGINEERING, 2019, 16 (03)
  • [3] [Anonymous], 2020, PANDAS DEVPANDAS PAN, DOI [10.5281/zenodo.3509134, DOI 10.5281/ZENODO.3509134]
  • [4] Speech synthesis from neural decoding of spoken sentences
    Anumanchipalli, Gopala K.
    Chartier, Josh
    Chang, Edward F.
    [J]. NATURE, 2019, 568 (7753) : 493 - +
  • [5] Functional organization of human sensorimotor cortex for speech articulation
    Bouchard, Kristofer E.
    Mesgarani, Nima
    Johnson, Keith
    Chang, Edward F.
    [J]. NATURE, 2013, 495 (7441) : 327 - 332
  • [6] Functional and Quantitative MRI Mapping of Somatomotor Representations of Human Supralaryngeal Vocal Tract
    Carey, Daniel
    Krishnan, Saloni
    Callaghan, Martina F.
    Sereno, Martin I.
    Dick, Frederic
    [J]. CEREBRAL CORTEX, 2017, 27 (01) : 265 - 278
  • [7] Progress in speech decoding from the electrocorticogram
    Chakrabarti S.
    Sandberg H.M.
    Brumberg J.S.
    Krusienski D.J.
    [J]. Biomedical Engineering Letters, 2015, 5 (01) : 10 - 21
  • [8] Encoding of Articulatory Kinematic Trajectories in Human Speech Sensorimotor Cortex
    Chartier, Josh
    Anumanchipalli, Gopala K.
    Johnson, Keith
    Chang, Edward F.
    [J]. NEURON, 2018, 98 (05) : 1042 - +
  • [9] Can the right hemisphere speak?
    Code, C
    [J]. BRAIN AND LANGUAGE, 1997, 57 (01) : 38 - 59
  • [10] Epileptic Spike Detection Using Neural Networks With Linear-Phase Convolutions
    Fukumori, Kosuke
    Yoshida, Noboru
    Sugano, Hidenori
    Nakajima, Madoka
    Tanaka, Toshihisa
    [J]. IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (03) : 1045 - 1056