A novel attention model across heterogeneous features for stuttering event detection

被引:3
作者
Al-Banna, Abedal-Kareem [1 ,3 ,4 ]
Fang, Hui [1 ]
Edirisinghe, Eran [2 ]
机构
[1] Loughborough Univ, Dept Comp Sci, Loughborough LE11 3TU, England
[2] Keele Univ, Sch Comp & Math, Newcastle ST5 5BG, England
[3] Dept Artificial Intelligence & Data Sci, Amman 11196, Jordan
[4] Univ Petra, Amman 11196, Jordan
关键词
Stuttering; Stuttering events detection; Multi-feature attention model; Stuttering severity systems; Dysfluency; Deep learning; SPEECH DISFLUENCIES;
D O I
10.1016/j.eswa.2023.122967
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stuttering is a prevalent speech disorder affecting millions worldwide. To provide an automatic and objective stuttering assessment tool, Stuttering Event Detection (SED) is under extensive investigation for advanced speech research and applications. Despite significant progress achieved by various machine learning and deep learning models, SED directly from speech signal is still challenging due to stuttering speech's heterogeneous and overlapped nature. This paper presents a novel SED approach using multi-feature fusion and attention mechanisms. The model utilises multiple acoustic features extracted based on different pitch, time-domain, frequency domain, and automatic speech recognition feature to detect stuttering core behaviours more accurately and reliably. In addition, we exploit both spatial and temporal attention mechanisms as well as Bidirectional Long Short-Term Memory (BI-LSTM) modules to learn better representations to improve the SED performance. The experimental evaluation and analysis convincingly demonstrate that our proposed model surpasses the state-of-the-art models on two popular stuttering datasets, with 4% and 3% overall F1 scores, respectively. The superior results indicate the consistency of our proposed method, supported by both multi-feature and attention mechanisms in different stuttering events datasets.
引用
收藏
页数:12
相关论文
共 43 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Afroz F, 2019, 2019 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), P921, DOI [10.1109/SPIN.2019.8711569, 10.1109/spin.2019.8711569]
  • [3] Sheikh SA, 2022, Arxiv, DOI arXiv:2204.01564
  • [4] Ajibola Alim S., 2018, From Natural to Artificial Intelligence - Algorithms and Applications, P3, DOI DOI 10.5772/INTECHOPEN.80419
  • [5] Stuttering Detection Using Atrous Convolutional Neural Networks
    Al-Banna, Abedal-Kareem
    Edirisinghe, Eran
    Fang, Hui
    [J]. 2022 13TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION SYSTEMS (ICICS), 2022, : 252 - 256
  • [6] Stuttering Disfluency Detection Using Machine Learning Approaches
    Al-Banna, Abedal-Kareem
    Edirisinghe, Eran
    Fang, Hui
    Hadi, Wael
    [J]. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2022, 21 (02)
  • [7] A Lightly Supervised Approach to Detect Stuttering in Children's Speech
    Alharbi, Sadeen
    Hasan, Madina
    Simons, Anthony J. H.
    Brumfitt, Shelagh
    Green, Phil
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3433 - 3437
  • [8] Detecting Elderly Behaviors Based on Deep Learning for Healthcare: Recent Advances, Methods, Real-World Applications and Challenges
    Almutairi, Mubarak
    Gabralla, Lubna A.
    Abubakar, Saidu
    Chiroma, Haruna
    [J]. IEEE ACCESS, 2022, 10 : 69802 - 69821
  • [9] [Anonymous], 2015, P 14 PYTHON SCI C, DOI 10.25080/majora-7b98-3ed-003
  • [10] [Anonymous], 2010, International statistical classification of diseases and related health problems, 10th revision