Memristor-Based Progressive Hierarchical Conformer Architecture for Speech Emotion Recognition

被引:0
作者
Zhao, Tianhao [1 ]
Zhou, Yue [1 ,2 ]
Hu, Xiaofang [1 ,2 ]
机构
[1] Southwest Univ, Coll Artificial Intelligence, Chongqing 400715, Peoples R China
[2] Southwest Univ, Chongqing Key Lab Brain inspired Comp & Intelligen, Chongqing 400715, Peoples R China
来源
INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS | 2024年 / 34卷 / 09期
基金
中国国家自然科学基金;
关键词
Memristor; self-attention mechanism; speech emotion recognition; conformer; circuit; CIRCUIT IMPLEMENTATION; FEATURES; SYSTEM;
D O I
10.1142/S0218127424501177
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Speech Emotion Recognition (SER) is a challenging task characterized by the diversity and complexity of emotional expression. Due to its powerful feature extraction capabilities, Transformer Network (TN) demonstrates advantages and potential in SER. However, the limited size of available datasets and the difficulty of decoupling emotional features restrain its performance and present challenges in implementing SER on edge devices. To address these issues, we present a Memristor-based Progressive Hierarchical Conformer Architecture (MPCA) and design a conformer submodule that leverages convolution to mitigate TN's limitations in SER. We propose attention-based feature decoupling, employing hierarchical extraction to decouple speaker characteristics and retain the relevant components, thereby obtaining reliable emotional features. Furthermore, we propose a reconfigurable circuit implementation scheme for MPCA based on operator multiplexing achieving flexible modules that can be dynamically adjusted based on the resources of edge devices, and the stability of the designed circuit is analyzed by simulation experiments with PSPICE. We show that the suggested MPCA demonstrates state-of-the-art performance in SER while significantly reducing system power consumption, offering a solution for SER implementation on edge devices.
引用
收藏
页数:14
相关论文
共 46 条
[1]   Facial Emotion Recognition Using Hybrid Features [J].
Alreshidi, Abdulrahman ;
Ullah, Mohib .
INFORMATICS-BASEL, 2020, 7 (01)
[2]   Recognition of Emotion in Speech-related Audio Files with LSTM-Transformer [J].
Andayani, Felicia ;
Theng, Lau Bee ;
Tsun, Mark TeeKit ;
Chua, Caslon .
5TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATICS (ICCI 2022), 2022, :87-91
[3]  
[Anonymous], 2012, Advances in Neuromorphic Memristor Science and Applications
[4]   SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers [J].
Arezzo, Alessandro ;
Berretti, Stefano .
PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA IN ASIA, MMASIA 2022, 2022,
[5]   A fully integrated reprogrammable memristor-CMOS system for efficient multiply-accumulate operations [J].
Cai, Fuxi ;
Correll, Justin M. ;
Lee, Seung Hwan ;
Lim, Yong ;
Bothra, Vishishtha ;
Zhang, Zhengya ;
Flynn, Michael P. ;
Lu, Wei D. .
NATURE ELECTRONICS, 2019, 2 (07) :290-299
[6]   MEMRISTOR - MISSING CIRCUIT ELEMENT [J].
CHUA, LO .
IEEE TRANSACTIONS ON CIRCUIT THEORY, 1971, CT18 (05) :507-+
[7]  
Chung JS, 2018, INTERSPEECH, P1086
[8]   Light-tuned selective photosynthesis of azo- and azoxy-aromatics using graphitic C3N4 [J].
Dai, Yitao ;
Li, Chao ;
Shen, Yanbin ;
Lim, Tingbin ;
Xu, Jian ;
Li, Yongwang ;
Niemantsverdriet, Hans ;
Besenbacher, Flemming ;
Lock, Nina ;
Su, Ren .
NATURE COMMUNICATIONS, 2018, 9
[9]   An ongoing review of speech emotion recognition [J].
de Lope, Javier ;
Grana, Manuel .
NEUROCOMPUTING, 2023, 528 :1-11
[10]   Survey on speech emotion recognition: Features, classification schemes, and databases [J].
El Ayadi, Moataz ;
Kamel, Mohamed S. ;
Karray, Fakhri .
PATTERN RECOGNITION, 2011, 44 (03) :572-587