An audio-based anger detection algorithm using a hybrid artificial neural network and fuzzy logic model

被引:0
作者
Arihant Surana
Manish Rathod
Shilpa Gite
Shruti Patil
Ketan Kotecha
Ganeshsree Selvachandran
Shio Gai Quek
Ajith Abraham
机构
[1] Symbiosis Institute of Technology,Symbiosis International (Deemed University)
[2] Symbiosis Centre for Applied Artificial Intelligence,School of Business
[3] Symbiosis International (Deemed University),Institute of Actuarial Science and Data Analytics
[4] Monash University Malaysia,School of Computer Science Engineering & Technology
[5] UCSI University,undefined
[6] Bennett University,undefined
来源
Multimedia Tools and Applications | 2024年 / 83卷
关键词
Audio Emotion Recognition; Variable Audio Sources; Audio Classification; ANN; Fuzzy Logic;
D O I
暂无
中图分类号
学科分类号
摘要
Audio Emotion Recognition (AER) is an important factor for Human Emotion Analysis with or without any visual aiding components. Such audio has different modular parameters, such as rhythm, tone, and pitch. However, emotions are highly complex, and the way they get delivered to human ears with preconceived emotions are then instantly understood by humans, and this is something that has been perfected after thousands of years of human evolution. Artificial intelligence (AI) enabled AER has captured worldwide attention in the last couple of years and has gained increasing importance amongst AI researchers in various fields. It has become increasingly important in recent years, especially after the start of the Covid-19 pandemic that has resulted in work from home, online schooling, and online learning on a mass scale due to large-scale lockdowns and movement control orders around the world. The audio quality on online platforms differs from device to device and is dependent on the quality or the bandwidth of the Internet connection used in such applications. Therefore, as the world is recovering from the Covid-19 pandemic, an algorithm for anger detection proves necessary in maintaining public security and general safety and can also help in the early detection of mental health issues or anger management issues. This is because the presence of an angry person in public can pose a threat to the people around and may also impose a risk of damage to public property. As a result, detecting the presence of anger emotion through voices in all public places proves to be the first line of defense against any outbreaks of public nuisance or even violent crimes. Moreover, the more prominent the anger emotion of a person, the more amount of attention must be given to the person by the public security forces. This study uses a collection of audio files from the CREMA-D dataset as the input, where a collection of 364 audio files from 91 actors, each with three degrees of showing anger and a neutral emotion were used. All audio files in this collection use the script “It’s eleven o’clock”. A hybrid algorithm of artificial neural network (ANN) and fuzzy logic, along with a dedicated preprocessing technique specifically for handling audio files were introduced. A comprehensive discussion and analysis of the results was presented in which the proposed algorithm was compared with all the other audio classification algorithms that exist in literature, many of which merely deployed a readily made general purpose neural network-based algorithm. This brute force method of relying on an overly complicated computational structure proves too low in efficiency as the number of nodes involved in the computational process far surpasses the number of preprocessed inputs. On top of this, descriptions about preprocessing procedures for audio classification among all recent works are found to be unclear. Finally, the limitations and suggestions for improvements of the experimental setup, and the potential applications of the findings are also discussed and analyzed in the conclusion of this study.
引用
收藏
页码:38909 / 38929
页数:20
相关论文
共 50 条
[31]   Fire Detection Model Based on Fuzzy RBF Neural Network [J].
Wang Longxin ;
Wang Hairong ;
Kang Qingchun .
PROGRESS IN SAFETY SCIENCE AND TECHNOLOGY, VOL VII, PTS A AND B, 2008, 7 :880-883
[32]   Hybrid control of the three phase induction machine using artificial neural networks and fuzzy logic [J].
Bouhoune, K. ;
Yazid, K. ;
Boucherit, M. S. ;
Cheriti, A. .
APPLIED SOFT COMPUTING, 2017, 55 :289-301
[33]   Convolutional Neural Network and Fuzzy Logic-based Hybrid Melanoma Diagnosis System [J].
Yalcinkaya, Fikret ;
Erbas, Ali .
ELEKTRONIKA IR ELEKTROTECHNIKA, 2021, 27 (02) :69-77
[34]   Network Attacks Detection Using Fuzzy Logic [J].
Levonevskiy, D. K. ;
Fatkieva, R. R. ;
Ryzhkov, S. R. .
2015 XVIII International Conference on Soft Computing and Measurements (SCM), 2015, :243-244
[35]   Hybrid Gravitational Search Algorithm Based on Fuzzy Logic [J].
Qian, Kun ;
Li, Wei ;
Qian, Weiyi .
IEEE ACCESS, 2017, 5 :24520-24532
[36]   A METHODOLOGY USING FUZZY-LOGIC TO OPTIMIZE FEEDFORWARD ARTIFICIAL NEURAL-NETWORK CONFIGURATIONS [J].
SHARPE, RN ;
CHOW, MY ;
BRIGGS, S ;
WINDINGLAND, L .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1994, 24 (05) :760-768
[37]   Proper estimation of surface roughness using hybrid intelligence based on artificial neural network and genetic algorithm [J].
Boga, Cem ;
Koroglu, Tahsin .
JOURNAL OF MANUFACTURING PROCESSES, 2021, 70 :560-569
[38]   A hybrid artificial neural network-based scheduling knowledge acquisition algorithm [J].
Wang Weida ;
Wang Wei ;
Liu Wenjian .
1st International Symposium on Digital Manufacture, Vols 1-3, 2006, :626-632
[39]   A Hybrid Artificial Neural Network-based Scheduling Knowledge Acquisition Algorithm [J].
WANG Weida WANG Wei LIU Wenjian School of Mechatronics EngineeringHarbin Institute of TechnologyHarbin China School of ComputerHarbin Institute of Technology WeihaiWeihai China .
武汉理工大学学报, 2006, (S2) :626-632
[40]   Abnormal human activity detection by convolutional recurrent neural network using fuzzy logic [J].
Kumar, Manoj ;
Biswas, Mantosh .
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (22) :61843-61859