Combining Data Augmentations for CNN-Based Voice Command Recognition

被引:8
作者
Azarang, Arian [1 ]
Hansen, John [1 ]
Kehtarnavaz, Nasser [1 ]
机构
[1] Univ Texas Dallas, Dept Elect & Comp Engn, Richardson, TX 75080 USA
来源
2019 12TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI) | 2019年
关键词
Combining data augmentation methods for voice command recognition; CNN-based voice command recognition; voice command human interaction systems; CONVOLUTIONAL NEURAL-NETWORKS;
D O I
10.1109/hsi47298.2019.8942638
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents combining two data augmentation methods involving speed perturbation and room impulse response reverberation for the purpose of improving the generalization capability of convolutional neural networks when used for voice command recognition. Speed perturbation generates voice command variations caused by shorter or longer time durations of commands spoken by different speakers. Room impulse response reverberation generates voice command variations caused by reflected sound paths. The combination of these two augmentation methods is presented in this paper by examining a public domain dataset of voice commands. The experimental results based on the performance metric of word error rate indicate the improvement in voice command recognition rates when combining these data augmentation methods relative to using each augmentation method individually.
引用
收藏
页码:17 / 21
页数:5
相关论文
共 50 条
[41]   CNN-Based Hyperspectral Pansharpening With Arbitrary Resolution [J].
He, Lin ;
Zhu, Jiawei ;
Li, Jun ;
Plaza, Antonio ;
Chanussot, Jocelyn ;
Yu, Zhuliang .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[42]   The effects of image smoothing on CNN-based detectors [J].
Skosana, Vusi ;
Ngxande, Mkhuseli .
2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, :68-73
[43]   A CNN-BASED METHOD FOR SAR IMAGE DESPECKLING [J].
Ma, Dejiao ;
Zhang, Xiaoling ;
Tang, Xinxin ;
Ming, Jing ;
Shi, Jun .
2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, :4272-4275
[44]   Combining CNN streams of RGB-D and skeletal data for human activity recognition [J].
Khaire, Pushpajit ;
Kumar, Praveen ;
Imran, Javed .
PATTERN RECOGNITION LETTERS, 2018, 115 :107-116
[45]   CNN-Based Model for Skin Diseases Classification [J].
Altimimi, Asmaa S. Zamil ;
Abdulkader, Hasan .
ARTIFICIAL INTELLIGENCE FOR INTERNET OF THINGS (IOT) AND HEALTH SYSTEMS OPERABILITY, IOTHIC 2023, 2024, 8 :28-38
[46]   CNN-based InSAR Denoising and Coherence Metric [J].
Mukherjee, Subhayan ;
Zimmer, Aaron ;
Kottayil, Navaneeth Kamballur ;
Sun, Xinyao ;
Ghuman, Parwant ;
Cheng, Irene .
2018 IEEE SENSORS, 2018, :808-811
[47]   CNN-Based Cognitive Radar Array Selection [J].
Elbir, Ahmet M. ;
Mishra, Kumar Vijay ;
Eldar, Yonina C. .
2019 IEEE RADAR CONFERENCE (RADARCONF), 2019,
[48]   Data augmentation for CNN-based probabilistic slope stability analysis in spatially variable soils [J].
Jiang, Shui-Hua ;
Zhu, Guang-Yuan ;
Wang, Ze Zhou ;
Huang, Zhuo-Tao ;
Huang, Jinsong .
COMPUTERS AND GEOTECHNICS, 2023, 160
[49]   Comprehensive benchmarking of CNN-based tumor segmentation methods using multimodal MRI data [J].
Kundal K. ;
Rao K.V. ;
Majumdar A. ;
Kumar N. ;
Kumar R. .
Computers in Biology and Medicine, 2024, 178
[50]   SPAM-Net: A CNN-Based SAR Target Recognition Network With Pose Angle Marginalization Learning [J].
Oh, Jihyong ;
Youm, Gwang-Young ;
Kim, Munchurl .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (02) :701-714