Combining Data Augmentations for CNN-Based Voice Command Recognition

被引：8

作者：

Azarang, Arian ^{[1
]}

Hansen, John ^{[1
]}

Kehtarnavaz, Nasser ^{[1
]}

机构：

[1] Univ Texas Dallas, Dept Elect & Comp Engn, Richardson, TX 75080 USA

来源：

2019 12TH INTERNATIONAL CONFERENCE ON HUMAN SYSTEM INTERACTION (HSI) | 2019年

关键词：

Combining data augmentation methods for voice command recognition; CNN-based voice command recognition; voice command human interaction systems; CONVOLUTIONAL NEURAL-NETWORKS;

D O I：

10.1109/hsi47298.2019.8942638

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents combining two data augmentation methods involving speed perturbation and room impulse response reverberation for the purpose of improving the generalization capability of convolutional neural networks when used for voice command recognition. Speed perturbation generates voice command variations caused by shorter or longer time durations of commands spoken by different speakers. Room impulse response reverberation generates voice command variations caused by reflected sound paths. The combination of these two augmentation methods is presented in this paper by examining a public domain dataset of voice commands. The experimental results based on the performance metric of word error rate indicate the improvement in voice command recognition rates when combining these data augmentation methods relative to using each augmentation method individually.

引用

页码：17 / 21

页数：5

共 50 条

[31] CNN-based InSAR Coherence Classification [J].

Mukherjee, Subhayan ;

Zimmer, Aaron ;

Sun, Xinyao ;

Ghuman, Parwant ;

Cheng, Irene .

2018 IEEE SENSORS, 2018, :1612-1615

[32] A CNN-based automatic vulnerability detection [J].

An, Jung Hyun ;

Wang, Zhan ;

Joe, Inwhee .

EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2023, 2023 (01)

[33] A CNN-based automatic vulnerability detection [J].

Jung Hyun An ;

Zhan Wang ;

Inwhee Joe .

EURASIP Journal on Wireless Communications and Networking, 2023

[34] INVESTIGATION OF DIFFERENT SKELETON FEATURES FOR CNN-BASED 3D ACTION RECOGNITION [J].

Ding, Zewei ;

Wang, Pichao ;

Ogunbona, Philip O. ;

Li, Wanqing .

2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,

[35] CNN-Based Real-time Hand and Fingertip Recognition for the Design of a Virtual Keyboard [J].

Li, Yan-Mei ;

Lee, Tae-Ho ;

Kim, Jin-Sung ;

Lee, Hyuk-Jae .

2021 36TH INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS AND COMMUNICATIONS (ITC-CSCC), 2021,

[36] A Real-Time CNN-Based Lightweight Mobile Masked Face Recognition System [J].

Kocacinar, Busra ;

Tas, Bilal ;

Akbulut, Fatma Patlar ;

Catal, Cagatay ;

Mishra, Deepti .

IEEE ACCESS, 2022, 10 :63496-63507

[37] Embedded Features for 1D CNN-based Action Recognition on Depth Maps [J].

Trelinski, Jacek ;

Kwolek, Bogdan .

VISAPP: PROCEEDINGS OF THE 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS - VOL. 4: VISAPP, 2021, :536-543

[38] Minimizing Model Size of CNN-Based Vehicle Make Recognition for Frontal Vehicle Images [J].

Puisamlee, Wiput ;

Chawuthai, Rathachai .

IEEE ACCESS, 2025, 13 :97409-97420

[39] Data-Driven Synthesis of Smoke Flows with CNN-based Feature Descriptors [J].

Chu, Mengyu ;

Thuerey, Nils .

ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04)

[40] A CNN-BASED FLOOD MAPPING APPROACH USING SENTINEL-1 DATA [J].

Tavus, Beste ;

Can, Recep ;

Kocaman, Sultan .

XXIV ISPRS CONGRESS: IMAGING TODAY, FORESEEING TOMORROW, COMMISSION III, 2022, 5-3 :549-556

← 1 2 3 4 5 →