Wavoice: An mmWave-Assisted Noise-Resistant Speech Recognition System

被引:2
|
作者
Liu, Tiantian [1 ]
Wang, Chao [1 ]
Li, Zhengxiong [2 ]
Huang, Ming-Chun [3 ]
Xu, Wenyao [4 ]
Lin, Feng [1 ]
机构
[1] Zhejiang Univ, Sch Cyber Sci & Technol, ZJU Hangzhou Global Sci & Technol Innovat Ctr, Hangzhou 310027, Peoples R China
[2] Univ Colorado Denver, Dept Comp Sci & Engn, Denver, CO USA
[3] Duke Kunshan Univ, Dept Data & Computat Sci, Suzhou 215316, Jiangsu, Peoples R China
[4] SUNY Buffalo, Dept Comp Sci & Engn, Buffalo, NY 14261 USA
关键词
Multi-modal systems; mmWave sensing; speech recognition; biometrics; ENHANCEMENT;
D O I
10.1145/3597457
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As automatic speech recognition evolves, deployment of the voice user interface (VUI) has boomingly expanded. Especially since the COVID-19 pandemic, the VUI has gained more attention in online communication owing to its non-contact property. However, the VUI struggles to be applied in public scenes due to the degradation of received audio signals caused by various ambient noises. In this article, we propose Wavoice, the first noise-resistant multi-modal speech recognition system that fuses two distinct voices sensing modalities (i.e., millimeter-wave signals and audio signals from a microphone) together. One key contribution is to model the inherent correlation between millimeter-wave and audio signals. Based on it, Wavoice facilitates the real-time noise-resistant voice activity detection and user targeting from multiple speakers. Additionally, we elaborate on two novel modules for multi-modal fusion embedded into the neural network, leading to accurate speech recognition. Extensive experiments prove the effectiveness of Wavoice under adverse conditions-that is, the character recognition error rate below 1% in a range of 7 m. In terms of robustness and accuracy, Wavoice considerably outperforms existing audio-only speech recognition methods with lower character error and word error rates.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] Noise-Resistant Bicluster Recognition
    Sun, Huan
    Miao, Gengxin
    Yan, Xifeng
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 707 - 716
  • [2] Noise-Resistant Multimodal Transformer for Emotion Recognition
    Liu, Yuanyuan
    Zhang, Haoyu
    Zhan, Yibing
    Chen, Zijing
    Yin, Guanghao
    Wei, Lin
    Chen, Zhe
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (05) : 3020 - 3040
  • [3] Noise-resistant methods of dynamic system identification
    Red'ko, SF
    STRUCTURAL DYNAMICS, VOLS 1 AND 2, 1999, : 561 - 566
  • [4] Performance Analysis of Random 3D mmWave-Assisted UAV Communication System
    Gao, Min
    Xu, Guanjun
    Song, Zhaohui
    Cheng, Yanyu
    Niyato, Dusit
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (12) : 19169 - 19185
  • [5] Noise-resistant network: a deep-learning method for face recognition under noise
    Yuanyuan Ding
    Yongbo Cheng
    Xiaoliu Cheng
    Baoqing Li
    Xing You
    Xiaobing Yuan
    EURASIP Journal on Image and Video Processing, 2017
  • [6] Noise-resistant network: a deep-learning method for face recognition under noise
    Ding, Yuanyuan
    Cheng, Yongbo
    Cheng, Xiaoliu
    Li, Baoqing
    You, Xing
    Yuan, Xiaobing
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2017,
  • [7] NEAT: A Label Noise-resistant Complementary Item Recommender System with Trustworthy Evaluation
    Ma, Luyi
    Xu, Jianpeng
    Cho, Jason H. D.
    Korpeoglu, Evren
    Kumar, Sushant
    Achan, Kannan
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 469 - 479
  • [8] The Application of Speech Recognition System in Noise Environment
    Niu, Gang
    Ren, Xinzhi
    Wu, Guoqing
    PROCEEDINGS OF 2008 INTERNATIONAL PRE-OLYMPIC CONGRESS ON COMPUTER SCIENCE, VOL I: COMPUTER SCIENCE AND ENGINEERING, 2008, : 1 - 5
  • [9] NOISE-RESISTANT MOBILE POSITIONING SYSTEM BASED ON CODE-AIDED RSS ESTIMATION
    Shr, Kai-Ting
    Huang, Li-Hong
    Huang, Yuan-Hao
    2012 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2012, : 73 - 78
  • [10] A noise-resistant infill sampling criterion in surrogate-assisted multi-objective evolutionary algorithms
    Zheng, Nan
    Wang, Handing
    SWARM AND EVOLUTIONARY COMPUTATION, 2024, 86