A multistage algorithm for spotting new words in speech

被引:21
作者
Dharanipragada, S [1 ]
Roukos, S [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2002年 / 10卷 / 08期
关键词
audio indexing; fast match; keyword spotting; multimedia browsing; new-word detection;
D O I
10.1109/TSA.2002.804543
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present a fast, vocabulary independent, algorithm for spotting words in speech. The algorithm consists of a phone-ngram representation (indexing) stage and a coarse-to-detailed search stage for spotting a word/phone sequence in speech. The phone-ngram representation stage provides a phoneme-level representation of the speech that can be searched efficiently. We present a novel method for phoneme-recognition using a vocabulary prefix tree to guide the creation of the phone-ngram index. The coarse search, consisting of phone-ngram matching, identifies regions of speech as putative word hits. The detailed acoustic match is then conducted only at the putative hits identified in the coarse match. This gives us vocabulary independence and the desired accuracy and speed in wordspotting. Current lattice-based phoneme-matching algorithms are similar to the coarse-match step of our Algorithm. We show that our combined algorithm gives a factor of two improvement over the coarse match. The algorithm has wide-ranging use in distributed and pervasive speech recognition applications such as audio-indexing, spoken message retrieval and video-browsing.
引用
收藏
页码:542 / 550
页数:9
相关论文
共 50 条
  • [21] An Anchor-Free Detector for Continuous Speech Keyword Spotting
    Zhao, Zhiyuan
    Tang, Chuanxin
    Yao, Chengdong
    Luo, Chong
    INTERSPEECH 2022, 2022, : 3228 - 3232
  • [22] Keyword Spotting Based On CTC and RNN For Mandarin Chinese Speech
    Wang, Yiyan
    Long, Yanhua
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 374 - 378
  • [23] THE 2013 BBN VIETNAMESE TELEPHONE SPEECH KEYWORD SPOTTING SYSTEM
    Tsakalidis, Stavros
    Hsiao, Roger
    Karakos, Damianos
    Ng, Tim
    Ranjan, Shivesh
    Saikumar, Guruprasad
    Zhang, Le
    Nguyen, Long
    Schwartz, Richard
    Makhoul, John
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [24] Querying out-of-vocabulary words in lexicon-based keyword spotting
    Joan Puigcerver
    Alejandro H. Toselli
    Enrique Vidal
    Neural Computing and Applications, 2017, 28 : 2373 - 2382
  • [25] Querying out-of-vocabulary words in lexicon-based keyword spotting
    Puigcerver, Joan
    Toselli, Alejandro H.
    Vidal, Enrique
    NEURAL COMPUTING & APPLICATIONS, 2017, 28 (09) : 2373 - 2382
  • [26] Speech Keyword Spotting Method Based on Swin-Transformer Model
    Sun, Chengli
    Chen, Bikang
    Chen, Feilong
    Leng, Yan
    Guo, Qiaosheng
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [27] DOMAIN ADVERSARIAL TRAINING FOR IMPROVING KEYWORD SPOTTING PERFORMANCE OF ESL SPEECH
    Hou, Jingyong
    Guo, Pengcheng
    Sun, Sining
    Soong, Frank K.
    Hu, Wenping
    Xie, Lei
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8122 - 8126
  • [28] Keyword spotting method based on speech feature space trace matching
    Wu, YD
    Liu, BL
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 3188 - 3192
  • [29] Keyword Spotting in Continuous Speech Using Spectral and Prosodic Information Fusion
    Laxmi Pandey
    Rajesh M. Hegde
    Circuits, Systems, and Signal Processing, 2019, 38 : 2767 - 2791
  • [30] Robust Dual-Modal Speech Keyword Spotting for XR Headsets
    Cai, Zhuojiang
    Ma, Yuhan
    Lu, Feng
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (05) : 2507 - 2516