New word detection in audio-indexing

被引:0
作者
Dharanipragada, S [1 ]
Roukos, S [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Heights, NY 10598 USA
来源
1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS | 1997年
关键词
D O I
10.1109/ASRU.1997.659135
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For an Audio-Indexing system that uses a speech recognizer with a fixed vocabulary to be practical one needs the ability to detect out of vocabulary or new words at query time. In this paper we present a fast, vocabulary independent, algorithm for spotting words in speech. The algorithm consists of a preprocessing stage and a coarse-to-detailed search strategy for spotting a word/phone sequence in speech. The preprocessing method provides a phone-level representation of the speech that can be searched efficiently. The coarse search, consisting of phone-ngram matching, identifies regions of speech as putative word hits. The detailed acoustic match is then conducted only at the putative hits identified in the coarse match. This gives us the desired speed in wordspotting.
引用
收藏
页码:551 / 557
页数:7
相关论文
共 50 条
[41]   Audio visual cues for video indexing and retrieval [J].
Muneesawang, Paisarn ;
Amin, Tahir ;
Guan, Ling .
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3331 :642-649
[42]   Audio Signal Representations for Indexing in the Transform Domain [J].
Ravelli, Emmanuel ;
Richard, Gael ;
Daudet, Laurent .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (03) :434-446
[43]   Real-world audio indexing systems [J].
Logan, B ;
Goddeau, D ;
Van Thong, JM .
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, :1001-1004
[44]   Fast Caption Alignment for Automatic Indexing of Audio [J].
Knight, Allan ;
Almeroth, Kevin .
INTERNATIONAL JOURNAL OF MULTIMEDIA DATA ENGINEERING & MANAGEMENT, 2010, 1 (02) :1-17
[45]   Indexing Word Sequences for Ranked Retrieval [J].
Huston, Samuel ;
Culpepper, J. Shane ;
Croft, W. Bruce .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2014, 32 (01)
[46]   Audio-visual word prominence detection from clean and noisy speech [J].
Heckmann, Martin .
COMPUTER SPEECH AND LANGUAGE, 2018, 48 :15-30
[47]   Arabic word descriptor for handwritten word indexing and lexicon reduction [J].
Chherawala, Youssouf ;
Cheriet, Mohamed .
PATTERN RECOGNITION, 2014, 47 (10) :3477-3486
[48]   AN EXPERIMENT IN INDEXING BY WORD-CHOOSING [J].
SHAW, TN ;
ROTHMAN, H .
JOURNAL OF DOCUMENTATION, 1968, 24 (03) :159-&
[49]   INDEXING AND CONTROL-WORD TECHNIQUES [J].
BLAAUW, GA .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1959, 3 (03) :288-301
[50]   Audio visual word spotting [J].
Liu, M ;
Xiong, ZY ;
Chu, SM ;
Zhang, ZQ ;
Huang, TS .
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL III, PROCEEDINGS: IMAGE AND MULTIDIMENSIONAL SIGNAL PROCESSING SPECIAL SESSIONS, 2004, :785-788