2-D Processing of Speech for Multi-Pitch Analysis

被引：0

作者：

Wang, Tianyu T. ^{[1
]}

Quatieri, Thomas F. ^{[1
]}

机构：

[1] MIT Lincoln Lab, Lincoln, NE USA

来源：

INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 | 2009年

关键词：

2-D speech processing; Grating Compression Transform; multi-pitch analysis; segmental pitch dynamics;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces a two-dimensional (2-D) processing approach for the analysis of multi-pitch speech sounds. Our framework invokes the short-space 2-D Fourier transform magnitude of a narrowband spectrogram, mapping harmonically-related signal components to multiple concentrated entities in a new 2-D space. First, localized time-frequency regions of the spectrogram are analyzed to extract pitch candidates. These candidates are then combined across multiple regions for obtaining separate pitch estimates of each speech-signal component at a single point in time. We refer to this as multi-region analysis (MRA). By explicitly accounting for pitch dynamics within localized time segments, this separability is distinct from that which can be obtained using short-time autocorrelation methods typically employed in state-of-the-art multi-pitch tracking algorithms. We illustrate the feasibility of MRA for multi-pitch estimation on mixtures of synthetic and real speech.

引用

页码：2795 / 2798

页数：4

共 6 条

[1] EZZAT T, 2007, ISCA INTERSPEECH
[2] SUPER RESOLUTION PITCH DETERMINATION OF SPEECH SIGNALS
MEDAN, Y
YAIR, E
CHAZAN, D
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (01) : 40 - 48
[3] QUATIERI TF, 2002, ISCA INTERSPEECH
[4] Stevens Kenneth N., 1998, ACOUSTIC PHONETICS
[5] A computationally efficient multipitch analysis model
Tolonen, T
Karjalainen, M
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2000, 8 (06): : 708 - 716
[6] A multipitch tracking algorithm for noisy speech
Wu, MY
Wang, DL
Brown, GJ
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (03): : 229 - 241

← 1 →