A SPECTRO-TEMPORAL TECHNIQUE FOR ESTIMATING APERIODICITY AND VOICED/UNVOICED DECISION BOUNDARIES OF SPEECH SIGNALS

被引：0

作者：

Dhiman, Jitendra Kumar ^{[1
]}

Seelamantula, Chandra Sekhar ^{[1
]}

机构：

[1] Indian Inst Sci, Dept Elect Engn, Bangalore 560012, Karnataka, India

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年

关键词：

aperiodicity in 2-D; band-wise aperiodicy parameters; carrier spectrogram; coherence map; DEMODULATION; SYSTEM;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In contrast to a 1-D short-time analysis of speech, 2-D approaches aim at characterizing the speech signal attributes jointly in time and frequency. In this paper, we focus on the quasi-periodicity of a voiced spectro-temporal patch and quantify it by proposing an aperiodicity measure defined using the underlying frequency modulations in the patch. We further propose a time-frequency aperiodicity map obtained by overlapping and adding the aperiodicity measures across patches. The proposed aperiodicity map is utilized to obtain band-wise aperiodicity parameters, which are essential for high-quality speech synthesis. The aperiodicity in unvoiced patches is addressed by identifying them using the coherence of the patch. In addition, the proposed technique also provides voiced/unvoiced decisions boundaries of a speech signal. The effectiveness of the proposed band-wise aperiodicity parameters and voiced/unvoiced decisions is verified by incorporating them in an existing state-of-the-art vocoder for speech synthesis. Subjective listening tests show that the quality of the reconstructed speech is on par with that of the state-of-the-art WORLD vocoder in terms of mean opinion score, indicating that spectrotemporal approaches are highly promising for speech analysis and synthesis applications.

引用

页码：6510 / 6514

页数：5