Robust Estimation of Fundamental Frequency using Single Frequency Filtering Approach

被引:20
作者
Pannala, Vishala [1 ]
Aneeja, G. [1 ]
Kadiri, Sudarsana Reddy [1 ]
Yegnanarayana, B. [1 ]
机构
[1] Int Inst Informat Technol, Speech & Vis Lab, Hyderabad, Andhra Pradesh, India
来源
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年
关键词
Fundamental frequency; Single frequency filtering; High SNR regions; Harmonics; Root cepstrum; PITCH DETECTION; ALGORITHM; SPEECH;
D O I
10.21437/Interspeech.2016-1401
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
A new method for robust estimation of fundamental frequency (F-0) from speech signal is proposed in this paper. The method exploits the high SNR regions of speech in time and frequency domains in the outputs of single frequency filtering (SFF) of speech signal. The high resolution in the frequency domain brings out the harmonic characteristics of speech clearly. The harmonic spacing in the high SNR regions of spectrum determine the F-0. The concept of root cepstrum is used to reduce the effects of vocal tract resonances in the F-0 estimation. The proposed method is evaluated for clean speech and noisy speech simulated for 15 different degradations at different noise levels. Performance of the proposed method is compared with four other standard methods of F-0 extraction. From the results it is evident that the proposed method is robust for most types of degradations.
引用
收藏
页码:2155 / 2159
页数:5
相关论文
共 28 条
[1]   Single Frequency Filtering Approach for Discriminating Speech and Nonspeech [J].
Aneeja, G. ;
Yegnanarayana, B. .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (04) :705-717
[2]  
[Anonymous], 2011, INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association
[3]  
Boersma P., 2018, Praat: doing phonetics by computer (Version 5.3) Computer software
[4]   A sawtooth waveform inspired pitch estimator for speech and music [J].
Camacho, Arturo ;
Harris, John G. .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2008, 124 (03) :1638-1652
[5]   SAFE: A Statistical Approach to F0 Estimation Under Clean and Noisy Conditions [J].
Chu, Wei ;
Alwan, Abeer .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (03) :933-944
[6]   YIN, a fundamental frequency estimator for speech and music [J].
de Cheveigné, A ;
Kawahara, H .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 111 (04) :1917-1930
[7]  
De Cheveigne A., 1991, P 12 INT C PHONETIC, P218
[8]   Noh voice quality [J].
Fujimura, Osamu ;
Honda, Kiyoshi ;
Kawahara, Hideki ;
Konparu, Yasuyuki ;
Morise, Masanori ;
Williams, J. C. .
LOGOPEDICS PHONIATRICS VOCOLOGY, 2009, 34 (04) :157-170
[9]   PEFAC - A Pitch Estimation Algorithm Robust to High Levels of Noise [J].
Gonzalez, Sira ;
Brookes, Mike .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (02) :518-530
[10]   MEASUREMENT OF PITCH BY SUBHARMONIC SUMMATION [J].
HERMES, DJ .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1988, 83 (01) :257-264