Development of a femininity estimator using speaker recognition techniques for voice therapy of gender identity disorder clients

被引:0
作者
Minematsu, Nobuaki [1 ]
Maruyama, Kazutaka [1 ]
Sakuraba, Kyoko [2 ]
Hirose, Keikichi [1 ]
Tayama, Niro [3 ]
Imaizumi, Satoshi [4 ]
Yamauchi, Toshio [5 ]
机构
[1] Univ Tokyo, Tokyo, Japan
[2] Kiyose Shi Welfare Ctr Handicapped, Tokyo, Japan
[3] Int Med Ctr, Tokyo, Japan
[4] Prefectural Univ Hiroshima, Hiroshima, Japan
[5] Saitama Med Univ, Moroyama, Saitama, Japan
来源
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年
关键词
femininity; GID; speaker recognition; GMM;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes the development of an estimator of perceptual femininity (PF) of an input utterance using speaker recognition techniques. The estimator was designed for its clinical use and the target speakers are Gender Identity Disorder (GID) clients, especially MtF (Male to Female) transsexuals. The voice therapy for MtFs is composed of three kinds of training; 1) raising the baseline F-0 range, 2) changing the baseline voice quality, and 3) enhancing F-0 dynamics to produce an exaggerated intonation pattern. The first two focus on static acoustic properties of speech and the voice quality is mainly controlled by size and shape of the articulators, which can be acoustically characterized by the spectral envelope. Gaussian Mixture Models (GMM) of F-0 values and spectrums were built separately for biologically male speakers and female ones. Using the four models, PF was estimated automatically for each of 142 utterances of 111 MtFs. The estimated values were compared with the PF values obtained through listening tests. Results showed very high correlation (R=0.86), which is comparable to the intra-rater correlation.
引用
收藏
页码:297 / +
页数:2
相关论文
共 1 条
  • [1] IMPROVED ESTIMATION OF FEMININITY USING GMM SUPERVECTORS AND SVR FOR VOICE THERAPY OF GENDER IDENTITY DISORDER CLIENTS
    Wang, Chengshuo
    Suzuki, Masayuki
    Minematsu, Nobuaki
    Sakuraba, Kyoko
    Hirose, Keikichi
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7751 - 7754