'Putting the face to the voice': Matching identity across modality

被引:131
作者
Kamachi, M
Hill, H
Lander, K
Vatikiotis-Bateson, E
机构
[1] ATR, Human Informat Sci Labs, Kyoto 6190288, Japan
[2] Univ Manchester, Dept Psychol, Manchester M13 9PL, Lancs, England
[3] Univ British Columbia, Dept Linguist, Vancouver, BC V6T 1Z1, Canada
关键词
D O I
10.1016/j.cub.2003.09.005
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Speech perception provides compelling examples of a strong link between auditory and visual modalities [1, 2]. This link originates in the mechanics of speech production, which, in shaping the vocal tract, determine the movement of the face as well as the sound of the voice [3, 4]. In this paper, we present evidence that equivalent information about identity is available cross-modally from both the face and voice. Using a delayed matching to sample task, XAB, we show that people can match the video of an unfamiliar face, X, to an unfamiliar voice, A or B, and vice versa, but only when stimuli are moving and are played forward. The critical role of time-varying information is underlined by the ability to match faces to voices containing only the coarse spatial and temporal information provided by sine wave speech [5]. The effect of varying sentence content across modalities was small, showing that identity-specific information is not closely tied to particular utterances. We conclude that the physical constraints linking faces to voices result in bimodally available dynamic information, not only about what is being said, but also about who is saying it.
引用
收藏
页码:1709 / 1714
页数:6
相关论文
共 21 条
  • [11] Munhall K. G., 1998, HEARING EYE 2, P123
  • [12] Recognizing moving faces: a psychological and neural synthesis
    O'Toole, AJ
    Roark, DA
    Abdi, H
    [J]. TRENDS IN COGNITIVE SCIENCES, 2002, 6 (06) : 261 - 266
  • [13] Talker identification based on phonetic information
    Remez, RE
    Fellowes, JM
    Rubin, PE
    [J]. JOURNAL OF EXPERIMENTAL PSYCHOLOGY-HUMAN PERCEPTION AND PERFORMANCE, 1997, 23 (03) : 651 - 666
  • [14] SPEECH-PERCEPTION WITHOUT TRADITIONAL SPEECH CUES
    REMEZ, RE
    RUBIN, PE
    PISONI, DB
    CARRELL, TD
    [J]. SCIENCE, 1981, 212 (4497) : 947 - 950
  • [15] Rosenblum L. D., 2002, ICSLP 2002
  • [16] FAMILIAR VOICE RECOGNITION - PATTERNS AND PARAMETERS .1. RECOGNITION OF BACKWARD VOICES
    VANLANCKER, D
    KREIMAN, J
    EMMOREY, K
    [J]. JOURNAL OF PHONETICS, 1985, 13 (01) : 19 - 38
  • [17] Eye movement of perceivers during audiovisual speech perception
    Vatikiotis-Bateson, E
    Eigsti, IM
    Yano, S
    Munhall, KG
    [J]. PERCEPTION & PSYCHOPHYSICS, 1998, 60 (06): : 926 - 940
  • [18] VATIKIOTISBATES.E, 1996, T TECHNICAL COMMITTE, P1
  • [19] VATIKIOTISBATES.E, 1996, NATO ASI SERIES F, V150, P221
  • [20] Quantitative association of vocal-tract and facial behavior
    Yehia, H
    Rubin, P
    Vatikiotis-Bateson, E
    [J]. SPEECH COMMUNICATION, 1998, 26 (1-2) : 23 - 43