Perceptually Motivated Guidelines for Voice Synchronization in Film

被引:4
作者
Carter, Elizabeth J. [1 ]
Sharan, Lavanya [2 ]
Trutoiu, Laura [1 ]
Matthews, Iain [2 ]
Hodgins, Jessica K. [1 ,2 ]
机构
[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA
[2] Disney Res Pittsburgh, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会;
关键词
Documentation; Languages; Multisensory perception and integration; human perception and performance; auditory perceptual research; visual psychophysics; SPEECH-PERCEPTION;
D O I
10.1145/1823738.1823741
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We consume video content in a multitude of ways, including in movie theaters, on television, on DVDs and Blu-rays, online, on smart phones, and on portable media players. For quality control purposes, it is important to have a uniform viewing experience across these various platforms. In this work, we focus on voice synchronization, an aspect of video quality that is strongly affected by current post-production and transmission practices. We examined the synchronization of an actor's voice and lip movements in two distinct scenarios. First, we simulated the temporal mismatch between the audio and video tracks that can occur during dubbing or during broadcast. Next, we recreated the pitch changes that result from conversions between formats with different frame rates. We show, for the first time, that these audio visual mismatches affect viewer enjoyment. When temporal synchronization is noticeably absent, there is a decrease in the perceived performance quality and the perceived emotional intensity of a performance. For pitch changes, we find that higher pitch voices are not preferred, especially for male actors. Based on our findings, we advise that mismatched audio and video signals negatively affect viewer experience.
引用
收藏
页数:12
相关论文
共 22 条
  • [1] ABELIN A, 2007, P 7 INT C EP ROB MOD
  • [2] The psychophysics toolbox
    Brainard, DH
    [J]. SPATIAL VISION, 1997, 10 (04): : 433 - 436
  • [3] Chion M., 1994, Audio-Vision: Sound on Screen
  • [4] THE DETECTION OF AUDITORY VISUAL DESYNCHRONY
    DIXON, NF
    SPITZ, L
    [J]. PERCEPTION, 1980, 9 (06) : 719 - 721
  • [5] Ekman P., 1972, Emotion in the Human Face: Guidelines for Research and an Integration of Findings
  • [6] GAROFOLO J, 1993, 4930 NIST SISTIR
  • [7] GRANT K, 2003, P AUD VIS SPEECH PRO
  • [8] GRANT KW, 2001, P AUD VIS SPEECH PRO
  • [9] HUANG E, 2007, SEARCHING IDEAL LIVE
  • [10] INVESTIGATION OF SPEAKER PHOTOGRAPH IDENTIFICATION
    LASS, NJ
    HARVEY, LA
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1976, 59 (05) : 1232 - 1236