The Effect of Real-Time Constraints on Automatic Speech Animation

被引:4
|
作者
Websdale, Danny [1 ]
Taylor, Sarah [1 ]
Milner, Ben [1 ]
机构
[1] Univ East Anglia, Norwich, Norfolk, England
基金
英国工程与自然科学研究理事会;
关键词
Real-time speech animation; automatic lip sync;
D O I
10.21437/Interspeech.2018-2066
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning has previously been applied successfully to speech-driven facial animation. To account for carry-over and anticipatory coarticulation a common approach is to predict the facial pose using a symmetric window of acoustic speech that includes both past and future context. Using future context limits this approach for animating the faces of characters in real-time and networked applications, such as online gaming. An acceptable latency for conversational speech is 200ms and typically network transmission times will consume a significant part of this. Consequently, we consider asymmetric windows by investigating the extent to which decreasing the future context effects the quality of predicted animation using both deep neural networks (DNNs) and bi-directional LSTM recurrent neural networks (BiLSTMs). Specifically we investigate future contexts from 170ms (fully-symmetric) to 0ms (fully asymmetric). We find that a BiLSTM trained using 70ms of future context is able to predict facial motion of equivalent quality as a DNN trained with 170ms, while introducing increased processing time of only 5ms. Subjective tests using the BiLSTM show that reducing the future context from 170ms to 50ms does not significantly decrease perceived realism. Below 50ms, the perceived realism begins to deteriorate, generating a trade-off between realism and latency.
引用
收藏
页码:2479 / 2483
页数:5
相关论文
共 50 条
  • [21] Real-time animation of spark discharge
    Katsutsugu Matsuyama
    Tadahiro Fujimoto
    Norishige Chiba
    The Visual Computer, 2006, 22 : 761 - 771
  • [22] A Real-Time Pedestrian Animation System
    Schulz, Christian
    Schultz, Michael
    Fricke, Hartmut
    PEDESTRIAN AND EVACUATION DYNAMICS 2008, 2010, : 811 - 817
  • [23] Scalable real-time animation of rivers
    Yu, Qizhi
    Neyret, Fabrice
    Bruneton, Eric
    Holzschuch, Nicolas
    COMPUTER GRAPHICS FORUM, 2009, 28 (02) : 239 - 248
  • [24] Real-time animation of large crowds
    Kang, In-Gu
    Han, JungHyun
    ENTERTAINMENT COMPUTING - ICEC 2006, 2006, 4161 : 382 - +
  • [25] Real-time animation of complex hairstyles
    Volino, P
    Magnenat-Thalmann, N
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2006, 12 (02) : 131 - 142
  • [26] Real-time cartoon water animation
    Yu, Jinhui
    Jiang, Xinan
    Chen, Haiying
    Yao, Cheng
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2007, 18 (4-5) : 405 - 414
  • [27] Real-time animation of spark discharge
    Matsuyama, Katsutsugu
    Fujimoto, Tadahiro
    Chiba, Norishige
    VISUAL COMPUTER, 2006, 22 (9-11): : 761 - 771
  • [28] Real-Time Interactive Tree Animation
    Quigley, Ed
    Yu, Yue
    Huang, Jingwei
    Lin, Winnie
    Fedkiw, Ronald
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2018, 24 (05) : 1717 - 1727
  • [29] Real-time animation of synchrotron radiation
    Shintake, T
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2003, 507 (1-2): : 89 - 92
  • [30] Real-Time Animation for Formal Specification
    Mery, Dominique
    Singh, Neeraj Kumar
    COMPLEX SYSTEMS DESIGN AND MANAGEMENT, 2010, : 49 - 60