A Multimodal Approach to Understanding Human Vocal Expressions and Beyond

被引：2

作者：

Narayanan, Shrikanth ^{[1
,2
]}

机构：

[1] Univ Southern Calif, Niki & CL Max Nikias Chair Engn, Los Angeles, CA 90007 USA

[2] Univ Southern Calif, Elect Engn & jointly Comp Sci Linguist Psychol Ne, Los Angeles, CA 90007 USA

来源：

ICMI'18: PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION | 2018年

关键词：

Human signals; speech; seeing speech; real time MRI; individual variability; biometrics; affective computing; behavioral informatics;

D O I：

10.1145/3242969.3243391

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Human verbal and nonverbal expressions cant' crucial information not only about intent but also emotions, individual identity, and the state of health and wellbeing. From a basic science perspective, understanding how such rich information is encoded in these signals can illuminate underlying production mechanisms including the variability therein, within and across individuals. From a technology perspective, finding ways for automatically processing and decoding this complex information continues to be of interest across a variety of applications. The convergence of sensing, communication and computing technologies is allowing access to data, in diverse forms and modalities, in ways that were unimaginable even a few years ago. These include data that afford the multimodal analysis and interpretation of the generation of human expressions. The first part of the talk will highlight advances that allow us to perform investigations on the dynamics of vocal production using real-time imaging and audio modeling to offer insights about how we produce speech and song with the vocal instrument. The second part of the talk will focus on the production of vocal expressions in conjunction with other signals from the face and body especially in encoding affect. The talk will draw data from various domains notably in health to illustrate some of the applications.

引用

页码：1 / 1

页数：1