Joint correlation analysis of audio-visual dance figures

被引:0
作者
Ofli, F. [1 ]
Demir, Y. [1 ]
Erzin, E. [1 ]
Yemez, Y. [1 ]
Tekalp, A. M. [1 ]
机构
[1] Koc Univ, Goru Grafik Lab, TR-34450 Istanbul, Turkey
来源
2007 IEEE 15TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS, VOLS 1-3 | 2007年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a framework for analysis of dance figures from audio-visual data. Our audio-visual data is the multiview video of a dancing actor which is acquired using 8 synchronized cameras. The multi-camera motion capture technique of this framework is based on 3D tracking of the markers attached to the dancer's body, using stereo color information. The extracted 31) points are used to calculate the body motion features as 3D displacement vectors. On the other hand, MFC coefficients serve as the audio features. In the first stage of the two stage analysis task, we perform Hidden Markov Model (HMM) based unsupervised temporal segmentation of the audio and body motion features, separately, to extract the recurrent elementary audio and body motion patterns. In the second stage, the correlation of body motion patterns with audio patterns is investigated to create a correlation model that can be used during the synthesis of an audio-driven body animation.
引用
收藏
页码:604 / 607
页数:4
相关论文
共 50 条
[21]   Speech Pattern Discovery using Audio-Visual Fusion and Canonical Correlation Analysis [J].
Xie, Lei ;
Xu, Yinqing ;
Zheng, Lilei ;
Huang, Qiang ;
Li, Bingfeng .
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, :2371-2374
[22]   ROBUST CANONICAL CORRELATION ANALYSIS: AUDIO-VISUAL FUSION FOR LEARNING CONTINUOUS INTEREST [J].
Nicolaou, Mihalis A. ;
Panagakis, Yannis ;
Zafeiriou, Stefanos ;
Pantic, Maja .
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[23]   Identification of story units in audio-visual sequences by joint audio and video processing [J].
Saraceno, C ;
Leonardi, R .
1998 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING - PROCEEDINGS, VOL 1, 1998, :363-367
[24]   An audio-visual speech recognition with a new mandarin audio-visual database [J].
Liao, Wen-Yuan ;
Pao, Tsang-Long ;
Chen, Yu-Te ;
Chang, Tsun-Wei .
INT CONF ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS/INT CONF ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL 1, 2007, :19-+
[25]   AUDIO-VISUAL SCENE-AWARE DIALOG AND REASONING USING AUDIO-VISUAL TRANSFORMERS WITH JOINT STUDENT-TEACHER LEARNING [J].
Shah, Ankit ;
Geng, Shijie ;
Gao, Peng ;
Cherian, Anoop ;
Hori, Takaaki ;
Marks, Tim K. ;
Le Roux, Jonathan ;
Hori, Chiori .
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, :7732-7736
[26]   Audio-Visual Biometric Recognition Via Joint Sparse Representations [J].
Primorac, Rudi ;
Togneri, Roberto ;
Bennamoun, Mohammed ;
Sohel, Ferdous .
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, :3031-3035
[27]   AUDIO-VISUAL EDUCATION [J].
Brickman, William W. .
SCHOOL AND SOCIETY, 1948, 67 (1739) :320-326
[28]   Discovering joint audio-visual codewords for video event detection [J].
Jhuo, I-Hong ;
Ye, Guangnan ;
Gao, Shenghua ;
Liu, Dong ;
Jiang, Yu-Gang ;
Lee, D. T. ;
Chang, Shih-Fu .
MACHINE VISION AND APPLICATIONS, 2014, 25 (01) :33-47
[29]   Learning joint statistical models for audio-visual fusion and segregation [J].
Fisher, JW ;
Darrell, T ;
Freeman, WT ;
Viola, P .
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 13, 2001, 13 :772-778
[30]   Audio-Visual Segmentation [J].
Zhou, Jinxing ;
Wang, Jianyuan ;
Zhang, Jiayi ;
Sun, Weixuan ;
Zhang, Jing ;
Birchfield, Stan ;
Guo, Dan ;
Kong, Lingpeng ;
Wang, Meng ;
Zhong, Yiran .
COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 :386-403