Spatiotemporal segmentation and tracking of objects for visualization of videoconference image sequences

被引:40
作者
Kompatsiaris, I [1 ]
Strintzis, MG
机构
[1] Aristotelian Univ Salonika, Dept Elect & Comp Engn, Informat Proc Lab, GR-54006 Salonika, Greece
[2] Ctr Res & Technol Hellas, Informat & Telemat Inst, Salonika 54639, Greece
关键词
model-based image sequence analysis; spatiotemporal image sequence segmentation; VRML;
D O I
10.1109/76.889030
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, a procedure is described for the segmentation, content-based coding, and visualization of videoconference image sequences. First, image sequence analysis is used to estimate the shape and motion parameters of the person facing the camera. A spatiotemporal filter, taking into account the intensity differences between consequent frames, is applied, in order to separate the moving person from the static background. The foreground is segmented in a number of regions in order to identify the face. For this purpose, we propose the novel procedure of K-Means with connectivity constraint algorithm as a general segmentation algorithm combining several types of information including intensity, motion and compactness. In this algorithm, the use of spatiotemporal regions is introduced since a number of frames are analyzed simultaneously and as a result, the same region is present in consequent frames. Based on this information, a 3-D ellipsoid is adapted to the person's face using an efficient and robust algorithm. The rigid 3-D motion is estimated next using a least median of squares approach. Finally, a Virtual Reality Modeling Language (VRML) file is created containing all the above information; this file may he viewed by using any VRML 2.0 compliant browser.
引用
收藏
页码:1388 / 1402
页数:15
相关论文
共 46 条