Automatic Annotation of Human Actions in Video

被引:105
作者
Duchenne, Olivier [1 ]
Laptev, Ivan [1 ]
Sivic, Josef [1 ]
Bach, Francis [1 ]
Ponce, Jean [1 ]
机构
[1] INRIA, Ecole Normale Super, Paris, France
来源
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2009年
关键词
D O I
10.1109/ICCV.2009.5459279
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses the problem of automatic temporal annotation of realistic human actions in video using minimal manual supervision. To this end we consider two associated problems: (a) weakly-supervised learning of action models from readily available annotations, and (b) temporal localization of human actions in test videos. To avoid the prohibitive cost of manual annotation for training, we use movie scripts as a means of weak supervision. Scripts, however, provide only implicit, noisy, and imprecise information about the type and location of actions in video. We address this problem with a kernel-based discriminative clustering algorithm that locates actions in the weakly-labeled training data. Using the obtained action samples, we train temporal action detectors and apply them to locate actions in the raw video data. Our experiments demonstrate that the proposed method for weakly-supervised learning of action models leads to significant improvement in action detection. We present detection results for three action classes in four feature length movies with challenging and realistic video data.
引用
收藏
页码:1491 / 1498
页数:8
相关论文
共 23 条
  • [1] [Anonymous], 2007, 2007 IEEE C COMPUTER
  • [2] [Anonymous], 2008, TRECV EV SURV EV DET
  • [3] [Anonymous], 2007, ICCV
  • [4] Blank M, 2005, IEEE I CONF COMP VIS, P1395
  • [5] Buehler P., 2009, CVPR
  • [6] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [7] Cour T, 2008, LECT NOTES COMPUT SC, V5305, P158, DOI 10.1007/978-3-540-88693-8_12
  • [8] Dollar P., 2005, Proceedings. 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance (VS-PETS) (IEEE Cat. No. 05EX1178), P65
  • [9] Efros AA, 2003, NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS, P726
  • [10] Everingham Mark, 2006, Proc. British Machine Vision Conference (BMVC), DOI DOI 10.5244/C.20.92