Cross-task weakly supervised learning from instructional videos

被引：97

作者：

Zhukov, Dimitri ^{[1
,2
]}

Alayrac, Jean-Baptiste ^{[1
,3
]}

Cinbis, Ramazan Gokberk ^{[4
]}

Fouhey, David ^{[5
]}

Laptev, Ivan ^{[1
,2
]}

Sivic, Josef ^{[1
,2
,6
]}

机构：

[1] Inria, Rocquencourt, France

[2] PSL Res Univ, Ecole Normale Super, Dept Informat, Paris, France

[3] DeepMind, London, England

[4] Middle East Tech Univ, Ankara, Turkey

[5] Univ Michigan, Ann Arbor, MI 48109 USA

[6] Czech Tech Univ, CIIRC Czech Inst Informat Robot & Cybernet, Prague, Czech Republic

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

基金：

欧洲研究理事会;

关键词：

D O I：

10.1109/CVPR.2019.00365

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we investigate learning visual models for the steps of ordinary tasks using weak supervision via instructional narrations and an ordered list of steps instead of strong supervision via temporal annotations. At the heart of our approach is the observation that weakly supervised learning may be easier if a model shares components while learning different steps: "pour egg" should be trained jointly with other tasks involving "pour" and "egg". We formalize this in a component model for recognizing steps and a weakly supervised learning framework that can learn this model under temporal constraints from narration and the list of steps. Past data does not permit systematic studying of sharing and so we also gather a new dataset, CrossTask, aimed at assessing cross-task sharing. Our experiments demonstrate that sharing across tasks improves performance, especially when done at the component level and that our component model can parse previously unseen tasks by virtue of its compositionality.

引用

页码：3532 / 3540

页数：9

共 50 条

[41] CROSS-TASK FACILITATION IN SEMANTIC MEMORY
MACLEOD, CM
VOUMVAKIS, S
BULLETIN OF THE PSYCHONOMIC SOCIETY, 1980, 16 (03) : 153 - 153
[42] The costs and benefits of cross-task priming
Waszak, Florian
Hommel, Bernhard
MEMORY & COGNITION, 2007, 35 (05) : 1175 - 1186
[43] Spatiotemporal Super-Resolution with Cross-Task Consistency and its Semi-supervised Extension
Lin, Han-Yi
Hsiu, Pi-Cheng
Kuo, Tei-Wei
Lin, Yen-Yu
PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 615 - 622
[44] Cross-Task Attention Network: Improving Multi-task Learning for Medical Imaging Applications
Kim, Sangwook
Purdie, Thomas G.
McIntosh, Chris
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023 WORKSHOPS, 2023, 14393 : 119 - 128
[45] Cross-task feature enhancement strategy in multi-task learning for harvesting Sichuan pepper
Wang, Yihan
Deng, Xinglong
Luo, Jianqiao
Li, Bailin
Xiao, Shide
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 207
[46] Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos
Wu, Jie
Li, Guanbin
Han, Xiaoguang
Lin, Liang
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1283 - 1291
[47] Mining Cross-Task Artifact Dependencies from Developer Interactions
Ashraf, Usman
Mayr-Dorn, Christoph
Egyed, Alexander
2019 IEEE 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), 2019, : 186 - 196
[48] Cross-task cue utilisation and situational awareness in learning to manage a simulated rail control task
Joffe, Anthony D.
Wiggins, Mark W.
APPLIED ERGONOMICS, 2020, 89
[49] Detecting Fall Actions of Videos by Using Weakly-Supervised Learning and Unsupervised Clustering Learning
Zhou, Jiaxin
Komuro, Takashi
ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT I, 2022, 13598 : 313 - 324
[50] Weakly supervised object localization and segmentation in videos
Rochan, Mrigank
Rahman, Shafin
Bruce, Neil D. B.
Wang, Yang
IMAGE AND VISION COMPUTING, 2016, 56 : 1 - 12

← 1 2 3 4 5 →