Cross-task weakly supervised learning from instructional videos

被引:97
|
作者
Zhukov, Dimitri [1 ,2 ]
Alayrac, Jean-Baptiste [1 ,3 ]
Cinbis, Ramazan Gokberk [4 ]
Fouhey, David [5 ]
Laptev, Ivan [1 ,2 ]
Sivic, Josef [1 ,2 ,6 ]
机构
[1] Inria, Rocquencourt, France
[2] PSL Res Univ, Ecole Normale Super, Dept Informat, Paris, France
[3] DeepMind, London, England
[4] Middle East Tech Univ, Ankara, Turkey
[5] Univ Michigan, Ann Arbor, MI 48109 USA
[6] Czech Tech Univ, CIIRC Czech Inst Informat Robot & Cybernet, Prague, Czech Republic
基金
欧洲研究理事会;
关键词
D O I
10.1109/CVPR.2019.00365
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we investigate learning visual models for the steps of ordinary tasks using weak supervision via instructional narrations and an ordered list of steps instead of strong supervision via temporal annotations. At the heart of our approach is the observation that weakly supervised learning may be easier if a model shares components while learning different steps: "pour egg" should be trained jointly with other tasks involving "pour" and "egg". We formalize this in a component model for recognizing steps and a weakly supervised learning framework that can learn this model under temporal constraints from narration and the list of steps. Past data does not permit systematic studying of sharing and so we also gather a new dataset, CrossTask, aimed at assessing cross-task sharing. Our experiments demonstrate that sharing across tasks improves performance, especially when done at the component level and that our component model can parse previously unseen tasks by virtue of its compositionality.
引用
收藏
页码:3532 / 3540
页数:9
相关论文
共 50 条
  • [41] CROSS-TASK FACILITATION IN SEMANTIC MEMORY
    MACLEOD, CM
    VOUMVAKIS, S
    BULLETIN OF THE PSYCHONOMIC SOCIETY, 1980, 16 (03) : 153 - 153
  • [42] The costs and benefits of cross-task priming
    Waszak, Florian
    Hommel, Bernhard
    MEMORY & COGNITION, 2007, 35 (05) : 1175 - 1186
  • [43] Spatiotemporal Super-Resolution with Cross-Task Consistency and its Semi-supervised Extension
    Lin, Han-Yi
    Hsiu, Pi-Cheng
    Kuo, Tei-Wei
    Lin, Yen-Yu
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 615 - 622
  • [44] Cross-Task Attention Network: Improving Multi-task Learning for Medical Imaging Applications
    Kim, Sangwook
    Purdie, Thomas G.
    McIntosh, Chris
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023 WORKSHOPS, 2023, 14393 : 119 - 128
  • [45] Cross-task feature enhancement strategy in multi-task learning for harvesting Sichuan pepper
    Wang, Yihan
    Deng, Xinglong
    Luo, Jianqiao
    Li, Bailin
    Xiao, Shide
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 207
  • [46] Reinforcement Learning for Weakly Supervised Temporal Grounding of Natural Language in Untrimmed Videos
    Wu, Jie
    Li, Guanbin
    Han, Xiaoguang
    Lin, Liang
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1283 - 1291
  • [47] Mining Cross-Task Artifact Dependencies from Developer Interactions
    Ashraf, Usman
    Mayr-Dorn, Christoph
    Egyed, Alexander
    2019 IEEE 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), 2019, : 186 - 196
  • [48] Cross-task cue utilisation and situational awareness in learning to manage a simulated rail control task
    Joffe, Anthony D.
    Wiggins, Mark W.
    APPLIED ERGONOMICS, 2020, 89
  • [49] Detecting Fall Actions of Videos by Using Weakly-Supervised Learning and Unsupervised Clustering Learning
    Zhou, Jiaxin
    Komuro, Takashi
    ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT I, 2022, 13598 : 313 - 324
  • [50] Weakly supervised object localization and segmentation in videos
    Rochan, Mrigank
    Rahman, Shafin
    Bruce, Neil D. B.
    Wang, Yang
    IMAGE AND VISION COMPUTING, 2016, 56 : 1 - 12