AMEGO: Active Memory from Long EGOcentric Videos

被引:0
|
作者
Goletto, Gabriele [1 ]
Nagarajan, Tushar [2 ]
Averta, Giuseppe [1 ]
Damen, Dima [3 ]
机构
[1] Politecn Torino, Turin, Italy
[2] Meta, FAIR, Austin, TX USA
[3] Univ Bristol, Bristol, Avon, England
来源
COMPUTER VISION - ECCV 2024, PT XIII | 2025年 / 15071卷
基金
英国工程与自然科学研究理事会;
关键词
Long video understanding; Egocentric vision;
D O I
10.1007/978-3-031-72624-8_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Egocentric videos provide a unique perspective into individuals' daily experiences, yet their unstructured nature presents challenges for perception. In this paper, we introduce AMEGO, a novel approach aimed at enhancing the comprehension of very-long egocentric videos. Inspired by the human's ability to maintain information from a single watching, AMEGOfocuses on constructing a self-contained representations from one egocentric video, capturing key locations and object interactions. This representation is semantic-free and facilitates multiple queries without the need to reprocess the entire visual content. Additionally, to evaluate our understanding of very-long egocentric videos, we introduce the new Active Memories Benchmark (AMB), composed of more than 20K of highly challenging visual queries from EPIC-KITCHENS. These queries cover different levels of video reasoning (sequencing, concurrency and temporal grounding) to assess detailed video understanding capabilities. We show-case improved performance of AMEGO on AMB, surpassing other video QA baselines by a substantial margin.
引用
收藏
页码:92 / 110
页数:19
相关论文
共 38 条
  • [31] Batch-based activity recognition from egocentric photo-streams revisited
    Cartas, Alejandro
    Marin, Juan
    Radeva, Petia
    Dimiccoli, Mariella
    PATTERN ANALYSIS AND APPLICATIONS, 2018, 21 (04) : 953 - 965
  • [32] Batch-based activity recognition from egocentric photo-streams revisited
    Alejandro Cartas
    Juan Marín
    Petia Radeva
    Mariella Dimiccoli
    Pattern Analysis and Applications, 2018, 21 : 953 - 965
  • [33] Braille Block Detection via Multi-Objective Optimization from an Egocentric Viewpoint
    Takano, Tsubasa
    Nakane, Takumi
    Akashi, Takuya
    Zhang, Chao
    SENSORS, 2021, 21 (08)
  • [34] EgoPoser: Robust Real-Time Egocentric Pose Estimation from Sparse and Intermittent Observations Everywhere
    Jiang, Jiaxi
    Streli, Paul
    Meier, Manuel
    Holz, Christian
    COMPUTER VISION - ECCV 2024, PT II, 2025, 15060 : 277 - 294
  • [35] Hand Activity Recognition From Automatic Estimated Egocentric Skeletons Combining Slow Fast and Graphical Neural Networks
    Le, Viet-Duc
    Hoang, Van-Nam
    Nguyen, Tien-Thanh
    Le, Van-Hung
    Tran, Thanh-Hai
    Vu, Hai
    Le, Thi-Lan
    VIETNAM JOURNAL OF COMPUTER SCIENCE, 2023, 10 (01) : 75 - 100
  • [36] Self-Supervised Learning from Untrimmed Videos via Hierarchical Consistency
    Qing, Zhiwu
    Zhang, Shiwei
    Huang, Ziyuan
    Xu, Yi
    Wang, Xiang
    Gao, Changxin
    Jin, Rong
    Sang, Nong
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12408 - 12426
  • [37] Does our social life influence our nutritional behaviour? Understanding nutritional habits from egocentric photo-streams
    Glavan, Andreea
    Matei, Alina
    Radeva, Petia
    Talavera, Estefania
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 171
  • [38] EgoFish3D: Egocentric 3D Pose Estimation From a Fisheye Camera via Self-Supervised Learning
    Liu, Yuxuan
    Yang, Jianxin
    Gu, Xiao
    Chen, Yijun
    Guo, Yao
    Yang, Guang-Zhong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8880 - 8891