AMEGO: Active Memory from Long EGOcentric Videos

被引：0

作者：

Goletto, Gabriele ^{[1
]}

Nagarajan, Tushar ^{[2
]}

Averta, Giuseppe ^{[1
]}

Damen, Dima ^{[3
]}

机构：

[1] Politecn Torino, Turin, Italy

[2] Meta, FAIR, Austin, TX USA

[3] Univ Bristol, Bristol, Avon, England

来源：

COMPUTER VISION - ECCV 2024, PT XIII | 2025年 / 15071卷

基金：

英国工程与自然科学研究理事会;

关键词：

Long video understanding; Egocentric vision;

D O I：

10.1007/978-3-031-72624-8_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Egocentric videos provide a unique perspective into individuals' daily experiences, yet their unstructured nature presents challenges for perception. In this paper, we introduce AMEGO, a novel approach aimed at enhancing the comprehension of very-long egocentric videos. Inspired by the human's ability to maintain information from a single watching, AMEGOfocuses on constructing a self-contained representations from one egocentric video, capturing key locations and object interactions. This representation is semantic-free and facilitates multiple queries without the need to reprocess the entire visual content. Additionally, to evaluate our understanding of very-long egocentric videos, we introduce the new Active Memories Benchmark (AMB), composed of more than 20K of highly challenging visual queries from EPIC-KITCHENS. These queries cover different levels of video reasoning (sequencing, concurrency and temporal grounding) to assess detailed video understanding capabilities. We show-case improved performance of AMEGO on AMB, surpassing other video QA baselines by a substantial margin.

引用

页码：92 / 110

页数：19

共 38 条

[31] Batch-based activity recognition from egocentric photo-streams revisited
Cartas, Alejandro
Marin, Juan
Radeva, Petia
Dimiccoli, Mariella
PATTERN ANALYSIS AND APPLICATIONS, 2018, 21 (04) : 953 - 965
[32] Batch-based activity recognition from egocentric photo-streams revisited
Alejandro Cartas
Juan Marín
Petia Radeva
Mariella Dimiccoli
Pattern Analysis and Applications, 2018, 21 : 953 - 965
[33] Braille Block Detection via Multi-Objective Optimization from an Egocentric Viewpoint
Takano, Tsubasa
Nakane, Takumi
Akashi, Takuya
Zhang, Chao
SENSORS, 2021, 21 (08)
[34] EgoPoser: Robust Real-Time Egocentric Pose Estimation from Sparse and Intermittent Observations Everywhere
Jiang, Jiaxi
Streli, Paul
Meier, Manuel
Holz, Christian
COMPUTER VISION - ECCV 2024, PT II, 2025, 15060 : 277 - 294
[35] Hand Activity Recognition From Automatic Estimated Egocentric Skeletons Combining Slow Fast and Graphical Neural Networks
Le, Viet-Duc
Hoang, Van-Nam
Nguyen, Tien-Thanh
Le, Van-Hung
Tran, Thanh-Hai
Vu, Hai
Le, Thi-Lan
VIETNAM JOURNAL OF COMPUTER SCIENCE, 2023, 10 (01) : 75 - 100
[36] Self-Supervised Learning from Untrimmed Videos via Hierarchical Consistency
Qing, Zhiwu
Zhang, Shiwei
Huang, Ziyuan
Xu, Yi
Wang, Xiang
Gao, Changxin
Jin, Rong
Sang, Nong
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12408 - 12426
[37] Does our social life influence our nutritional behaviour? Understanding nutritional habits from egocentric photo-streams
Glavan, Andreea
Matei, Alina
Radeva, Petia
Talavera, Estefania
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 171
[38] EgoFish3D: Egocentric 3D Pose Estimation From a Fisheye Camera via Self-Supervised Learning
Liu, Yuxuan
Yang, Jianxin
Gu, Xiao
Chen, Yijun
Guo, Yao
Yang, Guang-Zhong
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8880 - 8891

← 1 2 3 4 →