AMEGO: Active Memory from Long EGOcentric Videos

被引：0

作者：

Goletto, Gabriele ^{[1
]}

Nagarajan, Tushar ^{[2
]}

Averta, Giuseppe ^{[1
]}

Damen, Dima ^{[3
]}

机构：

[1] Politecn Torino, Turin, Italy

[2] Meta, FAIR, Austin, TX USA

[3] Univ Bristol, Bristol, Avon, England

来源：

COMPUTER VISION - ECCV 2024, PT XIII | 2025年 / 15071卷

基金：

英国工程与自然科学研究理事会;

关键词：

Long video understanding; Egocentric vision;

D O I：

10.1007/978-3-031-72624-8_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Egocentric videos provide a unique perspective into individuals' daily experiences, yet their unstructured nature presents challenges for perception. In this paper, we introduce AMEGO, a novel approach aimed at enhancing the comprehension of very-long egocentric videos. Inspired by the human's ability to maintain information from a single watching, AMEGOfocuses on constructing a self-contained representations from one egocentric video, capturing key locations and object interactions. This representation is semantic-free and facilitates multiple queries without the need to reprocess the entire visual content. Additionally, to evaluate our understanding of very-long egocentric videos, we introduce the new Active Memories Benchmark (AMB), composed of more than 20K of highly challenging visual queries from EPIC-KITCHENS. These queries cover different levels of video reasoning (sequencing, concurrency and temporal grounding) to assess detailed video understanding capabilities. We show-case improved performance of AMEGO on AMB, surpassing other video QA baselines by a substantial margin.

引用

页码：92 / 110

页数：19

共 38 条

[1] Next-active-object prediction from egocentric videos
Furnari, Antonino
Battiato, Sebastiano
Grauman, Kristen
Farinella, Giovanni Maria
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 49 : 401 - 411
[2] Anticipating Next Active Objects for Egocentric Videos
Thakur, Sanket Kumar
Beyan, Cigdem
Morerio, Pietro
Murino, Vittorio
del Bue, Alessio
IEEE ACCESS, 2024, 12 : 61767 - 61779
[3] Market basket analysis from egocentric videos
Santarcangelo, Vito
Farinella, Giovanni Maria
Furnari, Antonino
Battiato, Sebastiano
PATTERN RECOGNITION LETTERS, 2018, 112 : 83 - 90
[4] Recognizing Personal Locations From Egocentric Videos
Furnari, Antonino
Farinella, Giovanni Maria
Battiato, Sebastiano
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2017, 47 (01) : 6 - 18
[5] Geometrical cues in visual saliency models for active object recognition in egocentric videos
Vincent Buso
Jenny Benois-Pineau
Jean-Philippe Domenger
Multimedia Tools and Applications, 2015, 74 : 10077 - 10095
[6] Geometrical cues in visual saliency models for active object recognition in egocentric videos
Buso, Vincent
Benois-Pineau, Jenny
Domenger, Jean-Philippe
MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (22) : 10077 - 10095
[7] Summarization of Egocentric Videos: A Comprehensive Survey
del Molino, Ana Garcia
Tan, Cheston
Lim, Joo-Hwee
Tan, Ah-Hwee
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2017, 47 (01) : 65 - 76
[8] Left/right hand segmentation in egocentric videos
Betancourt, Alejandro
Morerio, Pietro
Barakova, Emilia
Marcenaro, Lucio
Rauterberg, Matthias
Regazzoni, Carlo
COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 154 : 73 - 81
[9] YOLO Series for Human Hand Action Detection and Classification from Egocentric Videos
Nguyen, Hung-Cuong
Nguyen, Thi-Hao
Scherer, Rafal
Le, Van-Hung
SENSORS, 2023, 23 (06)
[10] Detecting Hands in Egocentric Videos: Towards Action Recognition
Cartas, Alejandro
Dimiccoli, Mariella
Radeva, Petia
COMPUTER AIDED SYSTEMS THEORY - EUROCAST 2017, PT II, 2018, 10672 : 330 - 338

← 1 2 3 4 →