AMEGO: Active Memory from Long EGOcentric Videos

被引:0
|
作者
Goletto, Gabriele [1 ]
Nagarajan, Tushar [2 ]
Averta, Giuseppe [1 ]
Damen, Dima [3 ]
机构
[1] Politecn Torino, Turin, Italy
[2] Meta, FAIR, Austin, TX USA
[3] Univ Bristol, Bristol, Avon, England
来源
COMPUTER VISION - ECCV 2024, PT XIII | 2025年 / 15071卷
基金
英国工程与自然科学研究理事会;
关键词
Long video understanding; Egocentric vision;
D O I
10.1007/978-3-031-72624-8_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Egocentric videos provide a unique perspective into individuals' daily experiences, yet their unstructured nature presents challenges for perception. In this paper, we introduce AMEGO, a novel approach aimed at enhancing the comprehension of very-long egocentric videos. Inspired by the human's ability to maintain information from a single watching, AMEGOfocuses on constructing a self-contained representations from one egocentric video, capturing key locations and object interactions. This representation is semantic-free and facilitates multiple queries without the need to reprocess the entire visual content. Additionally, to evaluate our understanding of very-long egocentric videos, we introduce the new Active Memories Benchmark (AMB), composed of more than 20K of highly challenging visual queries from EPIC-KITCHENS. These queries cover different levels of video reasoning (sequencing, concurrency and temporal grounding) to assess detailed video understanding capabilities. We show-case improved performance of AMEGO on AMB, surpassing other video QA baselines by a substantial margin.
引用
收藏
页码:92 / 110
页数:19
相关论文
共 38 条
  • [1] Next-active-object prediction from egocentric videos
    Furnari, Antonino
    Battiato, Sebastiano
    Grauman, Kristen
    Farinella, Giovanni Maria
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2017, 49 : 401 - 411
  • [2] Anticipating Next Active Objects for Egocentric Videos
    Thakur, Sanket Kumar
    Beyan, Cigdem
    Morerio, Pietro
    Murino, Vittorio
    del Bue, Alessio
    IEEE ACCESS, 2024, 12 : 61767 - 61779
  • [3] Market basket analysis from egocentric videos
    Santarcangelo, Vito
    Farinella, Giovanni Maria
    Furnari, Antonino
    Battiato, Sebastiano
    PATTERN RECOGNITION LETTERS, 2018, 112 : 83 - 90
  • [4] Recognizing Personal Locations From Egocentric Videos
    Furnari, Antonino
    Farinella, Giovanni Maria
    Battiato, Sebastiano
    IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2017, 47 (01) : 6 - 18
  • [5] Geometrical cues in visual saliency models for active object recognition in egocentric videos
    Vincent Buso
    Jenny Benois-Pineau
    Jean-Philippe Domenger
    Multimedia Tools and Applications, 2015, 74 : 10077 - 10095
  • [6] Geometrical cues in visual saliency models for active object recognition in egocentric videos
    Buso, Vincent
    Benois-Pineau, Jenny
    Domenger, Jean-Philippe
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (22) : 10077 - 10095
  • [7] Summarization of Egocentric Videos: A Comprehensive Survey
    del Molino, Ana Garcia
    Tan, Cheston
    Lim, Joo-Hwee
    Tan, Ah-Hwee
    IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2017, 47 (01) : 65 - 76
  • [8] Left/right hand segmentation in egocentric videos
    Betancourt, Alejandro
    Morerio, Pietro
    Barakova, Emilia
    Marcenaro, Lucio
    Rauterberg, Matthias
    Regazzoni, Carlo
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 154 : 73 - 81
  • [9] YOLO Series for Human Hand Action Detection and Classification from Egocentric Videos
    Nguyen, Hung-Cuong
    Nguyen, Thi-Hao
    Scherer, Rafal
    Le, Van-Hung
    SENSORS, 2023, 23 (06)
  • [10] Detecting Hands in Egocentric Videos: Towards Action Recognition
    Cartas, Alejandro
    Dimiccoli, Mariella
    Radeva, Petia
    COMPUTER AIDED SYSTEMS THEORY - EUROCAST 2017, PT II, 2018, 10672 : 330 - 338