Enhancing video temporal grounding with large language model-based data augmentation

被引:0
|
作者
Tian, Yun [1 ]
Guo, Xiaobo [1 ]
Wang, Jinsong [1 ]
Li, Bin [2 ]
机构
[1] Changchun Univ Sci & Technol, Sch Optoelect Engn, Changchun 130022, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
关键词
Video temporal grounding; Large language model; Data augmentation; Video description; Semantic enrichment; ANNOTATION; QUALITY;
D O I
10.1007/s11227-025-07159-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Given an untrimmed video and a natural language query, the task of video temporal grounding (VTG) aims to precisely identify the temporal segment in the video that semantically matches the query. Existing datasets for this task often provide natural language queries that are overly simplistic and manually annotated, which lack sufficient semantic richness to fully capture the video's content. This limitation hinders the model's ability to comprehend complex semantic scenarios and degrades its overall performance. To address these challenges, we introduce a novel, low-cost, large language model-based data augmentation method, that can enrich the original samples and expand the dataset without requiring external data. We propose a fine-grained image captioning module with a noise filter to extract unexploited information from videos. Additionally, we design a hierarchical semantic prompting framework to guide GPT-3.5 in producing semantically rich and contextually coherent natural language queries. Our method outperforms the SOTA method MRTNet when combined with 2D-TAN and VSLNet across three public VTG datasets, particularly excelling in complex semantics and long-duration segment localization.
引用
收藏
页数:31
相关论文
共 50 条
  • [21] LLMGR: Large Language Model-based Generative Retrieval in Alipay Search
    Chen, Wei
    Ji, Yixin
    Chen, Zeyuan
    Xu, Jia
    Liu, Zhongyi
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2847 - 2851
  • [22] LARCH: Large Language Model-based Automatic Readme Creation with Heuristics
    Koreeda, Yuta
    Morishita, Terufumi
    Imaichi, Osamu
    Sogawa, Yasuhiro
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 5066 - 5070
  • [23] Training-Free Video Temporal Grounding Using Large-Scale Pre-trained Models
    Zheng, Minghang
    Cai, Xinhao
    Chen, Qingchao
    Peng, Yuxin
    Liu, Yang
    COMPUTER VISION-ECCV 2024, PT LXXXII, 2025, 15140 : 20 - 37
  • [24] A large language model-based agent for wayfinding: simulation of spatial perception and memory
    Dang, Pei
    Zhu, Jun
    Li, Weilian
    Lai, Jianbo
    CARTOGRAPHY AND GEOGRAPHIC INFORMATION SCIENCE, 2024,
  • [25] A statistical deformation model-based data augmentation method for volumetric medical image segmentation
    He, Wenfeng
    Zhang, Chulong
    Dai, Jingjing
    Liu, Lin
    Wang, Tangsheng
    Liu, Xuan
    Jiang, Yuming
    Li, Na
    Xiong, Jing
    Wang, Lei
    Xie, Yaoqin
    Liang, Xiaokun
    MEDICAL IMAGE ANALYSIS, 2024, 91
  • [26] Design Knowledge as Attention Emphasizer in Large Language Model-Based Sentiment Analysis
    Han, Yi
    Moghaddam, Mohsen
    JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2025, 25 (02)
  • [27] Data Augmentation Based on Pre-trained Language Model for Event Detection
    Zhang, Meng
    Xie, Zhiwen
    Liu, Jin
    CCKS 2021 - EVALUATION TRACK, 2022, 1553 : 59 - 68
  • [28] Video Summarization Based on Feature Fusion and Data Augmentation
    Psallidas, Theodoros
    Spyrou, Evaggelos
    COMPUTERS, 2023, 12 (09)
  • [29] Improving Machine Learning Diagnostic Systems with Model-Based Data Augmentation - Part B: Application
    Kahlen, Jannis Nikolas
    Wuerde, Andre
    Andres, Michael
    Moser, Albert
    2021 IEEE PES INNOVATIVE SMART GRID TECHNOLOGY EUROPE (ISGT EUROPE 2021), 2021, : 495 - 500
  • [30] Seq2Seq Model-based Augmentation of Atmospheric Microwave Remote Sensing Data
    Wu, Peng
    Liu, Zhifu
    Wu, Changzhe
    2024 INTERNATIONAL CONFERENCE ON MICROWAVE AND MILLIMETER WAVE TECHNOLOGY, ICMMT, 2024,