Enhancing video temporal grounding with large language model-based data augmentation

被引:0
|
作者
Tian, Yun [1 ]
Guo, Xiaobo [1 ]
Wang, Jinsong [1 ]
Li, Bin [2 ]
机构
[1] Changchun Univ Sci & Technol, Sch Optoelect Engn, Changchun 130022, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
关键词
Video temporal grounding; Large language model; Data augmentation; Video description; Semantic enrichment; ANNOTATION; QUALITY;
D O I
10.1007/s11227-025-07159-0
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Given an untrimmed video and a natural language query, the task of video temporal grounding (VTG) aims to precisely identify the temporal segment in the video that semantically matches the query. Existing datasets for this task often provide natural language queries that are overly simplistic and manually annotated, which lack sufficient semantic richness to fully capture the video's content. This limitation hinders the model's ability to comprehend complex semantic scenarios and degrades its overall performance. To address these challenges, we introduce a novel, low-cost, large language model-based data augmentation method, that can enrich the original samples and expand the dataset without requiring external data. We propose a fine-grained image captioning module with a noise filter to extract unexploited information from videos. Additionally, we design a hierarchical semantic prompting framework to guide GPT-3.5 in producing semantically rich and contextually coherent natural language queries. Our method outperforms the SOTA method MRTNet when combined with 2D-TAN and VSLNet across three public VTG datasets, particularly excelling in complex semantics and long-duration segment localization.
引用
收藏
页数:31
相关论文
共 50 条
  • [1] Improving Text Classification with Large Language Model-Based Data Augmentation
    Zhao, Huanhuan
    Chen, Haihua
    Ruggles, Thomas A.
    Feng, Yunhe
    Singh, Debjani
    Yoon, Hong-Jun
    ELECTRONICS, 2024, 13 (13)
  • [2] RumorLLM: A Rumor Large Language Model-Based Fake-News-Detection Data-Augmentation Approach
    Lai, Jianqiao
    Yang, Xinran
    Luo, Wenyue
    Zhou, Linjiang
    Li, Langchen
    Wang, Yongqi
    Shi, Xiaochuan
    APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [3] Data Augmentation and Large Language Model for Legal Case Retrieval and Entailment
    Minh-Quan Bui
    Dinh-Truong Do
    Nguyen-Khang Le
    Dieu-Hien Nguyen
    Khac-Vu-Hiep Nguyen
    Trang Pham Ngoc Anh
    Minh Le Nguyen
    The Review of Socionetwork Strategies, 2024, 18 : 49 - 74
  • [4] Data Augmentation and Large Language Model for Legal Case Retrieval and Entailment
    Bui, Minh-Quan
    Do, Dinh-Truong
    Le, Nguyen-Khang
    Nguyen, Dieu-Hien
    Nguyen, Khac-Vu-Hiep
    Anh, Trang Pham Ngoc
    Nguyen, Minh Le
    REVIEW OF SOCIONETWORK STRATEGIES, 2024, 18 (01) : 49 - 74
  • [5] Data augmentation based on large language models for radiological report classification
    Collado-Montanez, Jaime
    Martin-Valdivia, Maria-Teresa
    Martinez-Camara, Eugenio
    KNOWLEDGE-BASED SYSTEMS, 2025, 308
  • [6] Speech de-identification data augmentation leveraging large language model
    Dhingra, Priyanshu
    Agrawal, Satyam
    Veerappan, Chandra Sekar
    Thi Nga Ho
    Chng, Eng Siong
    Tong, Rong
    2024 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, IALP 2024, 2024, : 97 - 102
  • [7] Null Model-Based Data Augmentation for Graph Classification
    Wang, Zeyu
    Wang, Jinhuan
    Shan, Yalu
    Yu, Shanqing
    Xu, Xiaoke
    Xuan, Qi
    Chen, Guanrong
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2024, 11 (02): : 1821 - 1833
  • [8] Advancing Sensitive Health Data Recognition and Normalization Through Large Language Model Driven Data Augmentation
    Chao, Chia-Yi
    Lin, Cheng-Wei
    LARGE LANGUAGE MODELS FOR AUTOMATIC DEIDENTIFICATION OF ELECTRONIC HEALTH RECORD NOTES, IW-DMRN 2024, 2025, 2148 : 48 - 59
  • [9] CALLM: Enhancing Clinical Interview Analysis Through Data Augmentation With Large Language Models
    Wu, Yuqi
    Mao, Kaining
    Zhang, Yanbo
    Chen, Jie
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (12) : 7531 - 7542
  • [10] Large Language Model-Based Wireless Network Design
    Qiu, Kehai
    Bakirtzis, Stefanos
    Wassell, Ian
    Song, Hui
    Zhang, Jie
    Wang, Kezhi
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2024, 13 (12) : 3340 - 3344