Contrastive Learning of Global-Local Video Representations

被引:0
作者
Ma, Shuang [1 ]
Zeng, Zhaoyang [2 ]
McDuff, Daniel [3 ]
Song, Yale [3 ]
机构
[1] Microsoft, Redmond, WA 98052 USA
[2] Sun Yat Sen Univ, Guangzhou, Peoples R China
[3] Microsoft Res, Redmond, WA USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷
关键词
NETWORK; SOUND;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Contrastive learning has delivered impressive results for various tasks in the selfsupervised regime. However, existing approaches optimize for learning representations specific to downstream scenarios, i.e., global representations suitable for tasks such as classification or local representations for tasks such as detection and localization. While they produce satisfactory results in the intended downstream scenarios, they often fail to generalize to tasks that they were not originally designed for. In this work, we propose to learn video representations that generalize to both the tasks which require global semantic information (e.g., classification) and the tasks that require local fine-grained spatio-temporal information (e.g., localization). We achieve this by optimizing two contrastive objectives that together encourage our model to learn global-local visual information given audio signals. We show that the two objectives mutually improve the generalizability of the learned global-local representations, significantly outperforming their disjointly learned counterparts. We demonstrate our approach on various tasks including action/sound classification, lip reading, deepfake detection, event and sound localization.(1)
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Attribution of Urban Diurnal Thermal Environmental Change: Importance of Global-Local Effects
    Yu, Wenbo
    Yang, Jun
    Cong, Nan
    Ren, Jiayi
    Yu, Huisheng
    Xiao, Xiangming
    Xia, Jianhong
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 8087 - 8101
  • [22] Global-Local Association Discrepancy for Multivariate Time Series Anomaly Detection in IIoT
    Zhou, Xiaobo
    Dai, Cuini
    Wang, Weixu
    Qiu, Tie
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (07) : 11287 - 11297
  • [23] Tensorial Global-Local Graph Self-Representation for Hyperspectral Band Selection
    Zhang, Yongshan
    Qi, Jianwen
    Wang, Xinxin
    Cai, Zhihua
    Peng, Jiangtao
    Zhou, Yicong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (12) : 13213 - 13225
  • [24] Deep learning-based multi-stage postoperative type-b aortic dissection segmentation using global-local fusion learning
    Zhang, Xuyang
    Cheng, Guoliang
    Han, Xiaofeng
    Li, Shilong
    Xiong, Jiang
    Wu, Ziheng
    Zhang, Hongkun
    Chen, Duanduan
    PHYSICS IN MEDICINE AND BIOLOGY, 2023, 68 (23)
  • [25] LGFormer: integrating local and global representations for EEG decoding
    Yang, Wenjie
    Wang, Xingfu
    Qi, Wenxia
    Wang, Wei
    JOURNAL OF NEURAL ENGINEERING, 2025, 22 (02)
  • [26] Multi-Modal Transformer With Global-Local Alignment for Composed Query Image Retrieval
    Xu, Yahui
    Bin, Yi
    Wei, Jiwei
    Yang, Yang
    Wang, Guoqing
    Shen, Heng Tao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8346 - 8357
  • [27] GLaLT: Global-Local Attention-Augmented Light Transformer for Scene Text Recognition
    Zhang, Hui
    Luo, Guiyang
    Kang, Jian
    Huang, Shan
    Wang, Xiao
    Wang, Fei-Yue
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 10145 - 10158
  • [28] UAV Imagery Real-Time Semantic Segmentation with Global-Local Information Attention
    Zhang, Zikang
    Li, Gongquan
    SENSORS, 2025, 25 (06)
  • [29] Global-Local Coupled Style Transfer for Semantic Segmentation of Bitemporal Remote Sensing Images
    Wang, Hao
    Guo, Mingning
    Li, Shaoxian
    Li, Haifeng
    Tao, Chao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [30] GLF-CR: SAR-enhanced cloud removal with global-local fusion
    Xu, Fang
    Shi, Yilei
    Ebel, Patrick
    Yu, Lei
    Xia, Gui-Song
    Yang, Wen
    Zhu, Xiao Xiang
    ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2022, 192 : 268 - 278