Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective

被引:0
|
作者
Wei, Pengfei [1 ]
Kong, Lingdong [2 ]
Qu, Xinghua [1 ]
Ren, Yi [1 ]
Xu, Zhiqiang [3 ]
机构
[1] ByteDance, AI Lab, Beijing, Peoples R China
[2] Natl Univ Singapore, Singapore, Singapore
[3] MBZUAI, Abu Dhabi, U Arab Emirates
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Unsupervised video domain adaptation is a practical yet challenging task. In this work, for the first time, we tackle it from a disentanglement view. Our key idea is to handle the spatial and temporal domain divergence separately through disentanglement. Specifically, we consider the generation of cross-domain videos from two sets of latent factors, one encoding the static information and another encoding the dynamic information. A Transfer Sequential VAE (TranSVAE) framework is then developed to model such generation. To better serve for adaptation, we propose several objectives to constrain the latent factors. With these constraints, the spatial divergence can be readily removed by disentangling the static domain-specific information out, and the temporal divergence is further reduced from both frame- and video-levels through adversarial learning. Extensive experiments on the UCF-HMDB, Jester, and Epic-Kitchens datasets verify the effectiveness and superiority of TranSVAE compared with several state-of-the-art approaches.
引用
收藏
页数:20
相关论文
共 50 条
  • [41] DifFAR: Differentiable Frequency-based Disentanglement for Aerial Video Action Recognition
    Kothandaraman, Divya
    Lin, Ming
    Manocha, Dinesh
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 8254 - 8261
  • [42] ROBUST VIDEO FACIAL AUTHENTICATION WITH UNSUPERVISED MODE DISENTANGLEMENT
    Kim, Minsu
    Lee, Hong Joo
    Lee, Sangmin
    Ro, Yong Man
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1321 - 1325
  • [43] Unsupervised Domain Adaptation via Class Aggregation for Text Recognition
    Liu, Xiao-Qian
    Ding, Xue-Ying
    Luo, Xin
    Xu, Xin-Shun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 5617 - 5630
  • [44] Fine-grained Unsupervised Domain Adaptation for Gait Recognition
    Ma, Kang
    Fu, Ying
    Zheng, Dezhi
    Peng, Yunjie
    Cao, Chunshui
    Huang, Yongzhen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 11279 - 11288
  • [45] Fine-grained Unsupervised Domain Adaptation for Gait Recognition
    Ma, Kang
    Fu, Ying
    Zheng, Dezhi
    Peng, Yunjie
    Cao, Chunshui
    Huang, Yongzhen
    Proceedings of the IEEE International Conference on Computer Vision, 2023, : 11279 - 11288
  • [46] Structure Consistent Unsupervised Domain Adaptation for Driver Behavior Recognition
    Liu, Yuying
    Du, Shaoyi
    Guo, Qinbo
    Zhao, Zhiyue
    Tian, Zhiqiang
    Zheng, Nanning
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 1038 - 1043
  • [47] Kurcuma: a kitchen utensil recognition collection for unsupervised domain adaptation
    Rosello, Adrian
    Valero-Mas, Jose J.
    Gallego, Antonio Javier
    Saez-Perez, Javier
    Calvo-Zaragoza, Jorge
    PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (04) : 1557 - 1569
  • [48] UNSUPERVISED ADAPTATION WITH DOMAIN SEPARATION NETWORKS FOR ROBUST SPEECH RECOGNITION
    Meng, Zhong
    Chen, Zhuo
    Mazalov, Vadim
    Li, Jinyu
    Gong, Yifan
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 214 - 221
  • [49] Unsupervised domain adaptation for speech emotion recognition using PCANet
    Huang, Zhengwei
    Xue, Wentao
    Mao, Qirong
    Zhan, Yongzhao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (05) : 6785 - 6799
  • [50] Unsupervised domain adaptation for activity recognition across heterogeneous datasets
    Sanabria, Andrea Rosales
    Ye, Juan
    PERVASIVE AND MOBILE COMPUTING, 2020, 64