Hierarchical Dynamic Image Harmonization

被引:10
作者
Chen, Haoxing [1 ,2 ]
Gu, Zhangxuan [2 ]
Li, Yaohui [1 ]
Lan, Jun [2 ]
Meng, Changhua
Wang, Weiqiang [2 ]
Li, Huaxiong [1 ]
机构
[1] Nanjing Univ, Nanjing, Peoples R China
[2] Ant Grp, Hangzhou, Peoples R China
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年
基金
中国国家自然科学基金;
关键词
image harmonization; hierarchical dynamics; K-nearest neighbor;
D O I
10.1145/3581783.3611747
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image harmonization is a critical task in computer vision, which aims to adjust the foreground to make it compatible with the background. Recent works mainly focus on using global transformations (i.e., normalization and color curve rendering) to achieve visual consistency. However, these models ignore local visual consistency and their huge model sizes limit their harmonization ability on edge devices. In this paper, we propose a hierarchical dynamic network (HDNet) to adapt features from local to global view for better feature transformation in efficient image harmonization. Inspired by the success of various dynamic models, local dynamic (LD) module and mask-aware global dynamic (MGD) module are proposed in this paper. Specifically, LD matches local representations between the foreground and background regions based on semantic similarities, then adaptively adjust every foreground local representation according to the appearance of its K-nearest neighbor background regions. In this way, LD can produce more realistic images at a more fine-grained level, and simultaneously enjoy the characteristic of semantic alignment. The MGD effectively applies distinct convolution to the foreground and background, learning the representations of foreground and background regions as well as their correlations to the global harmonization, facilitating local visual consistency for the images much more efficiently. Experimental results demonstrate that the proposed HDNet significantly reduces the total model parameters by more than 80% compared to previous methods, while still attaining state-of-the-art performance on the popular iHarmony4 dataset. Additionally, we introduced a lightweight version of HDNet, i.e., HDNet-lite, which has only 0.65MB parameters, yet it still achieve competitive performance. Our code is avaliable at https://github.com/chenhaoxing/HDNet.
引用
收藏
页码:1422 / 1430
页数:9
相关论文
共 35 条
[1]  
[Anonymous], 2017, NEURIPS
[2]  
[Anonymous], 2010, ACM T GRAPH
[3]  
Cai Xun, 2023, IEEE Trans. Multim.
[4]   Semantic Scene Completion via Integrating Instances and Scene in-the-Loop [J].
Cai, Yingjie ;
Chen, Xuesong ;
Zhang, Chao ;
Lin, Kwan-Yee ;
Wang, Xiaogang ;
Li, Hongsheng .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :324-333
[5]  
Chen Jianqi, 2023, ARXIV230301681
[6]   Dynamic Region-Aware Convolution [J].
Chen, Jin ;
Wang, Xijun ;
Guo, Zichao ;
Zhang, Xiangyu ;
Sun, Jian .
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, :8060-8069
[7]   DoveNet: Deep Image Harmonization via Domain Verification [J].
Cong, Wenyan ;
Zhang, Jianfu ;
Niu, Li ;
Liu, Liu ;
Ling, Zhixin ;
Li, Weiyuan ;
Zhang, Liqing .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :8391-8400
[8]  
Cong Wenyan, 2022, CVPR, P18470
[9]   Improving the Harmony of the Composite Image by Spatial-Separated Attention Module [J].
Cun, Xiaodong ;
Pun, Chi-Man .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :4759-4771
[10]   Poisson Image Editing [J].
Di Martino, J. Matias ;
Facciolo, Gabriele ;
Meinhardt-Llopis, Enric .
IMAGE PROCESSING ON LINE, 2016, 6 :300-325