RTHEN: Unsupervised deep homography estimation based on dynamic attention for repetitive texture image stitching☆

被引:2
作者
Yan, Ni [1 ,2 ,3 ,4 ,5 ]
Mei, Yupeng [1 ,2 ,3 ,4 ,5 ]
Yang, Tian [1 ,2 ,3 ,4 ,5 ]
Yu, Huihui [6 ]
Chen, Yingyi [1 ,2 ,3 ,4 ,5 ]
机构
[1] China Agr Univ, Natl Innovat Ctr Digital Fishery, Beijing 100083, Peoples R China
[2] Minist Agr & Rural Affairs, Key Lab Smart Farming Aquat Anim & Livestock, Beijing 100083, Peoples R China
[3] China Agr Univ, Beijing Engn & Technol Res Ctr Internet Things Agr, Beijing 100083, Peoples R China
[4] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[5] China Agr Univ, State Key Lab Efficient Utilizat Agr Water Resourc, Beijing 100083, Peoples R China
[6] Beijing Forestry Univ, Sch Informat Sci & Technol, Beijing 100083, Peoples R China
关键词
Homography estimation; Repetitive textures; Deep learning; Dynamic attention; Triplet loss; FEATURES;
D O I
10.1016/j.displa.2024.102670
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Homography estimation is regarded as one of the key challenges in image alignment, where the goal is to estimate the projective transformation between two images on the same plane. Unsupervised learning methods are gradually becoming popular due to their excellent performance and lack of need for labeled data. However, in regional scenes with repeated textures, there may be ambiguity in the correspondence between local features, affecting homography estimation accuracy. This paper proposes a new unsupervised deep homography method RTHEN to solve such problems. In order to effectively obtain repeated texture features, a multi -scale Feature pyramid Siamese network (FPSN) is designed. Specifically, we dynamically allocate the weights of recited texture features through a dynamic attention module and introduce a channel attention module to provide rich contextual information for repeated texture areas. We propose a hard triplet loss function based on overlap constraints to optimize the matching results. At the same time, we collected a repetitive texture image dataset (RTID) for homography estimation training and evaluation. Experimental results show that our method outperforms existing learning methods in repetitive texture scenes and offers competitive performance with state-of-the-art traditional methods.
引用
收藏
页数:8
相关论文
共 43 条
[1]  
Amicantonio G., 2024, P IEEE CVF WINT C AP, P5876
[2]  
Andrew A.M., 2001, KYBERNETES
[3]   MAGSAC: Marginalizing Sample Consensus [J].
Barath, Daniel ;
Matas, Jiri ;
Noskova, Jana .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :10189-10197
[4]   SURF: Speeded up robust features [J].
Bay, Herbert ;
Tuytelaars, Tinne ;
Van Gool, Luc .
COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 :404-417
[5]   SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning [J].
Chen, Long ;
Zhang, Hanwang ;
Xiao, Jun ;
Nie, Liqiang ;
Shao, Jian ;
Liu, Wei ;
Chua, Tat-Seng .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6298-6306
[6]  
DeTone D, 2016, Arxiv, DOI arXiv:1606.03798
[7]   SuperPoint: Self-Supervised Interest Point Detection and Description [J].
DeTone, Daniel ;
Malisiewicz, Tomasz ;
Rabinovich, Andrew .
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :337-349
[8]  
Dusmanu M, 2019, Arxiv, DOI [arXiv:1905.03561, DOI 10.48550/ARXIV.1905.03561]
[9]   RANDOM SAMPLE CONSENSUS - A PARADIGM FOR MODEL-FITTING WITH APPLICATIONS TO IMAGE-ANALYSIS AND AUTOMATED CARTOGRAPHY [J].
FISCHLER, MA ;
BOLLES, RC .
COMMUNICATIONS OF THE ACM, 1981, 24 (06) :381-395
[10]   Dual Attention Network for Scene Segmentation [J].
Fu, Jun ;
Liu, Jing ;
Tian, Haijie ;
Li, Yong ;
Bao, Yongjun ;
Fang, Zhiwei ;
Lu, Hanqing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149