Focal Aggregation Transformer for Light Field Image Super-Resolution

被引：0

作者：

Wang, Shunzhou ^{[1
,3
]}

Lu, Yao ^{[2
,3
]}

Xia, Wang ^{[3
]}

机构：

[1] Peking Univ, Sch Elect & Comp Engn, Shenzhen Grad Sch, Shenzhen 518055, Peoples R China

[2] Shenzhen MSU BIT Univ, Guangdong Lab Machine Percept & Intelligent Comp, Dept Engn, Shenzhen 518172, Peoples R China

[3] Beijing Inst Technol, Sch Comp Sci, Beijing 100081, Peoples R China

来源：

PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VIII | 2025年 / 15038卷

关键词：

Light field; Image super-resolution; Inter-intra view feature aggregation; Hierarchical feature aggregation; Transformer; NETWORK;

D O I：

10.1007/978-981-97-8685-5_37

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Transformer has achieved significant progress in light field image super-resolution (LFSR) due to its long-range dependency learning ability for inter-intra view feature aggregation. However, locality information of each sub-aperture view is ignored in intra-view and inter-view aggregation with Transformer, hampering the high-quality light field image reconstruction. To this end, we propose a global to local aggregation approach termed Focal Aggregation for LFSR. In particular, Focal Aggregation includes two strategies: inter-view global to local aggregation (InterG2L) and intra-view global to local aggregation (IntraG2L). InterG2L is proposed to obtain complementary information from different views. IntraG2L is developed to extract efficient representations of a single sub-aperture view. InterG2L and IntraG2L are organized in a cascade way so that the global information of the input can be gathered for each sub-aperture image in a coarse to fine aggregation approach. Meanwhile, we also develop a global to local hierarchical feature aggregation approach named HierG2L, which enhances the last hierarchical feature used for light field reconstruction according to the input. Based on the above three global to local aggregation strategies, we construct a focal aggregation transformer (FAT) for LFSR. Experiments are performed on commonly-used LFSR benchmarks. Results demonstrate that FAT achieves superior results compared with other leading methods on synthesized and real data.

引用

页码：524 / 538

页数：15

共 39 条

[1] Alain M, 2018, IEEE IMAGE PROC, P2501, DOI 10.1109/ICIP.2018.8451162
[2] Chen G., 2023, IEEE TMM
[3] Simple Baselines for Image Restoration
Chen, Liangyu
Chu, Xiaojie
Zhang, Xiangyu
Sun, Jian
[J]. COMPUTER VISION, ECCV 2022, PT VII, 2022, 13667 : 17 - 33
[4] NAFSSR: Stereo Image Super-Resolution Using NAFNet
Chu, Xiaojie
Chen, Liangyu
Yu, Wenqing
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 1238 - 1247
[5] Exploiting Spatial and Angular Correlations With Deep Efficient Transformers for Light Field Image Super-Resolution
Cong, Ruixuan
Sheng, Hao
Yang, Da
Cui, Zhenglong
Chen, Rongshan
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1421 - 1435
[6] CMT: Convolutional Neural Networks Meet Vision Transformers
Guo, Jianyuan
Han, Kai
Wu, Han
Tang, Yehui
Chen, Xinghao
Wang, Yunhe
Xu, Chang
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12165 - 12175
[7] A Dataset and Evaluation Methodology for Depth Estimation on 4D Light Fields
Honauer, Katrin
Johannsen, Ole
Kondermann, Daniel
Goldluecke, Bastian
[J]. COMPUTER VISION - ACCV 2016, PT III, 2017, 10113 : 19 - 34
[8] Light Field Spatial Super-resolution via Deep Combinatorial Geometry Embedding and Structural Consistency Regularization
Jin, Jing
Hou, Junhui
Chen, Jie
Kwong, Sam
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 2257 - 2266
[9] Scene Reconstruction from High Spatio-Angular Resolution Light Fields
Kim, Changil
Zimmer, Henning
Pritch, Yael
Sorkine-Hornung, Alexander
Gross, Markus
[J]. ACM TRANSACTIONS ON GRAPHICS, 2013, 32 (04):
[10] Kim J, 2016, PROC CVPR IEEE, P1637, DOI [10.1109/CVPR.2016.181, 10.1109/CVPR.2016.182]

← 1 2 3 4 →