LFNet: A Novel Bidirectional Recurrent Convolutional Neural Network for Light-Field Image Super-Resolution

被引：149

作者：

Wang, Yunlong ^{[1
,2
]}

Liu, Fei ^{[2
]}

Zhang, Kunbo ^{[2
]}

Hou, Guangqi ^{[2
]}

Sun, Zhenan ^{[2
]}

Tan, Tieniu ^{[2
]}

机构：

[1] Univ Sci & Technol China, Hefei 230027, Anhui, Peoples R China

[2] Chinese Acad Sci, Inst Automat, Ctr Res Intelligent Percept & Comp, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2018年 / 27卷 / 09期

基金：

中国国家自然科学基金;

关键词：

Implicitly multi-scale fusion; bidirectional recurrent convolutional neural network; light-field; super-resolution; RESOLUTION;

D O I：

10.1109/TIP.2018.2834819

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The low spatial resolution of light-field image poses significant difficulties in exploiting its advantage. To mitigate the dependency of accurate depth or disparity information as priors for light-field image super-resolution, we propose an implicitly multi-scale fusion scheme to accumulate contextual information from multiple scales for super-resolution reconstruction. The implicitly multi-scale fusion scheme is then incorporated into bidirectional recurrent convolutional neural network, which aims to iteratively model spatial relations between horizontally or vertically adjacent sub-aperture images of light-field data. Within the network, the recurrent convolutions are modified to be more effective and flexible in modeling the spatial correlations between neighboring views. A horizontal sub-network and a vertical sub-network of the same network structure are ensembled for final outputs via stacked generalization. Experimental results on synthetic and real-world data sets demonstrate that the proposed method outperforms other state-of-the-art methods by a large margin in peak signal-to-noise ratio and gray-scale structural similarity indexes, which also achieves superior quality for human visual systems. Furthermore, the proposed method can enhance the performance of light field applications such as depth estimation.

引用

页码：4274 / 4286

页数：13

共 62 条

[1] SINGLE LENS STEREO WITH A PLENOPTIC CAMERA [J].

ADELSON, EH ;

WANG, JYA .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1992, 14 (02) :99-106

[2]

[Anonymous], 2015, ADV NEURAL INFPROCES

[3]

[Anonymous], 3D LIGHT FIELD CAM T

[4]

[Anonymous], 2014, Master thesis, DOI [10.1109/BTAS.2014.6996295, DOI 10.1145/2593069.2593124]

[5]

[Anonymous], 2013, P 8 CHIN C BIOM REC, DOI DOI 10.1007/978-3-319-02961-0_43

[6]

[Anonymous], 1991, Computational Models of Visual Processing

[7]

[Anonymous], 2009, INT C COMP PHOT

[8]

Baker S., 1999, TECH REP

[9]

Bishop T, 2009, COMPLET TECHNOL GUID, P1, DOI 10.1016/B978-0-12-374956-7.00001-2

[10] EPIPOLAR-PLANE IMAGE-ANALYSIS - AN APPROACH TO DETERMINING STRUCTURE FROM MOTION [J].

BOLLES, RC ;

BAKER, HH ;

MARIMONT, DH .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 1987, 1 (01) :7-55

← 1 2 3 4 5 6 7 →