Towards real-time photorealistic 3D holography with deep neural networks

被引：432

作者：

Shi, Liang ^{[1
,2
]}

Li, Beichen ^{[1
,2
]}

Kim, Changil ^{[1
,2
]}

Kellnhofer, Petr ^{[1
,2
]}

Matusik, Wojciech ^{[1
,2
]}

机构：

[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA

[2] MIT, Elect Engn & Comp Sci Dept, Cambridge, MA 02139 USA

来源：

NATURE | 2021年 / 591卷 / 7849期

关键词：

COMPUTER-GENERATED HOLOGRAMS; ALGORITHM; DISPLAY; FIELD;

D O I：

10.1038/s41586-020-03152-0

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

The ability to present three-dimensional (3D) scenes with continuous depth sensation has a profound impact on virtual and augmented reality, human-computer interaction, education and training. Computer-generated holography (CGH) enables high-spatio-angular-resolution 3D projection via numerical simulation of diffraction and interference(1). Yet, existing physically based methods fail to produce holograms with both per-pixel focal control and accurate occlusion(2,3). The computationally taxing Fresnel diffraction simulation further places an explicit trade-off between image quality and runtime, making dynamic holography impractical(4). Here we demonstrate a deep-learning-based CGH pipeline capable of synthesizing a photorealistic colour 3D hologram from a single RGB-depth image in real time. Our convolutional neural network (CNN) is extremely memory efficient (below 620 kilobytes) and runs at 60 hertz for a resolution of 1,920 x 1,080 pixels on a single consumer-grade graphics processing unit. Leveraging low-power on-device artificial intelligence acceleration chips, our CNN also runs interactively on mobile (iPhone 11 Pro at 1.1 hertz) and edge (Google Edge TPU at 2.0 hertz) devices, promising real-time performance in future-generation virtual and augmented-reality mobile headsets. We enable this pipeline by introducing a large-scale CGH dataset (MIT-CGH-4K) with 4,000 pairs of RGB-depth images and corresponding 3D holograms. Our CNN is trained with differentiable wave-based loss functions(5) and physically approximates Fresnel diffraction. With an anti-aliasing phase-only encoding method, we experimentally demonstrate speckle-free, natural-looking, high-resolution 3D holograms. Our learning-based approach and the Fresnel hologram dataset will help to unlock the full potential of holography and enable applications in metasurface design(6,7), optical and acoustic tweezer-based microscopic manipulation(8-10), holographic microscopy(11) and single-exposure volumetric 3D printing(12,13).

引用

页码：234 / +

页数：20

共 57 条

[1]

Benton S.A., 2008, Holographic Imaging, DOI DOI 10.1038/srep06211

[2]

Bjelkhagen H. I., 2011, JR PRACTICAL HOLOGRA, V7957, P13

[3] Improved layer-based method for rapid hologram generation and real-time interactive holographic display applications [J].

Chen, J-S. ;

Chu, D. P. .

OPTICS EXPRESS, 2015, 23 (14) :18143-18155

[4] Describing Textures in the Wild [J].

Cimpoi, Mircea ;

Maji, Subhransu ;

Kokkinos, Iasonas ;

Mohamed, Sammy ;

Vedaldi, Andrea .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3606-3613

[5] The Synthesizability of Texture Examples [J].

Dai, Dengxin ;

Riemenschneider, Hayko ;

Van Gool, Luc .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3027-3034

[6]

Eybposh M. H., 2020, OPTICS BRAIN BTU2C 2

[7] Large real-time holographic 3D displays: enabling components and results [J].

Haeussler, R. ;

Gritsai, Y. ;

Zschau, E. ;

Missbach, R. ;

Sahm, H. ;

Stock, M. ;

Stolle, H. .

APPLIED OPTICS, 2017, 56 (13) :F45-F52

[8] Time-multiplexed light field synthesis via factored Wigner distribution function [J].

Hamann, Stephen ;

Shi, Liang ;

Solgaard, Olav ;

Wetzstein, Gordon .

OPTICS LETTERS, 2018, 43 (03) :599-602

[9] Acceleration of hologram generation by optimizing the arrangement of wavefront recording planes [J].

Hasegawa, Naotaka ;

Shimobaba, Tomoyoshi ;

Kakue, Takashi ;

Ito, Tomoyoshi .

APPLIED OPTICS, 2017, 56 (01) :A97-A103

[10] A volumetric display for visual, tactile and audio presentation using acoustic trapping [J].

Hirayama, Ryuji ;

Plasencia, Diego Martinez ;

Masuda, Nobuyuki ;

Subramanian, Sriram .

NATURE, 2019, 575 (7782) :320-+

← 1 2 3 4 5 6 →