Image De-occlusion via Event-enhanced Multi-modal Fusion Hybrid Network

被引：7

作者：

Li, Si-Qi ^{[1
,2
,3
,4
]}

Gao, Yue ^{[1
,2
,3
,4
]}

Dai, Qiong-Hai ^{[1
,2
,3
,5
]}

机构：

[1] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, Beijing 100084, Peoples R China

[2] Tsinghua Univ, Inst Brain & Cognit Sci, Beijing 100084, Peoples R China

[3] Tsinghua Univ, Beijing Lab Brain & Cognit Intelligence, Beijing Municipal Educ Commiss, Beijing 100084, Peoples R China

[4] Tsinghua Univ, Sch Software, Key Lab Informat Syst Secur, Beijing 100084, Peoples R China

[5] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China

来源：

MACHINE INTELLIGENCE RESEARCH | 2022年 / 19卷 / 04期

基金：

北京市自然科学基金;

关键词：

Event camera; multi-modal fusion; image de-occlusion; spiking neural network (SNN); image reconstruction;

D O I：

10.1007/s11633-022-1350-3

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Seeing through dense occlusions and reconstructing scene images is an important but challenging task. Traditional frame-based image de-occlusion methods may lead to fatal errors when facing extremely dense occlusions due to the lack of valid information available from the limited input occluded frames. Event cameras are bio-inspired vision sensors that record the brightness changes at each pixel asynchronously with high temporal resolution. However, synthesizing images solely from event streams is ill-posed since only the brightness changes are recorded in the event stream, and the initial brightness is unknown. In this paper, we propose an event-enhanced multi-modal fusion hybrid network for image de-occlusion, which uses event streams to provide complete scene information and frames to provide color and texture information. An event stream encoder based on the spiking neural network (SNN) is proposed to encode and denoise the event stream efficiently. A comparison loss is proposed to generate clearer results. Experimental results on a large-scale event-based and frame-based image de-occlusion dataset demonstrate that our proposed method achieves state-of-the-art performance.

引用

页码：307 / 318

页数：12

共 38 条

[1] Real-Time High Speed Motion Prediction Using Fast Aperture-Robust Event-Driven Visual Flow [J].