Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI

被引：1

作者：

Meng, Lu ^{[1
]}

Yang, Chuanhao ^{[1
]}

机构：

[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China

来源：

BIOENGINEERING-BASEL | 2023年 / 10卷 / 10期

基金：

中国国家自然科学基金;

关键词：

visual reconstruction; diffusion model; brain decoding; fMRI;

D O I：

10.3390/bioengineering10101117

中图分类号：

Q81 [生物工程学（生物技术）]; Q93 [微生物学];

学科分类号：

071005 ; 0836 ; 090102 ; 100705 ;

摘要：

The reconstruction of visual stimuli from fMRI signals, which record brain activity, is a challenging task with crucial research value in the fields of neuroscience and machine learning. Previous studies tend to emphasize reconstructing pixel-level features (contours, colors, etc.) or semantic features (object category) of the stimulus image, but typically, these properties are not reconstructed together. In this context, we introduce a novel three-stage visual reconstruction approach called the Dual-guided Brain Diffusion Model (DBDM). Initially, we employ the Very Deep Variational Autoencoder (VDVAE) to reconstruct a coarse image from fMRI data, capturing the underlying details of the original image. Subsequently, the Bootstrapping Language-Image Pre-training (BLIP) model is utilized to provide a semantic annotation for each image. Finally, the image-to-image generation pipeline of the Versatile Diffusion (VD) model is utilized to recover natural images from the fMRI patterns guided by both visual and semantic information. The experimental results demonstrate that DBDM surpasses previous approaches in both qualitative and quantitative comparisons. In particular, the best performance is achieved by DBDM in reconstructing the semantic details of the original image; the Inception, CLIP and SwAV distances are 0.611, 0.225 and 0.405, respectively. This confirms the efficacy of our model and its potential to advance visual decoding research.

引用

页数：16

共 41 条

[1] Beliy R., 2019, Advances in Neural Information Processing Systems, VVolume 32
[2] FUNCTIONAL MAPPING OF THE HUMAN VISUAL-CORTEX BY MAGNETIC-RESONANCE-IMAGING
BELLIVEAU, JW
KENNEDY, DN
MCKINSTRY, RC
BUCHBINDER, BR
WEISSKOFF, RM
COHEN, MS
VEVEA, JM
BRADY, TJ
ROSEN, BR
[J]. SCIENCE, 1991, 254 (5032) : 716 - 719
[3] Caron M, 2020, ADV NEUR IN, V33
[4] Chen Z., 2023, arXiv, DOI 10.48550/arxiv.2211.06956
[5] Child R., 2021, arXiv
[6] Functional magnetic resonance imaging (fMRI) "brain reading": detecting and classifying distributed patterns of fMRI activity in human visual cortex
Cox, DD
Savoy, RL
[J]. NEUROIMAGE, 2003, 19 (02) : 261 - 270
[7] Hyperrealistic neural decoding for reconstructing faces from fMRI activations via the GAN latent space
Dado, Thirza
Gucluturk, Yagmur
Ambrogioni, Luca
Ras, Gabrielle
Bosch, Sander
van Gerven, Marcel
Guclu, Umut
[J]. SCIENTIFIC REPORTS, 2022, 12 (01)
[8] Decoding the representation of numerical values from brain activation patterns
Damarla, Saudamini Roy
Just, Marcel Adam
[J]. HUMAN BRAIN MAPPING, 2013, 34 (10) : 2624 - 2634
[9] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[10] Donahue J, 2019, ADV NEUR IN, V32

← 1 2 3 4 5 →