Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI

被引:1
作者
Meng, Lu [1 ]
Yang, Chuanhao [1 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
来源
BIOENGINEERING-BASEL | 2023年 / 10卷 / 10期
基金
中国国家自然科学基金;
关键词
visual reconstruction; diffusion model; brain decoding; fMRI;
D O I
10.3390/bioengineering10101117
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The reconstruction of visual stimuli from fMRI signals, which record brain activity, is a challenging task with crucial research value in the fields of neuroscience and machine learning. Previous studies tend to emphasize reconstructing pixel-level features (contours, colors, etc.) or semantic features (object category) of the stimulus image, but typically, these properties are not reconstructed together. In this context, we introduce a novel three-stage visual reconstruction approach called the Dual-guided Brain Diffusion Model (DBDM). Initially, we employ the Very Deep Variational Autoencoder (VDVAE) to reconstruct a coarse image from fMRI data, capturing the underlying details of the original image. Subsequently, the Bootstrapping Language-Image Pre-training (BLIP) model is utilized to provide a semantic annotation for each image. Finally, the image-to-image generation pipeline of the Versatile Diffusion (VD) model is utilized to recover natural images from the fMRI patterns guided by both visual and semantic information. The experimental results demonstrate that DBDM surpasses previous approaches in both qualitative and quantitative comparisons. In particular, the best performance is achieved by DBDM in reconstructing the semantic details of the original image; the Inception, CLIP and SwAV distances are 0.611, 0.225 and 0.405, respectively. This confirms the efficacy of our model and its potential to advance visual decoding research.
引用
收藏
页数:16
相关论文
共 41 条
  • [1] Beliy R., 2019, Advances in Neural Information Processing Systems, VVolume 32
  • [2] FUNCTIONAL MAPPING OF THE HUMAN VISUAL-CORTEX BY MAGNETIC-RESONANCE-IMAGING
    BELLIVEAU, JW
    KENNEDY, DN
    MCKINSTRY, RC
    BUCHBINDER, BR
    WEISSKOFF, RM
    COHEN, MS
    VEVEA, JM
    BRADY, TJ
    ROSEN, BR
    [J]. SCIENCE, 1991, 254 (5032) : 716 - 719
  • [3] Caron M, 2020, ADV NEUR IN, V33
  • [4] Chen Z., 2023, arXiv, DOI 10.48550/arxiv.2211.06956
  • [5] Child R., 2021, arXiv
  • [6] Functional magnetic resonance imaging (fMRI) "brain reading": detecting and classifying distributed patterns of fMRI activity in human visual cortex
    Cox, DD
    Savoy, RL
    [J]. NEUROIMAGE, 2003, 19 (02) : 261 - 270
  • [7] Hyperrealistic neural decoding for reconstructing faces from fMRI activations via the GAN latent space
    Dado, Thirza
    Gucluturk, Yagmur
    Ambrogioni, Luca
    Ras, Gabrielle
    Bosch, Sander
    van Gerven, Marcel
    Guclu, Umut
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01)
  • [8] Decoding the representation of numerical values from brain activation patterns
    Damarla, Saudamini Roy
    Just, Marcel Adam
    [J]. HUMAN BRAIN MAPPING, 2013, 34 (10) : 2624 - 2634
  • [9] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [10] Donahue J, 2019, ADV NEUR IN, V32