End-to-End Deep Image Reconstruction From Human Brain Activity

被引：99

作者：

Shen, Guohua ^{[1
]}

Dwivedi, Kshitij ^{[1
]}

Majima, Kei ^{[2
]}

Horikawa, Tomoyasu ^{[1
]}

Kamitani, Yukiyasu ^{[1
,2
]}

机构：

[1] Adv Telecommun Res Inst Int, Computat Neurosci Labs, Kyoto, Japan

[2] Kyoto Univ, Grad Sch Informat, Kyoto, Japan

来源：

FRONTIERS IN COMPUTATIONAL NEUROSCIENCE | 2019年 / 13卷

关键词：

brain decoding; visual image reconstruction; functional magnetic resonance imaging; deep neural networks; generative adversarial networks; NEURAL-NETWORKS; REPRESENTATIONS;

D O I：

10.3389/fncom.2019.00021

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Deep neural networks (DNNs) have recently been applied successfully to brain decoding and image reconstruction from functional magnetic resonance imaging (fMRI) activity. However, direct training of a DNN with fMRI data is often avoided because the size of available data is thought to be insufficient for training a complex network with numerous parameters. Instead, a pre-trained DNN usually serves as a proxy for hierarchical visual representations, and fMRI data are used to decode individual DNN features of a stimulus image using a simple linear model, which are then passed to a reconstruction module. Here, we directly trained a DNN model with fMRI data and the corresponding stimulus images to build an end-to-end reconstruction model. We accomplished this by training a generative adversarial network with an additional loss term that was defined in high-level feature space (feature loss) using up to 6,000 training data samples (natural images and fMRI responses). The above model was tested on independent datasets and directly reconstructed image using an fMRI pattern as the input. Reconstructions obtained from our proposed method resembled the test stimuli (natural and artificial images) and reconstruction accuracy increased as a function of training-data size. Ablation analyses indicated that the feature loss that we employed played a critical role in achieving accurate reconstruction. Our results show that the end-to-end model can learn a direct mapping between brain activity and perception.

引用

页数：11

共 28 条

[1]

[Anonymous], BIORXIV

[2]

[Anonymous], 2014, ARXIV14075104

[3]

[Anonymous], 2017, COMMUN ACM, DOI DOI 10.1145/3065386

[4]

[Anonymous], ARXIV160803425

[5] Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence [J].

Cichy, Radoslaw Martin ;

Khosla, Aditya ;

Pantazis, Dimitrios ;

Torralba, Antonio ;

Oliva, Aude .

SCIENTIFIC REPORTS, 2016, 6

[6] Neural portraits of perception: Reconstructing face images from evoked brain activity [J].

Cowen, Alan S. ;

Chun, Marvin M. ;

Kuhl, Brice A. .

NEUROIMAGE, 2014, 94 :12-22

[7]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[8] Inverting Visual Representations with Convolutional Networks [J].

Dosovitskiy, Alexey ;

Brox, Thomas .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4829-4837

[9]

Dosovitskiy Alexey, 2016, Advances in Neural Information Processing Systems, V29

[10]

Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672, DOI DOI 10.1145/3422622

← 1 2 3 →