Temporal Information-Guided Generative Adversarial Networks for Stimuli Image Reconstruction From Human Brain Activities

被引:5
作者
Huang, Shuo [1 ]
Sun, Liang [1 ]
Yousefnezhad, Muhammad [1 ]
Wang, Meiling [1 ]
Zhang, Daoqiang [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, MIIT Key Lab Pattern Anal & Machine Intelligence, Nanjing 211106, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金; 中国博士后科学基金;
关键词
Visualization; Neuroscience; Functional magnetic resonance imaging; Reconstruction algorithms; Generative adversarial networks; Brain modeling; Loss measurement; Functional magnetic resonance imaging (fMRI); generative adversarial networks (GANs); long-short term memory; stimuli image reconstruction; NATURAL IMAGES;
D O I
10.1109/TCDS.2021.3098743
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding how the human brain works has attracted increasing attention in both fields of neuroscience and machine learning. Previous studies use autoencoder and generative adversarial networks (GANs) to improve the quality of stimuli image reconstruction from functional magnetic resonance imaging (fMRI) data. However, these methods mainly focus on acquiring relevant features between two different modalities of data, i.e., stimuli images and fMRI, while ignoring the temporal information of fMRI data, thus leading to suboptimal performance. To address this issue, in this article, we propose a temporal information-guided GAN (TIGAN) to reconstruct visual stimuli from human brain activities. Specifically, the proposed method consists of three key components, including: 1) an image encoder for mapping the stimuli images into latent space; 2) a long short-term memory (LSTM) generator for fMRI feature mapping, which is used to capture temporal information in fMRI data; and 3) a discriminator for image reconstruction, which is used to make the reconstructed image more similar to the original image. In addition, to better measure the relationship of two different modalities of data (i.e., fMRI and natural images), we leverage a pairwise ranking loss to rank the stimuli images and fMRI to ensure strongly associated pairs at the top and weakly related ones at the bottom. The experimental results on real-world data sets suggest that the proposed TIGAN achieves better performance in comparison with several state-of-the-art image reconstruction approaches.
引用
收藏
页码:1104 / 1118
页数:15
相关论文
共 55 条
[1]  
Bloch I., 1995, Computer Vision, Virtual Reality and Robotics in Medicine. First International Conference, CVRMed '95. Proceedings, P392, DOI 10.1007/BFb0034975
[2]   AN OVERVIEW OF THE KL-ONE KNOWLEDGE REPRESENTATION SYSTEM [J].
BRACHMAN, RJ ;
SCHMOLZE, JG .
COGNITIVE SCIENCE, 1985, 9 (02) :171-216
[3]  
Chen PH, 2015, ADV NEUR IN, V28
[4]   How Does the Brain Solve Visual Object Recognition? [J].
DiCarlo, James J. ;
Zoccolan, Davide ;
Rust, Nicole C. .
NEURON, 2012, 73 (03) :415-434
[5]   Structured Neural Decoding With Multitask Transfer Learning of Deep Neural Network Representations [J].
Du, Changde ;
Du, Changying ;
Huang, Lijie ;
Wang, Haibao ;
He, Huiguang .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (02) :600-614
[6]   Reconstructing Perceived Images From Human Brain Activities With Bayesian Deep Multiview Learning [J].
Du, Changde ;
Du, Changying ;
Huang, Lijie ;
He, Huiguang .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (08) :2310-2323
[7]  
Du CD, 2017, IEEE IJCNN, P1049, DOI 10.1109/IJCNN.2017.7965968
[8]  
Faghri F, 2018, Arxiv, DOI arXiv:1707.05612
[9]  
Fang T., 2021, arXiv
[10]   Gating Neural Network for Large Vocabulary Audiovisual Speech Recognition [J].
Tao F. ;
Busso C. .
IEEE/ACM Transactions on Audio Speech and Language Processing, 2018, 26 (07) :1286-1298