Structured Neural Decoding With Multitask Transfer Learning of Deep Neural Network Representations

被引:28
作者
Du, Changde [1 ,2 ,3 ]
Du, Changying [4 ]
Huang, Lijie [1 ]
Wang, Haibao [1 ,2 ]
He, Huiguang [1 ,2 ,5 ]
机构
[1] Chinese Acad Sci, Res Ctr Brain Inspired Intelligence, Natl Lab Pattern Recognit, Inst Automat, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Huawei Cloud BU EI Innovat Lab, Beijing 100085, Peoples R China
[4] Huawei Noahs Ark Lab, Beijing 100085, Peoples R China
[5] Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing 100190, Peoples R China
基金
中国国家自然科学基金;
关键词
Decoding; Image reconstruction; Functional magnetic resonance imaging; Visualization; Task analysis; Brain; Correlation; Deep neural network (DNN); functional magnetic resonance imaging (fMRI); image reconstruction; multioutput regression; neural decoding; NATURAL IMAGES; BRAIN; RECONSTRUCTION; FACES;
D O I
10.1109/TNNLS.2020.3028167
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The reconstruction of visual information from human brain activity is a very important research topic in brain decoding. Existing methods ignore the structural information underlying the brain activities and the visual features, which severely limits their performance and interpretability. Here, we propose a hierarchically structured neural decoding framework by using multitask transfer learning of deep neural network (DNN) representations and a matrix-variate Gaussian prior. Our framework consists of two stages, Voxel2Unit and Unit2Pixel. In Voxel2Unit, we decode the functional magnetic resonance imaging (fMRI) data to the intermediate features of a pretrained convolutional neural network (CNN). In Unit2Pixel, we further invert the predicted CNN features back to the visual images. Matrix-variate Gaussian prior allows us to take into account the structures between feature dimensions and between regression tasks, which are useful for improving decoding effectiveness and interpretability. This is in contrast with the existing single-output regression models that usually ignore these structures. We conduct extensive experiments on two real-world fMRI data sets, and the results show that our method can predict CNN features more accurately and reconstruct the perceived natural images and faces with higher quality.
引用
收藏
页码:600 / 614
页数:15
相关论文
共 57 条
[1]  
Agarwal Arvind., 2010, NIPS. Ed. by, P46
[2]  
Nguyen A, 2016, ADV NEUR IN, V29
[3]  
[Anonymous], 2017, A downsampled variant of imagenet as an alternative to the cifar datasets
[4]  
Argyriou A., 2007, ADV NEURAL INFORM PR, DOI [10.7551/mitpress/7503.003.0010, DOI 10.7551/MITPRESS/7503.003.0010]
[5]  
Argyriou A., 2008, Advances in Neural Information Processing Systems, P25
[6]  
Arjovsky M, 2017, PR MACH LEARN RES, V70
[7]   CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training [J].
Bao, Jianmin ;
Chen, Dong ;
Wen, Fang ;
Li, Houqiang ;
Hua, Gang .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2764-2773
[8]   FlyMap: Interacting with Maps Projected from a Drone [J].
Brock, Anke M. ;
Chatain, Julia ;
Park, Michelle ;
Fang, Tommy ;
Hachet, Martin ;
Landay, James A. ;
Cauchard, Jessica R. .
PROCEEDINGS PERVASIVE DISPLAYS 2018: THE 7TH ACM INTERNATIONAL SYMPOSIUM ON PERVASIVE DISPLAYS, 2018,
[9]  
Chen C., 2018, PR MACH LEARN RES, V80, P5741
[10]  
Chen D, 2009, PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE OF MANAGEMENT SCIENCE AND INFORMATION SYSTEM, VOLS 1-4, P1375