A dual-channel language decoding from brain activity with progressive transfer training

被引:5
作者
Huang, Wei [1 ]
Yan, Hongmei [1 ]
Cheng, Kaiwen [2 ]
Wang, Yuting [1 ]
Wang, Chong [1 ]
Li, Jiyi [1 ]
Li, Chen [3 ]
Li, Chaorong [1 ]
Zuo, Zhentao [4 ]
Chen, Huafu [1 ]
机构
[1] Univ Elect Sci & Technol China, Clin Hosp,Chengdu Brain Sci Inst, Sch Life Sci & Technol, MOE Key Lab Neuroinformat,High Field Magnet Reson, Chengdu 610054, Peoples R China
[2] Sichuan Int Studies Univ, Sch Language Intelligence, Chongqing, Peoples R China
[3] Sichuan Univ, Dept Med Informat Engn, Chengdu, Peoples R China
[4] Chinese Acad Sci, State Key Lab Brain & Cognit Sci, Beijing MR Ctr Brain Res, Inst Biophys, Beijing 100101, Peoples R China
基金
中国国家自然科学基金;
关键词
artificial intelligence; functional magnetic resonance imaging; language decoding; progressive transfer; visual cortex; CATEGORIES; REPRESENTATION; OBJECTS;
D O I
10.1002/hbm.25603
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
When we view a scene, the visual cortex extracts and processes visual information in the scene through various kinds of neural activities. Previous studies have decoded the neural activity into single/multiple semantic category tags which can caption the scene to some extent. However, these tags are isolated words with no grammatical structure, insufficiently conveying what the scene contains. It is well-known that textual language (sentences/phrases) is superior to single word in disclosing the meaning of images as well as reflecting people's real understanding of the images. Here, based on artificial intelligence technologies, we attempted to build a dual-channel language decoding model (DC-LDM) to decode the neural activities evoked by images into language (phrases or short sentences). The DC-LDM consisted of five modules, namely, Image-Extractor, Image-Encoder, Nerve-Extractor, Nerve-Encoder, and Language-Decoder. In addition, we employed a strategy of progressive transfer to train the DC-LDM for improving the performance of language decoding. The results showed that the texts decoded by DC-LDM could describe natural image stimuli accurately and vividly. We adopted six indexes to quantitatively evaluate the difference between the decoded texts and the annotated texts of corresponding visual images, and found that Word2vec-Cosine similarity (WCS) was the best indicator to reflect the similarity between the decoded and the annotated texts. In addition, among different visual cortices, we found that the text decoded by the higher visual cortex was more consistent with the description of the natural image than the lower one. Our decoding model may provide enlightenment in language-based brain-computer interface explorations.
引用
收藏
页码:5089 / 5100
页数:12
相关论文
共 44 条
[1]   Decoding covert visual attention based on phase transfer entropy [J].
Ahmadi, Amirmasoud ;
Davoudi, Saeideh ;
Behroozi, Mahsa ;
Daliri, Mohammad Reza .
PHYSIOLOGY & BEHAVIOR, 2020, 222
[2]  
Ba J., 2016, ARXIV160706450, V1050, P21
[3]   Predicting brain states associated with object categories from fMRI data [J].
Behroozi, Mehdi ;
Daliri, Mohammad Reza .
JOURNAL OF INTEGRATIVE NEUROSCIENCE, 2014, 13 (04) :645-667
[4]  
Bengio Y., 2015, 14090473 ARXIV
[5]   ECHOES: An intelligent serious game for fostering social communication in children with autism [J].
Bernardini, Sara ;
Porayska-Pomsta, Kaska ;
Smith, Tim J. .
INFORMATION SCIENCES, 2014, 264 :41-60
[6]  
Cho K, 2014, ARXIV14061078, P1724, DOI DOI 10.3115/V1/D14-1179
[7]   Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks [J].
Cho, Kyunghyun ;
Courville, Aaron ;
Bengio, Yoshua .
IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (11) :1875-1886
[8]   Functional magnetic resonance imaging (fMRI) "brain reading": detecting and classifying distributed patterns of fMRI activity in human visual cortex [J].
Cox, DD ;
Savoy, RL .
NEUROIMAGE, 2003, 19 (02) :261-270
[9]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[10]   Population receptive field estimates in human visual cortex [J].
Dumoulin, Serge O. ;
Wandell, Brian A. .
NEUROIMAGE, 2008, 39 (02) :647-660