Attention, please! A survey of neural attention models in deep learning

被引:0
作者
Alana de Santana Correia
Esther Luna Colombini
机构
[1] University of Campinas,Laboratory of Robotics and Cognitive Systems (LaRoCS), Institute of Computing
来源
Artificial Intelligence Review | 2022年 / 55卷
关键词
Survey; Attention mechanism; Neural networks; Deep learning; Attention models;
D O I
暂无
中图分类号
学科分类号
摘要
In humans, Attention is a core property of all perceptual and cognitive operations. Given our limited ability to process competing sources, attention mechanisms select, modulate, and focus on the information most relevant to behavior. For decades, concepts and functions of attention have been studied in philosophy, psychology, neuroscience, and computing. For the last 6 years, this property has been widely explored in deep neural networks. Currently, the state-of-the-art in Deep Learning is represented by neural attention models in several application domains. This survey provides a comprehensive overview and analysis of developments in neural attention models. We systematically reviewed hundreds of architectures in the area, identifying and discussing those in which attention has shown a significant impact. We also developed and made public an automated methodology to facilitate the development of reviews in the area. By critically analyzing 650 works, we describe the primary uses of attention in convolutional, recurrent networks, and generative models, identifying common subgroups of uses and applications. Furthermore, we describe the impact of attention in different application domains and their impact on neural networks’ interpretability. Finally, we list possible trends and opportunities for further research, hoping that this review will provide a succinct overview of the main attentional models in the area and guide researchers in developing future approaches that will drive further improvements.
引用
收藏
页码:6037 / 6124
页数:87
相关论文
共 477 条
  • [71] Hamker FH(2018)Automatic question tagging with deep neural networks IEEE Trans Learn Technol 151 1134-514
  • [72] Hamker FH(2019)Attention-based autoencoder topic model for short texts Procedia Comput Sci 21 1409-1167
  • [73] Harsha Vardhan LV(1998)The retinotopy of visual spatial attention Neuron 12 97-7159
  • [74] Jia G(1980)A feature-integration theory of attention Cognit Psychol 58 607-1253
  • [75] Kok S(2004)Clam: closed-loop attention model for visual search Neurocomputing 32 569-8726
  • [76] He X(2007)Trying to detect taste in a tasteless solution: modulation of early gustatory cortex by attention to taste Chem Sens 24 510-370
  • [77] Yang Y(2016)Beyond frame-level cnn: saliency-aware 3-d cnn with lstm for video action recognition IEEE Signal Process Lett 57 1155-1802
  • [78] Shi B(2018)Scene classification with recurrent attention of vhr remote sensing images IEEE Trans Geosci Remote Sens 33 7152-9037
  • [79] Bai X(2019)Logic attention based neighborhood aggregation for inductive knowledge graph embedding Proc AAAI Conf Artif Intell 11 1240-17
  • [80] He Z(2017)Hybrid ctc/attention architecture for end-to-end speech recognition IEEE J Sel Top Signal Process 90 8722-191