Decoupled and Explainable Associative Memory for Effective Knowledge Propagation

被引：0

作者：

Fernando, Tharindu ^{[1
]}

Priyasad, Darshana ^{[1
]}

Sridharan, Sridha ^{[1
]}

Fookes, Clinton ^{[1
]}

机构：

[1] Queensland Univ Technol, Signal Proc Artificial Intelligence & Vis Technol, Brisbane, Qld 4000, Australia

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2025年 / 36卷 / 07期

基金：

澳大利亚研究理事会;

关键词：

Vectors; Memory management; Memory modules; Associative memory; Periodic structures; Machine learning; Learning systems; Three-dimensional displays; Technological innovation; Long short term memory; Auxiliary memory; decoupling; explainability; key-value memory networks (KVMNs); NETWORKS;

D O I：

10.1109/TNNLS.2024.3492133

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Long-term memory often plays a pivotal role in human cognition through the analysis of contextual information. Machine learning researchers have attempted to emulate this process through the development of memory-augmented neural networks (MANNs) to leverage indirectly related but resourceful historical observations during learning and inference. The area of MANN, however, is still in its infancy and significant research effort is required to enable machines to achieve performance close to the human cognition process. This article presents an innovative MANN framework for the advanced incorporation of historical knowledge into a predictive framework. Within the key-value memory structure, we propose to decouple the key representations from the learned value memory embeddings to offer improved associations between the inputs and latent memory embeddings. We argue that the keys should be static, sparse, and unique representations of a particular observation to offer robust input to memory associations, while the value embeddings could be trainable, dense latent vectors such that they can better capture historical knowledge. Moreover, we introduce a novel memory update procedure that preserves the explainability of the historical knowledge extraction process, which would enable the human end-users to interpret the deep machine learning model decisions, fostering their trust. With extensive experiments conducted on three different datasets using audio, text, and image modalities, we demonstrate that our proposed innovations collectively allow this framework to outperform the current state-of-the-art methods by significant margins, irrespective of the modalities or the downstream tasks. The code is available at https://github.com/tha725/DE-KVMN/tree/main.

引用

页码：13204 / 13218

页数：15

共 64 条

[1] Deep Graph Memory Networks for Forgetting-Robust Knowledge Tracing [J].

Abdelrahman, Ghodai ;

Wang, Qing .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (08) :7844-7855

[2] Gabor capsule network with preprocessing blocks for the recognition of complex images [J].

Abra Ayidzoe, Mighty ;

Yu, Yongbin ;

Mensah, Patrick Kwabena ;

Cai, Jingye ;

Adu, Kwabena ;

Tang, Yifan .

MACHINE VISION AND APPLICATIONS, 2021, 32 (04)

[3]

[Anonymous], 2004, ECCV WORKSH

[4]

Baevski A, 2020, ADV NEUR IN, V33

[5] G1020: A Benchmark Retinal Fundus Image Dataset for Computer-Aided Glaucoma Detection [J].

Bajwa, Muhammad Naseer ;

Singh, Gur Amrit Pal ;

Neumeier, Wolfgang ;

Malik, Muhammad Imran ;

Dengel, Andreas ;

Ahmed, Sheraz .

2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

[6] Two-stage framework for optic disc localization and glaucoma classification in retinal fundus images using deep learning [J].

Bajwa, Muhammad Naseer ;

Malik, Muhammad Imran ;

Siddiqui, Shoaib Ahmed ;

Dengel, Andreas ;

Shafait, Faisal ;

Neumeier, Wolfgang ;

Ahmed, Sheraz .

BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (1)

[7] IEMOCAP: interactive emotional dyadic motion capture database [J].

Busso, Carlos ;

Bulut, Murtaza ;

Lee, Chi-Chun ;

Kazemzadeh, Abe ;

Mower, Emily ;

Kim, Samuel ;

Chang, Jeannette N. ;

Lee, Sungbok ;

Narayanan, Shrikanth S. .

LANGUAGE RESOURCES AND EVALUATION, 2008, 42 (04) :335-359

[8] MeMOT: Multi-Object Tracking with Memory [J].

Cai, Jiarui ;

Xu, Mingze ;

Li, Wei ;

Xiong, Yuanjun ;

Xia, Wei ;

Tu, Zhuowen ;

Soatto, Stefano .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :8080-8090

[9] SpeechFormer plus plus : A Hierarchical Efficient Framework for Paralinguistic Speech Processing [J].

Chen, Weidong ;

Xing, Xiaofen ;

Xu, Xiangmin ;

Pang, Jianxin ;

Du, Lan .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 :775-788

[10] KEY-SPARSE TRANSFORMER FOR MULTIMODAL SPEECH EMOTION RECOGNITION [J].

Chen, Weidong ;

Xing, Xiaofeng ;

Xu, Xiangmin ;

Yang, Jichen ;

Pang, Jianxin .

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, :6897-6901

← 1 2 3 4 5 6 7 →