Attention-based vector quantized variational autoencoder for anomaly detection constraints

被引:0
作者
Yu, Qien [1 ]
Dai, Shengxin [2 ]
Dong, Ran [3 ]
Ikuno, Soichiro [4 ]
机构
[1] Chongqing Jiaotong Univ, Sch Informat Sci & Engn, Chongqing 400074, Peoples R China
[2] Sichuan Univ, Coll Comp Sci, Chengdu 610065, Peoples R China
[3] Chukyo Univ, Sch Engn, Toyota, Aichi 4700393, Japan
[4] Tokyo Univ Technol, Sch Comp Sci, Hachioji, Tokyo 1920982, Japan
基金
中国国家自然科学基金;
关键词
Industrial image; Anomaly detection; Subspace projection; Attention mechanism; Vector quantized variational autoencoder;
D O I
10.1016/j.patcog.2025.111500
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a new framework that uses a vector quantized variational autoencoder (VQVAE) enhanced by orthogonal subspace constraints (OSC) and pyramid criss-cross attention (PCCA). The framework was designed for anomaly detection in industrial product image datasets. Previous studies on modeling low- dimensional feature distributions have been unable to effectively distinguish between normal features and noisy/abnormal information, which is effectively addressed using OSC in this study. Then, the vector quantized mechanism is embodied in these two complementary subspaces to obtain normal and abnormal embedding subspaces and discrete representations for normal and noisy information, respectively. The proposed approach robustly represents low-dimensional discrete manifolds to present the information from normal data using a limited number of feature vectors. Additionally, two PCCA modules are proposed to capture feature maps from different layers in the encoder and decoder, benefitting the low-dimensional mapping and reconstruction process. The features of different layers are treated as the query (Q), key (K), and value (V), which could capture both low-level and high-level features, incorporating comprehensive contextual information. The effectiveness of the proposed framework for anomaly detection is assessed by comparing its performance with those of the state-of-the-art approaches on various publicly available industrial product image datasets.
引用
收藏
页数:18
相关论文
共 43 条
[1]   Skip-GANomaly: Skip Connected and Adversarially Trained Encoder-Decoder Anomaly Detection [J].
Akcay, Samet ;
Atapour-Abarghouei, Amir ;
Breckon, Toby P. .
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
[2]   GANomaly: Semi-supervised Anomaly Detection via Adversarial Training [J].
Akcay, Samet ;
Atapour-Abarghouei, Amir ;
Breckon, Toby P. .
COMPUTER VISION - ACCV 2018, PT III, 2019, 11363 :622-637
[3]  
An J., 2015, SPECIAL LECT IE, V2, P1
[4]   MIAD: A Maintenance Inspection Dataset for Unsupervised Anomaly Detection [J].
Bao, Tianpeng ;
Chen, Jiadong ;
Li, Wei ;
Wang, Xiang ;
Fei, Jingjing ;
Wu, Liwei ;
Zhao, Rui ;
Zheng, Ye .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, :993-1002
[5]   Autoencoders and their applications in machine learning: a survey [J].
Berahmand, Kamal ;
Daneshfar, Fatemeh ;
Salehi, Elaheh Sadat ;
Li, Yuefeng ;
Xu, Yue .
ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (02)
[6]   Rethinking Autoencoders for Medical Anomaly Detection from A Theoretical Perspective [J].
Cai, Yu ;
Chen, Hao ;
Cheng, Kwang-Ting .
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT XI, 2024, 15011 :544-554
[7]   Unsupervised Anomaly Detection of Industrial Robots Using Sliding-Window Convolutional Variational Autoencoder [J].
Chen, Tingting ;
Liu, Xueping ;
Xia, Bizhong ;
Wang, Wei ;
Lai, Yongzhi .
IEEE ACCESS, 2020, 8 :47072-47081
[8]  
Chen ZM, 2018, WIREL TELECOMM SYMP
[9]   Unsupervised video anomaly detection via normalizing flows with implicit latent features [J].
Cho, MyeongAh ;
Kim, Taeoh ;
Kim, Woo Jin ;
Cho, Suhwan ;
Lee, Sangyoun .
PATTERN RECOGNITION, 2022, 129
[10]   Leveraging Vector-Quantized Variational Autoencoder Inner Metrics for Anomaly Detection [J].
Gangloff, Hugo ;
Pham, Minh-Tan ;
Courtrai, Luc ;
Lefevre, Sebastien .
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, :435-441