Masked Transformer for Image Anomaly Localization

被引:25
作者
De Nardin, Axel [1 ]
Mishra, Pankaj [1 ]
Foresti, Gian Luca [1 ]
Piciarelli, Claudio [1 ]
机构
[1] Univ Udine, Dept Math Comp Sci & Phys, Via Sci 206, I-33100 Udine, Italy
关键词
Anomaly detection; vision transformer; image inpainting; self-supervised learning;
D O I
10.1142/S0129065722500307
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Image anomaly detection consists in detecting images or image portions that are visually different from the majority of the samples in a dataset. The task is of practical importance for various real-life applications like biomedical image analysis, visual inspection in industrial production, banking, traffic management, etc. Most of the current deep learning approaches rely on image reconstruction: the input image is projected in some latent space and then reconstructed, assuming that the network (mostly trained on normal data) will not be able to reconstruct the anomalous portions. However, this assumption does not always hold. We thus propose a new model based on the Vision Transformer architecture with patch masking: the input image is split in several patches, and each patch is reconstructed only from the surrounding data, thus ignoring the potentially anomalous information contained in the patch itself. We then show that multi-resolution patches and their collective embeddings provide a large improvement in the model's performance compared to the exclusive use of the traditional square patches. The proposed model has been tested on popular anomaly detection datasets such as MVTec and head CT and achieved good results when compared to other state-of-the-art approaches.
引用
收藏
页数:16
相关论文
共 37 条
[1]   Latent Space Autoregression for Novelty Detection [J].
Abati, Davide ;
Porrello, Angelo ;
Calderara, Simone ;
Cucchiara, Rita .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :481-490
[2]   Enhanced probabilistic neural network with local decision circles: A robust classifier [J].
Ahmadlou, Mehran ;
Adeli, Hojjat .
INTEGRATED COMPUTER-AIDED ENGINEERING, 2010, 17 (03) :197-210
[3]   A dynamic ensemble learning algorithm for neural networks [J].
Alam, Kazi Md Rokibul ;
Siddique, Nazmul ;
Adeli, Hojjat .
NEURAL COMPUTING & APPLICATIONS, 2020, 32 (12) :8675-8690
[4]  
[Anonymous], 2016, DEEP LEARNING
[5]   Neonatal Seizure Detection Using Deep Convolutional Neural Networks [J].
Ansari, Amir H. ;
Cherian, Perumpillichira J. ;
Caicedo, Alexander ;
Naulaers, Gunnar ;
De Vos, Maarten ;
Van Huffel, Sabine .
INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2019, 29 (04)
[6]   Deep Autoencoding Models for Unsupervised Anomaly Segmentation in Brain MR Images [J].
Baur, Christoph ;
Wiestler, Benedikt ;
Albarqouni, Shadi ;
Navab, Nassir .
BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2018, PT I, 2019, 11383 :161-169
[7]   The MVTec 3D-AD Dataset for Unsupervised 3D Anomaly Detection and Localization [J].
Bergmann, Paul ;
Jin, Xin ;
Sattlegger, David ;
Steger, Carsten .
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, :202-213
[8]   The MVTec Anomaly Detection Dataset: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection [J].
Bergmann, Paul ;
Batzner, Kilian ;
Fauser, Michael ;
Sattlegger, David ;
Steger, Carsten .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (04) :1038-1059
[9]   Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings [J].
Bergmann, Paul ;
Fauser, Michael ;
Sattlegger, David ;
Steger, Carsten .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4182-4191
[10]   Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders [J].
Bergmann, Paul ;
Loewe, Sindy ;
Fauser, Michael ;
Sattlegger, David ;
Steger, Carsten .
PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, :372-380