Vision Transformer (ViT)-based Applications in Image Classification

被引：7

作者：

Huo, Yingzi ^{[1
]}

Jin, Kai ^{[2
]}

Cai, Jiahong ^{[1
]}

Xiong, Huixuan ^{[1
]}

Pang, Jiacheng ^{[1
]}

机构：

[1] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Hunan Key Lab Serv Comp & Novel Software Technol, Xiangtan 411201, Peoples R China

[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410002, Peoples R China

来源：

2023 IEEE 9TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD, BIGDATASECURITY, IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, HPSC AND IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY, IDS | 2023年

关键词：

CNN; image classification; token; vision transformer; Vision Reservoir;

D O I：

10.1109/BigDataSecurity-HPSC-IDS58521.2023.00033

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, the ViT model has been widely used in the field of computer vision, especially for image classification tasks. This paper summarizes the application of ViT in image classification tasks, first introduces the image classification implementation process and the basic architecture of the ViT model, then analyzes and summarizes the image classification methods, including traditional image classification methods, CNN -based image classification methods, and ViT-based image classification methods, and provides a comparative analysis of CNN and ViT. Subsequently, this paper outlines the application prospects of ViT in image classification and its future development and also outlines some shortcomings of ViT and its solutions.

引用

页码：135 / 140

页数：6

共 50 条

[31] Gait-ViT: Gait Recognition with Vision Transformer
Mogan, Jashila Nair
Lee, Chin Poo
Lim, Kian Ming
Muthu, Kalaiarasi Sonai
SENSORS, 2022, 22 (19)
[32] Refined Feature-Space Window Attention Vision Transformer for Image Classification
Yoo D.
Yoo J.
Transactions of the Korean Institute of Electrical Engineers, 2024, 73 (06) : 1004 - 1011
[33] Fundus Image Classification Research Based on Ensemble Convolutional Neural Network and Vision Transformer
Yuan Yuan
Chen Minghui
Ke Shuting
Wang Teng
He Longxi
Lu Linjie
Sun Hao
Liu Jiannan
CHINESE JOURNAL OF LASERS-ZHONGGUO JIGUANG, 2022, 49 (20):
[34] Hyperspectral Image Classification Based on Multi-stage Vision Transformer with Stacked Samples
Chen, Xiaoyue
Kamata, Sei-Ichiro
Zhou, Weilian
2021 IEEE REGION 10 CONFERENCE (TENCON 2021), 2021, : 441 - 446
[35] Histopathological Image Classification based on Self-Supervised Vision Transformer and Weak Labels
Gul, Ahmet Gokberk
Cetin, Oezdemir
Reich, Christoph
Flinner, Nadine
Prangemeier, Tim
Koeppl, Heinz
MEDICAL IMAGING 2022: DIGITAL AND COMPUTATIONAL PATHOLOGY, 2022, 12039
[36] Privacy-Preserving Image Classification Using Vision Transformer
Qi, Zheng
MaungMaung, AprilPyone
Kinoshita, Yuma
Kiya, Hitoshi
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 543 - 547
[37] MedViT: A robust vision transformer for generalized medical image classification
Manzari, Omid Nejati
Ahmadabadi, Hamid
Kashiani, Hossein
Shokouhi, Shahriar B.
Ayatollahi, Ahmad
COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 157
[38] Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification
Shiri, Mohammad
Reddy, Monalika Padma
Sun, Jiangwen
2024 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE, IRI 2024, 2024, : 296 - 301
[39] MASK-VIT: AN OBJECT MASK EMBEDDING IN VISION TRANSFORMER FOR FINE-GRAINED VISUAL CLASSIFICATION
Su, Tong
Ye, Shuo
Song, Chengqun
Cheng, Jun
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1626 - 1630
[40] AnisotropicBreast-ViT: Breast Cancer Classification in Ultrasound Images Using Anisotropic Filtering and Vision Transformer
Diniz, Joao Otavio Bandeira
Ribeiro, Neilson P.
Dias, Domingos A., Jr.
da Cruz, Luana B.
da Silva, Giovanni L. F.
Gomes, Daniel L., Jr.
de Paiva, Anselmo C.
Silva, Aristofanes C.
INTELLIGENT SYSTEMS, BRACIS 2024, PT III, 2025, 15414 : 95 - 109

← 1 2 3 4 5 →