Vision Transformers in medical computer vision-A contemplative retrospection

被引:118
作者
Parvaiz, Arshi [1 ]
Khalid, Muhammad Anwaar [1 ]
Zafar, Rukhsana [1 ]
Ameer, Huma [1 ]
Ali, Muhammad [1 ]
Fraz, Muhammad Moazam [1 ]
机构
[1] Natl Univ Sci & Technol NUST, Islamabad 44000, Pakistan
关键词
Vision Transformers; Medical image analysis; Self attention; Medical computer vision; Diagnostic image analysis; Literature survey; CONVOLUTIONAL NEURAL-NETWORK; BARRETTS-ESOPHAGUS; IMAGE DATABASE; LUNG-CANCER; SEGMENTATION; COVID-19; TOMOGRAPHY; CNN; LOCALIZATION; PREDICTION;
D O I
10.1016/j.engappai.2023.106126
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Vision Transformers (ViTs), with the magnificent potential to unravel the information contained within images, have evolved as one of the most contemporary and dominant architectures that are being used in the field of computer vision. These are immensely utilized by plenty of researchers to perform new as well as former experiments. Here, in this article, we investigate the intersection of vision transformers and medical images. We proffered an overview of various ViT based frameworks that are being used by different researchers to decipher the obstacles in medical computer vision. We surveyed the applications of Vision Transformers in different areas of medical computer vision such as image-based disease classification, anatomical structure segmentation, registration, region-based lesion detection, captioning, report generation, and reconstruction using multiple medical imaging modalities that greatly assist in medical diagnosis and hence treatment process. Along with this, we also demystify several imaging modalities used in medical computer vision. Moreover, to get more insight and deeper understanding, the self-attention mechanism of transformers is also explained briefly. Conclusively, the ViT based solutions for each image analytics task are critically analyzed, open challenges are discussed and the pointers to possible solutions for future direction are deliberated. We hope this review article will open future research directions for medical computer vision researchers.
引用
收藏
页数:38
相关论文
共 243 条
[21]   Automatic segmentation of multiple cardiovascular structures from cardiac computed tomography angiography images using deep learning [J].
Baskaran, Lohendran ;
Al'Aref, Subhi J. ;
Maliakal, Gabriel ;
Lee, Benjamin C. ;
Xu, Zhuoran ;
Choi, Jeong W. ;
Lee, Sang-Eun ;
Sung, Ji Min ;
Lin, Fay Y. ;
Dunham, Simon ;
Mosadegh, Bobak ;
Kim, Yong-Jin ;
Gottlieb, Ilan ;
Lee, Byoung Kwon ;
Chun, Eun Ju ;
Cademartiri, Filippo ;
Maffei, Erica ;
Marques, Hugo ;
Shin, Sanghoon ;
Choi, Jung Hyun ;
Chinnaiyan, Kavitha ;
Hadamitzky, Martin ;
Conte, Edoardo ;
Andreini, Daniele ;
Pontone, Gianluca ;
Budoff, Matthew J. ;
Leipsic, Jonathon A. ;
Raff, Gilbert L. ;
Virmani, Renu ;
Samady, Habib ;
Stone, Peter H. ;
Berman, Daniel S. ;
Narula, Jagat ;
Bax, Jeroen J. ;
Chang, Hyuk-Jae ;
Min, James K. ;
Shaw, Leslee J. .
PLOS ONE, 2020, 15 (05)
[22]   Machine learning based texture analysis of patella from X-rays for detecting patellofemoral osteoarthritis [J].
Bayramoglu, Neslihan ;
Nieminen, Miika T. ;
Saarakkala, Simo .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2022, 157
[23]  
Bernheim A, 2020, RADIOLOGY
[24]   Composite deep neural network with gated-attention mechanism for diabetic retinopathy severity classification [J].
Bodapati, Jyostna Devi ;
Shaik, Nagur Shareef ;
Naralasetti, Veeranjaneyulu .
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 12 (10) :9825-9839
[25]   AI applications to medical images: From machine learning to deep learning [J].
Castiglioni, Isabella ;
Rundo, Leonardo ;
Codari, Marina ;
Leo, Giovanni Di ;
Salvatore, Christian ;
Interlenghi, Matteo ;
Gallivanone, Francesca ;
Cozzi, Andrea ;
D'Amico, Natascha Claudia ;
Sardanelli, Francesco .
PHYSICA MEDICA-EUROPEAN JOURNAL OF MEDICAL PHYSICS, 2021, 83 :9-24
[26]   Automated detection of lung nodules and coronary artery calcium using artificial intelligence on low-dose CT scans for lung cancer screening: accuracy and prognostic value [J].
Chamberlin, Jordan ;
Kocher, Madison R. ;
Waltz, Jeffrey ;
Snoddy, Madalyn ;
Stringer, Natalie F. C. ;
Stephenson, Joseph ;
Sahbaee, Pooyan ;
Sharma, Puneet ;
Rapaka, Saikiran ;
Schoepf, U. Joseph ;
Abadia, Andres F. ;
Sperl, Jonathan ;
Hoelzer, Phillip ;
Mercer, Megan ;
Somayaji, Nayana ;
Aquino, Gilberto ;
Burt, Jeremy R. .
BMC MEDICINE, 2021, 19 (01)
[27]   Disentangle, Align and Fuse for Multimodal and Semi-Supervised Image Segmentation [J].
Chartsias, Agisilaos ;
Papanastasiou, Giorgos ;
Wang, Chengjia ;
Semple, Scott ;
Newby, David E. ;
Dharmakumar, Rohan ;
Tsaftaris, Sotirios A. .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (03) :781-792
[28]   Technical Note: Development and validation of an open data format for CT projection data [J].
Chen, Baiyu ;
Duan, Xinhui ;
Yu, Zhicong ;
Leng, Shuai ;
Yu, Lifeng ;
McCollough, Cynthia .
MEDICAL PHYSICS, 2015, 42 (12) :6964-6972
[29]  
Chen Haoyuan, 2022, COMPUT BIOL MED
[30]  
Chen J., 2021, arXiv