Deep local-to-global feature learning for medical image super-resolution

被引：5

作者：

Huang, Wenfeng ^{[1
,2
]}

Liao, Xiangyun ^{[1
]}

Chen, Hao ^{[3
,4
]}

Hu, Ying ^{[1
]}

Jia, Wenjing ^{[2
]}

Wang, Qiong ^{[1
]}

机构：

[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Guangdong Prov Key Lab Comp Vis & Virtual Real Tec, Shenzhen 518000, Peoples R China

[2] Univ Technol Sydney, Fac Engn & Informat Technol, Broadway, NSW 2007, Australia

[3] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China

[4] Hong Kong Univ Sci & Technol, Dept Chem & Biol Engn, Hong Kong, Peoples R China

来源：

COMPUTERIZED MEDICAL IMAGING AND GRAPHICS | 2024年 / 115卷

基金：

国家重点研发计划;

关键词：

Medical images; Super-resolution; Feature learning; Vision transformer; NETWORK;

D O I：

10.1016/j.compmedimag.2024.102374

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

Medical images play a vital role in medical analysis by providing crucial information about patients' pathological conditions. However, the quality of these images can be compromised by many factors, such as limited resolution of the instruments, artifacts caused by movements, and the complexity of the scanned areas. As a result, low -resolution (LR) images cannot provide sufficient information for diagnosis. To address this issue, researchers have attempted to apply image super -resolution (SR) techniques to restore the high -resolution (HR) images from their LR counterparts. However, these techniques are designed for generic images, and thus suffer from many challenges unique to medical images. An obvious one is the diversity of the scanned objects; for example, the organs, tissues, and vessels typically appear in different sizes and shapes, and are thus hard to restore with standard convolution neural networks (CNNs). In this paper, we develop a dynamic -local learning framework to capture the details of these diverse areas, consisting of deformable convolutions with adjustable kernel shapes. Moreover, the global information between the tissues and organs is vital for medical diagnosis. To preserve global information, we propose pixel-pixel and patch-patch global learning using a non -local mechanism and a vision transformer (ViT), respectively. The result is a novel CNN-ViT neural network with Local -to -Global feature learning for medical image SR, referred to as LGSR, which can accurately restore both local details and global information. We evaluate our method on six public datasets and one large-scale private dataset, which include five different types of medical images ( i.e. , Ultrasound, OCT, Endoscope, CT, and MRI images). Experiments show that the proposed method achieves superior PSNR/SSIM and visual performance than the state of the arts with competitive computational costs, measured in network parameters, runtime, and FLOPs. What is more, the experiment conducted on OCT image segmentation for the downstream task demonstrates a significantly positive performance effect of LGSR.

引用

页数：11

共 49 条

[1] A new generative adversarial network for medical images super resolution [J].

Ahmad, Waqar ;

Ali, Hazrat ;

Shah, Zubair ;

Azmat, Shoaib .

SCIENTIFIC REPORTS, 2022, 12 (01)

[2] Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network [J].

Ahn, Namhyuk ;

Kang, Byungkon ;

Sohn, Kyung-Ah .

COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 :256-272

[3] Multi-modal medical Transformers: A meta-analysis for medical image segmentation in oncology [J].

Andrade-Miranda, Gustavo ;

Jaouen, Vincent ;

Tankyevych, Olena ;

Le Rest, Catherine Cheze ;

Visvikis, Dimitris ;

Conze, Pierre-Henri .

COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2023, 110

[4] WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians [J].

Bernal, Jorge ;

Javier Sanchez, F. ;

Fernandez-Esparrach, Gloria ;

Gil, Debora ;

Rodriguez, Cristina ;

Vilarino, Fernando .

COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2015, 43 :99-111

[5] Emerging Properties in Self-Supervised Vision Transformers [J].

Caron, Mathilde ;

Touvron, Hugo ;

Misra, Ishan ;

Jegou, Herve ;

Mairal, Julien ;

Bojanowski, Piotr ;

Joulin, Armand .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :9630-9640

[6]

Chen Jiacheng, 2024, MultiMedia Modeling: 30th International Conference, MMM 2024, Proceedings. Lecture Notes in Computer Science (14555), P353, DOI 10.1007/978-3-031-53308-2_26

[7]

Chen J., 2021, arXiv

[8] Pathological image super-resolution using mix-attention generative adversarial network [J].