Driven by textual knowledge: A Text-View Enhanced Knowledge Transfer Network for lung infection region segmentation

被引：0

作者：

Fang, Lexin ^{[1
]}

Li, Xuemei ^{[1
]}

Xu, Yunyang ^{[1
]}

Zhang, Fan ^{[2
]}

Zhang, Caiming ^{[1
]}

机构：

[1] Shandong Univ, Sch Software, Jinan 250101, Peoples R China

[2] Shandong Technol & Business Univ, Sch Comp Sci & Technol, Yantai 264005, Peoples R China

来源：

MEDICAL IMAGE ANALYSIS | 2025年 / 103卷

基金：

中国国家自然科学基金;

关键词：

Medical image segmentation; Text supervision; Feature enhancement; Knowledge transfer network; ATTENTION;

D O I：

10.1016/j.media.2025.103625

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Lung infections are the leading cause of death among infectious diseases, and accurate segmentation of the infected lung area is crucial for effective treatment. Currently, segmentation methods that rely solely on imaging data have limited accuracy. Incorporating text information enriched with expert knowledge into the segmentation process has emerged as a novel approach. However, previous methods often used unified text encoding strategies for extracting textual features. It failed to adequately emphasize critical details in the text, particularly the spatial location of infected regions. Moreover, the semantic space inconsistency between text and image features complicates cross-modal information transfer. To close these gaps, we propose a Text-View Enhanced Knowledge Transfer Network (TVE-Net) that leverages key information from textual data to assist in segmentation and enhance the model's perception of lung infection locations. The method generates a text view by probabilistically modeling the location information of infected areas in text using a robust, carefully designed positional probability function. By assigning lesion probabilities to each image region, the infected areas' spatial information from the text view is explicitly integrated into the segmentation model. Once the text view has been introduced, a unified image encoder can be employed to extract text view features, so that both text and images are mapped into the same space. In addition, a self-supervised constraint based on text-view overlap and feature consistency is proposed to enhance the model's robustness and semi-supervised capability through feature augmentation. Meanwhile, the newly designed multi-stage knowledge transfer module utilizes a globally enhanced cross-attention mechanism to comprehensively learn the implicit correlations between image features and text-view features, enabling effective knowledge transfer from text-view features to image features. Extensive experiments demonstrate that TVE-Net outperforms both unimodal and multimodal methods in both fully supervised and semi-supervised lung infection segmentation tasks, achieving significant improvements on QaTa-COV19 and MosMedData+ datasets.

引用

页数：17

共 33 条

[1] Know your orientation: A viewpoint-aware framework for polyp segmentation [J].

Cai, Linghan ;

Chen, Lijiang ;

Huang, Jianhao ;

Wang, Yifeng ;

Zhang, Yongbing .

MEDICAL IMAGE ANALYSIS, 2024, 97

[2] A novel convolutional neural network for kidney ultrasound images segmentation [J].

Chen, Gongping ;

Yin, Jingjing ;

Dai, Yu ;

Zhang, Jianxun ;

Yin, Xiaotao ;

Cui, Liang .

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2022, 218

[3]

Chen J., 2021, PREPRINT

[4] VLT: Vision-Language Transformer and Query Generation for Referring Segmentation [J].

Ding, Henghui ;

Liu, Chang ;

Wang, Suchen ;

Jiang, Xudong .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (06) :7900-7916

[5]

Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929

[6]

Grill J. -B., 2020, NeurIPS, V33, P21271

[7] Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images [J].

Hatamizadeh, Ali ;

Nath, Vishwesh ;

Tang, Yucheng ;

Yang, Dong ;

Roth, Holger R. ;

Xu, Daguang .

BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2021, PT I, 2022, 12962 :272-284

[8] UNETR: Transformers for 3D Medical Image Segmentation [J].

Hatamizadeh, Ali ;

Tang, Yucheng ;

Nath, Vishwesh ;

Yang, Dong ;

Myronenko, Andriy ;

Landman, Bennett ;

Roth, Holger R. ;

Xu, Daguang .

2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :1748-1758

[9] GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-efficient Medical Image Recognition [J].

Huang, Shih-Cheng ;

Shen, Liyue ;

Lungren, Matthew P. ;

Yeung, Serena .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :3922-3931

[10] nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation [J].

Isensee, Fabian ;

Jaeger, Paul F. ;

Kohl, Simon A. A. ;

Petersen, Jens ;

Maier-Hein, Klaus H. .

NATURE METHODS, 2021, 18 (02) :203-+

← 1 2 3 4 →