HSNet: A hybrid semantic network for polyp segmentation

被引:78
作者
Zhang, Wenchao [1 ]
Fu, Chong [1 ,2 ,3 ]
Zheng, Yu [4 ]
Zhang, Fangyuan [5 ]
Zhao, Yanli [6 ]
Sham, Chiu-Wing [7 ]
机构
[1] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110819, Peoples R China
[2] Minist Educ, Engn Res Ctr Secur Technol Complex Network Syst, Shenyang, Peoples R China
[3] Northeastern Univ, Key Lab Intelligent Comp Med Image, Minist Educ, Shenyang 110819, Peoples R China
[4] Chinese Univ Hong Kong, Dept Informat Engn, Sha Tin, Hong Kong, Peoples R China
[5] China Med Univ, Dept Gen Surg, Shengjing Hosp, Shenyang, Peoples R China
[6] Ningxia Inst Sci & Technol, Sch Elect Informat Engn, Shizuishan 753000, Peoples R China
[7] Univ Auckland, Sch Comp Sci, Auckland, New Zealand
关键词
Polyp segmentation; Hybrid semantic; Dual-branch; Long-range dependencies; Local details;
D O I
10.1016/j.compbiomed.2022.106173
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Automatic polyp segmentation can help physicians to effectively locate polyps (a.k.a. region of interests) in clinical practice, in the way of screening colonoscopy images assisted by neural networks (NN). However, two significant bottlenecks hinder its effectiveness, disappointing physicians' expectations. (1) Changeable polyps in different scaling, orientation, and illumination, bring difficulty in accurate segmentation. (2) Current works building on a dominant decoder-encoder network tend to overlook appearance details (e.g., textures) for a tiny polyp, degrading the accuracy to differentiate polyps. For alleviating the bottlenecks, we investigate a hybrid semantic network (HSNet) that adopts both advantages of Transformer and convolutional neural networks (CNN), aiming at improving polyp segmentation. Our HSNet contains a cross-semantic attention module (CSA), a hybrid semantic complementary module (HSC), and a multi-scale prediction module (MSP). Unlike previous works on segmenting polyps, we newly insert the CSA module, which can fill the gap between low-level and high-level features via an interactive mechanism that exchanges two types of semantics from different NN attentions. By a dual-branch structure of Transformer and CNN, we newly design an HSC module, for capturing both long-range dependencies and local details of appearance. Besides, the MSP module can learn weights for fusing stage-level prediction masks of a decoder. Experimentally, we compared our work with 10 state-of-the-art works, including both recent and classical works, showing improved accuracy (via 7 evaluative metrics) over 5 benchmark datasets, e.g., it achieves 0.926/0.877 mDic/mIoU on Kvasir-SEG, 0.948/0.905 mDic/mIoU on ClinicDB, 0.810/0.735 mDic/mIoU on ColonDB, 0.808/0.74 mDic/mIoU on ETIS, and 0.903/0.839 mDic/mIoU on Endoscene. The proposed model is available at (https://github.com/baiboat/ HSNet).
引用
收藏
页数:10
相关论文
共 66 条
[61]   TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation [J].
Zhang, Yundong ;
Liu, Huiye ;
Hu, Qiang .
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT I, 2021, 12901 :14-24
[62]   Road Extraction by Deep Residual U-Net [J].
Zhang, Zhengxin ;
Liu, Qingjie ;
Wang, Yunhong .
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2018, 15 (05) :749-753
[63]   LFANet: Lightweight feature attention network for abnormal cell segmentation in cervical cytology images [J].
Zhao, Yanli ;
Fu, Chong ;
Xu, Sen ;
Cao, Lin ;
Ma, Hong-feng .
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 145
[64]  
Zhou DQ, 2021, Arxiv, DOI arXiv:2103.11886
[65]   Recognition of Imbalanced Epileptic EEG Signals by a Graph-Based Extreme Learning Machine [J].
Zhou, Jie ;
Zhang, Xiongtao ;
Jiang, Zhibin .
WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
[66]   UNet plus plus : A Nested U-Net Architecture for Medical Image Segmentation [J].
Zhou, Zongwei ;
Siddiquee, Md Mahfuzur Rahman ;
Tajbakhsh, Nima ;
Liang, Jianming .
DEEP LEARNING IN MEDICAL IMAGE ANALYSIS AND MULTIMODAL LEARNING FOR CLINICAL DECISION SUPPORT, DLMIA 2018, 2018, 11045 :3-11