Local Reversible Transformer for semantic segmentation of grape leaf diseases

被引：8

作者：

Zhang, Xinxin ^{[1
,2
]}

Li, Fei ^{[1
]}

Jin, Haibin ^{[1
,2
]}

Mu, Weisong ^{[1
,2
,3
]}

机构：

[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China

[2] Minist Agr, Key Lab Viticulture & Enol, Beijing 100083, Peoples R China

[3] China Agr Univ, POB 121,17 Tsinghua East Rd, Beijing 100083, Peoples R China

来源：

APPLIED SOFT COMPUTING | 2023年 / 143卷

关键词：

Local learning bottleneck; Reversible downsampling; Grape leaf diseases; Semantic segmentation;

D O I：

10.1016/j.asoc.2023.110392

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Grape leaf diseases segmentation is an essential basis for achieving precise diagnosis and identification of diseases. However, the complex background renders it difficult for small disease areas to be precisely segmented. The existing Transformer mainly focuses on utilizing key and value downsampling to improve model performance while neglecting that downsampling is irreversible with the loss of contextual information. To this end, this paper proposed a novel Locally Reversible Transformer (LRT) segmentation model for grape leaf diseases in natural scene images, whose representation is learned in a reversible downsampling manner. Specifically, a Local Learning Bottleneck (LLB) is developed to enhance local perception and extract richer semantic information of grape leaf diseases via inverted residual convolution. Furthermore, motivated by the wavelet theory, the Reversible Attention (RA) is designed to replace the original downsampling operation by introducing wavelet transform into the multi-headed attention and solving the problem of difficult detection and segmentation of small disease targets with complex backgrounds. Extensive experiments demonstrate that the segmentation performance of LRT outperforms state-of-the-art models with comparable GFLOPs and parameters. Moreover, LRT can retain more multi-grain information and can increase the receptive field to focus on small disease regions with complex backgrounds.

引用

页数：13

共 50 条

[31] Efficient Depth Fusion Transformer for Aerial Image Semantic Segmentation
Yan, Li
Huang, Jianming
Xie, Hong
Wei, Pengcheng
Gao, Zhao
REMOTE SENSING, 2022, 14 (05)
[32] Learning graph structures with transformer for weakly supervised semantic segmentation
Sun, Wanchun
Feng, Xin
Ma, Hui
Liu, Jingyao
COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (06) : 7511 - 7521
[33] Learning graph structures with transformer for weakly supervised semantic segmentation
Wanchun Sun
Xin Feng
Hui Ma
Jingyao Liu
Complex & Intelligent Systems, 2023, 9 : 7511 - 7521
[34] Transformer framework for depth-assisted UDA semantic segmentation
Song, Yunna
Shi, Jinlong
Zou, Danping
Liu, Caisheng
Bai, Suqin
Shu, Xin
Qian, Qian
Xu, Dan
Yuan, Yu
Sun, Yunhan
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
[35] Transformer fusion for indoor RGB-D semantic segmentation
Wu, Zongwei
Zhou, Zhuyun
Allibert, Guillaume
Stolz, Christophe
Demonceaux, Cedric
Ma, Chao
COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
[36] Enhancing Mask Transformer with Auxiliary Convolution Layers for Semantic Segmentation
Xia, Zhengyu
Kim, Joohee
SENSORS, 2023, 23 (02)
[37] Remote sensing image semantic segmentation based on cascaded Transformer
Wang F.
Ji J.
Wang Y.
IEEE. Trans. Artif. Intell., 2024, 8 (4136-4148): : 1 - 12
[38] Privacy-Preserving Semantic Segmentation Using Vision Transformer
Kiya, Hitoshi
Nagamori, Teru
Imaizumi, Shoko
Shiota, Sayaka
JOURNAL OF IMAGING, 2022, 8 (09)
[39] A Multilevel Multimodal Fusion Transformer for Remote Sensing Semantic Segmentation
Ma, Xianping
Zhang, Xiaokang
Pun, Man-On
Liu, Ming
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
[40] TBFormer: three-branch efficient transformer for semantic segmentation
Can Wei
Yan Wei
Signal, Image and Video Processing, 2024, 18 : 3661 - 3672

← 1 2 3 4 5 →