Local Reversible Transformer for semantic segmentation of grape leaf diseases

被引:8
|
作者
Zhang, Xinxin [1 ,2 ]
Li, Fei [1 ]
Jin, Haibin [1 ,2 ]
Mu, Weisong [1 ,2 ,3 ]
机构
[1] China Agr Univ, Coll Informat & Elect Engn, Beijing 100083, Peoples R China
[2] Minist Agr, Key Lab Viticulture & Enol, Beijing 100083, Peoples R China
[3] China Agr Univ, POB 121,17 Tsinghua East Rd, Beijing 100083, Peoples R China
关键词
Local learning bottleneck; Reversible downsampling; Grape leaf diseases; Semantic segmentation;
D O I
10.1016/j.asoc.2023.110392
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Grape leaf diseases segmentation is an essential basis for achieving precise diagnosis and identification of diseases. However, the complex background renders it difficult for small disease areas to be precisely segmented. The existing Transformer mainly focuses on utilizing key and value downsampling to improve model performance while neglecting that downsampling is irreversible with the loss of contextual information. To this end, this paper proposed a novel Locally Reversible Transformer (LRT) segmentation model for grape leaf diseases in natural scene images, whose representation is learned in a reversible downsampling manner. Specifically, a Local Learning Bottleneck (LLB) is developed to enhance local perception and extract richer semantic information of grape leaf diseases via inverted residual convolution. Furthermore, motivated by the wavelet theory, the Reversible Attention (RA) is designed to replace the original downsampling operation by introducing wavelet transform into the multi-headed attention and solving the problem of difficult detection and segmentation of small disease targets with complex backgrounds. Extensive experiments demonstrate that the segmentation performance of LRT outperforms state-of-the-art models with comparable GFLOPs and parameters. Moreover, LRT can retain more multi-grain information and can increase the receptive field to focus on small disease regions with complex backgrounds.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Efficient Depth Fusion Transformer for Aerial Image Semantic Segmentation
    Yan, Li
    Huang, Jianming
    Xie, Hong
    Wei, Pengcheng
    Gao, Zhao
    REMOTE SENSING, 2022, 14 (05)
  • [32] Learning graph structures with transformer for weakly supervised semantic segmentation
    Sun, Wanchun
    Feng, Xin
    Ma, Hui
    Liu, Jingyao
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (06) : 7511 - 7521
  • [33] Learning graph structures with transformer for weakly supervised semantic segmentation
    Wanchun Sun
    Xin Feng
    Hui Ma
    Jingyao Liu
    Complex & Intelligent Systems, 2023, 9 : 7511 - 7521
  • [34] Transformer framework for depth-assisted UDA semantic segmentation
    Song, Yunna
    Shi, Jinlong
    Zou, Danping
    Liu, Caisheng
    Bai, Suqin
    Shu, Xin
    Qian, Qian
    Xu, Dan
    Yuan, Yu
    Sun, Yunhan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
  • [35] Transformer fusion for indoor RGB-D semantic segmentation
    Wu, Zongwei
    Zhou, Zhuyun
    Allibert, Guillaume
    Stolz, Christophe
    Demonceaux, Cedric
    Ma, Chao
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [36] Enhancing Mask Transformer with Auxiliary Convolution Layers for Semantic Segmentation
    Xia, Zhengyu
    Kim, Joohee
    SENSORS, 2023, 23 (02)
  • [37] Remote sensing image semantic segmentation based on cascaded Transformer
    Wang F.
    Ji J.
    Wang Y.
    IEEE. Trans. Artif. Intell., 2024, 8 (4136-4148): : 1 - 12
  • [38] Privacy-Preserving Semantic Segmentation Using Vision Transformer
    Kiya, Hitoshi
    Nagamori, Teru
    Imaizumi, Shoko
    Shiota, Sayaka
    JOURNAL OF IMAGING, 2022, 8 (09)
  • [39] A Multilevel Multimodal Fusion Transformer for Remote Sensing Semantic Segmentation
    Ma, Xianping
    Zhang, Xiaokang
    Pun, Man-On
    Liu, Ming
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [40] TBFormer: three-branch efficient transformer for semantic segmentation
    Can Wei
    Yan Wei
    Signal, Image and Video Processing, 2024, 18 : 3661 - 3672