Over the years, medical image segmentation has played a vital role in assisting healthcare professionals in disease treatment. Convolutional neural networks have demonstrated remarkable success in this domain. Among these networks, the encoder-decoder architecture stands out as a classic and effective model for medical image segmentation. However, several challenges remain to be addressed, including segmentation issues arising from indistinct boundaries, difficulties in segmenting images with irregular shapes, and accurate segmentation of lesions with small targets. To address these limitations, we propose Encoder Activation Diffusion and Decoder Transformer Fusion Network (ADTF). Specifically, we propose a novel Lightweight Convolution Modulation (LCM) formed by a gated attention mechanism, using convolution to encode spatial features. LCM replaces the convolutional layer in the encoder-decoder network. Additionally, to enhance the integration of spatial information and dynamically extract more valuable high-order semantic information, we introduce Activation Diffusion Blocks after the encoder (EAD), so that the network can segment a complete medical segmentation image. Furthermore, we utilize a Transformer-based multi-scale feature fusion module on the decoder (MDFT) to achieve global interaction of multi-scale features. To validate our approach, we conduct experiments on multiple medical image segmentation datasets. Experimental results demonstrate that our model outperforms other state-of-the-art (SOTA) methods on commonly used evaluation metrics.