Multi-focus image fusion is an important computer vision task aimed at synthesizing multiple images with different focal points into a single image with higher clarity and a wide depth of field. However, existing methods face challenges when dealing with complex scenes, such as information loss, edge blurring, and insufficient extraction of image detail information. To address these challenges, a multi-focus image fusion method based on adaptive weighting and interactive information modulation is proposed in this paper. Specifically, the proposed method utilizes a detail and structure information enhancement block that employs DTCWT to decompose the spectral information of the image. This approach achieves a good balance of direction selectivity, multi-scale analysis, and translation invariance, enabling efficient capture of spectral information while reducing noise interference and preserving edge information. Introducing learnable weights for adaptive weighting allows the network to adjust the importance of different frequency components to enhance feature information. Moreover, the interactive information modulation structure enhances local feature extraction through multi-level and multi-dimensional feature interaction and information flow. The paper also optimizes feature fusion to better aggregate feature information from different scales and channels, improving the network's ability to generate accurate and natural fusion results. The method's performance is evaluated based on both visual quality and objective metrics, demonstrating its superiority in multi-focus image fusion tasks compared to other state-of-the-art methods.