The purpose of image fusion is to generate an image that contains more complementary information. Existing image fusion methods suffer from loss of detail information, artifacts and/or inconsistencies. To alleviate these problems, we propose a feature extraction network combining with Axial-attention, which can capture long-range semantic information while extracting multi-scale features and thus has stronger feature representation capabilities. Likewise, existing fusion strategies also suffer from loss of details. To solve this problem, a new fusion strategy is proposed, where a novel attention mechanism is constructed by applying entropy features to aggregate edge and detail features. At the same time, a new loss function is designed to constrain the network. To validate the efficiency of the proposed method, validation experiments are performed on public datasets. Compared with other fusion methods, the experimental results of the proposed method demonstrate state-of-the-art advantages in both subjective and objective evaluations. Furthermore, ablation studies illustrate the superiority of the proposed method. (c) 2022 Published by Elsevier B.V.