BUS-M2AE: Multi-scale Masked Autoencoder for Breast Ultrasound Image Analysis

被引:0
作者
Yu, Le [1 ]
Gou, Bo [1 ,2 ]
Xia, Xun [2 ]
Yang, Yujia [3 ]
Yi, Zhang [1 ]
Min, Xiangde [4 ]
He, Tao [1 ]
机构
[1] College of Computer Science, Sichuan University, Chengdu
[2] School of Clinical Medicine, The First Affiliated Hospital of Chengdu Medical College, Chengdu
[3] Department of Medical Ultrasound, West China Hospital of Sichuan University, Chengdu
[4] Department of Radiology, Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Wuhan
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Masked Autoencoder; Multi-scale masking; Tumor segmentation; Ultrasound image classification;
D O I
10.1016/j.compbiomed.2025.110159
中图分类号
学科分类号
摘要
Masked AutoEncoder (MAE) has demonstrated significant potential in medical image analysis by reducing the cost of manual annotations. However, MAE and its recent variants are not well-developed for ultrasound images in breast cancer diagnosis, as they struggle to generalize to the task of distinguishing ultrasound breast tumors of varying sizes. This limitation hinders the model's ability to adapt to the diverse morphological characteristics of breast tumors. In this paper, we propose a novel Breast UltraSound Multi-scale Masked AutoEncoder (BUS-M2AE) model to address the limitations of the general MAE. BUS-M2AE incorporates multi-scale masking methods at both the token level during the image patching stage and the feature level during the feature learning stage. These two multi-scale masking methods enable flexible strategies to match the explicit masked patches and the implicit features with varying tumor scales. By introducing these multi-scale masking methods in the image patching and feature learning phases, BUS-M2AE allows the pre-trained vision transformer to adaptively perceive and accurately distinguish breast tumors of different sizes, thereby improving the model's overall performance in handling diverse tumor morphologies. Comprehensive experiments demonstrate that BUS-M2AE outperforms recent MAE variants and commonly used supervised learning methods in breast cancer classification and tumor segmentation tasks. © 2025 Elsevier Ltd
引用
收藏
相关论文
共 58 条
[1]  
He K., Chen X., Xie S., Li Y., Dollar P., Girshick R., Masked autoencoders are scalable vision learners, CVPR, pp. 16000-16009, (2022)
[2]  
Dosovitskiy A., Beyer L., Kolesnikov A., Weissenborn D., Zhai X., Unterthiner T., Dehghani M., Minderer M., Heigold G., Gelly S., Et al., An image is worth 16x16 words: Transformers for image recognition at scale, ICLR, (2021)
[3]  
Chen H., Zhang W., Wang Y., Yang X., Improving masked autoencoders by learning where to mask, pp. 377-390, (2023)
[4]  
Shi Y., Siddharth N., Torr P., Kosiorek A.R., Adversarial masking for self-supervised learning, ICML, pp. 20026-20040, (2022)
[5]  
Kakogeorgiou I., Gidaris S., Psomas B., Avrithis Y., Bursuc A., Karantzalos K., Komodakis N., What to hide from your students: Attention-guided masked image modeling, ECCV, pp. 300-318, (2022)
[6]  
Liu Z., Gui J., Luo H., Good helper is around you: Attention-driven masked image modeling, AAAI, pp. 1799-1807, (2023)
[7]  
Li G., Zheng H., Liu D., Wang C., Su B., Zheng C., Semmae: Semantic-guided masking for learning masked autoencoders, NeurIPS, pp. 14290-14302, (2022)
[8]  
Wang H., Song K., Fan J., Wang Y., Xie J., Zhang Z., Hard patches mining for masked image modeling, CVPR, pp. 10375-10385, (2023)
[9]  
Xu J., Lin Z., Zhou D., Yang Y., Liao X., Wu B., Chen G., Heng P.-A., DPPMask: Masked image modeling with determinantal point processes, (2023)
[10]  
Kang Q., Gao J., Li K., Lao Q., Deblurring masked autoencoder is better recipe for ultrasound image recognition, (2023)