UAPT: an underwater acoustic target recognition method based on pre-trained Transformer

被引：1

作者：

Tang, Jun ^{[1
]}

Ma, Enxue ^{[1
]}

Qu, Yang ^{[1
]}

Gao, Wenbo ^{[1
]}

Zhang, Yuchen ^{[1
]}

Gan, Lin ^{[2
]}

机构：

[1] Tianjin Univ, Sch Civil Engn, Tianjin 300072, Peoples R China

[2] Northwestern Polytech Univ, Sch Automat, Xian 710072, Peoples R China

来源：

MULTIMEDIA SYSTEMS | 2025年 / 31卷 / 01期

关键词：

Underwater acoustic target recognition; Transformer; Transfer learning; Deep learning; Pre-train; MODEL;

D O I：

10.1007/s00530-024-01614-3

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The Convolutional Neural Network (CNN) model in underwater acoustic target recognition (UATR) research reveals limitations arising from its inability to capture long-distance dependencies, impeding its capacity to focus on global information within the underwater acoustic signal. In contrast, the Transformer model has progressively emerged as the optimal choice in various studies, owing to its exclusive dependence on the attention mechanism for extracting global features from input data. Limited research utilizing the Transformer model in UATR has relied on an early ViT model, while in this paper, two refined Transformer models, namely Swin Transformer and Biformer, are adopted as the foundational networks, and a novel Swin Biformer model is proposed by harnessing the strengths of the two. Experimental results demonstrate the consistent superiority of the three models over CNN and ViT in UATR, and the Swin Biformer model remarkably attains the highest recognition accuracy of 94.3% evaluated on a dataset constructed from the Deepship database. At the same time, this paper proposes a UATR method based on pre-trained Transformer, the effectiveness of which is underscored by experimental findings as a recognition accuracy of approximately 97% was achieved on a generalized dataset derived from the Shipsear database. Even with limited data samples and more stringent classification requirements, the method maintains a recognition accuracy of over 90%, all while significantly reducing the training duration.

引用

页数：15

共 44 条

[21] Kolen J.F., 2001, A field guide to dynamical recurrent neural networks, P237, DOI [DOI 10.1109/9780470544037.CH14, 10.1109/9780470544037.ch14]
[22] STM: Spectrogram Transformer Model for Underwater Acoustic Target Recognition
Li, Peng
Wu, Ji
Wang, Yongxian
Lan, Qiang
Xiao, Wenbin
[J]. JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (10)
[23] Underwater target recognition using convolutional recurrent neural networks with 3-D Mel-spectrogram and data augmentation
Liu, Feng
Shen, Tongsheng
Luo, Zailei
Zhao, Dexin
Guo, Shaojun
[J]. APPLIED ACOUSTICS, 2021, 178
[24] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Liu, Ze
Lin, Yutong
Cao, Yue
Hu, Han
Wei, Yixuan
Zhang, Zheng
Lin, Stephen
Guo, Baining
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9992 - 10002
[25] A Survey of Underwater Acoustic Target Recognition Methods Based on Machine Learning
Luo, Xinwei
Chen, Lu
Zhou, Hanlu
Cao, Hongli
[J]. JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (02)
[26] An Underwater Acoustic Target Recognition Method Based on Combined Feature With Automatic Coding and Reconstruction
Luo, Xinwei
Feng, Yulin
Zhang, Minghong
[J]. IEEE ACCESS, 2021, 9 : 63841 - 63854
[27] Kingma DP, 2014, Arxiv, DOI arXiv:1312.6114
[28] Palanisamy K, 2020, Arxiv, DOI arXiv:2007.11154
[29] Parmar N, 2018, PR MACH LEARN RES, V80
[30] Pre-trained models for natural language processing: A survey
Qiu XiPeng
Sun TianXiang
Xu YiGe
Shao YunFan
Dai Ning
Huang XuanJing
[J]. SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2020, 63 (10) : 1872 - 1897

← 1 2 3 4 5 →