XMP: A CROSS-ATTENTION MULTI-SCALE PERFORMER FOR FILE FRAGMENT CLASSIFICATION

被引：2

作者：

Park, Jeong Gyu ^{[1
]}

Liu, Sisung ^{[2
]}

Hong, Je Hyeong ^{[1
,2
]}

机构：

[1] Hanyang Univ, Dept Elect Engn, Seoul, South Korea

[2] Hanyang Univ, Dept Artificial Intelligence, Seoul, South Korea

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024 | 2024年

关键词：

file fragment classification; Transformer; multi-scale attention; cross-attention; performer;

D O I：

10.1109/ICASSP48485.2024.10447626

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

File fragment classification (FFC) is the task of identifying the file type given a small fraction of binary data, and serves a crucial role in digital forensics and cybersecurity. Recent studies have adopted convolutional neural networks (CNNs) for this problem, significantly improving the accuracy over the traditional methods relying on handcrafted features. In this paper, we aim to expand on the recent performance gain by better leveraging the large amount of digital files available for training. We propose to achieve this by employing a Transformer encoder-based network known for its weak inductive bias suited for large-scale training. Our model, XMP, is inspired by the CrossViT architecture for image recognition and utilizes multi-scale self and cross-attentions between CNN features extracted from the byte n-grams of input binary data. Experimental results on the latest public dataset show XMP achieving state-of-the-art accuracies in almost all scenarios without need for additional preprocessing of binary data such as bit shifting, demonstrating the effectiveness of the Transformer-based architecture for FFC. The benefit of each proposed component is assessed through ablation study. Our code is available at github.com/pank40/xmp.

引用

页码：4505 / 4509

页数：5

共 50 条

[21] Cross-supervised Crowd Counting via Multi-scale Channel Attention
Yang, Kexin
Luan, Fangjun
Yuan, Shuai
Liu, Guoqi
[J]. INFORMATION TECHNOLOGY AND CONTROL, 2024, 53 (03): : 785 - 797
[22] Interactive CNN and Transformer-Based Cross-Attention Fusion Network for Medical Image Classification
Cai, Shu
Zhang, Qiude
Wang, Shanshan
Hu, Junjie
Zeng, Liang
Li, Kaiyan
[J]. INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2025, 35 (03)
[23] Enhancing Medical Image Classification With Context Modulated Attention and Multi-Scale Feature Fusion
Zhang, Renhan
Luo, Xuegang
Lv, Junrui
Cao, Junyang
Zhu, Yangping
Wang, Juan
Zheng, Bochuan
[J]. IEEE ACCESS, 2025, 13 : 15226 - 15243
[24] Multimodal Dual Cross-Attention Fusion Strategy for Autonomous Garbage Classification System
Xu, Huxiu
Tang, Wei
Li, Zhaoyang
Qin, Kecheng
Zou, Jun
[J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (11) : 13319 - 13329
[25] Bridging CNN and Transformer With Cross-Attention Fusion Network for Hyperspectral Image Classification
Xu, Fulin
Mei, Shaohui
Zhang, Ge
Wang, Nan
Du, Qian
[J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
[26] Multi-Scale Attention Network for Image Cropping
Lian, Tianpei
Xian, Ke
Pan, Zhiyu
Hong, Chaoyi
Cao, Zhiguo
Zhong, Weicai
[J]. 2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 2640 - 2645
[27] MSTrack: Visual Tracking with Multi-scale Attention
Song, Chunlin
Yao, Yu
Guo, Jianhui
Li, Lunbo
[J]. PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024, 2024, : 337 - 344
[28] Multi-Scale Attention for Audio Question Answering
Li, Guangyao
Xu, Yixin
Hu, Di
[J]. INTERSPEECH 2023, 2023, : 3442 - 3446
[29] Multi-Scale Spatial Perception Attention Network for Few-Shot Hyperspectral Image Classification
Li, Yang
Luo, Jian
Long, Haoyu
Jin, Qianqian
[J]. IEEE ACCESS, 2024, 12 : 173076 - 173090
[30] Few-shot Image Classification Algorithm Based on Multi-scale Attention and Residual Network
Wang, Qi
Jin, Huazhong
Yan, Meng
Li, Lin
[J]. 2023 3RD ASIA-PACIFIC CONFERENCE ON COMMUNICATIONS TECHNOLOGY AND COMPUTER SCIENCE, ACCTCS, 2023, : 641 - 645

← 1 2 3 4 5 →