Robust Representation Learning via Sparse Attention Mechanism for Similarity Models

被引：0

作者：

Ermilova, Alina ^{[1
]}

Baramiia, Nikita ^{[1
]}

Kornilov, Valerii ^{[1
]}

Petrakov, Sergey ^{[1
]}

Zaytsev, Alexey ^{[1
,2
]}

机构：

[1] Skolkovo Inst Sci & Technol, Moscow 121205, Russia

[2] Sber, Risk Management, Moscow 121165, Russia

来源：

IEEE ACCESS | 2024年 / 12卷

关键词：

Transformers; Oil insulation; Task analysis; Time series analysis; Meteorology; Training; Deep learning; Representation learning; efficient transformer; robust transformer; representation learning; similarity learning; TRANSFORMER;

D O I：

10.1109/ACCESS.2024.3418779

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The attention-based models are widely used for time series data. However, due to the quadratic complexity of attention regarding input sequence length, the application of Transformers is limited by high resource demands. Moreover, their modifications for industrial time series need to be robust to missing or noisy values, which complicates the expansion of their application horizon. To cope with these issues, we introduce the class of efficient Transformers named Regularized Transformers (Reguformers). We implement the regularization technique inspired by the dropout ideas to improve robustness and reduce computational expenses without significantly modifying the pipeline. The focus in our experiments is on oil&gas data. For well-interval similarity task, our best Reguformer configuration reaches ROC AUC 0.97, which is comparable to Informer (0.978) and outperforms baselines: the previous LSTM model (0.934), the classical Transformer model (0.967), and three recent most promising modifications of the original Transformer, namely, Performer (0.949), LRformer (0.955), and DropDim (0.777). We also conduct the corresponding experiments on three additional datasets from different domains and obtain superior results. The increase in the quality of the best Reguformer relative to Transformer for different datasets varies from 3.7% to 9.6%, while the increase range relative to Informer is wider: from 1.7% to 18.4%.

引用

页码：97833 / 97850

页数：18

共 50 条

[1] Robust Multimodal Representation Learning With Evolutionary Adversarial Attention Networks
Huang, Feiran
Jolfaei, Alireza
Bashir, Ali Kashif
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2021, 25 (05) : 856 - 868
[2] Efficient time series adaptive representation learning via Dynamic Routing Sparse Attention
Wang, Wenyan
Zuo, Enguang
Chen, Chen
Chen, Cheng
Zhong, Jie
Yan, Ziwei
Lv, Xiaoyi
PATTERN RECOGNITION, 2025, 158
[3] Robust Meta-Representation Learning via Global Label Inference and Classification
Wang, Ruohan
Falk, John Isak Texas
Pontil, Massimiliano
Ciliberto, Carlo
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (04) : 1996 - 2010
[4] Towards Robust Knowledge Tracing Models via k-Sparse Attention
Huang, Shuyan
Liu, Zitao
Zhao, Xiangyu
Luo, Weiqi
Weng, Jian
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2441 - 2445
[5] Self-Supervised Time Series Representation Learning via Cross Reconstruction Transformer
Zhang, Wenrui
Yang, Ling
Geng, Shijia
Hong, Shenda
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) : 16129 - 16138
[6] Exploring Attention and Self-Supervised Learning Mechanism for Graph Similarity Learning
Wen, Guangqi
Gao, Xin
Tan, Wenhui
Cao, Peng
Yang, Jinzhu
Li, Weiping
Zaiane, Osmar R.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[7] Exploring attention mechanism for graph similarity learning
Tan, Wenhui
Gao, Xin
Li, Yiyang
Wen, Guangqi
Cao, Peng
Yang, Jinzhu
Li, Weiping
Zaiane, Osmar R.
KNOWLEDGE-BASED SYSTEMS, 2023, 276
[8] Representation Learning With Multi-Level Attention for Activity Trajectory Similarity Computation
Liu, An
Zhang, Yifan
Zhang, Xiangliang
Liu, Guanfeng
Zhang, Yanan
Li, Zhixu
Zhao, Lei
Li, Qing
Zhou, Xiaofang
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (05) : 2387 - 2400
[9] Generalized Conditional Similarity Learning via Semantic Matching
Shi, Yi
Li, Rui-Xiang
Gan, Le
Zhan, De-Chuan
Ye, Han-Jia
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (05) : 3847 - 3862
[10] Representation learning via serial robust autoencoder for domain adaptation
Yang, Shuai
Zhang, Yuhong
Wang, Hao
Li, Peipei
Hu, Xuegang
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 160

← 1 2 3 4 5 →