Robust Representation Learning via Sparse Attention Mechanism for Similarity Models

被引:0
|
作者
Ermilova, Alina [1 ]
Baramiia, Nikita [1 ]
Kornilov, Valerii [1 ]
Petrakov, Sergey [1 ]
Zaytsev, Alexey [1 ,2 ]
机构
[1] Skolkovo Inst Sci & Technol, Moscow 121205, Russia
[2] Sber, Risk Management, Moscow 121165, Russia
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Transformers; Oil insulation; Task analysis; Time series analysis; Meteorology; Training; Deep learning; Representation learning; efficient transformer; robust transformer; representation learning; similarity learning; TRANSFORMER;
D O I
10.1109/ACCESS.2024.3418779
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The attention-based models are widely used for time series data. However, due to the quadratic complexity of attention regarding input sequence length, the application of Transformers is limited by high resource demands. Moreover, their modifications for industrial time series need to be robust to missing or noisy values, which complicates the expansion of their application horizon. To cope with these issues, we introduce the class of efficient Transformers named Regularized Transformers (Reguformers). We implement the regularization technique inspired by the dropout ideas to improve robustness and reduce computational expenses without significantly modifying the pipeline. The focus in our experiments is on oil&gas data. For well-interval similarity task, our best Reguformer configuration reaches ROC AUC 0.97, which is comparable to Informer (0.978) and outperforms baselines: the previous LSTM model (0.934), the classical Transformer model (0.967), and three recent most promising modifications of the original Transformer, namely, Performer (0.949), LRformer (0.955), and DropDim (0.777). We also conduct the corresponding experiments on three additional datasets from different domains and obtain superior results. The increase in the quality of the best Reguformer relative to Transformer for different datasets varies from 3.7% to 9.6%, while the increase range relative to Informer is wider: from 1.7% to 18.4%.
引用
收藏
页码:97833 / 97850
页数:18
相关论文
共 50 条
  • [41] Facial Expression Recognition Based on Feature Representation Learning and Clustering-Based Attention Mechanism
    Jin, Lianghai
    Guo, Liyuan
    IEEE TRANSACTIONS ON BIOMETRICS, BEHAVIOR, AND IDENTITY SCIENCE, 2025, 7 (02): : 182 - 194
  • [42] Robust Visual Tracking via Multitask Sparse Correlation Filters Learning
    Nai, Ke
    Li, Zhiyong
    Gan, Yihui
    Wang, Qi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (01) : 502 - 515
  • [43] Robust Clustering Model Based on Attention Mechanism and Graph Convolutional Network
    Xia, Hui
    Shao, Shushu
    Hu, Chunqiang
    Zhang, Rui
    Qiu, Tie
    Xiao, Fu
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (05) : 5203 - 5215
  • [44] Unsupervised Visual Representation Learning via Multi-Dimensional Relationship Alignment
    Cheng, Haoyang
    Li, Hongliang
    Qiu, Heqian
    Wu, Qingbo
    Zhang, Xiaoliang
    Meng, Fanman
    Ngan, King Ngi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 1613 - 1626
  • [45] Sparse Online Learning of Image Similarity
    Gao, Xingyu
    Hoi, Steven C. H.
    Zhang, Yongdong
    Zhou, Jianshe
    Wan, Ji
    Chen, Zhenyu
    Li, Jintao
    Zhu, Jianke
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2017, 8 (05)
  • [46] ROBUST SELF-SUPERVISED SPEAKER REPRESENTATION LEARNING VIA INSTANCE MIX REGULARIZATION
    Kang, Woo Hyun
    Alam, Jahangir
    Fathan, Abderrahim
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6617 - 6621
  • [47] Two-Level Attention Model of Representation Learning for Fraud Detection
    Cao, Ruihao
    Liu, Guanjun
    Xie, Yu
    Jiang, Changjun
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2021, 8 (06) : 1291 - 1301
  • [48] DySAT: Deep Neural Representation Learning on Dynamic Graphs via Self-Attention Networks
    Sankar, Aravind
    Wu, Yanhong
    Gou, Liang
    Zhang, Wei
    Yang, Hao
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 519 - 527
  • [49] Attention driven multi-modal similarity learning
    Gao, Xinjian
    Mu, Tingting
    Goulermas, John Y.
    Wang, Meng
    INFORMATION SCIENCES, 2018, 432 : 530 - 542
  • [50] Denoising deep extreme learning machine for sparse representation
    Xiangyi Cheng
    Huaping Liu
    Xinying Xu
    Fuchun Sun
    Memetic Computing, 2017, 9 : 199 - 212