Robust Representation Learning via Sparse Attention Mechanism for Similarity Models

被引:0
|
作者
Ermilova, Alina [1 ]
Baramiia, Nikita [1 ]
Kornilov, Valerii [1 ]
Petrakov, Sergey [1 ]
Zaytsev, Alexey [1 ,2 ]
机构
[1] Skolkovo Inst Sci & Technol, Moscow 121205, Russia
[2] Sber, Risk Management, Moscow 121165, Russia
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Transformers; Oil insulation; Task analysis; Time series analysis; Meteorology; Training; Deep learning; Representation learning; efficient transformer; robust transformer; representation learning; similarity learning; TRANSFORMER;
D O I
10.1109/ACCESS.2024.3418779
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The attention-based models are widely used for time series data. However, due to the quadratic complexity of attention regarding input sequence length, the application of Transformers is limited by high resource demands. Moreover, their modifications for industrial time series need to be robust to missing or noisy values, which complicates the expansion of their application horizon. To cope with these issues, we introduce the class of efficient Transformers named Regularized Transformers (Reguformers). We implement the regularization technique inspired by the dropout ideas to improve robustness and reduce computational expenses without significantly modifying the pipeline. The focus in our experiments is on oil&gas data. For well-interval similarity task, our best Reguformer configuration reaches ROC AUC 0.97, which is comparable to Informer (0.978) and outperforms baselines: the previous LSTM model (0.934), the classical Transformer model (0.967), and three recent most promising modifications of the original Transformer, namely, Performer (0.949), LRformer (0.955), and DropDim (0.777). We also conduct the corresponding experiments on three additional datasets from different domains and obtain superior results. The increase in the quality of the best Reguformer relative to Transformer for different datasets varies from 3.7% to 9.6%, while the increase range relative to Informer is wider: from 1.7% to 18.4%.
引用
收藏
页码:97833 / 97850
页数:18
相关论文
共 50 条
  • [21] Improved Sparse Representation based Robust Hybrid Feature Extraction Models with Transfer and Deep Learning for EEG Classification
    Prabhakar, Sunil Kumar
    Lee, Seong-Whan
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 198
  • [22] Face Recognition via Deep Learning and Constraint Sparse Representation
    Zhang J.-W.
    Niu S.-Z.
    Cao Z.-Y.
    Wang X.-Y.
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2019, 39 (03): : 255 - 261
  • [23] DR-FER: Discriminative and Robust Representation Learning for Facial Expression Recognition
    Li, Ming
    Fu, Huazhu
    He, Shengfeng
    Fan, Hehe
    Liu, Jun
    Keppo, Jussi
    Shou, Mike Zheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6297 - 6309
  • [24] FL-GNNs: Robust Network Representation via Feature Learning Guided Graph Neural Networks
    Wang, Beibei
    Jiang, Bo
    Ding, Chris
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2024, 11 (01): : 750 - 760
  • [25] MuSAM: Mutual-Scenario-Aware Multimodal-Enhanced Representation Learning for Semantic Similarity
    Lai, Pei-Yuan
    Dai, Qing-Yun
    Liao, De-Zhang
    Wang, Zeng-Hui
    Wang, Chang-Dong
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (09) : 11161 - 11170
  • [26] Meta-Learning Based Tasks Similarity Representation for Cross Domain Lifelong Learning
    Shen, Mingge
    Chen, Dehu
    Ren, Teng
    IEEE ACCESS, 2023, 11 : 36692 - 36701
  • [27] Depth as Attention for Face Representation Learning
    Uppal, Hardik
    Sepas-Moghaddam, Alireza
    Greenspan, Michael
    Etemad, Ali
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2021, 16 : 2461 - 2476
  • [28] MSGA-Net: Progressive Feature Matching via Multi-Layer Sparse Graph Attention
    Gong, Zhepeng
    Xiao, Guobao
    Shi, Ziwei
    Chen, Riqing
    Yu, Jun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5765 - 5775
  • [29] TransPath: Representation Learning for Heterogeneous Information Networks via Translation Mechanism
    Fang, Yang
    Zhao, Xiang
    Tan, Zhen
    Xiao, Weidong
    IEEE ACCESS, 2018, 6 : 20712 - 20721
  • [30] Linking Sparse Coding Dictionaries for Representation Learning
    Barari, Nicki
    Kim, Edward
    2021 INTERNATIONAL CONFERENCE ON REBOOTING COMPUTING (ICRC 2021), 2021, : 84 - 87