Seeing the world from its words: All-embracing Transformers for fingerprint-based indoor localization

被引:2
作者
Nguyen, Son Minh [1 ]
Le, Duc Viet [1 ]
Havinga, Paul J. M. [1 ,2 ]
机构
[1] Univ Twente, Enschede, Netherlands
[2] TNO, The Hague, Netherlands
基金
欧盟地平线“2020”;
关键词
RSS fingerprints; Indoor localization; Deep learning; Transformers;
D O I
10.1016/j.pmcj.2024.101912
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present all-embracing Transformers (AaTs) that are capable of deftly manipulating attention mechanism for Received Signal Strength (RSS) fingerprints in order to invigorate localizing performance. Since most machine learning models applied to the RSS modality do not possess any attention mechanism, they can merely capture superficial representations. Moreover, compared to textual and visual modalities, the RSS modality is inherently notorious for its sensitivity to environmental dynamics. Such adversities inhibit their access to subtle but distinct representations that characterize the corresponding location, ultimately resulting in significant degradation in the testing phase. In contrast, a major appeal of AaTs is the ability to focus exclusively on relevant anchors in RSS sequences, allowing full rein to the exploitation of subtle and distinct representations for specific locations. This also facilitates disregarding redundant clues formed by noisy ambient conditions, thus enhancing accuracy in localization. Apart from that, explicitly resolving the representation collapse ( i.e. , none-informative or homogeneous features, and gradient vanishing) can further invigorate the self-attention process in transformer blocks, by which subtle but distinct representations to specific locations are radically captured with ease. For that purpose, we first enhance our proposed model with two sub-constraints, namely covariance and variance losses at the Anchor2Vec . The proposed constraints are automatically mediated with the primary task towards a novel multi -task learning manner. In an advanced manner, we present further the ultimate in design with a few simple tweaks carefully crafted for transformer encoder blocks. This effort aims to promote representation augmentation via stabilizing the inflow of gradients to these blocks. Thus, the problems of representation collapse in regular Transformers can be tackled. To evaluate our AaTs, we compare the models with the state -of -the -art (SoTA) methods on three benchmark indoor localization datasets. The experimental results confirm our hypothesis and show that our proposed models could deliver much higher and more stable accuracy.
引用
收藏
页数:16
相关论文
共 60 条
  • [1] WiDeep: WiFi-based Accurate and Robust Indoor Localization System using Deep Learning
    Abbas, Moustafa
    Elhamshary, Moustafa
    Rizk, Hamada
    Torki, Marwan
    Youssef, Moustafa
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS (PERCOM), 2019,
  • [2] Bahl P., 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064), P775, DOI 10.1109/INFCOM.2000.832252
  • [3] Brown TB, 2020, ADV NEUR IN, V33
  • [4] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [5] CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
    Chen, Chun-Fu
    Fan, Quanfu
    Panda, Rameswar
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 347 - 356
  • [6] ConViT: improving vision transformers with soft convolutional inductive biases
    d'Ascoli, Stephane
    Touvron, Hugo
    Leavitt, Matthew L.
    Morcos, Ari S.
    Biroli, Giulio
    Sagun, Levent
    [J]. JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2022, 2022 (11):
  • [7] StyTr2: Image Style Transfer with Transformers
    Deng, Yingying
    Tang, Fan
    Dong, Weiming
    Ma, Chongyang
    Pan, Xingjia
    Wang, Lei
    Xu, Changsheng
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11316 - 11326
  • [8] Devlin J, 2019, Arxiv, DOI arXiv:1810.04805
  • [9] Dosovitskiy A, 2021, Arxiv, DOI arXiv:2010.11929
  • [10] Ek S, 2022, Arxiv, DOI arXiv:2209.11750