SAGHOG: Self-supervised Autoencoder for Generating HOG Features for Writer Retrieval

被引:0
|
作者
Peer, Marco [1 ]
Kleber, Florian [1 ]
Sablatnig, Robert [1 ]
机构
[1] TU Wien, Comp Vis Lab, Vienna, Austria
来源
DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT II | 2024年 / 14805卷
关键词
Writer Retrieval; Self-Supervised Learning; Masked Autoencoder; Document Analysis;
D O I
10.1007/978-3-031-70536-6_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces Saghog, a self-supervised pretraining strategy for writer retrieval using HOG features of the binarized input image. Our preprocessing involves the application of the Segment Anything technique to extract handwriting from various datasets, ending up with about 24k documents, followed by training a vision transformer on reconstructing masked patches of the handwriting. Saghog is then finetuned by appending NetRVLAD as an encoding layer to the pretrained encoder. Evaluation of our approach on three historical datasets, Historical-WI, HisFrag20, and GRK-Papyri, demonstrates the effectiveness of Saghog for writer retrieval. Additionally, we provide ablation studies on our architecture and evaluate un- and supervised finetuning. Notably, on HisFrag20, Saghog outperforms related work with a mAP of 57.2% - a margin of 11.6% to the current state of the art, showcasing its robustness on challenging data, and is competitive on even small datasets, e.g. GRK-Papyri, where we achieve a Top-1 accuracy of 58.0%.
引用
收藏
页码:121 / 138
页数:18
相关论文
共 50 条
  • [1] Self-supervised Vision Transformers for Writer Retrieval
    Raven, Tim
    Matei, Arthur
    Fink, Gernot A.
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT II, 2024, 14805 : 380 - 396
  • [2] Autoencoder-based self-supervised hashing for cross-modal retrieval
    Li, Yifan
    Wang, Xuan
    Cui, Lei
    Zhang, Jiajia
    Huang, Chengkai
    Luo, Xuan
    Qi, Shuhan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 17257 - 17274
  • [3] Autoencoder-based self-supervised hashing for cross-modal retrieval
    Yifan Li
    Xuan Wang
    Lei Cui
    Jiajia Zhang
    Chengkai Huang
    Xuan Luo
    Shuhan Qi
    Multimedia Tools and Applications, 2021, 80 : 17257 - 17274
  • [4] Feature Extraction using Self-Supervised Convolutional Autoencoder for Content based Image Retrieval
    Siradjuddin, Indah Agustien
    Wardana, Wrida Adi
    Sophan, Mochammad Kautsar
    2019 3RD INTERNATIONAL CONFERENCE ON INFORMATICS AND COMPUTATIONAL SCIENCES (ICICOS 2019), 2019,
  • [5] Context Autoencoder for Self-supervised Representation Learning
    Xiaokang Chen
    Mingyu Ding
    Xiaodi Wang
    Ying Xin
    Shentong Mo
    Yunhao Wang
    Shumin Han
    Ping Luo
    Gang Zeng
    Jingdong Wang
    International Journal of Computer Vision, 2024, 132 : 208 - 223
  • [6] Context Autoencoder for Self-supervised Representation Learning
    Chen, Xiaokang
    Ding, Mingyu
    Wang, Xiaodi
    Xin, Ying
    Mo, Shentong
    Wang, Yunhao
    Han, Shumin
    Luo, Ping
    Zeng, Gang
    Wang, Jingdong
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 132 (1) : 208 - 223
  • [7] Self-supervised Variational Autoencoder for Recommender Systems
    Wang, Jing
    Liu, Gangdu
    Wu, Jun
    Jia, Caiyan
    Zhang, Zhifei
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 831 - 835
  • [8] A Self-supervised Graph Autoencoder with Barlow Twins
    Li, Jingci
    Lu, Guangquan
    Li, Jiecheng
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2022, 13630 : 501 - 512
  • [9] Self-supervised Vision Transformers with Data Augmentation Strategies Using Morphological Operations for Writer Retrieval
    Peer, Marco
    Kleber, Florian
    Sablatnig, Robert
    FRONTIERS IN HANDWRITING RECOGNITION, ICFHR 2022, 2022, 13639 : 122 - 136
  • [10] Mixed Autoencoder for Self-supervised Visual Representation Learning
    Chen, Kai
    Liu, Zhili
    Hong, Lanqing
    Xu, Hang
    Li, Zhenguo
    Yeung, Dit-Yan
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22742 - 22751