Writer Retrieval using Compact Convolutional Transformers and NetMVLAD

被引:2
|
作者
Peer, Marco [1 ]
Kleber, Florian [1 ]
Sablatnig, Robert [1 ]
机构
[1] TU Wien, Inst Visual Comp & Human Ctr Technol, Comp Vis Lab, Vienna, Austria
来源
2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2022年
关键词
IDENTIFICATION; FEATURES;
D O I
10.1109/ICPR56361.2022.9956155
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a method for writer retrieval where embeddings of patches extracted at SIFT keypoint locations are learned by a Compact Convolutional Transformer (CCT), a modified attention-based transformer architecture including convolutions, followed by a NetMVLAD layer and Generalized Max Pooling (GMP) to obtain global page descriptors. We introduce the application of CCTs for writer retrieval and show that they outperform Convolutional Neural Networks (CNNs) used in current State-of-the-Art methods for writer retrieval, namely ResNet18, while at the same time only have one-third of the number of parameters. Additionally, we propose NetMVLAD, an extension of NetVLAD with multiple vocabularies, to encode information with different vocabulary sizes improving the original NetVLAD. An evaluation of the performance of CCTs compared to ResNet18 is provided on the ICDAR2013 Competition on Writer Identification dataset (ICDAR2013) and CVL dataset. The effect of multiple vocabularies applied within the NetVLAD layer is shown. CCT7 pretrained on CIFAR100 combined with NetMVLAD achieves 89.3% Mean Average Precision (mAP) on the ICDAR2013 dataset and 96.5% on the CVL dataset.
引用
收藏
页码:1571 / 1578
页数:8
相关论文
共 50 条
  • [1] Self-supervised Vision Transformers for Writer Retrieval
    Raven, Tim
    Matei, Arthur
    Fink, Gernot A.
    DOCUMENT ANALYSIS AND RECOGNITION-ICDAR 2024, PT II, 2024, 14805 : 380 - 396
  • [2] Self-supervised Vision Transformers with Data Augmentation Strategies Using Morphological Operations for Writer Retrieval
    Peer, Marco
    Kleber, Florian
    Sablatnig, Robert
    FRONTIERS IN HANDWRITING RECOGNITION, ICFHR 2022, 2022, 13639 : 122 - 136
  • [3] Writer Identification and Writer Retrieval Using Vision Transformer for Forensic Documents
    Koepf, Michael
    Kleber, Florian
    Sablatnig, Robert
    DOCUMENT ANALYSIS SYSTEMS, DAS 2022, 2022, 13237 : 352 - 366
  • [4] CUE: Compound Uniform Encoding for Writer Retrieval
    Luo, Jiakai
    Lu, Hongwei
    Nie, Xin
    Liu, Shenghao
    Deng, Xianjun
    Zhu, Chenlu
    2022 18TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING, MSN, 2022, : 773 - 780
  • [5] Improved writer retrieval in handwritten documents using hybrid combination
    Bouibed, Mohamed Lamine
    Nemmour, Hassiba
    Arab, Naouel
    Chibani, Youcef
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (26) : 68671 - 68695
  • [6] Re-Ranking for Writer Identification and Writer Retrieval
    Jordan, Simon
    Seuret, Mathias
    Kral, Pavel
    Lenc, Ladislav
    Martinek, Jiri
    Wiermann, Barbara
    Schwinger, Tobias
    Maier, Andreas
    Christlein, Vincent
    DOCUMENT ANALYSIS SYSTEMS, 2020, 12116 : 572 - 586
  • [7] Text-independent writer identification using convolutional neural network
    Hung Tuan Nguyen
    Cuong Tuan Nguyen
    Ino, Takeya
    Indurkhya, Bipin
    Nakagawa, Masaki
    PATTERN RECOGNITION LETTERS, 2019, 121 : 104 - 112
  • [8] Retrieval of striated toolmarks using convolutional neural networks
    Keglevic, Manuel
    Sablatnig, Robert
    IET COMPUTER VISION, 2017, 11 (07) : 613 - 619
  • [9] Writer Based Handwritten Document Image Retrieval Using Contour let Transform
    Shirdhonkar, M. S.
    Kokare, Manesh B.
    ADVANCES IN DIGITAL IMAGE PROCESSING AND INFORMATION TECHNOLOGY, 2011, 205 : 108 - +
  • [10] Multiple writer retrieval systems based on language independent dissimilarity learning
    Bouibed, Mohamed Lamine
    Nemmour, Hassiba
    Chibani, Youcef
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 143