Human Related Information Extraction from Chinese Archive Images

被引:0
作者
Jin, Xin [1 ]
Yin, Hangbing [1 ]
Chen, Xiaoyu [2 ]
Bi, Huimin [1 ]
Xiao, Chaoen [1 ]
Liu, Yijian [1 ]
机构
[1] Beijing Elect Sci & Technol Inst, 7 Fufeng St, Beijing 100070, Peoples R China
[2] Grp Corp Ltd, Informat Ctr China North Ind, Beijing 100089, Peoples R China
来源
ARTIFICIAL INTELLIGENCE AND ROBOTICS, ISAIR 2023 | 2024年 / 1998卷
关键词
File image processing; Certificate photo extraction; PP-OCR; YOLOv5; Key information extraction;
D O I
10.1007/978-981-99-9109-9_14
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rise of the information age, digitalization and paperless processes have become the norm in managing archive images. However, research on extracting and managing image content from archives is still in its early stages, and is primarily focused on recognizing fixed-format archive images. As a result, there is a lack of technology for extracting key personal information applicable to all types of archive images. To address this, we have identified two main tasks: extracting identity photos and key personal information. To ensure confidentiality of real data, we created a dataset that simulates certificate photo files. We then used a YOLOv5-based object detection network to train a model that can detect document photos in archive images. We also used a combination of PP-OCR text recognition and object detection to extract key information from document images.
引用
收藏
页码:139 / 146
页数:8
相关论文
共 10 条
  • [1] Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
  • [2] Du YN, 2020, Arxiv, DOI arXiv:2009.09941
  • [3] A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection
    Duy Thanh Nguyen
    Tuan Nghia Nguyen
    Kim, Hyun
    Lee, Hyuk-Jae
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2019, 27 (08) : 1861 - 1873
  • [4] Multi-Type-TD-TSR - Extracting Tables from Document Images Using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: From OCR to Structured Table Representations
    Fischer, Pascal
    Smajic, Alen
    Abrami, Giuseppe
    Mehler, Alexander
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, KI 2021, 2021, 12873 : 95 - 108
  • [5] Liao MH, 2019, Arxiv, DOI arXiv:1911.08947
  • [6] Liao MH, 2020, AAAI CONF ARTIF INTE, V34, P11474
  • [7] Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning
    Sun, Yipeng
    Liu, Jiaming
    Liu, Wei
    Han, Junyu
    Ding, Errui
    Liu, Jingtuo
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9085 - 9094
  • [8] Cross-Modal Dynamic Networks for Video Moment Retrieval With Text Query
    Wang, Gongmian
    Xu, Xing
    Shen, Fumin
    Lu, Huimin
    Ji, Yanli
    Shen, Heng Tao
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1221 - 1232
  • [9] Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval
    Xu, Xing
    Lu, Huimin
    Song, Jingkuan
    Yang, Yang
    Shen, Heng Tao
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (06) : 2400 - 2413
  • [10] Generalized Label Enhancement With Sample Correlations
    Zheng, Qinghai
    Zhu, Jihua
    Tang, Haoyu
    Liu, Xinyuan
    Li, Zhongyu
    Lu, Huimin
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (01) : 482 - 495