Human Related Information Extraction from Chinese Archive Images

被引：0

作者：

Jin, Xin ^{[1
]}

Yin, Hangbing ^{[1
]}

Chen, Xiaoyu ^{[2
]}

Bi, Huimin ^{[1
]}

Xiao, Chaoen ^{[1
]}

Liu, Yijian ^{[1
]}

机构：

[1] Beijing Elect Sci & Technol Inst, 7 Fufeng St, Beijing 100070, Peoples R China

[2] Grp Corp Ltd, Informat Ctr China North Ind, Beijing 100089, Peoples R China

来源：

ARTIFICIAL INTELLIGENCE AND ROBOTICS, ISAIR 2023 | 2024年 / 1998卷

关键词：

File image processing; Certificate photo extraction; PP-OCR; YOLOv5; Key information extraction;

D O I：

10.1007/978-981-99-9109-9_14

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the rise of the information age, digitalization and paperless processes have become the norm in managing archive images. However, research on extracting and managing image content from archives is still in its early stages, and is primarily focused on recognizing fixed-format archive images. As a result, there is a lack of technology for extracting key personal information applicable to all types of archive images. To address this, we have identified two main tasks: extracting identity photos and key personal information. To ensure confidentiality of real data, we created a dataset that simulates certificate photo files. We then used a YOLOv5-based object detection network to train a model that can detect document photos in archive images. We also used a combination of PP-OCR text recognition and object detection to extract key information from document images.

引用

页码：139 / 146

页数：8

共 10 条

[1] Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934
[2] Du YN, 2020, Arxiv, DOI arXiv:2009.09941
[3] A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection
Duy Thanh Nguyen
Tuan Nghia Nguyen
Kim, Hyun
Lee, Hyuk-Jae
[J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2019, 27 (08) : 1861 - 1873
[4] Multi-Type-TD-TSR - Extracting Tables from Document Images Using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: From OCR to Structured Table Representations
Fischer, Pascal
Smajic, Alen
Abrami, Giuseppe
Mehler, Alexander
[J]. ADVANCES IN ARTIFICIAL INTELLIGENCE, KI 2021, 2021, 12873 : 95 - 108
[5] Liao MH, 2019, Arxiv, DOI arXiv:1911.08947
[6] Liao MH, 2020, AAAI CONF ARTIF INTE, V34, P11474
[7] Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning
Sun, Yipeng
Liu, Jiaming
Liu, Wei
Han, Junyu
Ding, Errui
Liu, Jingtuo
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9085 - 9094
[8] Cross-Modal Dynamic Networks for Video Moment Retrieval With Text Query
Wang, Gongmian
Xu, Xing
Shen, Fumin
Lu, Huimin
Ji, Yanli
Shen, Heng Tao
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1221 - 1232
[9] Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval
Xu, Xing
Lu, Huimin
Song, Jingkuan
Yang, Yang
Shen, Heng Tao
Li, Xuelong
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (06) : 2400 - 2413
[10] Generalized Label Enhancement With Sample Correlations
Zheng, Qinghai
Zhu, Jihua
Tang, Haoyu
Liu, Xinyuan
Li, Zhongyu
Lu, Huimin
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (01) : 482 - 495

← 1 →