A comprehensive end-to-end computer vision framework for restoration and recognition of low-quality engineering drawings

被引:0
|
作者
Yang, Lvyang [1 ]
Zhang, Jiankang [2 ]
Li, Huaiqiang [2 ]
Ren, Longfei [2 ]
Yang, Chen [1 ]
Wang, Jingyu [1 ]
Shi, Dongyuan [1 ]
机构
[1] Huazhong Univ Sci & Technol, State Key Lab Adv Electromagnet Technol, Wuhan 430074, Hubei, Peoples R China
[2] Northwest Branch State Grid Corp China, Xian 710048, Shaanxi, Peoples R China
关键词
Collaborative learning; Computer vision; Deep learning; Engineering drawing; Graphical symbol recognition; Image restoration; CLASSIFICATION; DIGITIZATION; NETWORK;
D O I
10.1016/j.engappai.2024.108524
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The digitization of engineering drawings is crucial for efficient reuse, distribution, and archiving. Existing computer vision approaches for digitizing engineering drawings typically assume the input drawings have high quality. However, in reality, engineering drawings are often blurred and distorted due to improper scanning, storage, and transmission, which may jeopardize the effectiveness of existing approaches. This paper focuses on restoring and recognizing low-quality engineering drawings, where an end-to-end framework is proposed to improve the quality of the drawings and identify the graphical symbols on them. The framework uses K-means clustering to classify different engineering drawing patches into simple and complex texture patches based on their gray level co-occurrence matrix statistics. Computer vision operations and a modified Enhanced Super- Resolution Generative Adversarial Network (ESRGAN) model are then used to improve the quality of the two types of patches, respectively. A modified Faster Region-based Convolutional Neural Network (Faster R-CNN) model is used to recognize the quality-enhanced graphical symbols. Additionally, a multi-stage task-driven collaborative learning strategy is proposed to train the modified ESRGAN and Faster R-CNN models to improve the resolution of engineering drawings in the direction that facilitates graphical symbol recognition, rather than human visual perception. A synthetic data generation method is also proposed to construct quality-degraded samples for training the framework. Experiments on real-world electrical diagrams show that the proposed framework achieves an accuracy of 98.98% and a recall of 99.33%, demonstrating its superiority over previous approaches. Moreover, the framework is integrated into a widely-used power system software application to showcase its practicality. The reference codes and data can be found at https://github.com/Lattle-y/AIrecognition-for-lq-ed.git Future work will focus on improving the generalizability of the proposed framework to different quality degradation scenarios and extrapolating the application to different engineering domains.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] LWMD: A Comprehensive Compression Platform for End-to-End Automatic Speech Recognition Models
    Liu, Yukun
    Li, Ta
    Zhang, Pengyuan
    Yan, Yonghong
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [22] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework
    Busta, Michal
    Neumann, Lukas
    Matas, Jiri
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2223 - 2231
  • [23] Exploring End-to-End Techniques for Low-Resource Speech Recognition
    Bataev, Vladimir
    Korenevsky, Maxim
    Medennikov, Ivan
    Zatvornitskiy, Alexander
    SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 32 - 41
  • [24] META LEARNING FOR END-TO-END LOW-RESOURCE SPEECH RECOGNITION
    Hsu, Jui-Yang
    Chen, Yuan-Jui
    Lee, Hung-yi
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7844 - 7848
  • [25] Active Learning Methods for Low Resource End-To-End Speech Recognition
    Malhotra, Karan
    Bansal, Shubham
    Ganapathy, Sriram
    INTERSPEECH 2019, 2019, : 2215 - 2219
  • [26] Context-Aware Mathematical Expression Recognition: An End-to-End Framework and A Benchmark
    He, Wenhao
    Luo, Yuxuan
    Yin, Fei
    Hu, Han
    Han, Junyu
    Ding, Errui
    Liu, Cheng-Lin
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3246 - 3251
  • [27] An End-To-End Emotion Recognition Framework Based on Temporal Aggregation of Multimodal Information
    Radoi, Anamaria
    Birhala, Andreea
    Ristea, Nicolae-Catalin
    Dutu, Liviu-Cristian
    IEEE ACCESS, 2021, 9 : 135559 - 135570
  • [28] An End-to-End Deep Learning Framework with Speech Emotion Recognition of Atypical Individuals
    Tang, Dengke
    Zeng, Junlin
    Li, Ming
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 162 - 166
  • [29] Tibetan-Mandarin Bilingual Speech Recognition Based on End-to-End Framework
    Wang, Qingnan
    Guo, Wu
    Chen, Peixin
    Song, Yan
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1214 - 1217
  • [30] End-to-End Solution for Analog Gauge Monitoring Using Computer Vision in an IoT Platform
    Peixoto, Joao
    Sousa, Joao
    Carvalho, Ricardo
    Santos, Goncalo
    Cardoso, Ricardo
    Reis, Ana
    SENSORS, 2023, 23 (24)