A comprehensive end-to-end computer vision framework for restoration and recognition of low-quality engineering drawings

被引:0
|
作者
Yang, Lvyang [1 ]
Zhang, Jiankang [2 ]
Li, Huaiqiang [2 ]
Ren, Longfei [2 ]
Yang, Chen [1 ]
Wang, Jingyu [1 ]
Shi, Dongyuan [1 ]
机构
[1] Huazhong Univ Sci & Technol, State Key Lab Adv Electromagnet Technol, Wuhan 430074, Hubei, Peoples R China
[2] Northwest Branch State Grid Corp China, Xian 710048, Shaanxi, Peoples R China
关键词
Collaborative learning; Computer vision; Deep learning; Engineering drawing; Graphical symbol recognition; Image restoration; CLASSIFICATION; DIGITIZATION; NETWORK;
D O I
10.1016/j.engappai.2024.108524
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The digitization of engineering drawings is crucial for efficient reuse, distribution, and archiving. Existing computer vision approaches for digitizing engineering drawings typically assume the input drawings have high quality. However, in reality, engineering drawings are often blurred and distorted due to improper scanning, storage, and transmission, which may jeopardize the effectiveness of existing approaches. This paper focuses on restoring and recognizing low-quality engineering drawings, where an end-to-end framework is proposed to improve the quality of the drawings and identify the graphical symbols on them. The framework uses K-means clustering to classify different engineering drawing patches into simple and complex texture patches based on their gray level co-occurrence matrix statistics. Computer vision operations and a modified Enhanced Super- Resolution Generative Adversarial Network (ESRGAN) model are then used to improve the quality of the two types of patches, respectively. A modified Faster Region-based Convolutional Neural Network (Faster R-CNN) model is used to recognize the quality-enhanced graphical symbols. Additionally, a multi-stage task-driven collaborative learning strategy is proposed to train the modified ESRGAN and Faster R-CNN models to improve the resolution of engineering drawings in the direction that facilitates graphical symbol recognition, rather than human visual perception. A synthetic data generation method is also proposed to construct quality-degraded samples for training the framework. Experiments on real-world electrical diagrams show that the proposed framework achieves an accuracy of 98.98% and a recall of 99.33%, demonstrating its superiority over previous approaches. Moreover, the framework is integrated into a widely-used power system software application to showcase its practicality. The reference codes and data can be found at https://github.com/Lattle-y/AIrecognition-for-lq-ed.git Future work will focus on improving the generalizability of the proposed framework to different quality degradation scenarios and extrapolating the application to different engineering domains.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] COMBINING END-TO-END AND ADVERSARIAL TRAINING FOR LOW-RESOURCE SPEECH RECOGNITION
    Drexler, Jennifer
    Glass, James
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 361 - 368
  • [42] Modular End-to-End Automatic Speech Recognition Framework for Acoustic-to-Word Model
    Liu, Qi
    Chen, Zhehuai
    Li, Hao
    Huang, Mingkun
    Lu, Yizhou
    Yu, Kai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2174 - 2183
  • [43] A Novel Approach to End-to-End Facial Recognition Framework with Virtual Search Engine ElasticSearch
    Dat Nguyen Van
    Son Nguyen Trung
    Anh Pham Thi Hong
    Thao Thu Hoang
    Ta Minh Thanh
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT III, 2021, 12951 : 454 - 470
  • [44] ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems
    Lin, Yi
    Yang, Bo
    Li, Linchao
    Guo, Dongyue
    Zhang, Jianwei
    Chen, Hu
    Zhang, Yi
    APPLIED SOFT COMPUTING, 2021, 112
  • [45] MC-OCR Challenge 2021: An end-to-end recognition framework for Vietnamese Receipts
    Hung Le
    Huy To
    Hung An
    Khanh Ho
    Khoa Nguyen
    Thua Nguyen
    Tien Do
    Thanh Duc Ngo
    Duy-Dinh Le
    2021 RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES (RIVF 2021), 2021, : 100 - 105
  • [46] A focus module-based lightweight end-to-end CNN framework for voiceprint recognition
    Velayuthapandian, Karthikeyan
    Subramoniam, Suja Priyadharsini
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (06) : 2817 - 2825
  • [47] A focus module-based lightweight end-to-end CNN framework for voiceprint recognition
    Karthikeyan Velayuthapandian
    Suja Priyadharsini Subramoniam
    Signal, Image and Video Processing, 2023, 17 : 2817 - 2825
  • [48] Low-Level Physiological Implications of End-to-End Learning of Speech Recognition
    de Gibson, Louise Coppieters
    Garner, Philip N.
    INTERSPEECH 2022, 2022, : 749 - 753
  • [49] DOMAIN ADAPTATION OF END-TO-END SPEECH RECOGNITION IN LOW-RESOURCE SETTINGS
    Samarakoon, Lahiru
    Mak, Brian
    Lam, Albert Y. S.
    2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 382 - 388
  • [50] GeometryMotion-Transformer: An End-to-End Framework for 3D Action Recognition
    Liu, Jiaheng
    Guo, Jinyang
    Xu, Dong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 5649 - 5661