A Character Position-Aware Compression Framework for Screen Text Image

被引:0
|
作者
Zhu, Chen [1 ]
Lu, Guo [1 ]
Chen, Huanbang [2 ]
Feng, Donghui [1 ]
Wang, Shen [1 ]
Zhao, Yan [1 ]
Xie, Rong [1 ]
Song, Li [1 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China
[2] Huawei Technol Co Ltd, Shenzhen 518129, Peoples R China
[3] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China
关键词
Screen content coding; text detection; motion vector prediction and coding; in-loop filter; SCENE TEXT; SEGMENTATION; PREDICTION;
D O I
10.1109/TCSVT.2024.3379675
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Text patterns typically exhibit distinct boundaries and sparse color histograms. However, in current hybrid codec frameworks, the positions of coding units are often misaligned with the text patterns, resulting in prediction and color mapping tools consuming a large number of bits to indicate these patterns. Nowadays, some text detection and recognition methods have been proposed to accurately locate and analyze the text regions in screen images. Combined with these techniques, we propose a character position-aware compression framework for screen text image. On the encoder side, a low-complexity detection method is adopted to locate the text characters. Then it copies the detected characters to the position aligned with the coding unit (CU) grid to form a text layer. This text-layer representation can further increase the efficiency of existing screen content coding tools such as Intra Block Copy (IBC). Moreover, we design several compression tools based on this representation. We extend the two Motion Vector (MV) prediction modes: Adaptive Motion Vector Prediction (AMVP) and Merge. We modify the MV encoding syntax according to the layout characteristics of the text layer. We present a Gradient-guided In-loop Filter (GIF) to sharpen the text lines using a convolutional network. Experiments conducted on VVC reference software VTM all_intra configuration show that the proposed framework can achieve an average bitrate savings of 4.6% and 3.6% under the w/GIF and w/o GIF versions, with a corresponding increase in CPU encoding complexity of 72% and 10%.
引用
收藏
页码:8821 / 8835
页数:15
相关论文
共 50 条
  • [41] Position-aware context attention for session-based recommendation
    Cao, Yi
    Zhang, Weifeng
    Song, Bo
    Pan, Weike
    Xu, Congfu
    NEUROCOMPUTING, 2020, 376 : 65 - 72
  • [42] PosKHG: A Position-Aware Knowledge Hypergraph Model for Link Prediction
    Zirui Chen
    Xin Wang
    Chenxu Wang
    Zhao Li
    Data Science and Engineering, 2023, 8 : 135 - 145
  • [43] Learning Perceptual Position-Aware Shapelets for Time Series Classification
    Le, Xuan-May
    Tran, Minh-Tuan
    Huynh, Van-Nam
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT VI, 2023, 13718 : 53 - 69
  • [44] Position-aware and Symmetry Enhanced GAN for Radial Distortion Correction
    Shi, Yongjie
    Tong, Xin
    Wen, Jingsi
    Zhao, He
    Ying, Xianghua
    Zha, Hongbin
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1701 - 1708
  • [45] Fast Collisionless Pattern Formation by Anonymous, Position-Aware Robots
    Lukovszki, Tamas
    Heide, Friedhelm Meyer auf der
    PRINCIPLES OF DISTRIBUTED SYSTEMS, OPODIS 2014, 2014, 8878 : 248 - 262
  • [46] Position-Aware Tooth Segmentation and Numbering with Prior Knowledge Injected
    Li, Changlin
    He, Jian
    Wang, Gaige
    Liu, Kuilong
    Yang, Changyuan
    CROSS-CULTURAL DESIGN, PT III, CCD 2023, 2023, 14024 : 457 - 475
  • [47] Position-aware Location Regression Network for Temporal Video Grounding
    Kim, Sunoh
    Yun, Kimin
    Choi, Jin Young
    2021 17TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS 2021), 2021,
  • [48] Position-aware lightweight object detectors with depthwise separable convolutions
    Libo Chang
    Shengbing Zhang
    Huimin Du
    Zhonglun You
    Shiyu Wang
    Journal of Real-Time Image Processing, 2021, 18 : 857 - 871
  • [49] Answer-focused and Position-aware Neural Question Generation
    Sun, Xingwu
    Liu, Jing
    Lyu, Yajuan
    He, Wei
    Ma, Yanjun
    Wang, Shi
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3930 - 3939
  • [50] PosKHG: A Position-Aware Knowledge Hypergraph Model for Link Prediction
    Chen, Zirui
    Wang, Xin
    Wang, Chenxu
    Li, Zhao
    DATA SCIENCE AND ENGINEERING, 2023, 8 (02) : 135 - 145