A Character Position-Aware Compression Framework for Screen Text Image

被引:0
|
作者
Zhu, Chen [1 ]
Lu, Guo [1 ]
Chen, Huanbang [2 ]
Feng, Donghui [1 ]
Wang, Shen [1 ]
Zhao, Yan [1 ]
Xie, Rong [1 ]
Song, Li [1 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China
[2] Huawei Technol Co Ltd, Shenzhen 518129, Peoples R China
[3] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, Shanghai 200240, Peoples R China
关键词
Screen content coding; text detection; motion vector prediction and coding; in-loop filter; SCENE TEXT; SEGMENTATION; PREDICTION;
D O I
10.1109/TCSVT.2024.3379675
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Text patterns typically exhibit distinct boundaries and sparse color histograms. However, in current hybrid codec frameworks, the positions of coding units are often misaligned with the text patterns, resulting in prediction and color mapping tools consuming a large number of bits to indicate these patterns. Nowadays, some text detection and recognition methods have been proposed to accurately locate and analyze the text regions in screen images. Combined with these techniques, we propose a character position-aware compression framework for screen text image. On the encoder side, a low-complexity detection method is adopted to locate the text characters. Then it copies the detected characters to the position aligned with the coding unit (CU) grid to form a text layer. This text-layer representation can further increase the efficiency of existing screen content coding tools such as Intra Block Copy (IBC). Moreover, we design several compression tools based on this representation. We extend the two Motion Vector (MV) prediction modes: Adaptive Motion Vector Prediction (AMVP) and Merge. We modify the MV encoding syntax according to the layout characteristics of the text layer. We present a Gradient-guided In-loop Filter (GIF) to sharpen the text lines using a convolutional network. Experiments conducted on VVC reference software VTM all_intra configuration show that the proposed framework can achieve an average bitrate savings of 4.6% and 3.6% under the w/GIF and w/o GIF versions, with a corresponding increase in CPU encoding complexity of 72% and 10%.
引用
收藏
页码:8821 / 8835
页数:15
相关论文
共 50 条
  • [31] SEPAKE: a structure-enhanced and position-aware knowledge embedding framework for knowledge graph completion
    Mei Yu
    Tingxu Jiang
    Jian Yu
    Mankun Zhao
    Jiujiang Guo
    Ming Yang
    Ruiguo Yu
    Xuewei Li
    Applied Intelligence, 2023, 53 : 23113 - 23123
  • [32] PAS: A Position-Aware Similarity Measurement for Sequential Recommendation
    Zeng, Zijie
    Lin, Jing
    Pan, Weike
    Ming, Zhong
    Lu, Zhongqi
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [33] Position-Aware Relational Transformer for Knowledge Graph Embedding
    Li, Guangyao
    Sun, Zequn
    Hu, Wei
    Cheng, Gong
    Qu, Yuzhong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 11580 - 11594
  • [34] A POSITION-AWARE LINEAR SOLID CONSTITUTIVE MODEL FOR PERIDYNAMICS
    Mitchell, John A.
    Silling, Stewart A.
    Littlewood, David J.
    JOURNAL OF MECHANICS OF MATERIALS AND STRUCTURES, 2015, 10 (05) : 539 - 557
  • [35] ISDA: POSITION-AWARE INSTANCE SEGMENTATION WITH DEFORMABLE ATTENTION
    Ying, Kaining
    Wang, Zhenhua
    Bai, Cong
    Zhou, Pengfei
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2619 - 2623
  • [36] Position-aware compositional embeddings for compressed recommendation systems
    Mu, Zongshen
    Zhuang, Yueting
    Tang, Siliang
    NEUROCOMPUTING, 2024, 592
  • [37] Position-Aware ListMLE: A Sequential Learning Process for Ranking
    Lan, Yanyan
    Zhu, Yadong
    Guo, Jiafeng
    Niu, Shuzi
    Cheng, Xueqi
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2014, : 449 - 458
  • [38] Position-Aware Recalibration Module: Learning From Feature Semantics and Feature Position
    Ma, Xu
    Fu, Song
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 797 - 803
  • [39] Position-Aware Masked Autoencoder for Histopathology WSI Representation Learning
    Wu, Kun
    Zheng, Yushan
    Shi, Jun
    Xie, Fengying
    Jiang, Zhiguo
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VI, 2023, 14225 : 714 - 724
  • [40] Unbiased Ad Click Prediction for Position-aware Advertising Systems
    Yuan, Bowen
    Liu, Yaxu
    Hsia, Jui-Yang
    Dong, Zhenhua
    Lin, Chih-Jen
    RECSYS 2020: 14TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2020, : 368 - 377