Fast and scalable lock methods for video coding on many-core architecture

被引:3
|
作者
Xu, Weizhi [2 ,6 ]
Yu, Hui [3 ]
Lu, Dianjie [4 ]
Song, Fenglong [2 ]
Wang, Da [2 ]
Ye, Xiaochun [2 ]
Pei, Songwei [5 ]
Fan, Dongrui [2 ]
Xie, Hongtao [1 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Natl Engn Lab Informat Secur Technol, Beijing, Peoples R China
[2] Tsinghua Univ, Inst Microelect, Beijing 100084, Peoples R China
[3] Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing, Peoples R China
[4] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan, Peoples R China
[5] Beijing Univ Chem Technol, Dept Comp Sci & Technol, Beijing 100029, Peoples R China
[6] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China
关键词
Many-core; Hardware lock; Centralized lock; Distributed lock; Micro-benchmarks; Godson-T; Software lock; Single-core processor; SHARED-MEMORY MULTIPROCESSORS; HIGHLY PARALLEL FRAMEWORK; DEBLOCKING FILTER; HEVC; SYNCHRONIZATION; ALGORITHMS; PROCESSOR; PLATFORM;
D O I
10.1016/j.jvcir.2014.06.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many-core processors are good candidates for speeding up video coding because the parallelism of these applications can be exploited more efficiently by the many-core architecture. Lock methods are important for many-core architecture to ensure correct execution of the program and communication between threads on chip. The efficiency of lock method is critical to overall performance of chipped many-core processor. In this paper, we propose two types of hardware locks for on-chip many-core architecture, a centralized lock and a distributed lock. First, we design the architectures of centralized lock and distributed lock to implement the two hardware lock methods. Then, we evaluate the performance of the two hardware locks and a software lock by quantitative evaluation micro-benchmarks on a many-core processor simulator Godson-T. The experimental results show that the locks with dedicated hardware support have higher performance than the software lock, and the distributed hardware lock is more scalable than the centralized hardware lock. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:1758 / 1762
页数:5
相关论文
共 50 条
  • [21] Stitch: Fusible Heterogeneous Accelerators Enmeshed with Many-Core Architecture for Wearables
    Tan, Cheng
    Karunaratne, Manupa
    Mitra, Tulika
    Peh, Li-Shiuan
    2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, : 575 - 587
  • [22] Distributed SDN Architecture for NoC-based Many-core SoCs
    Ruaro, Marcelo
    Velloso, Nedison
    Jantsch, Axel
    Moraes, Fernando G.
    PROCEEDINGS OF THE 13TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NETWORKS-ON-CHIP (NOCS'19), 2019,
  • [23] NoC-based Many-Core Processor Using CUSPARC Architecture
    Soliman, Muhammad R.
    Fahmy, Hossam A. H.
    Habib, S. E. -D.
    2014 26TH INTERNATIONAL CONFERENCE ON MICROELECTRONICS (ICM), 2014, : 84 - 87
  • [24] BLOCK-BASED HARDWARE SCHEDULER DESIGN ON MANY-CORE ARCHITECTURE
    Ju, Lihan
    Pan, Ping
    Quan, Baixing
    Chen, Tianzhou
    Wu, Minghui
    2012 IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE), 2012, : 814 - 819
  • [25] A Highly Parallel Framework for HEVC Coding Unit Partitioning Tree Decision on Many-core Processors
    Yan, Chenggang
    Zhang, Yongdong
    Xu, Jizheng
    Dai, Feng
    Li, Liang
    Dai, Qionghai
    Wu, Feng
    IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (05) : 573 - 576
  • [26] Scalable Collision Detection Using p-Partition Fronts on Many-Core Processors
    Zhang, Xinyu
    Kim, Young J.
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2014, 20 (03) : 447 - 456
  • [27] Architecture and Evaluation of Low Power Many-Core SoC with Two 32-Core Clusters
    Miyamori, Takashi
    Xu, Hui
    Usui, Hiroyuki
    Hosoda, Soichiro
    Sano, Toru
    Yamamoto, Kazumasa
    Kodaka, Takeshi
    Nonogaki, Nobuhiro
    Ozaki, Nau
    Tanabe, Jun
    IEICE TRANSACTIONS ON ELECTRONICS, 2014, E97C (04): : 360 - 368
  • [28] A Highly-Efficient and Tightly-Connected Many-Core Overlay Architecture
    Ben Abdelhamid, Riadh
    Yamaguchi, Yoshiki
    Boku, Taisuke
    IEEE ACCESS, 2021, 9 : 65277 - 65292
  • [29] Godson-T: An Efficient Many-Core Architecture for Parallel Program Executions
    Dong-Rui Fan
    Nan Yuan
    Jun-Chao Zhang
    Yong-Bin Zhou
    Wei Lin
    Feng-Long Song
    Xiao-Chun Ye
    He Huang
    Lei Yu
    Guo-Ping Long
    Hao Zhang
    Lei Liu
    Journal of Computer Science and Technology, 2009, 24 : 1061 - 1073
  • [30] Towards optimal scheduling policy for heterogeneous memory architecture in many-core system
    Park, Geunchul
    Rho, Seungwoo
    Kim, Jik-Soo
    Nam, Dukyun
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (01): : 121 - 133