Novel many-core architecture design for real-time image processing

被引:0
作者
Liu, Zhentao [1 ]
Li, Tao [2 ]
Huang, Hucai [2 ]
Han, Jungang [2 ]
Shen, Xubang [1 ]
机构
[1] School of Microelectronic, Xidian Univ., Xi'an
[2] School of Electronic Engineering, Xi'an Univ. of Posts and Telecommunications, Xi'an
来源
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University | 2015年 / 42卷 / 02期
关键词
Data-flow; Many-core; Parallel computin; Polymorphic reconfigurable architecture;
D O I
10.3969/j.issn.1001-2400.2015.02.016
中图分类号
学科分类号
摘要
Based on the data-flow model and hardware reconfigurable technology, a polymorphic reconfigurable many-core processor architecture is presented for image processing. It is a scalable hierarchically organized parallel architecture, which is capable of supporting a dynamic mixture of multiple parallel computing models, and overcomes the inefficiency of traditional data-flow implementation by using distributed shared memory and neighbor interconnect architecture with hardware handshaking. From the beginning of the architecture design, based on the VC++, the integrated simulation platform (ISE) is developed for verifying the architecture and the performance of the instruction set. In addition, we also implement the proposed architecture on the FPGA. Experimental results show that the architecture can be used in many image processing applications, and achieve the throughput close to that of the ASIC and the performance better than that of the GPU. ©, 2015, Science Press. All right reserved.
引用
收藏
页码:95 / 101
页数:6
相关论文
共 12 条
  • [1] Licciardo G.D., Albanese L.F., Design of a Context-adaptive Variable Length Encoder for Real-time Video Compression on Reconfigurable Platforms, IET Image Processing, 6, 4, pp. 301-308, (2012)
  • [2] Dixit H.V., Jeyakumar A., Kasat P.S., Et al., VLSI Design of Fast DCTQ-IQIDCT Processor for Real Time Image Compression, Proceedings of Tenth International Conference on Wireless and Optical Communications Networks, pp. 1-5, (2013)
  • [3] Orchard G., Zhang J., Suo Y., Et al., Real Time Compressive Sensing Video Reconstruction in Hardware, IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2, 3, pp. 604-615, (2012)
  • [4] Coates A., Baumstarck P., Le Q., Et al., Scalable Learning for Object Detection with GPU Hardware, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4287-4293, (2009)
  • [5] Vangal S., Howard J., Ruhl G., Et al., An 80-tile 1.28 TFLOPS Network-on-chip in 65nm CMOS, Proceedings of IEEE International Solid-State Circuits Conference, pp. 98-589, (2007)
  • [6] Hutchings B., Nelson B., West S., Et al., Comparing Fine-grained Performance on the Ambric MPPA Against an FPGA, Proceedings of International Conference on Field Programmable Logic and Applications, pp. 174-179, (2009)
  • [7] Zhang Y., Yan C., Dai F., Et al., Efficient Parallel Framework for H. 264/AVC Deblocking Filter on Many-core Platform, IEEE Transactions on Multimedia, 14, 3, pp. 510-524, (2012)
  • [8] Sankaralingam K., Nagarajan R., Liu H., Et al., Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture, Proceedings of 30th Annual International Symposium on Computer Architecture, pp. 422-433, (2003)
  • [9] Dennis B., Misunas D.P., A Preliminary Architecture for a Basic Data-flow Processor, ACM SIGARCH Computer Architecture News, 3, 4, pp. 126-132, (1974)
  • [10] Shen X., Evolution of MPP SoC Architecture Techniques, Science in China Series F: Information Sciences, 51, 6, pp. 756-764, (2008)