Simulating cortical networks on heterogeneous multi-GPU systems

被引:7
|
作者
Nere, Andrew [1 ]
Franey, Sean [1 ]
Hashmi, Atif [1 ]
Lipasti, Mikko [1 ]
机构
[1] Univ Wisconsin, Dept Elect & Comp Engn, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
Cortical learning algorithms; CUDA; GPGPU; Profiling systems; RECEPTIVE-FIELDS; FUNCTIONAL ARCHITECTURE; MODEL;
D O I
10.1016/j.jpdc.2012.02.006
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recent advances in neuroscientific understanding have highlighted the highly parallel computation power of the mammalian neocortex. In this paper we describe a GPGPU-accelerated implementation of an intelligent learning model inspired by the structural and functional properties of the neocortex. Furthermore, we consider two inefficiencies inherent to our initial implementation and propose software optimizations to mitigate such problems. Analysis of our application's behavior and performance provides important insights into the GPGPU architecture, including the number of cores, the memory system, atomic operations, and the global thread scheduler. Additionally, we create a runtime profiling tool for the cortical network that proportionally distributes work across the host CPU as well as multiple GPGPUs available to the system. Using the profiling tool with these optimizations on Nvidia's CUDA framework, we achieve up to 60 x speedup over a single-threaded CPU implementation of the model. (c) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:953 / 971
页数:19
相关论文
共 50 条
  • [31] Multi-Objective Concurrent Kernel Scheduling for Multi-GPU Systems
    Alizadeh, Negar Baradar
    Momtazpour, Mahmoud
    2024 32ND INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, ICEE 2024, 2024, : 859 - 864
  • [32] Algorithmic skeletons for multi-core, multi-GPU systems and clusters
    Ernsting, Steffen
    Kuchen, Herbert
    International Journal of High Performance Computing and Networking, 2012, 7 (02) : 129 - 138
  • [33] Parallel Singular Value Decomposition on Heterogeneous Multi-core and Multi-GPU Platforms
    Feng, Xiaowen
    Jin, Hai
    Zheng, Ran
    Zhu, Lei
    2014 NINTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), 2014, : 45 - 50
  • [34] MAPREDUCE IMPLEMENTATION WITH MULTI-GPU
    Chen, Yi
    Chen, Su
    Jiang, Hai
    INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE & TECHNOLOGY: PROCEEDINGS, 2012, : 21 - 25
  • [35] Multi-GPU Graph Analytics
    Pan, Yuechao
    Wang, Yangzihao
    Wu, Yuduo
    Yang, Carl
    Owens, John D.
    2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 479 - 490
  • [36] PARTANS: An Autotuning Framework for Stencil Computation on Multi-GPU Systems
    Lutz, Thibaut
    Fensch, Christian
    Cole, Murray
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2013, 9 (04)
  • [37] High performance MRI simulations of motion on multi-GPU systems
    Xanthis, Christos G.
    Venetis, Ioannis E.
    Aletras, Anthony H.
    JOURNAL OF CARDIOVASCULAR MAGNETIC RESONANCE, 2014, 16
  • [38] Introducing and Implementing the Allpairs Skeleton for Programming Multi-GPU Systems
    Michel Steuwer
    Malte Friese
    Sebastian Albers
    Sergei Gorlatch
    International Journal of Parallel Programming, 2014, 42 : 601 - 618
  • [39] Consumer Level Multi-GPU Systems Utilization, Efficiency, and Optimization
    Ross, John Brandon
    2013 PROCEEDINGS OF IEEE SOUTHEASTCON, 2013,
  • [40] Performance Analysis of Parallel FFT on Large Multi-GPU Systems
    Ayala, Alan
    Tomov, Stan
    Stoyanov, Miroslav
    Haidar, Azzam
    Dongarra, Jack
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022), 2022, : 372 - 381