Simulating cortical networks on heterogeneous multi-GPU systems

被引:7
|
作者
Nere, Andrew [1 ]
Franey, Sean [1 ]
Hashmi, Atif [1 ]
Lipasti, Mikko [1 ]
机构
[1] Univ Wisconsin, Dept Elect & Comp Engn, Madison, WI 53706 USA
基金
美国国家科学基金会;
关键词
Cortical learning algorithms; CUDA; GPGPU; Profiling systems; RECEPTIVE-FIELDS; FUNCTIONAL ARCHITECTURE; MODEL;
D O I
10.1016/j.jpdc.2012.02.006
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recent advances in neuroscientific understanding have highlighted the highly parallel computation power of the mammalian neocortex. In this paper we describe a GPGPU-accelerated implementation of an intelligent learning model inspired by the structural and functional properties of the neocortex. Furthermore, we consider two inefficiencies inherent to our initial implementation and propose software optimizations to mitigate such problems. Analysis of our application's behavior and performance provides important insights into the GPGPU architecture, including the number of cores, the memory system, atomic operations, and the global thread scheduler. Additionally, we create a runtime profiling tool for the cortical network that proportionally distributes work across the host CPU as well as multiple GPGPUs available to the system. Using the profiling tool with these optimizations on Nvidia's CUDA framework, we achieve up to 60 x speedup over a single-threaded CPU implementation of the model. (c) 2012 Elsevier Inc. All rights reserved.
引用
收藏
页码:953 / 971
页数:19
相关论文
共 50 条
  • [21] Performance Optimization of Allreduce Operation for Multi-GPU Systems
    Nukada, Akira
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 3107 - 3112
  • [22] Efficient breadth first search on multi-GPU systems
    Mastrostefano, Enrico
    Bernaschi, Massimo
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (09) : 1292 - 1305
  • [23] Autonomous Execution for Multi-GPU Systems: Compiler Support
    Koç University, Istanbul, Turkey
    不详
    CA, United States
    Proc. SC -W: Workshops Int. Conf. High Perform. Comput., Netw., Storage Anal., (1129-1140):
  • [24] Tensor Movement Orchestration in Multi-GPU Training Systems
    Lin, Shao-Fu
    Chen, Yi-Jung
    Cheng, Hsiang-Yun
    Yang, Chia-Lin
    2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, 2023, : 1140 - 1152
  • [25] Gossip: Efficient Communication Primitives for Multi-GPU Systems
    Kobus, Robin
    Juenger, Daniel
    Hundt, Christian
    Schmidt, Bertil
    PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
  • [26] Solving Multiple Tridiagonal Systems on a Multi-GPU Platform
    Dieguez, Adrian P.
    Amor, Margarita
    Doallo, Ramon
    2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 759 - 763
  • [27] A multi-GPU algorithm for large-scale neuronal networks
    de Camargo, Raphael Y.
    Rozante, Luiz
    Song, Siang W.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (06): : 556 - 572
  • [28] New multi-GPU implementation for smoothed particle hydrodynamics on heterogeneous clusters
    Dominguez, J. M.
    Crespo, A. J. C.
    Valdez-Balderas, D.
    Rogers, B. D.
    Gomez-Gesteira, M.
    COMPUTER PHYSICS COMMUNICATIONS, 2013, 184 (08) : 1848 - 1860
  • [29] Parallel Generation of Digitally Reconstructed Radiographs on Heterogeneous Multi-GPU Workstations
    Abdellah, Marwan
    Abdelaziz, Asem
    Ali, Eslam
    Abdelaziz, Sherief
    Sayed, Abdelrahman
    Owis, Mohamed I.
    Eldeib, Ayman
    2016 38TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2016, : 3953 - 3956
  • [30] HPSM: A Programming Framework for Multi-CPU and Multi-GPU Systems
    Lima, Joao V. F.
    Di Domenico, Daniel
    2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 31 - 36