Simulating cortical networks on heterogeneous multi-GPU systems

被引：7

作者：

Nere, Andrew ^{[1
]}

Franey, Sean ^{[1
]}

Hashmi, Atif ^{[1
]}

Lipasti, Mikko ^{[1
]}

机构：

[1] Univ Wisconsin, Dept Elect & Comp Engn, Madison, WI 53706 USA

来源：

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING | 2013年 / 73卷 / 07期

基金：

美国国家科学基金会;

关键词：

Cortical learning algorithms; CUDA; GPGPU; Profiling systems; RECEPTIVE-FIELDS; FUNCTIONAL ARCHITECTURE; MODEL;

D O I：

10.1016/j.jpdc.2012.02.006

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Recent advances in neuroscientific understanding have highlighted the highly parallel computation power of the mammalian neocortex. In this paper we describe a GPGPU-accelerated implementation of an intelligent learning model inspired by the structural and functional properties of the neocortex. Furthermore, we consider two inefficiencies inherent to our initial implementation and propose software optimizations to mitigate such problems. Analysis of our application's behavior and performance provides important insights into the GPGPU architecture, including the number of cores, the memory system, atomic operations, and the global thread scheduler. Additionally, we create a runtime profiling tool for the cortical network that proportionally distributes work across the host CPU as well as multiple GPGPUs available to the system. Using the profiling tool with these optimizations on Nvidia's CUDA framework, we achieve up to 60 x speedup over a single-threaded CPU implementation of the model. (c) 2012 Elsevier Inc. All rights reserved.

引用

页码：953 / 971

页数：19

共 50 条

[21] Performance Optimization of Allreduce Operation for Multi-GPU Systems
Nukada, Akira
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 3107 - 3112
[22] Efficient breadth first search on multi-GPU systems
Mastrostefano, Enrico
Bernaschi, Massimo
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (09) : 1292 - 1305
[23] Autonomous Execution for Multi-GPU Systems: Compiler Support
Koç University, Istanbul, Turkey
不详
CA, United States
Proc. SC -W: Workshops Int. Conf. High Perform. Comput., Netw., Storage Anal., (1129-1140):
[24] Tensor Movement Orchestration in Multi-GPU Training Systems
Lin, Shao-Fu
Chen, Yi-Jung
Cheng, Hsiang-Yun
Yang, Chia-Lin
2023 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA, 2023, : 1140 - 1152
[25] Gossip: Efficient Communication Primitives for Multi-GPU Systems
Kobus, Robin
Juenger, Daniel
Hundt, Christian
Schmidt, Bertil
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
[26] Solving Multiple Tridiagonal Systems on a Multi-GPU Platform
Dieguez, Adrian P.
Amor, Margarita
Doallo, Ramon
2018 26TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2018), 2018, : 759 - 763
[27] A multi-GPU algorithm for large-scale neuronal networks
de Camargo, Raphael Y.
Rozante, Luiz
Song, Siang W.
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (06): : 556 - 572
[28] New multi-GPU implementation for smoothed particle hydrodynamics on heterogeneous clusters
Dominguez, J. M.
Crespo, A. J. C.
Valdez-Balderas, D.
Rogers, B. D.
Gomez-Gesteira, M.
COMPUTER PHYSICS COMMUNICATIONS, 2013, 184 (08) : 1848 - 1860
[29] Parallel Generation of Digitally Reconstructed Radiographs on Heterogeneous Multi-GPU Workstations
Abdellah, Marwan
Abdelaziz, Asem
Ali, Eslam
Abdelaziz, Sherief
Sayed, Abdelrahman
Owis, Mohamed I.
Eldeib, Ayman
2016 38TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2016, : 3953 - 3956
[30] HPSM: A Programming Framework for Multi-CPU and Multi-GPU Systems
Lima, Joao V. F.
Di Domenico, Daniel
2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 31 - 36

← 1 2 3 4 5 →