A high-throughput readout architecture based on PCI-Express Gen3 and DirectGMA technology

被引:8
作者
Rota, L. [1 ]
Vogelgesang, M. [1 ]
Perez, L. E. Ardila [1 ]
Caselle, M. [1 ]
Chilingaryan, S. [1 ]
Dritschler, T. [1 ]
Zilio, N. [1 ]
Kopmann, A. [1 ]
Balzer, M. [1 ]
Weber, M. [1 ]
机构
[1] Karlsruhe Inst Technol, Inst Data Proc & Elect, Herrmann von Helmholtz Pl 1, D-76021 Karlsruhe, Germany
关键词
Trigger concepts and systems (hardware and software); Data acquisition concepts; Digital electronic circuits;
D O I
10.1088/1748-0221/11/02/P02007
中图分类号
TH7 [仪器、仪表];
学科分类号
0804 ; 080401 ; 081102 ;
摘要
Modern physics experiments produce multi-GB/s data rates. Fast data links and high performance computing stages are required for continuous data acquisition and processing. Because of their intrinsic parallelism and computational power, GPUs emerged as an ideal solution to process this data in high performance computing applications. In this paper we present a high-throughput platformbased on direct FPGA-GPU communication. The architecture consists of a Direct Memory Access (DMA) engine compatible with the Xilinx PCI-Express core, a Linux driver for register access, and high-level software to manage direct memory transfers using AMD's DirectGMA technology. Measurements with a Gen3x8 link show a throughput of 6.4 GB/s for transfers to GPU memory and 6.6 GB/s to system memory. We also assess the possibility of using the architecture in low latency systems: preliminary measurements show a round-trip latency as low as 1 mu s for data transfers to system memory, while the additional latency introduced by OpenCL scheduling is the current limitation for GPU based systems. Our implementation is suitable for real-time DAQ system applications ranging from photon science and medical imaging to High Energy Physics (HEP) systems.
引用
收藏
页数:9
相关论文
共 14 条
[1]  
ATLAS collaboration, 2012, J PHYS C SER, V396
[2]   The proposed trigger-less TBit/s readout for the Mu3e experiment [J].
Bachmann, S. ;
Berger, N. ;
Blondel, A. ;
Bravar, S. ;
Buniatyan, A. ;
Dissertori, G. ;
Eckert, P. ;
Fischer, P. ;
Grab, C. ;
Gredig, R. ;
Hildebrandt, M. ;
Kettle, P. -R. ;
Kiehn, M. ;
Papa, A. ;
Peric, I. ;
Pohl, M. ;
Ritt, S. ;
Robmann, P. ;
Schoening, A. ;
Schultz-Coulon, H. -C. ;
Shen, W. ;
Shresta, S. ;
Stoykov, A. ;
Straumann, U. ;
Wallny, R. ;
Wiedner, D. ;
Windelband, B. .
JOURNAL OF INSTRUMENTATION, 2014, 9
[3]  
Bittner R., 2012, 2012 41st International Conference on Parallel Processing Workshops (ICPPW 2012), P135, DOI 10.1109/ICPPW.2012.20
[4]  
Caselle M., 2014, P 5 INT PART ACC C I, P3497
[5]   ALICE HLT High Speed Tracking on GPU [J].
Gorbunov, Sergey ;
Rohr, David ;
Aamodt, Kenneth ;
Alt, Torsten ;
Appelshaeuser, Harald ;
Arend, Andreas ;
Bach, Matthias ;
Becker, Bruce ;
Boettger, Stefan ;
Breitner, Timo ;
Buesching, Henner ;
Chattopadhyay, Sukalyan ;
Cleymans, Jean ;
Cicalo, Corrado ;
Das, Indranil ;
Djuvsland, Oystein ;
Engel, Heikofname ;
Erdal, Hege Austrheim ;
Fearick, Roger ;
Haaland, Oystein Senneset ;
Hille, Per Thomas ;
Kalcher, Sebastian ;
Kanaki, Kalliopi ;
Kebschull, Udo Wolfgang ;
Kisel, Ivan ;
Kretz, Matthias ;
Lara, Camilo ;
Lindal, Svein ;
Lindenstruth, Volker ;
Masoodi, Arshad Ahmad ;
Ovrebekk, Gaute ;
Panse, Ralf ;
Peschek, Joerg ;
Ploskon, Mateusz ;
Pocheptsov, Timur ;
Ram, Dinesh ;
Rascanu, Theodor ;
Richter, Matthias ;
Roehrich, Dieter ;
Ronchetti, Federico ;
Skaali, Bernhard ;
Smorholm, Olav ;
Stokkevag, Camilla ;
Steinbeck, Timm Morten ;
Szostak, Artur ;
Thaeder, Jochen ;
Tveter, Trine ;
Ullaland, Kjetil ;
Vilakazi, Zeblon ;
Weis, Robert .
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2011, 58 (04) :1845-1851
[6]  
Herten A., 2014, P GPU COMP HIGH EN P, P57
[7]  
LIBOIRONLADOUCE.O, 2007, OFC NFOEC 2007 MAR, P1
[8]   NaNet: a configurable NIC bridging the gap between HPC and real-time HEP GPU computing [J].
Lonardo, A. ;
Ameli, F. ;
Ammendola, R. ;
Biagioni, A. ;
Ramusino, A. Cotta ;
Fiorini, M. ;
Frezza, O. ;
Lamanna, G. ;
Lo Cicero, F. ;
Martinelli, M. ;
Neri, I. ;
Paolucci, P. S. ;
Pastorelli, E. ;
Pontisso, L. ;
Rossetti, D. ;
Simeone, F. ;
Simula, F. ;
Sozzi, M. ;
Tosoratto, L. ;
Vicini, P. .
JOURNAL OF INSTRUMENTATION, 2015, 10
[9]   A high throughput data acquisition and processing model for applications based on GPUs [J].
Nieto, J. ;
De Arcas, D. ;
Ruiz, M. ;
Castro, R. ;
Vega, J. ;
Guillen, P. .
FUSION ENGINEERING AND DESIGN, 2015, 96-97 :895-898
[10]   A PCIe DMA Architecture for Multi-Gigabyte Per Second Data Transmission [J].
Rota, L. ;
Caselle, M. ;
Chilingaryan, S. ;
Kopmann, A. ;
Weber, M. .
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, 2015, 62 (03) :972-976