A method for efficient radio astronomical data gridding on multi-core vector processor

被引:0
作者
Wang, Hao [1 ]
Yu, Ce [1 ]
Xiao, Jian [1 ]
Tang, Shanjiang [1 ]
Lu, Yu [1 ]
Fu, Hao [2 ]
Kang, Bo [2 ]
Zheng, Gang [2 ]
Cui, Chenzhou [3 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, 135 Yaguan Rood, Haihe Educ Pk, Tianjin 300350, Peoples R China
[2] Natl Supercomp Ctr Tianjin, Tinajin 300457, Peoples R China
[3] Chinese Acad Sci, Natl Astron Observ, 20A Datun Rd, Beijing 100101, Peoples R China
基金
中国国家自然科学基金;
关键词
Radio astronomy gridding; Multi-core vector processor; Vectorization; Parallelization; M-DSP;
D O I
10.1016/j.parco.2022.102972
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Gridding is the performance-critical step in the data reduction pipeline for radio astronomy research, allowing astronomers to create the correct sky images for further analysis. Like the 2D stencil computation, gridding iteratively updates the output cells by convolution, where the value at each output cell in the space is computed as a weighted sum of neighboring point values. Existing state-of-the-art works have achieved performance improvement of gridding by using multi-core CPUs and GPUs in real-world applications, and their study proved that gridding is a type of scientific computation with high-density computing characteristics. However, low computational performance or high power consumption becomes the main limitation for their processing of large-scale astronomical data. The high-density computing feature of gridding provides opportunities to accelerate it on the multi-core vector processor with vector-SIMD architectures. However, existing works' (such as those implemented on CPUs or GPUs) task parallelization and data transfer strategies are inefficient to perform gridding directly on the vector processor without any dedicated mapping algorithm. M-DSP is a multi-core vector processor with vector-SIMD architectures designed for the next-generation exascale supercomputer, delivering high performance with ultra-low power consumption. In this paper, we present, for the first time, a novel method to achieve efficient gridding on the M-DSP. Specifically, we propose a gridding workflow designed for the vector-SIMD architectures and present a vectorized version of the gridding convolution algorithm to fully exploit the computational power of the M-DSP. In addition, centering on the processor architectures, we propose task-based parallelization strategies for block and line computing as well as different data loading strategies to achieve high parallel performance and high data transfer efficiency. Experimental results show that our work on M-DSP exhibits very competitive performance compared to other methods running on CPUs or GPUs. This demonstrates the efficiency of our method and the fact that the vector-SIMD architecture is beneficial for scientific computing with "high density" characteristics, which can exploit its wide vector core and achieve higher performance than its competitors.
引用
收藏
页数:10
相关论文
共 27 条
[1]   HI4PI: a full-sky H I survey based on EBHIS and GASS [J].
Ben Bekhti, N. ;
Floer, L. ;
Keller, R. ;
Kerp, J. ;
Lenz, D. ;
Winkel, B. ;
Bailin, J. ;
Calabretta, M. R. ;
Dedes, L. ;
Ford, H. A. ;
Gibson, B. K. ;
Haud, U. ;
Janowiecki, S. ;
Kalberla, P. M. W. ;
Lockman, F. J. ;
McClure-Griffiths, N. M. ;
Murphy, T. ;
Nakanishi, H. ;
Pisano, D. J. ;
Staveley-Smith, L. .
ASTRONOMY & ASTROPHYSICS, 2016, 594
[2]  
Bigot-Sazy MA, 2015, Arxiv, DOI arXiv:1511.03006
[3]   Multi-GPU maximum entropy image synthesis for radio astronomy [J].
Carcamo, M. ;
Roman, P. E. ;
Casassus, S. ;
Moral, V. ;
Rannou, F. R. .
ASTRONOMY AND COMPUTING, 2018, 22 :16-27
[4]  
Carrad G., 2006, PROC WORKSHOP APPL R, P15
[5]   ALFALFA H i data stacking - I. Does the bulge quench ongoing star formation in early-type galaxies? [J].
Fabello, Silvia ;
Catinella, Barbara ;
Giovanelli, Riccardo ;
Kauffmann, Guinevere ;
Haynes, Martha P. ;
Heckman, Timothy M. ;
Schiminovich, David .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2011, 411 (02) :993-1012
[6]   The Sunway TaihuLight supercomputer: system and applications [J].
Fu, Haohuan ;
Liao, Junfeng ;
Yang, Jinzhe ;
Wang, Lanning ;
Song, Zhenya ;
Huang, Xiaomeng ;
Yang, Chao ;
Xue, Wei ;
Liu, Fangfang ;
Qiao, Fangli ;
Zhao, Wei ;
Yin, Xunqiang ;
Hou, Chaofeng ;
Zhang, Chenglong ;
Ge, Wei ;
Zhang, Jian ;
Wang, Yangang ;
Zhou, Chunbo ;
Yang, Guangwen .
SCIENCE CHINA-INFORMATION SCIENCES, 2016, 59 (07)
[7]   The Arecibo Legacy Fast ALFA survey. II. Results of precursor observations [J].
Giovanelli, R ;
Haynes, MP ;
Kent, BR ;
Perillat, P ;
Catinella, B ;
Hoffman, GL ;
Momjian, E ;
Rosenberg, JL ;
Saintonge, A ;
Spekkens, K ;
Stierwalt, S ;
Brosch, N ;
Masters, KL ;
Springob, CM ;
Karachentsev, ID ;
Karachentseva, VE ;
Koopmann, RA ;
Muller, E ;
van Driel, W ;
van Zee, L .
ASTRONOMICAL JOURNAL, 2005, 130 (06) :2613-2624
[8]   HEALPix:: A framework for high-resolution discretization and fast analysis of data distributed on the sphere [J].
Górski, KM ;
Hivon, E ;
Banday, AJ ;
Wandelt, BD ;
Hansen, FK ;
Reinecke, M ;
Bartelmann, M .
ASTROPHYSICAL JOURNAL, 2005, 622 (02) :759-771
[9]  
Griffin A, 2018, PR IEEE SEN ARRAY, P480, DOI 10.1109/SAM.2018.8448899
[10]   FAST in Space [J].
Li, Di ;
Wang, Pei ;
Qian, Lei ;
Krco, Marko ;
Dunning, Alex ;
Jiang, Peng ;
Yue, Youling ;
Jin, Chenjin ;
Zhu, Yan ;
Pan, Zhichen ;
Nan, Rendong .
IEEE MICROWAVE MAGAZINE, 2018, 19 (03) :112-119