Validating the simulation of large-scale parallel applications using statistical characteristics

被引:1
作者
Zhang D. [1 ]
Wilke J. [2 ]
Hendry G. [2 ]
Dechev D. [1 ]
机构
[1] Department of Computer Science, University of Central Florida, 211 Harris Center (Building 116), 4000 Central Florida Boulevard, Orlando, 32816, FL
[2] Sandia National Laboratories, California, P.O. Box 969, Livermore, 94551-0969, CA
关键词
Evaluation metrics; Simulation evaluation; Software skeleton;
D O I
10.1145/2809778
中图分类号
学科分类号
摘要
Simulation is a widely adopted method to analyze and predict the performance of large-scale parallel applications. Validating the hardware model is highly important for complex simulations with a large number of parameters. Common practice involves calculating the percent error between the projected and the real execution time of a benchmark program. However, in a high-dimensional parameter space, this coarse-grained approach often suffers from parameter insensitivity, which may not be known a priori. Moreover, the traditional approach cannot be applied to the validation of software models, such as application skeletons used in online simulations. In this work, we present a methodology and a toolset for validating both hardware and software models by quantitatively comparing fine-grained statistical characteristics obtained from execution traces. Although statistical information has been used in tasks like performance optimization, this is the first attempt to apply it to simulation validation. Our experimental results show that the proposed evaluation approach offers significant improvement in fidelity when compared to evaluation using total execution time, and the proposed metrics serve as reliable criteria that progress toward automating the simulation tuning process. © 2016 ACM.
引用
收藏
相关论文
共 35 条
[11]  
Kamil S., Oliker L., Pinar A., Shalf J., Communication requirements and interconnect optimization for high-end scientific applications, IEEE Transactions on Parallel and Distributed Systems, 21, 2, pp. 188-202, (2010)
[12]  
Nagel W.E., Arnold A., Weber M., Hoppe H.C., Solchenbach K., VAMPIR: Visualization and Analysis of MPI Resources, (1996)
[13]  
Nunez A., Fernandez J., Garcia J.D., Garcia F., Carretero J., New techniques for simulating high performance MPI applications on large storage networks, The Journal of Supercomputing, 51, 1, pp. 40-57, (2010)
[14]  
Pena A.J., Carvalho R.G.C., Dinan J., Balaji P., Thakur R., Gropp W., Analysis of topologydependent MPI performance on Gemini networks, Proceedings of the 20th European MPI Users' Group Meeting, pp. 61-66, (2013)
[15]  
Penoff B., Wagner A., Tuxen M., Rungeler I., MPI-NetSim: A network simulation module for MPI, Proceedings of the 2009 15th International Conference on Parallel and Distributed Systems (ICPADS, pp. 464-471, (2009)
[16]  
Preissl R., Kockerbauer T., Schulz M., Kranzlmuller D., Supinski B., Quinlan D.J., Detecting patterns in MPI communication traces, Proceedings of the 37th International Conference on Parallel Processing (ICPP'08, pp. 230-237, (2008)
[17]  
Reussner R., Sanders P., Prechelt L., Muller M., SKaMPI: A detailed, accurate MPI benchmark, Recent Advances in Parallel Virtual Machine and Message Passing Interface (1998), pp. 52-59, (1998)
[18]  
Rodrigues A.F., Hemmert K.S., Barrett B.W., Kersey C., Oldfield R., Weston M., Risen R., Cook J., Rosenfeld P., Cooper E., The structural simulation toolkit, ACM SIGMETRICS Performance Evaluation Review, 38, 4, pp. 37-42, (2011)
[19]  
Shalf J., Quinlan D., Janssen C., Rethinking hardware-software codesign for exascale systems, Computer, 44, 11, pp. 22-30, (2011)
[20]  
Shende S.S., Malony A.D., The TAU parallel performance system, International Journal of High Performance Computing Applications, 20, 2, pp. 287-311, (2006)