共 50 条
- [1] Hierarchical Roofline Performance Analysis for Deep Learning Applications INTELLIGENT COMPUTING, VOL 2, 2021, 284 : 473 - 491
- [2] Performance analysis of deep learning workloads using roofline trajectories CCF Transactions on High Performance Computing, 2019, 1 : 224 - 239
- [3] Time-Based Roofline for Deep Learning Performance Analysis PROCEEDINGS OF 2020 IEEE/ACM 5TH WORKSHOP ON DEEP LEARNING ON SUPERCOMPUTERS (DLS 2020), 2020, : 10 - 19
- [5] Performance Analysis of Distributed and Scalable Deep Learning 2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 760 - 766
- [6] Modeling and Optimizing the Scaling Performance in Distributed Deep Learning Training PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 1764 - 1773
- [7] Collective Communication Performance Evaluation for Distributed Deep Learning Training APPLIED SCIENCES-BASEL, 2024, 14 (12):
- [9] Performance and Consistency Analysis for Distributed Deep Learning Applications 2020 IEEE 39TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2020,
- [10] Building a Performance Model for Deep Learning Recommendation Model Training on GPUs 2022 IEEE 29TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS, HIPC, 2022, : 48 - 58