FFT-Based Dense Polynomial Arithmetic on Multi-cores

被引:0
|
作者
Maza, Marc Moreno [1 ]
Xie, Yuzhen [2 ]
机构
[1] Univ Western Ontario, Ontario Res Ctr Comp Algebra, London, ON, Canada
[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA USA
来源
HIGH PERFORMANCE COMPUTING SYSTEMS AND APPLICATIONS | 2010年 / 5976卷
基金
美国国家科学基金会; 加拿大自然科学与工程研究理事会;
关键词
Parallel polynomial arithmetic; parallel polynomial multiplication; parallel normal form; parallel multi-dimensional FFT/TFT; Cilk plus; multi-core;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We report efficient implementation techniques for FFT-based dense multivariate polynomial arithmetic over finite fields, targeting multi-cores. We have extended a preliminary study dedicated to polynomial multiplication and obtained a complete set of efficient parallel routines in Cilk++ for polynomial arithmetic such as normal form computation. Since bivariate multiplication applied to balanced data is a good kernel for these routines, we provide an in-depth study on the performance and the cut-off criteria of our different implementations for this operation. We also show that, not only optimized parallel multiplication can improve the performance of higher-level algorithms such as normal form computation but also this composition is necessary for parallel normal form computation to reach peak performance on a variety of problems that we have tested.
引用
收藏
页码:378 / +
页数:3
相关论文
共 28 条
  • [11] A Multi-Rate Precision Timed Programming Language for Multi-Cores
    Girault, Alain
    Hili, Nicolas
    Jenn, Eric
    Yip, Eugene
    PROCEEDINGS OF THE 2019 FORUM ON SPECIFICATION AND DESIGN LANGUAGES (FDL), 2019,
  • [12] Redundant Execution on Heterogeneous Multi-cores Utilizing Transactional Memory
    Amslinger, Rico
    Weis, Sebastian
    Piatka, Christian
    Haas, Florian
    Ungerer, Theo
    ARCHITECTURE OF COMPUTING SYSTEMS, 2018, 10793 : 155 - 167
  • [13] Improved Resource Efficient Allocation of IMA Applications to Multi-Cores
    M'Sirdi, Soukayna
    Godard, Wenceslas
    Pantel, Marc
    Stilkerich, Stephan
    2016 IEEE/AIAA 35TH DIGITAL AVIONICS SYSTEMS CONFERENCE (DASC), 2016,
  • [14] Memory Utilization-Based Dynamic Bandwidth Regulation for Temporal Isolation in Multi-Cores
    Saeed, Ahsan
    Dasari, Dakshina
    Ziegenbein, Dirk
    Rajasekaran, Varun
    Rehm, Falk
    Pressler, Michael
    Hamann, Arne
    Mueller-Gritschneder, Daniel
    Gerstlauer, Andreas
    Schlichtmann, Ulf
    2022 IEEE 28TH REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM (RTAS), 2022, : 133 - 145
  • [15] PireSPM: Efficient and Recoverable Secure Persistent Memory for Multi-cores
    Huang, Weijie
    Zhu, Bohong
    Shu, Jiwu
    Li, Shu
    Wang, Zhengyong
    Gao, Yu
    2024 IEEE 24TH INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID 2024, 2024, : 47 - 56
  • [16] Peformance Optimization Utilizing Heterogeneous Multi-cores for Smart TV Applications
    Lee, Taeyoung
    Ann, Wooram
    Hahm, Cheulhee
    18TH IEEE INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS (ISCE 2014), 2014,
  • [17] Mixed Criticality on Multi-cores Accounting for Resource Stress and Resource Sensitivity
    Davis, Robert I.
    Bate, Iain
    PROCEEDINGS OF THE 30TH INTERNATIONAL CONFERENCE ON REAL-TIME NETWORKS AND SYSTEMS, RTNS 2022, 2022, : 103 - 115
  • [18] Timing analysis of concurrent programs running on shared cache multi-cores
    Liang, Yun
    Ding, Huping
    Mitra, Tulika
    Roychoudhury, Abhik
    Li, Yan
    Suhendra, Vivy
    REAL-TIME SYSTEMS, 2012, 48 (06) : 638 - 680
  • [19] Architectural support for efficient message passing on shared memory multi-cores
    Titos-Gil, Ruben
    Palomar, Oscar
    Unsal, Osman
    Cristal, Adrian
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2016, 95 : 92 - 106
  • [20] Design of an Interconnect Topology For Multi-Cores And Scale-Out Workloads
    Vidya, T.
    Ramasubramanian, N.
    2015 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATION AND NETWORKING (ICSCN), 2015,