FFT-Based Dense Polynomial Arithmetic on Multi-cores

被引:0
|
作者
Maza, Marc Moreno [1 ]
Xie, Yuzhen [2 ]
机构
[1] Univ Western Ontario, Ontario Res Ctr Comp Algebra, London, ON, Canada
[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA USA
来源
HIGH PERFORMANCE COMPUTING SYSTEMS AND APPLICATIONS | 2010年 / 5976卷
基金
美国国家科学基金会; 加拿大自然科学与工程研究理事会;
关键词
Parallel polynomial arithmetic; parallel polynomial multiplication; parallel normal form; parallel multi-dimensional FFT/TFT; Cilk plus; multi-core;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We report efficient implementation techniques for FFT-based dense multivariate polynomial arithmetic over finite fields, targeting multi-cores. We have extended a preliminary study dedicated to polynomial multiplication and obtained a complete set of efficient parallel routines in Cilk++ for polynomial arithmetic such as normal form computation. Since bivariate multiplication applied to balanced data is a good kernel for these routines, we provide an in-depth study on the performance and the cut-off criteria of our different implementations for this operation. We also show that, not only optimized parallel multiplication can improve the performance of higher-level algorithms such as normal form computation but also this composition is necessary for parallel normal form computation to reach peak performance on a variety of problems that we have tested.
引用
收藏
页码:378 / +
页数:3
相关论文
共 28 条
  • [21] Shared-Clock Methodology for Time-Triggered Multi-Cores
    Athaide, Keith F.
    Pont, Michael J.
    Ayavoo, Devaraj
    COMMUNICATING PROCESS ARCHITECTURES 2008, 2008, 66 : 149 - +
  • [22] Timing analysis of concurrent programs running on shared cache multi-cores
    Yun Liang
    Huping Ding
    Tulika Mitra
    Abhik Roychoudhury
    Yan Li
    Vivy Suhendra
    Real-Time Systems, 2012, 48 : 638 - 680
  • [23] A Gaussian Set Sampling Model for Efficient Shared Cache Profiling on Multi-Cores
    Zhang, Yi
    Ling, Zhanwei
    Lv, Mingsong
    Guan, Nan
    IEEE ACCESS, 2019, 7 : 115560 - 115567
  • [24] All-Digital Control-Theoretic Scheme to Optimize Energy Budget and Allocation in Multi-Cores
    Zoni, Davide
    Cremona, Luca
    Fornaciari, William
    IEEE TRANSACTIONS ON COMPUTERS, 2020, 69 (05) : 706 - 721
  • [25] Energy Reduction Through Memory Aware Real-Time Scheduling on Virtual Machine in Multi-Cores Server
    Alqudah, Mohammad A.
    Ahmed, Iqra
    Ahmad, Fahad
    Naseem, Shahid
    Nisar, Kottakkaran Sooppy
    IEEE ACCESS, 2021, 9 : 55436 - 55447
  • [26] A Highly Efficient Multicore Floating-Point FFT Architecture Based on Hybrid Linear Algebra/FFT Cores
    Pedram, Ardavan
    McCalpin, John D.
    Gerstlauer, Andreas
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2014, 77 (1-2): : 169 - 190
  • [27] A Highly Efficient Multicore Floating-Point FFT Architecture Based on Hybrid Linear Algebra/FFT Cores
    Ardavan Pedram
    John D. McCalpin
    Andreas Gerstlauer
    Journal of Signal Processing Systems, 2014, 77 : 169 - 190
  • [28] Implementation of Multi-core Parallel Computation for Solving Large Dense Linear Equations Based on TBB
    Zhang, Shuangshi
    Zhang, Wei
    Wang, Xuben
    2012 INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING AND COMMUNICATION TECHNOLOGY (ICCECT 2012), 2012, : 24 - 27