An Optimized Cell BE Special Function Library Generated by Coconut

被引:7
作者
Anand, Christopher Kumar [1 ]
Kahl, Wolfram [1 ]
机构
[1] McMaster Univ, Dept Comp & Software, ITB202, Hamilton, ON L8S 4K1, Canada
关键词
Special function approximations; parallel and vector implementations; code generation; specialized application languages; SIMD processors; applicative (functional) programming;
D O I
10.1109/TC.2008.223
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Coconut, a tool for developing high-assurance, high-performance kernels for scientific computing, contains an extensible domain-specific language (DSL) embedded in Haskell. The DSL supports interactive prototyping and unit testing, simplifying the process of designing efficient implementations of common patterns. Unscheduled C and scheduled assembly language output are supported. Using the patterns, even nonexpert users can write efficient function implementations, leveraging special hardware features. A production-quality library of elementary functions for the Cell BE SPU compute engines has been developed. Coconut-generated and -scheduled vector functions were more than four times faster than commercially distributed functions written in C with intrinsics (a nicer syntax for in-line assembly), wrapped in loops and scheduled by spuxIc. All Coconut functions were faster, but the difference was larger for hard-to-approximate functions for which register-level SIMD lookups made a bigger difference. Other helpful features in the language include facilities for translating interval and polynomial descriptions between GHCi, a Haskell interpreter used to prototype in the DSL, and Maple, used for exploration and minimax polynomial generation. This makes it easier to match mathematical properties of the functions with efficient calculational patterns in the SPU ISA. By using single, literate source files, the resulting functions are remarkably readable.
引用
收藏
页码:1126 / 1138
页数:13
相关论文
共 20 条
[1]  
ANAND CK, 2007, 43 SQRL MCMASTER U
[2]  
BANDERA G, 2004, P 18 INT PAR DISTR P
[3]   Efficient polynomial L∞-approximations [J].
Brisebarre, Nicolas ;
Chevillard, Sylvain .
18TH IEEE SYMPOSIUM ON COMPUTER ARITHMETIC, PROCEEDINGS, 2007, :169-+
[4]  
DUBEY PK, 2001, Patent No. 6223320
[5]  
ENENKEL R, 2004, TR74200 IBM CORP
[6]  
Gal S., 1986, P S ACC SCI COMP, P1
[7]   CHEBYSHEV APPROXIMATION OF CONTINUOUS FUNCTIONS BY A CHEBYSHEV SYSTEM OF FUNCTIONS [J].
GOLUB, GH ;
SMITH, LB .
COMMUNICATIONS OF THE ACM, 1971, 14 (11) :737-&
[8]  
Grelck C, 2006, INT J PARALLEL PROG, V34, P383, DOI 10.1007/S10766-006-0018-x
[9]  
Hudak Paul, 1996, ACM Comput. Surv, V28, P196, DOI DOI 10.1145/242224.242477
[10]  
*IBM CORP, 2006, SYN PROC UN INSTR SE