Data Dependence Graph Directed Scheduling for Clustered VLIW Architectures

被引:0
作者
杨旭
何虎
孙义和
机构
[1] InstituteofMicroelectronics,TsinghuaUniversity
关键词
clustered VLIW processor; instruction scheduling; cluster assignments;
D O I
暂无
中图分类号
TP332 [运算器和控制器(CPU)];
学科分类号
081201 ;
摘要
This paper presents an instruction scheduling and cluster assignment approach for clustered very long instruction words (VLIW) processors. The technique produces high performance code by simultaneously balancing instructions among clusters and minimizing the amount of inter-cluster data communications. The scheme is evaluated based on benchmarks extracted from UTDSP. Results show a significant speedup compared with previously used techniques with speed-ups of up to 44%, with average speed-ups ranging from 14% (2-cluster) to 18% (4-cluster).
引用
收藏
页码:299 / 306
页数:8
相关论文
共 16 条
[1]  
Lakshmi K V,Sreedhar D,Raman E, et al.Integrating a new cluster assignment and scheduling algorithm into an experimental retaragable code generation framework. Proceedings of the International Conference on High Per- formance Computing . 2005
[2]  
Codina J M,Sanchez J,Gonzalez A.Virtual cluster scheduling through the scheduling graph. Proceedings of the International Symposium on Code Generation and Opti- mization . 2007
[3]  
Farkas K,Chow P,Jouppi N, et al.The multicluster architecture: Reducing cycle time through partitioning. Pro- ceedings of the 30th Annual International Symposium on Microarchitecture . 1997
[4]  
Ellis J.R.Bulldog:A Compiler for VLIW Architectures. . 1986
[5]  
Fridman J,,Greenfield Z.The TigerSHARC DSP Architecture. IEEE Micro Magazine . 2000
[6]  
Lee W,Puppin D,Swenson S, et al.Convergent scheduling. Proceedings of the 35th Annual IEEE/ACM Interna- tional Symposium on Microarchitecture . 2002
[7]  
Aleta A,Codina J M,Sanchez J, et al.Graph-partitioning based instruction scheduling for clustered processors. Proceedings of the 34th International Symposium on ACM/IEEE . 2001
[8]  
Faraboschi P,,Desoli G,Fisher J.Clustered instruction-level parallel processors. Technical Report HPL-98-204. Hew- lett-Packard Laboratories . 1998
[9]  
Zhou Z X,He H,Zhang Y J, et al.A 2-dimension force-directed scheduling algorithm for register-file- con- nectivity clustered VLIW architecture. IEEE Interna- tional Conf. on Application-specific Systems, Architectures and Processors . 2007
[10]  
Texas Instruments.TMS320C62x/67x CPU AND Instruction Set Reference Guide. SPRU189 . 1998