Selecting the Best Compiler Optimization by Adopting Natural Language Processing

被引:0
作者
Ahmed, Hameeza [1 ]
Fahim Ul Haque, Muhammad [2 ,4 ]
Raza Khan, Hashim [3 ]
Nadeem, Ghalib [3 ]
Arshad, Kamran [5 ,6 ]
Assaleh, Khaled [5 ,6 ]
Cesar Santos, Paulo [7 ]
机构
[1] NED Univ Engn & Technol, Dept Comp & Informat Syst Engn, Karachi 75270, Pakistan
[2] NED Univ Engn & Technol, Dept Telecommun Engn, Karachi 75270, Pakistan
[3] Iqra Univ, Dept Engn Sci & Technol, Karachi 75500, Pakistan
[4] NED Univ Engn & Technol, Natl Ctr Artificial Intelligence, Neurocomputat Lab, Karachi 75270, Pakistan
[5] Ajman Univ, Coll Engn & Informat Technol, Dept Elect & Comp Engn, Ajman, U Arab Emirates
[6] Ajman Univ, Artificial Intelligence Res Ctr, Ajman, U Arab Emirates
[7] Univ Fed Rio Grande do Sul, Inst Informat, BR-91509900 Porto Alegre, Brazil
关键词
Codes; Optimization; Natural language processing; Source coding; Feature extraction; Hardware; Bayes methods; Program processors; Performance evaluation; Compiler; optimization; source code analysis; natural language processing; vectorization; regression; CLASSIFICATION;
D O I
10.1109/ACCESS.2024.3451516
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Compiler is a tool that converts the high-level language into assembly code after enabling relevant optimizations. The automatic selection of suitable optimizations from an ample optimization space is a non-trivial task mainly accomplished through hardware profiling and application-level features. These features are then passed through an intelligent algorithm to predict the desired optimizations. However, collecting these features requires executing the application beforehand, which involves high overheads. With the evolution of Natural Language Processing (NLP), the performance of an application can be solely predicted at compile time via source code analysis. There has been substantial work in source code analysis using NLP, but most of it is focused on offloading the computation to suitable devices or detecting code vulnerabilities. Therefore, it has yet to be used to identify the best optimization sequence for an application. Similarly, most works have focused on finding the best machine learning or deep learning algorithms, hence ignoring the other important phases of the NLP pipeline. This paper pioneers the use of NLP to predict the best set of optimizations for a given application at compile time. Furthermore, this paper uniquely studies the impact of four vectorization and seven regression techniques in predicting the application performance. For most applications, we show that tfidf vectorization and huber regression result in the best outcomes. On average, the proposed technique predicts the optimal optimization sequence with a performance drop of 18%, achieving a minimum drop of merely 0.5% compared to the actual best combination.
引用
收藏
页码:121700 / 121711
页数:12
相关论文
共 45 条
[1]   Exploring compiler optimization space for control flow obfuscation [J].
Ahmed, Hameeza ;
Hyder, Muhammad Faraz ;
ul Haque, Muhammad Fahim ;
Santos, Paulo Cesar .
COMPUTERS & SECURITY, 2024, 139
[2]   Toward a novel engine for compiler optimization space exploration of big data workloads [J].
Ahmed, Hameeza ;
Ismail, Muhammad Ali .
SOFTWARE-PRACTICE & EXPERIENCE, 2022, 52 (05) :1262-1293
[3]   REDUCER: ELIMINATION OF REPETITIVE CODES FOR ACCELERATED ITERATIVE COMPILATION [J].
Ahmed, Hameeza ;
Ismail, Muhammad Ali .
COMPUTING AND INFORMATICS, 2021, 40 (03) :543-574
[4]   Towards a Novel Framework for Automatic Big Data Detection [J].
Ahmed, Hameeza ;
Ismail, Muhammad Ali .
IEEE ACCESS, 2020, 8 :186304-186322
[5]  
Aho AV., 1986, Compilers: Principles, Techniques, Tools
[6]  
Alpaydin E., 2020, Introduction to machine learning, V4th
[7]   A Survey on Compiler Autotuning using Machine Learning [J].
Ashouri, Amir H. ;
Killian, William ;
Cavazos, John ;
Palermo, Gianluca ;
Silvano, Cristina .
ACM COMPUTING SURVEYS, 2019, 51 (05)
[8]   MiCOMP: Mitigating the Compiler Phase-Ordering Problem Using Optimization Sub-Sequences and Machine Learning [J].
Ashouri, Amir H. ;
Bignoli, Andrea ;
Palermo, Gianluca ;
Silvano, Cristina ;
Kulkarni, Sameer ;
Cavazos, John .
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2017, 14 (03)
[9]   COBAYN: Compiler Autotuning Framework Using Bayesian Networks [J].
Ashouri, Amir Hossein ;
Mariani, Giovanni ;
Palermo, Gianluca ;
Park, Eunjung ;
Cavazos, John ;
Silvano, Cristina .
ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2016, 13 (02)
[10]   Deep Learning Approaches to Source Code Analysis for Optimization of Heterogeneous Systems: Recent Results, Challenges and Opportunities [J].
Barchi, Francesco ;
Parisi, Emanuele ;
Bartolini, Andrea ;
Acquaviva, Andrea .
JOURNAL OF LOW POWER ELECTRONICS AND APPLICATIONS, 2022, 12 (03)