Transient Simulations of High-Speed Channels Using CNN-LSTM With an Adaptive Successive Halving Algorithm for Automated Hyperparameter Optimizations

被引:22
作者
Goay, Chan Hong [1 ]
Ahmad, Nur Syazreen [1 ]
Goh, Patrick [1 ]
机构
[1] Univ Sains Malaysia, Sch Elect & Elect Engn, Nibong Tebal 14300, Penang, Malaysia
关键词
Training; Transient analysis; Computational modeling; Convolutional neural networks; Time series analysis; Optimization; Integrated circuit modeling; Automated hyperparameter optimization; convolutional neural network (CNN); high-speed channel; long short-term memory (LSTM) network; progressive sampling; transient simulation; RECURRENT; NETWORKS;
D O I
10.1109/ACCESS.2021.3112134
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Transient simulations of high-speed channels can be very time intensive. Recurrent neural network (RNN) based methods can be used to speed up the process by training a RNN model on a relatively short bit sequence, and then using a multi-steps rolling forecast method to predict subsequent bits. However, the performance of the RNN model is highly affected by its hyperparameters. We propose an algorithm named adaptive successive halving automated hyperparameter optimization (ASH-HPO) which combines successive halving, Bayesian optimization (BO), and progressive sampling to tune the hyperparameters of the RNN models. Modifications are proposed to the successive halving and progressive sampling algorithms for better efficiency on time series data. The ASH-HPO algorithm trains on smaller dataset subsets initially, then expands the training dataset progressively and adaptively adds or removes models along the process. In this paper, we use the ASH-HPO algorithm to optimize the hyperparameters of convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and CNN-LSTM networks. We demonstrate the effectiveness of the ASH-HPO algorithm using a PCIe Gen 2 channel, a PCIe Gen 5 channel, and a PAM4 differential channel. We also investigate the effects of several settings and tunable variables of the ASH-HPO algorithm on its convergence speed. As a benchmark, we compared the ASH-HPO algorithm to three state-of-the-art HPO methods: BO, successive halving, and hyperband. The results show that the ASH-HPO algorithm converges faster than the other HPO methods on transient simulation problems.
引用
收藏
页码:127644 / 127663
页数:20
相关论文
共 42 条
[1]   S-Parameter and Frequency Identification Method for ANN-Based Eye-Height/Width Prediction [J].
Ambasana, Nikita ;
Anand, Gowri ;
Gope, Dipanjan ;
Mutnury, Bhyrav .
IEEE TRANSACTIONS ON COMPONENTS PACKAGING AND MANUFACTURING TECHNOLOGY, 2017, 7 (05) :698-709
[2]   Eye Height/Width Prediction From S-Parameters Using Learning-Based Models [J].
Ambasana, Nikita ;
Anand, Gowri ;
Mutnury, Bhyrav ;
Gope, Dipanjan .
IEEE TRANSACTIONS ON COMPONENTS PACKAGING AND MANUFACTURING TECHNOLOGY, 2016, 6 (06) :873-885
[3]  
Brochu E., 2010, A tutorial on Bayesian optimization of expensive cost functions, DOI DOI 10.48550/ARXIV.1012.2599
[4]   A New Training Approach for Robust Recurrent Neural-Network Modeling of Nonlinear Circuits [J].
Cao, Yi ;
Zhang, Qi-Jun .
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, 2009, 57 (06) :1539-1553
[5]   An accurate and efficient analysis method for multi-Gb/s chip-to-chip signaling schemes [J].
Casper, BK ;
Haycock, M ;
Mooney, R .
2002 SYMPOSIUM ON VLSI CIRCUITS, DIGEST OF TECHNICAL PAPERS, 2002, :54-57
[6]  
Chu YH, 2019, IEEE INT SYMP ELEC, P625, DOI [10.1109/ISEMC.2019.8825275, 10.1109/isemc.2019.8825275]
[7]   Worst-Case Eye Analysis of High-Speed Channels Based on Bayesian Optimization [J].
Dolatsara, Majid Ahadi ;
Hejase, Jose Ale ;
Becker, Wiren Dale ;
Kim, Jinwoo ;
Lim, Sung Kyu ;
Swaminathan, Madhavan .
IEEE TRANSACTIONS ON ELECTROMAGNETIC COMPATIBILITY, 2021, 63 (01) :246-258
[8]   A Hybrid Methodology for Jitter and Eye Estimation in High-Speed Serial Channels Using Polynomial Chaos Surrogate Models [J].
Dolatsara, Majid Ahadi ;
Hejase, Jose Ale ;
Becker, Wiren Dale ;
Swaminathan, Madhavan .
IEEE ACCESS, 2019, 7 :53629-53640
[9]   Long-Term Recurrent Convolutional Networks for Visual Recognition and Description [J].
Donahue, Jeff ;
Hendricks, Lisa Anne ;
Rohrbach, Marcus ;
Venugopalan, Subhashini ;
Guadarrama, Sergio ;
Saenko, Kate ;
Darrell, Trevor .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (04) :677-691
[10]   A new macromodeling approach for nonlinear microwave circuits based on recurrent neural networks [J].
Fang, YH ;
Yagoub, MCE ;
Wang, F ;
Zhang, QJ .
IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, 2000, 48 (12) :2335-2344