No (good) loss no gain: systematic evaluation of loss functions in deep learning-based side-channel analysis

被引：11

作者：

Kerkhof, Maikel ^{[1
]}

Wu, Lichao ^{[1
]}

Perin, Guilherme ^{[2
]}

Picek, Stjepan ^{[2
]}

机构：

[1] Delft Univ Technol, Cyber Secur Res Grp, Mekelweg 2, Delft, Netherlands

[2] Radboud Univ Nijmegen, Digital Secur Grp, Postbus 9010, Nijmegen, Netherlands

来源：

JOURNAL OF CRYPTOGRAPHIC ENGINEERING | 2023年 / 13卷 / 03期

关键词：

Side-channel analysis; Deep Learning; Loss function; Evaluation; KEY;

D O I：

10.1007/s13389-023-00320-6

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Deep learning is a powerful direction for profiling side-channel analysis as it can break targets protected with countermeasures even with a relatively small number of attack traces. Still, it is necessary to conduct hyperparameter tuning to reach strong attack performance, which can be far from trivial. Besides many options stemming from the machine learning domain, recent years also brought neural network elements specially designed for side-channel analysis. The loss function, which calculates the error or loss between the actual and desired output, is one of the most important neural network elements. The resulting loss values guide the weights update associated with the connections between the neurons or filters of the deep learning neural network. Unfortunately, despite being a highly relevant hyperparameter, there are no systematic comparisons among different loss functions regarding their effectiveness in side-channel attacks. This work provides a detailed study of the efficiency of different loss functions in the SCA context. We evaluate five loss functions commonly used in machine learning and three loss functions specifically designed for SCA. Our results show that an SCA-specific loss function (called CER) performs very well and outperforms other loss functions in most evaluated settings. Still, categorical cross-entropy represents a good option, especially considering the variety of neural network architectures.

引用

页码：311 / 324

页数：14

共 37 条

[1] A Comparison of Regression Models for Prediction of Graduate Admissions [J].

Acharya, Mohan S. ;

Armaan, Asfia ;

Antony, Aneeta S. .

2019 SECOND INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN DATA SCIENCE (ICCIDS 2019), 2019,

[2]

Barz B, 2020, IEEE WINT CONF APPL, P1360, DOI 10.1109/WACV45572.2020.9093286

[3] Deep learning for side-channel analysis and introduction to ASCAD database [J].

Benadjila, Ryad ;

Prouff, Emmanuel ;

Strullu, Remi ;

Cagli, Eleonora ;

Dumas, Cecile .

JOURNAL OF CRYPTOGRAPHIC ENGINEERING, 2020, 10 (02) :163-188

[4] Convolutional Neural Networks with Data Augmentation Against Jitter-Based Countermeasures Profiling Attacks Without Pre-processing [J].

Cagli, Eleonora ;

Dumas, Cecile ;

Prouff, Emmanuel .

CRYPTOGRAPHIC HARDWARE AND EMBEDDED SYSTEMS - CHES 2017, 2017, 10529 :45-68

[5] On the algorithmic implementation of multiclass kernel-based vector machines [J].

Crammer, K ;

Singer, Y .

JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :265-292

[6] Combination of loss functions for robust breast cancer prediction [J].

Hajiabadi, Hamideh ;

Babaiyan, Vahide ;

Zabihzadeh, Davood ;

Hajiabadi, Moein .

COMPUTERS & ELECTRICAL ENGINEERING, 2020, 84

[7] Combination of loss functions for deep text classification [J].

Hajiabadi, Hamideh ;

Molla-Aliod, Diego ;

Monsefi, Reza ;

Yazdi, Hadi Sadoghi .

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (04) :751-761

[8]

He K., 2016, 2016 IEEE C COMP VIS, DOI DOI 10.1109/CVPR.2016.90

[9]

Ioffe Sergey, 2015, Proceedings of Machine Learning Research, V37, P448

[10]

Janocha Katarzyna, 2017, arXiv

← 1 2 3 4 →