Improved generalization performance of convolutional neural networks with LossDA

被引：3

作者：

Liu, Juncheng ^{[1
]}

Zhao, Yili ^{[1
]}

机构：

[1] Southwest Forestry Univ, Coll Big Data & Intelligent Engn, Bailong Rd, Kunming 650224, Yunnan, Peoples R China

来源：

APPLIED INTELLIGENCE | 2023年 / 53卷 / 11期

基金：

美国国家科学基金会;

关键词：

Convolutional neural networks; Fully-connected layer; Dynamic adjustment; Generalization performance; Overfitting; CNN; FUSION; IDENTIFICATION; PREDICTION; DIAGNOSIS; FRAMEWORK;

D O I：

10.1007/s10489-022-04208-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, convolutional neural networks (CNNs) have been used in many fields. Nowadays, CNNs have a high learning capability, and this learning capability is accompanied by a more complex model architecture. Complex model architectures allow CNNs to learn more data features, but such a learning process tends to reduce the training model's ability to generalize to unknown data, and may be associated with problems of overfitting. Although many regularization methods have been proposed, such as data augmentation, batch normalization, and Dropout, research on improving generalization performance is still a common concern in the training process of robust CNNs. In this paper, we propose a dynamically controllable adjustment method, which we call LossDA, that embeds a disturbance variable in the fully-connected layer. The trend of this variable is kept consistent with the training loss, while the magnitude of the variable can be preset to adapt to the training process of different models. Through this dynamic adjustment, the training process of CNNs can be adaptively adjusted. The whole regularization process can improve the generalization performance of CNNs while helping to suppress overfitting. To evaluate this method, this paper conducts comparative experiments on MNIST, FashionMNIST, CIFAR-10, Cats_vs_Dogs, and miniImagenet datasets. The experimental results show that the method can improve the model performance of Light CNNs and Transfer CNNs (InceptionResNet, VGG19, ResNet50, and InceptionV3). The average maximum improvement in accuracy of Light CNNs is 4.62%, F1 is 3.99%, and Recall is 4.69%. The average maximum improvement accuracy of Transfer CNNs is 4.17%, F1 is 5.64%, and Recall is 4.05%.

引用

页码：13852 / 13866

页数：15

共 48 条

[1] Fault Detection and Severity Identification of Ball Bearings by Online Condition Monitoring [J].

Abdeljaber, Osama ;

Sassi, Sadok ;

Avci, Onur ;

Kiranyaz, Serkan ;

Ibrahim, Abdelrahman Aly ;

Gabbouj, Moncef .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2019, 66 (10) :8136-8147

[2] Information Dropout: Learning Optimal Representations Through Noisy Computation [J].

Achille, Alessandro ;

Soatto, Stefano .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (12) :2897-2905

[3] Deep transfer learning-based automated detection of COVID-19 from lung CT scan slices [J].

Ahuja, Sakshi ;

Panigrahi, Bijaya Ketan ;

Dey, Nilanjan ;

Rajinikanth, Venkatesan ;

Gandhi, Tapan Kumar .

APPLIED INTELLIGENCE, 2021, 51 (01) :571-585

[4] Brain tumor classification in magnetic resonance image using hard swish-based RELU activation function-convolutional neural network [J].

Alhassan, Afnan M. ;

Zainon, Wan Mohd Nazmee Wan .

NEURAL COMPUTING & APPLICATIONS, 2021, 33 (15) :9075-9087

[5] Review of deep learning: concepts, CNN architectures, challenges, applications, future directions [J].

Alzubaidi, Laith ;

Zhang, Jinglan ;

Humaidi, Amjad J. ;

Al-Dujaili, Ayad ;

Duan, Ye ;

Al-Shamma, Omran ;

Santamaria, J. ;

Fadhel, Mohammed A. ;

Al-Amidie, Muthana ;

Farhan, Laith .

JOURNAL OF BIG DATA, 2021, 8 (01)

[6] Impact of fully connected layers on performance of convolutional neural networks for image classification [J].

Basha, S. H. Shabbeer ;

Dubey, Shiv Ram ;

Pulabaigari, Viswanath ;

Mukherjee, Snehasis .

NEUROCOMPUTING, 2020, 378 :112-119

[7] Human action identification by a quality-guided fusion of multi-model feature [J].

Bi, Zhuo ;

Huang, Wenju .

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 116 :13-21

[8]

Castiglione A, 2021, IEEE T IND INFORM, V17, P6480, DOI [10.9734/ijmpcr/2021/v14i130120, 10.1109/TII.2021.3057524]

[9]

Castro DC, 2019, J MACH LEARN RES, V20

[10] Explainable deep neural networks for novel viral genome prediction [J].

Dasari, Chandra Mohan ;

Bhukya, Raju .

APPLIED INTELLIGENCE, 2022, 52 (03) :3002-3017

← 1 2 3 4 5 →