Depth-restricted convolutional neural network-a model for Gujarati food image classification

被引：5

作者：

Shah, Bhoomi ^{[1
]}

Bhavsar, Hetal ^{[2
]}

机构：

[1] Maharaja Sayajirao Univ Baroda, Fac Sci, Dept Comp Applicat, Vadodara 390001, Gujarat, India

[2] Maharaja Sayajirao Univ Baroda, Fac Technol & Engn, Dept Comp Sci & Engn, Vadodara 390001, Gujarat, India

来源：

VISUAL COMPUTER | 2024年 / 40卷 / 03期

关键词：

Image classification; Machine learning; Optimization; Predictive models; Supervised learning; Transfer Learning;

D O I：

10.1007/s00371-023-02893-z

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

For an effective dietary assessment system, it is necessary to keep track of the amount of food consumed. Food recognition is the first step to calorie estimation, and image processing technique is useful to achieve this. With the use of food image classification, people can count the amount of food taken and control the calories taken, which helps to reduce the risk of serious health conditions like hypertension, chronic diseases, and heart disease. The nature of food is very diverse, which makes the food image classification task more challenging. Deep learning methods for image classification give more accurate and efficient results as compared to traditional methods. This research work focuses on classifying Gujarati food images as no efforts have been made till now to classify Gujarati food images. A new dataset named "Traditional Gujarati Food Images Dataset (TGFD)" has been created. The dataset contains 1764 images belonging to five food classes and famous food items in Gujarat. The experiments start by implementing transfer learning on models, namely VGG16, VGG19, Resnet50, Inceptionv3, and Alexnet. Fine-tuning has been implemented on all models in order to increase accuracy. After fine-tuning all the models, the maximum accuracy achieved was "89.36%" on the Inception v3 model, but the loss was very high. Certain parameters, like the number of convolutional layers, number of neurons in fully connected layers, number of filters, and filter size, directly affect the model's accuracy. Taking these parameters into consideration to improve accuracy and reduce loss, this research work proposes a model named "depth-restricted convolutional neural network (DRCNN)" which achieves "95.48%" accuracy, which is remarkable. The DRCNN model contains 482,069 parameters, which is 48 times less than the parameters of the Inceptionv3 model, and the validation loss is only 0.8041. Introducing batch normalization in the proposed model drastically improves performance with a lower number of parameters. DRCNN has been tested on an increasing number of classes in the dataset and on different types of food datasets. In both cases, the model performs outstandingly, proving its versatility.

引用

页码：1931 / 1946

页数：16

共 56 条

[1]

Aguilar E, 2017, IMAGE ANAL PROCESSIN, P1

[2]

Attokaren D, 2017, P IEEE REG 10 C

[3]

Attokaren DJ, 2017, TENCON IEEE REGION, P2801, DOI 10.1109/TENCON.2017.8228338

[4]

Ba J, 2014, ACS SYM SER

[5] The Do's and Don'ts for CNN-based Face Verification [J].

Bansal, Ankan ;

Castillo, Carlos ;

Ranjan, Rajeev ;

Chellappa, Rama .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :2545-2554

[6] Impact of fully connected layers on performance of convolutional neural networks for image classification [J].

Basha, S. H. Shabbeer ;

Dubey, Shiv Ram ;

Pulabaigari, Viswanath ;

Mukherjee, Snehasis .

NEUROCOMPUTING, 2020, 378 :112-119

[7] Effect of pooling strategy on convolutional neural network for classification of hyperspectral remote sensing images [J].

Bera, Somenath ;

Shrivastava, Vimal K. .

IET IMAGE PROCESSING, 2020, 14 (03) :480-486

[8]

Bird Jordan J., 2020, 2020 IEEE 10th International Conference on Intelligent Systems (IS), P619, DOI 10.1109/IS48319.2020.9199968

[9]

Bylander T, 2006, P 19 INT FLOR ART IN, P11

[10]

Chawla NV, 2005, DATA MINING AND KNOWLEDGE DISCOVERY HANDBOOK, P853, DOI 10.1007/0-387-25465-X_40

← 1 2 3 4 5 6 →