Semantic image representation for image recognition and retrieval using multilayer variational auto-encoder, InceptionNet and low-level image features

被引：1

作者：

Giveki, Davar ^{[1
]}

Esfandyari, Sajad ^{[1
]}

机构：

[1] Malayer Univ, Dept Comp Engn, Malayer, Iran

来源：

JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 01期

关键词：

Image representation; Image recognition; Content-based image retrieval; Deep learning; FEATURE FUSION; SCENE; CLASSIFICATION; PERFORMANCE; INFORMATION; ATTENTION; NETWORK; CNN;

D O I：

10.1007/s11227-024-06792-5

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper presents a novel image descriptor that enhances performance in image recognition and retrieval by combining deep learning and handcrafted features. Our method integrates high-level semantic features extracted via InceptionResNet-V2 with color and texture features to create a comprehensive representation of image content. The descriptor's effectiveness is demonstrated through extensive experiments across a range of image recognition and retrieval tasks. Our approach is tested on six benchmark datasets, including Corel-1 K, VS, OT, QT, SUN-397, and ILSVRC-2012 for single-label classification, and COCO and NUS-WIDE for multi-label classification, achieving high performances. The results establish that the proposed method is versatile and robust, excelling in single-label and multi-label recognition as well as image retrieval tasks, and outperforms several state-of-the-art methods. This work provides a significant advancement in image representation, with broad applicability in various computer vision domains.

引用

页数：40

共 100 条

[1] Content based image retrieval using image features information fusion [J].

Ahmed, Khawaja Tehseen ;

Ummesafi, Shahida ;

Iqbal, Amjad .

INFORMATION FUSION, 2019, 51 :76-99

[2] A FUSION OF HAND-CRAFTED FEATURES AND DEEP NEURAL NETWORK FOR INDOOR SCENE CLASSIFICATION. [J].

Anami, Basavaraj S. ;

V. Sagarnal, Chetan .

MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2023, 36 (02) :193-207

[3] Query-by-visual-search: multimodal framework for content-based image retrieval [J].

Bibi, Ruqia ;

Mehmood, Zahid ;

Yousef, Rehan Mehmood ;

Saba, Tanzila ;

Sardaraz, Muhammad ;

Rehman, Amjad .

JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2020, 11 (11) :5629-5648

[4]

Bingyi Cao, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12365), P726, DOI 10.1007/978-3-030-58565-5_43

[5]

Brock Andrew, 2021, P MACHINE LEARNING R, V139

[6] Dynamic Correlation Learning and Regularization for Multi-Label Confidence Calibration [J].