Semantic embedding: scene image classification using scene-specific objects

被引：3

作者：

Parseh, Mohammad Javad ^{[1
]}

Rahmanimanesh, Mohammad ^{[1
]}

Keshavarzi, Parviz ^{[1
]}

Azimifar, Zohreh ^{[2
]}

机构：

[1] Semnan Univ, Dept Elect & Comp Engn, Semnan, Iran

[2] Shiraz Univ, Dept Comp Sci & Engn, Shiraz, Iran

来源：

MULTIMEDIA SYSTEMS | 2023年 / 29卷 / 02期

关键词：

Scene classification; Semantic embedding; Scene-specific objects; RECOGNITION; MODEL; REPRESENTATION; FEATURES; CATEGORIZATION; NETWORKS; BANK;

D O I：

10.1007/s00530-022-01010-9

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Visual scene understanding is a hot and challenging topic in image processing that aims to understand the general (global) concept of a scene image. In this paper, we propose a novel image embedding algorithm using a learned embedded space, which introduces a high-level semantic representation of the scene images. The learned embedded space as a suitable semantic framework for visual concepts can be used in most applications such as image captioning, Visual Question Answering (VQA), and scene recognition or classification. Inspired by the human inference mechanism in visual scene understanding, the proposed method learns a semantic embedded space of visual concepts using prior semantic knowledge. Prior knowledge is extracted from ConceptNet as one of the most comprehensive knowledge graphs in the form of semantic vectors and is transformed to the learned embedded space with a transformation function. The transformation function is learned by solving a minimization problem. To evaluate our proposed approach, we introduce a scene image dataset called "Scene23", which is based on the VisualGenome dataset. A non-linear SVM classifier is utilized to classify the representations of images to the scene categories. The experimental results show 99.44% classification accuracy on the "Scene23" dataset. Also, we evaluated our proposed method by the "UIUC Sports" and "MIT67" datasets. Experimental results indicate that our proposed method outperforms other state-of-the-art methods on the "UIUC Sports" dataset and achieves competitive results on the "MIT67" dataset.

引用

页码：669 / 691

页数：23

共 112 条

[1]

[Anonymous], 2014, ASIAN C COMPUTER VIS

[2]

[Anonymous], 2010, CVPR, DOI DOI 10.1109/CVPR.2010.5540018

[3]

[Anonymous], 2010, ADV NEURAL INF PROCE

[4]

[Anonymous], 2004, International Workshop on Statistical Learning in Computer Vision, DOI DOI 10.1234/12345678

[5]

[Anonymous], 2016, NEURIPS

[6] Scene Categorization Through Using Objects Represented by Deep Features [J].

Bai, Shuang .

INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2017, 31 (09)

[7] Growing random forest on deep convolutional neural networks for scene categorization [J].

Bai, Shuang .

EXPERT SYSTEMS WITH APPLICATIONS, 2017, 71 :279-287

[8]

Baldassano C., 2015, Visual Scene Perception in the Human Brain: Connections to Memory, Categorization, and Social Cognition

[9] Indoor Home Scene Recognition Using Capsule Neural Networks [J].

Basu, Amlan ;

Petropoulakis, Lykourgos ;

Di Caterina, Gaetano ;

Soraghan, John .

INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 :440-448

[10] Speeded-Up Robust Features (SURF) [J].

Bay, Herbert ;

Ess, Andreas ;

Tuytelaars, Tinne ;

Van Gool, Luc .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2008, 110 (03) :346-359

← 1 2 3 4 5 6 7 8 9 10 →