Environment Scene Classification Based on Images Using Bag-of-Words

被引:2
作者
Petraitis, Taurius [1 ]
Maskeliunas, Rytis [1 ]
Damasevicius, Robertas [1 ]
Polap, Dawid [2 ]
Wozniak, Marcin [2 ]
Gabryel, Marcin [3 ]
机构
[1] Kaunas Univ Technol, Dept Multimedia Engn, LT-44249 Kaunas, Lithuania
[2] Silesian Tech Univ, Fac Appl Math, Inst Math, PL-44100 Gliwice, Poland
[3] Czestochowa Tech Univ, Inst Computat Intelligence, PL-42200 Czestochowa, Poland
来源
COMPUTATIONAL INTELLIGENCE, IJCCI 2017 | 2019年 / 829卷
关键词
Object recognition; Scene recognition; Image processing; Bag-of-Words; Assisted living; REPRESENTATION;
D O I
10.1007/978-3-030-16469-0_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We analyse the environment scene classification methods based on the Bag of Words (BoW) model. The BoW model encodes images by a bag of visual features, which is a sparse histogram over a dictionary of visual features extracted from an image. We analyse five feature detectors (Scale Invasive Feature Transform (SIFT), Speed-Up Robust Features (SURF), Features from Accelerated Segment Test (FAST), Maximally Stable Extremal Regions (MSER), and grid-based) and three feature descriptors (SIFT, SURF and U-SURF). Our experiments show that feature detection with a grid and feature description using SIFT descriptor, and feature detection with SURF and feature description with U-SURF are most effective when classifying (using Support Vector Machine (SVM)) images into eight outdoor scene categories (coast, forest, highway, inside city, mountain, open country, street, and high buildings). Indoor scene classification into five categories (bedroom, industrial, kitchen, living room, and store) achieved worse results, while the most confused categories were industrial/store images. The classification of full image dataset (15 outdoor and indoor categories) achieved the overall accuracy of 67.49 +/- 1.50%, while most errors came from misclassifications of indoor images. The results of the study can be applicable for assisting living applications and security systems.
引用
收藏
页码:281 / 303
页数:23
相关论文
共 48 条
[1]   Enhanced Bags of Visual Words Representation Using Spatial Information [J].
Abdi, Lotfi ;
Kalboussi, Rahma ;
Meddeb, Aref .
IMAGE ANALYSIS AND PROCESSING (ICIAP 2017), PT II, 2017, 10485 :171-179
[2]  
[Anonymous], 2011, P 19 ACM INT C MULTI, DOI DOI 10.1145/2072298.2072005
[3]  
Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
[4]   SURF: Speeded up robust features [J].
Bay, Herbert ;
Tuytelaars, Tinne ;
Van Gool, Luc .
COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 :404-417
[5]   Magnitude, temporal trends, and projections of the global prevalence of blindness and distance and near vision impairment: a systematic review and meta-analysis [J].
Bourne, Rupert R. A. ;
Flaxman, Seth R. ;
Braithwaite, Tasanee ;
Cicinelli, Maria V. ;
Das, Aditi ;
Jonas, Jost B. ;
Keeffe, Jill ;
Kempen, John H. ;
Leasher, Janet ;
Limburg, Hans ;
Naidoo, Kovin ;
Pesudovs, Konrad ;
Resnikoff, Serge ;
Silvester, Alex ;
Stevens, Gretchen A. ;
Tahhan, Nina ;
Wong, Tien Y. ;
Taylor, Hugh R. .
LANCET GLOBAL HEALTH, 2017, 5 (09) :E888-E897
[6]  
Chan L.A., 2002, IMAGE RECOGNITION CL
[7]  
Csurka G., 2004, WORKSHOP STAT LEARNI, P1, DOI DOI 10.1234/12345678
[8]  
Damasevicius R, 2017, 2017 13TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO), P85, DOI 10.1109/ICENCO.2017.8289768
[9]  
Dobre C., 2016, Ambient Assisted Living and Enhanced Living Environments: Principles, Technologies and Control
[10]   Text detection from natural scene images: Towards a system for visually impaired persons [J].
Ezaki, N ;
Bulacu, M ;
Schomaker, L .
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, :683-686