Real-Time Indoor Scene Description for the Visually Impaired Using Autoencoder Fusion Strategies with Visible Cameras

被引：13

作者：

Malek, Salim ^{[1
]}

Melgani, Farid ^{[1
]}

Mekhalfi, Mohamed Lamine ^{[1
]}

Bazi, Yakoub ^{[2
]}

机构：

[1] Univ Trento, Dept Informat Engn & Comp Sci, Via Sommarive 9, I-38123 Trento, Italy

[2] King Saud Univ, Coll Comp & Informat Sci, Riyadh 11543, Saudi Arabia

来源：

SENSORS | 2017年 / 17卷 / 11期

关键词：

assistive technologies; visible cameras; visually impaired (VI) people; coarse scene description; multiobject recognition; deep learning; feature fusion; image representation; REPRESENTATIONS;

D O I：

10.3390/s17112641

中图分类号：

O65 [分析化学];

学科分类号：

070302 ; 081704 ;

摘要：

This paper describes three coarse image description strategies, which are meant to promote a rough perception of surrounding objects for visually impaired individuals, with application to indoor spaces. The described algorithms operate on images (grabbed by the user, by means of a chest-mounted camera), and provide in output a list of objects that likely exist in his context across the indoor scene. In this regard, first, different colour, texture, and shape-based feature extractors are generated, followed by a feature learning step by means of AutoEncoder (AE) models. Second, the produced features are fused and fed into a multilabel classifier in order to list the potential objects. The conducted experiments point out that fusing a set of AE-learned features scores higher classification rates with respect to using the features individually. Furthermore, with respect to reference works, our method: (i) yields higher classification accuracies, and (ii) runs (at least four times) faster, which enables a potential full real-time application.

引用

页数：14

共 27 条

[1]

[Anonymous], PROC CVPR IEEE

[2]

[Anonymous], 2012, AUST ROBOT AUTOM ASS, DOI DOI 10.1007/978-3-319-16199-0_32

[3]

[Anonymous], 2014, ARXIV PREPRINT ARXIV

[4]

Balakrishnan G., 2004, Proceedings. Third International Conference on Image and Graphics, P580

[5] A Navigation Aid for Blind People [J].

Bousbia-Salah, Mounir ;

Bettayeb, Maamar ;

Larbi, Allal .

JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2011, 64 (3-4) :387-400

[6] Histograms of oriented gradients for human detection [J].

Dalal, N ;

Triggs, B .

2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893

[7] A Completed Modeling of Local Binary Pattern Operator for Texture Classification [J].

Guo, Zhenhua ;

Zhang, Lei ;

Zhang, David .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2010, 19 (06) :1657-1663

[8] Robust and Effective Component-Based Banknote Recognition for the Blind [J].

Hasanuzzaman, Faiz M. ;

Yang, Xiaodong ;

Tian, YingLi .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (06) :1021-1030

[9]

López-de-Ipiña D, 2011, LECT NOTES COMPUT SC, V6719, P266, DOI 10.1007/978-3-642-21535-3_39

[10] A Virtualization-based Cloud Infrastructure for IMS Core Network [J].

Lu, Feng ;

Pan, Hao ;

Lei, Xiao ;

Liao, Xiaofei ;

Jin, Hai .

2013 IEEE FIFTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), VOL 1, 2013, :25-32

← 1 2 3 →