Real-Time Indoor Scene Description for the Visually Impaired Using Autoencoder Fusion Strategies with Visible Cameras

被引:13
作者
Malek, Salim [1 ]
Melgani, Farid [1 ]
Mekhalfi, Mohamed Lamine [1 ]
Bazi, Yakoub [2 ]
机构
[1] Univ Trento, Dept Informat Engn & Comp Sci, Via Sommarive 9, I-38123 Trento, Italy
[2] King Saud Univ, Coll Comp & Informat Sci, Riyadh 11543, Saudi Arabia
关键词
assistive technologies; visible cameras; visually impaired (VI) people; coarse scene description; multiobject recognition; deep learning; feature fusion; image representation; REPRESENTATIONS;
D O I
10.3390/s17112641
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
This paper describes three coarse image description strategies, which are meant to promote a rough perception of surrounding objects for visually impaired individuals, with application to indoor spaces. The described algorithms operate on images (grabbed by the user, by means of a chest-mounted camera), and provide in output a list of objects that likely exist in his context across the indoor scene. In this regard, first, different colour, texture, and shape-based feature extractors are generated, followed by a feature learning step by means of AutoEncoder (AE) models. Second, the produced features are fused and fed into a multilabel classifier in order to list the potential objects. The conducted experiments point out that fusing a set of AE-learned features scores higher classification rates with respect to using the features individually. Furthermore, with respect to reference works, our method: (i) yields higher classification accuracies, and (ii) runs (at least four times) faster, which enables a potential full real-time application.
引用
收藏
页数:14
相关论文
共 27 条
[1]  
[Anonymous], PROC CVPR IEEE
[2]  
[Anonymous], 2012, AUST ROBOT AUTOM ASS, DOI DOI 10.1007/978-3-319-16199-0_32
[3]  
[Anonymous], 2014, ARXIV PREPRINT ARXIV
[4]  
Balakrishnan G., 2004, Proceedings. Third International Conference on Image and Graphics, P580
[5]   A Navigation Aid for Blind People [J].
Bousbia-Salah, Mounir ;
Bettayeb, Maamar ;
Larbi, Allal .
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2011, 64 (3-4) :387-400
[6]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[7]   A Completed Modeling of Local Binary Pattern Operator for Texture Classification [J].
Guo, Zhenhua ;
Zhang, Lei ;
Zhang, David .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2010, 19 (06) :1657-1663
[8]   Robust and Effective Component-Based Banknote Recognition for the Blind [J].
Hasanuzzaman, Faiz M. ;
Yang, Xiaodong ;
Tian, YingLi .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2012, 42 (06) :1021-1030
[9]  
López-de-Ipiña D, 2011, LECT NOTES COMPUT SC, V6719, P266, DOI 10.1007/978-3-642-21535-3_39
[10]   A Virtualization-based Cloud Infrastructure for IMS Core Network [J].
Lu, Feng ;
Pan, Hao ;
Lei, Xiao ;
Liao, Xiaofei ;
Jin, Hai .
2013 IEEE FIFTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), VOL 1, 2013, :25-32