Surface Material Retrieval Using Weakly Paired Cross-Modal Learning

被引：25

作者：

Liu, Huaping ^{[1
,2
]}

Wang, Feng ^{[1
,2
]}

Sun, Fuchun ^{[1
,2
]}

Fang, Bin ^{[1
,2
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China

[2] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING | 2019年 / 16卷 / 02期

基金：

美国国家科学基金会; 中国国家自然科学基金;

关键词：

Cross-modal learning; multimodal data; surface material retrieval; MATERIAL RECOGNITION; FUSION;

D O I：

10.1109/TASE.2018.2865000

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we investigate the cross-modal material retrieval problem, which permits the user to submit a multimodal query including tactile and auditory modalities, and retrieve the image results of visual modalities. Since multiple significantly different modalities are involved in this process, we encounter more challenges compared with the existing cross-modal retrieval tasks. Our focus is to learn cross-modal representations when the modalities are significantly different and with minimal supervision. A novelty is that we establish a framework that deals with weakly paired multimodal fusion method for heterogenous tactile and auditory modalities and weakly paired cross-modal transfer for visual modality. A structured dictionary learning method with a low rank and common classifier is developed to obtain the modal-invariant representation. Finally, some cross-modal validations on publicly available data sets are performed to show the advantages of the proposed method. Note to Practitioners-Cross-modal retrieval is an important task for industrial intelligence. In this paper, we establish a framework to effectively solve the cross-modal material retrieval problem. In the developed framework, the user may submit a multimodal query including acceleration and sound about an object, and the system may return the most relevant retrieved images. Such a framework may find extensive applications in many fields, because it can be flexible to deal with a multiple-modal query and uses the minimal category label supervision without the need of strong sample pairing information between modalities. Compared with the previous material analysis systems, this paper goes beyond previously proposed surface material classification approaches as it returns an ordered list of perceptually similar surface materials for a query.

引用

页码：781 / 791

页数：11

共 45 条

[1]

[Anonymous], 2016, P 25 INT JOINT C ART

[2]

[Anonymous], 2017, CVPR

[3]

Aytar Y., IEEE T PATTERN ANAL

[4] Intelligent Prodder: Implementation of Measurement Methodologies for Material Recognition and Classification With Humanitarian Demining Applications [J].

Baglio, Salvatore ;

Cantelli, Luciano ;

Giusa, Fabio ;

Muscato, Giovanni .

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2015, 64 (08) :2217-2226

[5] Multimodal Task-Driven Dictionary Learning for Image Classification [J].

Bahrampour, Soheil ;

Nasrabadi, Nasser M. ;

Ray, Asok ;

Jenkins, William Kenneth .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (01) :24-38

[6]

Barazandeh B., IEEE T NETW SCI ENG

[7]

Bell S, 2015, PROC CVPR IEEE, P3479, DOI 10.1109/CVPR.2015.7298970

[8] Heterogeneous Sensor Data Fusion Approach for Real-time Monitoring in Ultraprecision Machining (UPM) Process Using Non-Parametric Bayesian Clustering and Evidence Theory [J].

Beyca, Omer F. ;

Rao, Prahalad K. ;

Kong, Zhenyu ;

Bukkapatnam, Satish T. S. ;

Komanduri, Ranga .

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2016, 13 (02) :1033-1044

[9]

Brandao M, 2016, IEEE-RAS INT C HUMAN, P81, DOI 10.1109/HUMANOIDS.2016.7803258

[10] Discriminative Dictionary Learning With Common Label Alignment for Cross-Modal Retrieval [J].

Deng, Cheng ;

Tang, Xu ;

Yan, Junchi ;

Liu, Wei ;

Gao, Xinbo .

IEEE TRANSACTIONS ON MULTIMEDIA, 2016, 18 (02) :208-218

← 1 2 3 4 5 →