Visual–Semantic Fuzzy Interaction Network for Zero-Shot Learning

被引：0

作者：

Hui, Xuemeng ^{[1
]}

Liu, Zhunga ^{[1
]}

Liu, Jiaxiang ^{[1
]}

Zhang, Zuowei ^{[1
]}

Wang, Longfei ^{[1
]}

机构：

[1] Northwestern Polytechnical University, Key Laboratory of Information Fusion Technology of Ministry of Education, Shaanxi Province, Xi’an

来源：

IEEE Transactions on Artificial Intelligence | 2025年 / 6卷 / 05期

基金：

中国国家自然科学基金;

关键词：

Fuzzy set theory; knowledge transfer; membership function; object recognition; zero-shot learning;

D O I：

10.1109/TAI.2024.3524955

中图分类号：

学科分类号：

摘要：

Zero-shot learning (ZSL) aims to recognize unseen class image objects using manually defined semantic knowledge corresponding to both seen and unseen images. The key of ZSL lies in building the interaction between precise image data and fuzzy semantic knowledge. The fuzziness is attributed to the difficulty in quantifying human knowledge. However, the existing ZSL methods ignore the inherent fuzziness of semantic knowledge and treat it as precise data during building the visual–semantic interaction. This is not good for transferring semantic knowledge from seen classes to unseen classes. In order to solve this problem, we propose a visual–semantic fuzzy interaction network (VSFIN) for ZSL. VSFIN utilize an effective encoder–decoder structure, including a semantic prototype encoder (SPE) and visual feature decoder (VFD). The SPE and VFD enable the visual features to interact with semantic knowledge via cross-attention. To achieve visual–semantic fuzzy interaction in SPE and VFD, we introduce the concept of membership function in fuzzy set theory and design a membership loss function. This loss function allows for a certain degree of imprecision in visual–semantic interaction, thereby enabling VSFIN to becomingly utilize the given semantic knowledge. Moreover, we introduce the concept of rank sum test and propose a distribution alignment loss to alleviate the bias towards seen classes. Extensive experiments on three widely used benchmarks have demonstrated that VSFIN outperforms current state-of-the-art methods under both conventional ZSL (CZSL) and generalized ZSL (GZSL) settings. © 2020 IEEE.

引用

页码：1345 / 1359

页数：14

共 67 条

[1]

Larochelle H., Erhan D., Bengio Y., Zero-data learning of new tasks, Proc. 23rd Nat. Conf. Artif. Intell. (AAAI), 2, pp. 646-651, (2008)

[2]

Palatucci M., Pomerleau D., Hinton G., Mitchell T.M., Zero-shot learning with semantic output codes, Proc. 22nd Int. Conf. Neural Inf. Process. Syst. (NIPS), pp. 1410-1418, (2009)

[3]

Fu Y., Xiang T., Jiang Y.-G., Xue X., Sigal L., Gong S., Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content, IEEE Signal Process. Mag., 35, 1, pp. 112-125, (2018)

[4]

Wang W., Zheng V.W., Yu H., Miao C., A survey of zero-shot learning: Settings, methods, and applications, ACM Trans. Intell. Syst. Technol., 10, (2019)

[5]

Pourpanah F., Et al., A review of generalized zero-shot learning methods, IEEE Trans. Pattern Anal. Mach. Intell., 45, 4, pp. 4051-4070, (2023)

[6]

Lampert C.H., Nickisch H., Harmeling S., Attribute-based classification for zero-shot visual object categorization, IEEE Trans. Pattern Anal. Mach. Intell., 36, 3, pp. 453-465, (2014)

[7]

Reed S., Akata Z., Lee H., Schiele B., Learning deep representations of fine-grained visual descriptions, Proc. IEEE Conf. Comput. Vision Pattern Recognit. (CVPR), (2016)

[8]

Farhadi A., Endres I., Hoiem D., Forsyth D., Describing objects by their attributes, Proc. IEEE Conf. Comput. Vision Pattern Recognit. (CVPR), pp. 1778-1785, (2009)

[9]

Cheng Y., Qiao X., Wang X., Yu Q., Random forest classifier for zero-shot learning based on relative attribute, IEEE Trans. Neural Networks Learn. Syst., 29, 5, pp. 1662-1674, (2018)

[10]

Akata Z., Perronnin F., Harchaoui Z., Schmid C., Label-embedding for image classification, IEEE Trans. Pattern Anal. Mach. Intell., pp. 1425-1438, (2016)

← 1 2 3 4 5 6 7 →