A Discriminant Information Theoretic Learning Framework for Multi-modal Feature Representation

被引:2
作者
Gao, Lei [1 ]
Guan, Ling [1 ]
机构
[1] Ryerson Univ, 350 Victoria St, Toronto, ON M5B 2K3, Canada
关键词
Discriminative representation; complementary representation; information theoretic learning; multi-modal feature representation; image recognition; audio-visual emotion recognition; CANONICAL CORRELATION-ANALYSIS; GRAPH CONVOLUTIONAL NETWORKS; FEATURE FUSION; LEVEL FUSION; EMOTION; RECOGNITION; MODEL; SPARSE; AUTOENCODERS; ALGORITHMS;
D O I
10.1145/3587253
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As sensory and computing technology advances, multi-modal features have been playing a central role in ubiquitously representing patterns and phenomena for effective information analysis and recognition. As a result, multi-modal feature representation is becoming a progressively significant direction of academic research and real applications. Nevertheless, numerous challenges remain ahead, especially in the joint utilization of discriminatory representations and complementary representations from multi-modal features. In this article, a discriminant information theoretic learning (DITL) framework is proposed to address these challenges. By employing this proposed framework, the discrimination and complementation within the given multi-modal features are exploited jointly, resulting in a high-quality feature representation. According to characteristics of the DITL framework, the newly generated feature representation is further optimized, leading to lower computational complexity and improved system performance. To demonstrate the effectiveness and generality of DITL, we conducted experiments on several recognition examples, including both static cases, such as handwritten digit recognition, face recognition, and object recognition, and dynamic cases, such as video-based human emotion recognition and action recognition. The results show that the proposed framework outperforms state-of-the-art algorithms.
引用
收藏
页数:24
相关论文
共 50 条
[31]   An Optimal Edge-weighted Graph Semantic Correlation Framework for Multi-view Feature Representation Learning [J].
Gao, Lei ;
Guo, Zheng ;
Guan, Ling .
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (07) :1-23
[32]   Landmark Classification With Hierarchical Multi-Modal Exemplar Feature [J].
Zhu, Lei ;
Shen, Jialie ;
Jin, Hai ;
Xie, Liang ;
Zheng, Ran .
IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (07) :981-993
[33]   Fast Multi-Task SCCA Learning with Feature Selection for Multi-Modal Brain Imaging Genetics [J].
Du, Lei ;
Liu, Kefei ;
Yao, Xiaohui ;
Risacher, Shannon L. ;
Han, Junwei ;
Guo, Lei ;
Saykin, Andrew J. ;
Shen, Li ;
Weiner, Michael ;
Aisen, Paul ;
Petersen, Ronald ;
Jack, Clifford R., Jr. ;
Jagust, William ;
Trojanowki, John Q. ;
Toga, Arthur W. ;
Beckett, Laurel ;
Green, Robert C. ;
Saykin, Andrew J. ;
Morris, John ;
Liu, Enchi ;
Montine, Tom ;
Gamst, Anthony ;
Thomas, Ronald G. ;
Donohue, Michael ;
Walter, Sarah ;
Gessert, Devon ;
Sather, Tamie ;
Harvey, Danielle ;
Kornak, John ;
Dale, Anders ;
Bernstein, Matthew ;
Felmlee, Joel ;
Fox, Nick ;
Thompson, Paul ;
Schuff, Norbert ;
Alexander, Gene ;
DeCarli, Charles ;
Bandy, Dan ;
Koeppe, Robert A. ;
Foster, Norm ;
Reiman, Eric M. ;
Chen, Kewei ;
Mathis, Chet ;
Cairns, Nigel J. ;
Taylor-Reinwald, Lisa ;
Shaw, Les ;
Lee, Virginia M. Y. ;
Korecka, Magdalena ;
Crawford, Karen ;
Neu, Scott .
PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, :356-361
[34]   Multi-Modal Entity Alignment Based on Enhanced Relationship Learning and Multi-Layer Feature Fusion [J].
Li, Huayu ;
Hou, Yujie ;
Liu, Jing ;
Zhang, Peiying ;
Wang, Cuicui ;
Liu, Kai .
SYMMETRY-BASEL, 2025, 17 (07)
[35]   Robust indoor localization based on multi-modal information fusion and multi-scale sequential feature extraction [J].
Wang, Qinghu ;
Jia, Jie ;
Chen, Jian ;
Deng, Yansha ;
Wang, Xingwei ;
Aghvami, Abdol Hamid .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 155 :164-178
[36]   Based on Multi-Feature Information Attention Fusion for Multi-Modal Remote Sensing Image Semantic Segmentation [J].
Zhang, Chongyu .
2021 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (IEEE ICMA 2021), 2021, :71-76
[37]   Multi-Granularity and Multi-Modal Feature Fusion for Indoor Positioning [J].
Ye, Lijuan ;
Wang, Yi ;
Pei, Shenglei ;
Wang, Yu ;
Zhao, Hong ;
Dong, Shi .
SYMMETRY-BASEL, 2025, 17 (04)
[38]   Multi-modal graph regularization based class center discriminant analysis for cross modal retrieval [J].
Zhang, Meijia ;
Zhang, Huaxiang ;
Lie, Junzheng ;
Fang, Yixian ;
Wang, Li ;
Shang, Fei .
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (19) :28285-28307
[39]   Multi-modal facial expression feature based on deep-neural networks [J].
Wei, Wei ;
Jia, Qingxuan ;
Feng, Yongli ;
Chen, Gang ;
Chu, Ming .
JOURNAL ON MULTIMODAL USER INTERFACES, 2020, 14 (01) :17-23
[40]   Efficient Solvers for Wyner Common Information With Application to Multi-Modal Clustering [J].
Huang, Teng-Hui ;
El Gamal, Hesham .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2025, 71 (03) :2054-2074