A Discriminant Information Theoretic Learning Framework for Multi-modal Feature Representation

被引:2
作者
Gao, Lei [1 ]
Guan, Ling [1 ]
机构
[1] Ryerson Univ, 350 Victoria St, Toronto, ON M5B 2K3, Canada
关键词
Discriminative representation; complementary representation; information theoretic learning; multi-modal feature representation; image recognition; audio-visual emotion recognition; CANONICAL CORRELATION-ANALYSIS; GRAPH CONVOLUTIONAL NETWORKS; FEATURE FUSION; LEVEL FUSION; EMOTION; RECOGNITION; MODEL; SPARSE; AUTOENCODERS; ALGORITHMS;
D O I
10.1145/3587253
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As sensory and computing technology advances, multi-modal features have been playing a central role in ubiquitously representing patterns and phenomena for effective information analysis and recognition. As a result, multi-modal feature representation is becoming a progressively significant direction of academic research and real applications. Nevertheless, numerous challenges remain ahead, especially in the joint utilization of discriminatory representations and complementary representations from multi-modal features. In this article, a discriminant information theoretic learning (DITL) framework is proposed to address these challenges. By employing this proposed framework, the discrimination and complementation within the given multi-modal features are exploited jointly, resulting in a high-quality feature representation. According to characteristics of the DITL framework, the newly generated feature representation is further optimized, leading to lower computational complexity and improved system performance. To demonstrate the effectiveness and generality of DITL, we conducted experiments on several recognition examples, including both static cases, such as handwritten digit recognition, face recognition, and object recognition, and dynamic cases, such as video-based human emotion recognition and action recognition. The results show that the proposed framework outperforms state-of-the-art algorithms.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Introduction to a Framework for Multi-modal and Tangible Interaction
    Lo, Kenneth W. K.
    Tang, Will W. W.
    Ngai, Grace
    Chan, Stephen C. F.
    Tse, Jason T. P.
    IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2010), 2010,
  • [22] Neighbor-consistent multi-modal canonical correlations for feature fusion
    Zhu, Yanmin
    Peng, Tianhao
    Su, Shuzhi
    Li, Changpeng
    INFRARED PHYSICS & TECHNOLOGY, 2022, 123
  • [23] Dynamically engineered multi-modal feature learning for predictions of office building cooling loads
    Liu, Yiren
    Zhao, Xiangyu
    Qin, S. Joe
    APPLIED ENERGY, 2024, 355
  • [24] Fusional Recognition for Depressive Tendency With Multi-Modal Feature
    Wang, Hong
    Zhou, Ying
    Yu, Fengping
    Zhao, Lili
    Wang, Caiyu
    Ren, Yanju
    IEEE ACCESS, 2019, 7 : 38702 - 38713
  • [25] Adaptive Feature Fusion for Multi-modal Entity Alignment
    Guo H.
    Li X.-Y.
    Tang J.-Y.
    Guo Y.-M.
    Zhao X.
    Zidonghua Xuebao/Acta Automatica Sinica, 2024, 50 (04): : 758 - 770
  • [26] Landmark Classification With Hierarchical Multi-Modal Exemplar Feature
    Zhu, Lei
    Shen, Jialie
    Jin, Hai
    Xie, Liang
    Zheng, Ran
    IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (07) : 981 - 993
  • [27] An Optimal Edge-weighted Graph Semantic Correlation Framework for Multi-view Feature Representation Learning
    Gao, Lei
    Guo, Zheng
    Guan, Ling
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (07) : 1 - 23
  • [28] Fast Multi-Task SCCA Learning with Feature Selection for Multi-Modal Brain Imaging Genetics
    Du, Lei
    Liu, Kefei
    Yao, Xiaohui
    Risacher, Shannon L.
    Han, Junwei
    Guo, Lei
    Saykin, Andrew J.
    Shen, Li
    Weiner, Michael
    Aisen, Paul
    Petersen, Ronald
    Jack, Clifford R., Jr.
    Jagust, William
    Trojanowki, John Q.
    Toga, Arthur W.
    Beckett, Laurel
    Green, Robert C.
    Saykin, Andrew J.
    Morris, John
    Liu, Enchi
    Montine, Tom
    Gamst, Anthony
    Thomas, Ronald G.
    Donohue, Michael
    Walter, Sarah
    Gessert, Devon
    Sather, Tamie
    Harvey, Danielle
    Kornak, John
    Dale, Anders
    Bernstein, Matthew
    Felmlee, Joel
    Fox, Nick
    Thompson, Paul
    Schuff, Norbert
    Alexander, Gene
    DeCarli, Charles
    Bandy, Dan
    Koeppe, Robert A.
    Foster, Norm
    Reiman, Eric M.
    Chen, Kewei
    Mathis, Chet
    Cairns, Nigel J.
    Taylor-Reinwald, Lisa
    Shaw, Les
    Lee, Virginia M. Y.
    Korecka, Magdalena
    Crawford, Karen
    Neu, Scott
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 356 - 361
  • [29] Robust indoor localization based on multi-modal information fusion and multi-scale sequential feature extraction
    Wang, Qinghu
    Jia, Jie
    Chen, Jian
    Deng, Yansha
    Wang, Xingwei
    Aghvami, Abdol Hamid
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 155 : 164 - 178
  • [30] Based on Multi-Feature Information Attention Fusion for Multi-Modal Remote Sensing Image Semantic Segmentation
    Zhang, Chongyu
    2021 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (IEEE ICMA 2021), 2021, : 71 - 76