Multimodal Representation Learning via Maximization of Local Mutual Information

被引:23
|
作者
Liao, Ruizhi [1 ]
Moyer, Daniel [1 ]
Cha, Miriam [2 ]
Quigley, Keegan [2 ]
Berkowitz, Seth [3 ]
Horng, Steven [3 ]
Golland, Polina [1 ]
Wells, William M. [1 ,4 ]
机构
[1] MIT, CSAIL, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] MIT, Lincoln Lab, 244 Wood St, Lexington, MA 02173 USA
[3] Harvard Med Sch, Beth Israel Deaconess Med Ctr, Boston, MA 02115 USA
[4] Harvard Med Sch, Brigham & Womens Hosp, Boston, MA 02115 USA
关键词
Multimodal representation learning; Local feature representations; Mutual information maximization;
D O I
10.1007/978-3-030-87196-3_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose and demonstrate a representation learning approach by maximizing the mutual information between local features of images and text. The goal of this approach is to learn useful image representations by taking advantage of the rich information contained in the free text that describes the findings in the image. Our method trains image and text encoders by encouraging the resulting representations to exhibit high local mutual information. We make use of recent advances in mutual information estimation with neural network discriminators. We argue that the sum of local mutual information is typically a lower bound on the global mutual information. Our experimental results in the downstream image classification tasks demonstrate the advantages of using local features for image-text representation learning.
引用
收藏
页码:273 / 283
页数:11
相关论文
共 50 条
  • [1] Graph Representation Learning via Graphical Mutual Information Maximization
    Peng, Zhen
    Huang, Wenbing
    Luo, Minnan
    Zheng, Qinghua
    Rong, Yu
    Xu, Tingyang
    Huang, Junzhou
    WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 259 - 270
  • [2] Learning graph representation by aggregating subgraphs via mutual information maximization
    Liu, Ziwen
    Wang, Chenguang
    Han, Congying
    Guo, Tiande
    NEUROCOMPUTING, 2023, 548
  • [3] Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization
    Dong, Wei
    Wu, Junsheng
    Luo, Yi
    Ge, Zongyuan
    Wang, Peng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16599 - 16608
  • [4] Relation Representation Learning via Signed Graph Mutual Information Maximization for Trust Prediction
    Jing, Yongjun
    Wang, Hao
    Shao, Kun
    Huo, Xing
    SYMMETRY-BASEL, 2021, 13 (01): : 1 - 18
  • [5] GraphDPI: Partial label disambiguation by graph representation learning via mutual information maximization
    Fan, Jinfu
    Yu, Yang
    Huang, Linqing
    Wang, Zhongjie
    PATTERN RECOGNITION, 2023, 134
  • [6] BipNRL: Mutual Information Maximization on Bipartite Graphs for Node Representation Learning
    Poduval, Pranav
    Oberoi, Gaurav
    Verma, Sangam
    Agarwal, Ayush
    Singh, Karamjit
    Asthana, Siddhartha
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT IV, 2023, 14172 : 728 - 743
  • [7] Representation Learning for Conversational Data using Discourse Mutual Information Maximization
    Santra, Bishal
    Roychowdhury, Sumegh
    Mandal, Aishik
    Gurram, Vasu
    Naik, Atharva
    Gupta, Manish
    Goyal, Pawan
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1718 - 1734
  • [8] Learning Deep Generative Clustering via Mutual Information Maximization
    Yang, Xiaojiang
    Yan, Junchi
    Cheng, Yu
    Zhang, Yizhe
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (09) : 6263 - 6275
  • [9] Deep Representation Debiasing via Mutual Information Minimization and Maximization (Student Abstract)
    Han, Ruijiang
    Wang, Wei
    Long, Yuxi
    Peng, Jiajie
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12965 - 12966
  • [10] Unsupervised deep representation learning for motor fault diagnosis by mutual information maximization
    Xiao, Dengyu
    Qin, Chengjin
    Yu, Honggan
    Huang, Yixiang
    Liu, Chengliang
    JOURNAL OF INTELLIGENT MANUFACTURING, 2021, 32 (02) : 377 - 391