MORE: A Multimodal Object-Entity Relation Extraction Dataset with a Benchmark Evaluation

被引:0
作者
He, Liang [1 ]
Wang, Hongke [1 ]
Cao, Yongchang [1 ]
Wu, Zhen [1 ]
Zhang, Jianbing [1 ]
Dai, Xinyu [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Peoples R China
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年
基金
中国国家自然科学基金;
关键词
dataset; multimodal; relation extraction; benchmark evaluation;
D O I
10.1145/3581783.3612209
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extracting relational facts from multimodal data is a crucial task in the field of multimedia and knowledge graphs that feeds into widespread real-world applications. The emphasis of recent studies centers on recognizing relational facts in which both entities are present in one modality and supplementary information is used from other modalities. However, such works disregard a substantial amount of multimodal relational facts that arise across different modalities, such as one entity seen in a text and another in an image. In this paper, we propose a new task, namely Multimodal Object-Entity Relation Extraction, which aims to extract "object-entity" relational facts from image and text data. To facilitate research on this task, we introduce MORE, a new dataset comprising 21 relation types and 20,136 multimodal relational facts annotated on 3,522 pairs of textual news titles and corresponding images. To show the challenges of Multimodal Object-Entity Relation Extraction, we evaluated recent state-of-the-art methods for multimodal relation extraction and conducted a comprehensive experimentation analysis on MORE. Our results demonstrate significant challenges for existing methods, underlining the need for further research on this task. Based on our experiments, we identify several promising directions for future research. The MORE dataset and code are available at https://github.com/NJUNLP/MORE.
引用
收藏
页码:4564 / 4573
页数:10
相关论文
共 35 条
  • [11] Li Liunian Harold, 2019, ARXIV190803557
  • [12] Liang Ke, 2023, IEEE T KNOWLEDGE DAT
  • [13] Liang Ke, 2022, ARXIV221205767
  • [14] MMKG: Multi-modal Knowledge Graphs
    Liu, Ye
    Li, Hui
    Garcia-Duran, Alberto
    Niepert, Mathias
    Onoro-Rubio, Daniel
    Rosenblum, David S.
    [J]. SEMANTIC WEB, ESWC 2019, 2019, 11503 : 459 - 474
  • [15] Loshchilov I., 2018, INT C LEARNING REPRE
  • [16] Lu JS, 2019, ADV NEUR IN, V32
  • [17] Exploring the synergy between knowledge graph and computer vision for personalisation systems
    Lully, Vincent
    Laublet, Philippe
    Stankovic, Milan
    Radulovic, Filip
    [J]. PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON SEMANTIC SYSTEMS, 2018, 137 : 175 - 186
  • [18] Mitra Sayantan, 2022, P 2022 C N AM CHAPT, P280, DOI DOI 10.18653/V1/2022.NAACL-INDUSTRY.31
  • [19] Mokady Ron, 2021, CoRR abs/2111.09734
  • [20] Shangfei Zheng, 2023, 2023 IEEE 39th International Conference on Data Engineering (ICDE), P96, DOI 10.1109/ICDE55515.2023.00015