VEG-MMKG: Multimodal knowledge graph construction for vegetables based on pre-trained model extraction

被引:0
作者
Lv, Bowen [1 ,2 ,3 ,4 ]
Wu, Huarui [1 ,3 ,4 ]
Chen, Wenbai [2 ]
Chen, Cheng [1 ]
Miao, Yisheng [1 ,3 ,4 ]
Zhao, Chunjiang [1 ]
机构
[1] Natl Engn Res Ctr Informat Technol Agr, Beijing 100097, Peoples R China
[2] Beijing Informat Sci & Technol Univ, Sch Automat, Beijing 100192, Peoples R China
[3] Beijing Acad Agr & Forestry Sci, Informat Technol Res Ctr, Beijing 100097, Peoples R China
[4] Minist Agr & Rural Affairs, Key Lab Digital Village Technol, Beijing 100097, Peoples R China
关键词
Knowledge graph; Multimodal fusion; Image-text pairs; Pre-trained model;
D O I
10.1016/j.compag.2024.109398
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Knowledge graph technology is of great significance to modern agricultural information management and datadriven decision support. However, agricultural knowledge is rich in types, and agricultural knowledge graph databases built only based on text are not conducive to users' intuitive perception and comprehensive understanding of knowledge. In view of this, this paper proposes a solution to extract knowledge and construct an agricultural multimodal knowledge graph using a pre-trained language model. This paper takes two plants, cabbage and corn, as research objects. First, a text-image collaborative representation learning method with a two-stream structure is adopted to combine the image modal information of vegetables with the text modal information, and the correlation and complementarity between the two types of information are used to achieve entity alignment. In addition, in order to solve the problem of high similarity of vegetable entities in small categories, a cross-modal fine-grained contrastive learning method is introduced, and the problem of insufficient semantic association between modalities is solved by contrastive learning of vocabulary and small areas of images. Finally, a visual multimodal knowledge graph user interface is constructed using the results of image and text matching. Experimental results show that the image and text matching efficiency of the fine-tuned pretrained model on the vegetable dataset is 76.7%, and appropriate images can be matched for text entities. The constructed visual multimodal knowledge graph database allows users to query and filter knowledge according to their needs, providing assistance for subsequent research on various applications in specific fields such as multimodal agricultural intelligent question and answer, crop pest and disease identification, and agricultural product recommendations.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Data Augmentation Based on Pre-trained Language Model for Event Detection
    Zhang, Meng
    Xie, Zhiwen
    Liu, Jin
    CCKS 2021 - EVALUATION TRACK, 2022, 1553 : 59 - 68
  • [22] A Joint Summarization and Pre-Trained Model for Review-Based Recommendation
    Bai, Yi
    Li, Yang
    Wang, Letian
    INFORMATION, 2021, 12 (06)
  • [23] Infusing factual knowledge into pre-trained model for finding the contributions from the research articles
    Gupta, Komal
    Ghosal, Tirthankar
    Ekbal, Asif
    JOURNAL OF INFORMATION SCIENCE, 2024,
  • [24] The construction of shield machine fault diagnosis knowledge graph based on joint knowledge extraction model
    Wei, Wei
    Jiang, Chuan
    JOURNAL OF ENGINEERING DESIGN, 2025, 36 (03) : 355 - 374
  • [25] A Sentiment Analysis Method for Big Social Online Multimodal Comments Based on Pre-trained Models
    Wan, Jun
    Wozniak, Marcin
    MOBILE NETWORKS & APPLICATIONS, 2024, : 1924 - 1937
  • [26] Extraction and Association Analysis of Risk Factors in Gas Pipeline Network Emergencies Through Fusing Expert Knowledge and Pre-Trained Model
    Zhao, Xinghao
    Hu, Yanzhu
    Liu, Xiaoyu
    Wang, Yingjian
    IEEE ACCESS, 2024, 12 : 65640 - 65649
  • [27] A hybrid model in transfer learning based pre-trained model and a scale factor updating
    Jiao, Peng
    Pei, Jing
    2018 JOINT 7TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV) AND 2018 2ND INTERNATIONAL CONFERENCE ON IMAGING, VISION & PATTERN RECOGNITION (ICIVPR), 2018, : 538 - 543
  • [28] A text restoration model for ancient texts based on pre-trained language model RoBERTa
    Gu, Zhongyu
    Guan, Yanzhi
    Zhang, Shuai
    PROCEEDINGS OF 2024 4TH INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND MACHINE LEARNING, IOTML 2024, 2024, : 96 - 102
  • [29] Remote Sensing Image Classification with a Graph-Based Pre-Trained Neighborhood Spatial Relationship
    Guan, Xudong
    Huang, Chong
    Yang, Juan
    Li, Ainong
    SENSORS, 2021, 21 (16)
  • [30] Detection of Unstructured Sensitive Data Based on a Pre-Trained Model and Lattice Transformer
    Jin, Feng
    Wu, Shaozhi
    Liu, Xingang
    Su, Han
    Tian, Miao
    2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA, ICAIBD 2024, 2024, : 180 - 185