VEG-MMKG: Multimodal knowledge graph construction for vegetables based on pre-trained model extraction

被引：0

作者：

Lv, Bowen ^{[1
,2
,3
,4
]}

Wu, Huarui ^{[1
,3
,4
]}

Chen, Wenbai ^{[2
]}

Chen, Cheng ^{[1
]}

Miao, Yisheng ^{[1
,3
,4
]}

Zhao, Chunjiang ^{[1
]}

机构：

[1] Natl Engn Res Ctr Informat Technol Agr, Beijing 100097, Peoples R China

[2] Beijing Informat Sci & Technol Univ, Sch Automat, Beijing 100192, Peoples R China

[3] Beijing Acad Agr & Forestry Sci, Informat Technol Res Ctr, Beijing 100097, Peoples R China

[4] Minist Agr & Rural Affairs, Key Lab Digital Village Technol, Beijing 100097, Peoples R China

来源：

COMPUTERS AND ELECTRONICS IN AGRICULTURE | 2024年 / 226卷

关键词：

Knowledge graph; Multimodal fusion; Image-text pairs; Pre-trained model;

D O I：

10.1016/j.compag.2024.109398

中图分类号：

S [农业科学];

学科分类号：

09 ;

摘要：

Knowledge graph technology is of great significance to modern agricultural information management and datadriven decision support. However, agricultural knowledge is rich in types, and agricultural knowledge graph databases built only based on text are not conducive to users' intuitive perception and comprehensive understanding of knowledge. In view of this, this paper proposes a solution to extract knowledge and construct an agricultural multimodal knowledge graph using a pre-trained language model. This paper takes two plants, cabbage and corn, as research objects. First, a text-image collaborative representation learning method with a two-stream structure is adopted to combine the image modal information of vegetables with the text modal information, and the correlation and complementarity between the two types of information are used to achieve entity alignment. In addition, in order to solve the problem of high similarity of vegetable entities in small categories, a cross-modal fine-grained contrastive learning method is introduced, and the problem of insufficient semantic association between modalities is solved by contrastive learning of vocabulary and small areas of images. Finally, a visual multimodal knowledge graph user interface is constructed using the results of image and text matching. Experimental results show that the image and text matching efficiency of the fine-tuned pretrained model on the vegetable dataset is 76.7%, and appropriate images can be matched for text entities. The constructed visual multimodal knowledge graph database allows users to query and filter knowledge according to their needs, providing assistance for subsequent research on various applications in specific fields such as multimodal agricultural intelligent question and answer, crop pest and disease identification, and agricultural product recommendations.

引用

页数：13

共 50 条

[1] Construction and application of knowledge graph for grid dispatch fault handling based on pre-trained model
Ji, Zhixiang
Wang, Xiaohui
Zhang, Jie
Wu, Di
GLOBAL ENERGY INTERCONNECTION-CHINA, 2023, 6 (04): : 493 - 504
[2] NMT Enhancement based on Knowledge Graph Mining with Pre-trained Language Model
Yang, Hao
Qin, Ying
Deng, Yao
Wang, Minghan
2020 22ND INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): DIGITAL SECURITY GLOBAL AGENDA FOR SAFE SOCIETY!, 2020, : 185 - 189
[3] A Pre-trained Universal Knowledge Graph Reasoning Model Based on Rule Prompts
Cui, Yuanning
Sun, Zequn
Hu, Wei
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (08): : 2030 - 2044
[4] Billion-scale pre-trained knowledge graph model for conversational chatbot
Wong, Chi-Man
Feng, Fan
Zhang, Wen
Chen, Huajun
Vong, Chi-Man
Chen, Chuangquan
NEUROCOMPUTING, 2024, 606
[5] Billion-scale Pre-trained E-commerce Product Knowledge Graph Model
Zhang, Wen
Wong, Chi-Man
Ye, Ganqiang
Wen, Bo
Zhang, Wei
Chen, Huajun
2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2476 - 2487
[6] Enhancing Code Summarization with Graph Embedding and Pre-trained Model
Li, Lixuan
Li, Jie
Xu, Yihui
Zhu, Hao
Zhang, Xiaofang
INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2023, 33 (11N12) : 1765 - 1786
[7] A survey of transformer-based multimodal pre-trained modals
Han, Xue
Wang, Yi-Tong
Feng, Jun-Lan
Deng, Chao
Chen, Zhan-Heng
Huang, Yu-An
Su, Hui
Hu, Lun
Hu, Peng-Wei
NEUROCOMPUTING, 2023, 515 : 89 - 106
[8] Academic Article Classification Algorithm Based on Pre-trained Model and Keyword Extraction
Zhou, Zekai
Zheng, Dongyang
Qiu, Zihan
Lin, Ronghua
Wu, Zhengyang
Yuan, Chengzhe
COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2021, PT II, 2022, 1492 : 149 - 161
[9] Speech Topic Classification Based on Pre-trained and Graph Networks
Niu, Fangjing
Cao, Tengfei
Hu, Ying
Huang, Hao
He, Liang
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1721 - 1726
[10] Knowledge Grounded Pre-Trained Model For Dialogue Response Generation
Wang, Yanmeng
Rong, Wenge
Zhang, Jianfei
Ouyang, Yuanxin
Xiong, Zhang
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

← 1 2 3 4 5 →