Automatic granary sweeping strategy using visual large language model

被引:0
作者
Zhang, Boqiang [1 ]
Yan, Jinhao [1 ]
Gao, Yuhe [1 ]
Yang, Genliang [1 ]
Zhang, Kunpeng [2 ]
Li, Junwu [3 ]
机构
[1] Henan Univ Technol, Sch Mech & Elect Engn, Zhengzhou 450001, Peoples R China
[2] Henan Univ Technol, Coll Elect Engn, Zhengzhou 450001, Peoples R China
[3] COFCO Engn & Technol Zhengzhou Co Ltd, Zhengzhou 450001, Peoples R China
关键词
Food security; Visual large language model; Grain sweep; Grain storage; Explainable AI; GUTTER BRUSHES;
D O I
10.1016/j.jspr.2025.102619
中图分类号
Q96 [昆虫学];
学科分类号
摘要
Food security is a fundamental element of human survival. Reducing grain losses and ensuring grain quality have extremely important practical implications. Enhancing the granary's intelligence is particularly important due to several issues affecting residue grain sweeping, including manual inefficiency, incomplete coverage, and expensive equipment. This work proposes a new method called the Residual Grain Sweeping Visual Large Mode (RGSVLM)1 based on the Visual Large Language Model (VLLM). First, we constructed a semantic dataset containing images of various residual grain dispersal patterns captured in real granary environments. We also introduced an improved version of the Fast Segment Anything Model (FastSAM) algorithm to detect residual grains in the field images, extract visual features, and achieve accurate segmentation. In addition, we crafted prompt engineering that combines image data to produce corresponding textual datasets that effectively reflect the real-world situation. Next, we integrated this dataset with a chain of reasoning framework to fine-tune the visual large language model for specific tasks. This approach compensates for the original model's limitations in logical reasoning, enabling it to simulate human thought processes and generate clear and reasonable answers. In a granary environment, RGSVLM performs better than other models. This study's development and implementation of RGSVLM offers innovative concepts and techniques for building intelligent granaries.
引用
收藏
页数:12
相关论文
共 34 条
[1]  
Berjan S., 2018, International Journal of Agricultural Management and Development, V8, P1
[2]  
Bourne M.C., 1977, Post-harvest food losses: the neglected dimension in increasing the World food supply
[3]   A Survey on Evaluation of Large Language Models [J].
Chang, Yupeng ;
Wang, Xu ;
Wang, Jindong ;
Wu, Yuan ;
Yang, Linyi ;
Zhu, Kaijie ;
Chen, Hao ;
Yi, Xiaoyuan ;
Wang, Cunxiang ;
Wang, Yidong ;
Ye, Wei ;
Zhang, Yue ;
Chang, Yi ;
Yu, Philip S. ;
Yang, Qiang ;
Xie, Xing .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (03)
[4]   InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks [J].
Chen, Zhe ;
Wu, Jiannan ;
Wang, Wenhai ;
Su, Weijie ;
Chen, Guo ;
Xing, Sen ;
Zhong, Muyan ;
Zhang, Qinglong ;
Zhu, Xizhou ;
Lu, Lewei ;
Li, Bin ;
Luo, Ping ;
Lu, Tong ;
Qiao, Yu ;
Dai, Jifeng .
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, :24185-24198
[5]  
Chomsky N., 1965, Aspects of the theory of syntax
[6]   The science of food security [J].
Cole, Martin Barry ;
Augustin, Mary Ann ;
Robertson, Michael John ;
Manners, John Michael .
NPJ SCIENCE OF FOOD, 2018, 2 (01)
[7]   A Survey on Multimodal Large Language Models for Autonomous Driving [J].
Cui, Can ;
Ma, Yunsheng ;
Cao, Xu ;
Ye, Wenqian ;
Zhou, Yang ;
Liang, Kaizhao ;
Chen, Jintai ;
Lu, Juanwu ;
Yang, Zichong ;
Liao, Kuei-Da ;
Gao, Tianren ;
Li, Erlong ;
Tang, Kun ;
Cao, Zhipeng ;
Zhou, Tong ;
Liu, Ao ;
Yan, Xinrui ;
Mei, Shuqi ;
Cao, Jianguo ;
Wang, Ziran ;
Zheng, Chao .
2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, :958-979
[8]   An Energy Saving Road Sweeper Using Deep Vision for Garbage Detection [J].
Donati, Luca ;
Fontanini, Tomaso ;
Tagliaferri, Fabrizio ;
Prati, Andrea .
APPLIED SCIENCES-BASEL, 2020, 10 (22) :1-19
[9]  
Farzana W., 2024, Unit Operations in Food Grain Processing, P215
[10]  
GLM T., 2024, arXiv, DOI DOI 10.48550/ARXIV.2406.12793