Visual large language model for wheat disease diagnosis in the wild

被引:1
作者
Zhang, Kunpeng [1 ,2 ]
Ma, Li [1 ]
Cui, Beibei [1 ]
Li, Xin [1 ]
Zhang, Boqiang [3 ]
Xie, Na [4 ]
机构
[1] Henan Univ Technol, Coll Elect Engn, Zhengzhou 450001, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[3] Henan Univ Technol, Coll Mech Engn, Zhengzhou 450001, Peoples R China
[4] Cent Univ Finance & Econ, Sch Management Sci & Engn, Beijing 100081, Peoples R China
关键词
Plant disease; Wheat disease diagnosis; Wheat disease classification; Large language model; Explainable AI;
D O I
10.1016/j.compag.2024.109587
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Early detection of symptoms in wheat plants is crucial for mitigating disease effects and preventing their spread. Prompt phytosanitary treatment minimizes yield losses and enhances treatment efficacy. In recent years, numerous image analysis-based methodologies for automatic disease identification have been developed, with Convolutional Neural Networks (CNNs) achieving notable success in visual classification tasks. The existing methods often lack the necessary intelligence and reasoning for real-world applications. This study introduces an advanced wheat disease diagnosis approach using a Visual Language Model (VLM), named the Wheat Disease Language Model (WDLM). The WDLM first leverages the modified Segment Anything Model (SAM) to isolate key wheat features from complex wild environments. To enhance the logical reasoning abilities, the WDLM integrates a reasoning chain to generate clear, reasoned explanations for its diagnosis. By employing dedicated prompt engineering, this study establishes the Wheat Disease Semantic Dataset (WDSD) to fine-tune the VLM. The WDSD, which includes a diverse set of wheat images from various sources, bridges the gap between advanced VLM technology and wheat pathology. Tailored with task-specific data, the WDLM demonstrates superior intelligence by providing accurate classification of wheat diseases and suggesting potential treatment options. Compared to CNN-based models, Transformer-based models, and other VLMs, the WDLM shows improved performance in various scenarios. Integrated with mobile applications, the WDLM approach is readily applicable in the field, representing a promising advancement in the intelligent diagnosis of wheat diseases.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Towards Efficient Compound Large Language Model System Serving in the Wild
    Zhu, Yifei
    Zhu, Botao
    Chen, Chen
    Fan, Xiaoyi
    2024 IEEE/ACM 32ND INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE, IWQOS, 2024,
  • [2] PneumoLLM: Harnessing the power of large language model for pneumoconiosis diagnosis
    Song, Meiyue
    Wang, Jiarui
    Yu, Zhihua
    Wang, Jiaxin
    Yang, Le
    Lu, Yuting
    Li, Baicun
    Wang, Xue
    Wang, Xiaoxu
    Huang, Qinghua
    Li, Zhijun
    Kanellakis, Nikolaos I.
    Liu, Jiangfeng
    Wang, Jing
    Wang, Binglu
    Yang, Juntao
    MEDICAL IMAGE ANALYSIS, 2024, 97
  • [3] Integrating visual large language model and reasoning chain for driver behavior analysis and risk assessment
    Zhang, Kunpeng
    Wang, Shipu
    Jia, Ning
    Zhao, Liang
    Han, Chunyang
    Li, Li
    ACCIDENT ANALYSIS AND PREVENTION, 2024, 198
  • [4] Automatic granary sweeping strategy using visual large language model
    Zhang, Boqiang
    Yan, Jinhao
    Gao, Yuhe
    Yang, Genliang
    Zhang, Kunpeng
    Li, Junwu
    JOURNAL OF STORED PRODUCTS RESEARCH, 2025, 112
  • [5] PBChat: Enhance Student's Problem Behavior Diagnosis with Large Language Model
    Chen, Penghe
    Fan, Zhilin
    Lu, Yu
    Xu, Qi
    ARTIFICIAL INTELLIGENCE IN EDUCATION, PT I, AIED 2024, 2024, 14829 : 32 - 45
  • [6] FD-LLM: Large language model for fault diagnosis of complex equipment
    Lin, Lin
    Zhang, Sihao
    Fu, Song
    Liu, Yikun
    ADVANCED ENGINEERING INFORMATICS, 2025, 65
  • [7] Large language model may assist diagnosis of SAPHO syndrome by bone scintigraphy
    Mori, Yu
    Izumiyama, Takuya
    Kanabuchi, Ryuichi
    Mori, Naoko
    Aizawa, Toshimi
    MODERN RHEUMATOLOGY, 2024, 34 (05) : 1043 - 1046
  • [8] DiagLLM: multimodal reasoning with large language model for explainable bearing fault diagnosis
    Jie Wang
    Tianrui Li
    Yan Yang
    Shiqian Chen
    Wanming Zhai
    Science China Information Sciences, 2025, 68 (6)
  • [9] Influence of prior probability information on large language model performance in radiological diagnosis
    Fukushima, Takahiro
    Kurokawa, Ryo
    Hagiwara, Akifumi
    Sonoda, Yuki
    Asari, Yusuke
    Kurokawa, Mariko
    Kanzawa, Jun
    Gonoi, Wataru
    Abe, Osamu
    JAPANESE JOURNAL OF RADIOLOGY, 2025, : 934 - 939
  • [10] Large language model in electrocatalysis
    Zhang, Chengyi
    Wang, Xingyu
    Wang, Ziyun
    CHINESE JOURNAL OF CATALYSIS, 2024, 59 : 7 - 14