Agricultural large language model for standardized production of distinctive agricultural products

被引：0

作者：

Yi, Wenlong ^{[1
]}

Zhang, Li ^{[1
]}

Kuzmin, Sergey ^{[2
]}

Gerasimov, Igor ^{[2
]}

Liu, Muhua ^{[3
]}

机构：

[1] Jiangxi Agr Univ, Sch Software, Nanchang 330045, Peoples R China

[2] St Petersburg Electrotech Univ LETI, Fac Comp Sci & Technol, St Petersburg 197022, Russia

[3] Jiangxi Agr Univ, Sch Engn, Nanchang 330045, Peoples R China

来源：

COMPUTERS AND ELECTRONICS IN AGRICULTURE | 2025年 / 234卷

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Agricultural products; Standardization; Knowledge engineering; Large models; Retrieval augmentation;

D O I：

10.1016/j.compag.2025.110218

中图分类号：

S [农业科学];

学科分类号：

09 ;

摘要：

To address the diverse nature of specialty agricultural product standardization, its complex and cumbersome development process, and lengthy drafting cycles, while simultaneously tackling challenges such as outdated standardization documents and hallucinations caused by general large language models' delayed access to agricultural domain information. This study constructs a multi-stage cascaded large language model based on a hybrid retrieval-augmented mechanism. The model comprises three core modules: (1) A multi-source retrieval augmentation module that achieves comprehensive external knowledge acquisition through vector retrieval, keyword retrieval, and knowledge graph retrieval branches; (2) A knowledge fusion module that filters redundant information using inverse ranking fusion and graph structure pruning methods to achieve precise injection of high-quality knowledge; (3) A domain adaptation module that enhances the model's understanding of agricultural terminology through vertical domain fine-tuning. Experimental results show that in the standardization document summarization task, the model achieves chrF, BERTscore, and Gscore metrics of 34.85, 74.88, and 39.85, respectively, representing improvements of 59.52%, 35.28%, and 72.84% over the BART baseline model, and 58.54%, 24.25%, and 59.54% over the T5 model. This study enriches the theoretical foundation of large language models in agriculture and provides intelligent technical support for specialty agricultural product standardization development.

引用

页数：15

共 47 条

[1] Austin J, 2021, ADV NEUR IN
[2] Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[3] A Comparative Study of Rank Aggregation Methods in Recommendation Systems
Balchanowski, Michal
Boryczka, Urszula
[J]. ENTROPY, 2023, 25 (01)
[4] Barbella M., 2022, Rouge metric evaluation for text summarization techniques, DOI [10.2139/ssrn.4120317, DOI 10.2139/SSRN.4120317]
[5] Brown TB, 2020, ADV NEUR IN, V33
[6] International Consensus on Standardized Clinic Blood Pressure Measurement-A Call to Action
Cheung, Alfred K.
Whelton, Paul K.
Muntner, Paul
Schutte, Aletta E.
Moran, Andrew E.
Williams, Bryan
Sarafidis, Pantelis
Chang, Tara I.
Daskalopoulou, Stella S.
Flack, John M.
Jennings, Garry
Juraschek, Stephen P.
Kreutz, Reinhold
Mancia, Giuseppe
Nesbitt, Shawna
Ordunez, Pedro
Padwal, Raj
Persu, Alexandre
Rabi, Doreen
Schlaich, Markus P.
Stergiou, George S.
Tobe, Sheldon W.
Tomaszewski, Maciej
Williams Sr, Kim A.
Mann, Johannes F. E.
[J]. AMERICAN JOURNAL OF MEDICINE, 2023, 136 (05)
[7] Reciprocal Rank Fusion outperforms Condorcet and Individual Rank Learning Methods
Cormack, Gordon V.
Clarke, Charles L. A.
Buettcher, Stefan
[J]. PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 758 - 759
[8] Analysis of Automatic Evaluation Metric on Low-Resourced Language: BERTScore vs BLEU Score
Datta, Goutam
Joshi, Nisheeth
Gupta, Kusum
[J]. SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 155 - 162
[9] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10] Ding N, 2023, 2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, P4133

← 1 2 3 4 5 →