Spectroscopy Approaches for Food Safety Applications: Improving Data Efficiency Using Active Learning and Semi-supervised Learning

被引:1
作者
Zhang, Huanle [1 ]
Wisuthiphaet, Nicharee [2 ]
Cui, Hemiao [2 ]
Nitin, Nitin [2 ]
Liu, Xin [1 ]
Zhao, Qing [3 ]
机构
[1] Univ Calif Davis, Dept Comp Sci, Davis, CA 95616 USA
[2] Univ Calif Davis, Dept Food Sci & Technol, Davis, CA USA
[3] Cornell Univ, Sch Elect & Comp Engn, Ithaca, NY USA
来源
FRONTIERS IN ARTIFICIAL INTELLIGENCE | 2022年 / 5卷
基金
美国食品与农业研究所;
关键词
food science; spectroscopy analysis; machine learning; data efficiency; active learning; semi-supervised learning; TRANSFORM INFRARED-SPECTROSCOPY; OPTIMAL EXPERIMENTAL-DESIGNS; FLUORESCENCE SPECTROSCOPY; REGRESSION; QUALITY; IDENTIFICATION; CHEMOMETRICS; MODELS;
D O I
10.3389/frai.2022.863261
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The past decade witnessed rapid development in the measurement and monitoring technologies for food science. Among these technologies, spectroscopy has been widely used for the analysis of food quality, safety, and nutritional properties. Due to the complexity of food systems and the lack of comprehensive predictive models, rapid and simple measurements to predict complex properties in food systems are largely missing. Machine Learning (ML) has shown great potential to improve the classification and prediction of these properties. However, the barriers to collecting large datasets for ML applications still persists. In this paper, we explore different approaches of data annotation and model training to improve data efficiency for ML applications. Specifically, we leverage Active Learning (AL) and Semi-Supervised Learning (SSL) and investigate four approaches: baseline passive learning, AL, SSL, and a hybrid of AL and SSL. To evaluate these approaches, we collect two spectroscopy datasets: predicting plasma dosage and detecting foodborne pathogen. Our experimental results show that, compared to the de facto passive learning approach, advanced approaches (AL, SSL, and the hybrid) can greatly reduce the number of labeled samples, with some cases decreasing the number of labeled samples by more than half.
引用
收藏
页数:13
相关论文
共 51 条
[1]   Optuna: A Next-generation Hyperparameter Optimization Framework [J].
Akiba, Takuya ;
Sano, Shotaro ;
Yanase, Toshihiko ;
Ohta, Takeru ;
Koyama, Masanori .
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :2623-2631
[2]  
Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
[3]   Detection and Identification of Bacillus cereus, Bacillus cytotoxicus, Bacillus thuringiensis, Bacillus mycoides and Bacillus weihenstephanensis via Machine Learning Based FTIR Spectroscopy [J].
Bagcioglu, Murat ;
Fricker, Martina ;
Johler, Sophia ;
Ehling-Schulz, Monika .
FRONTIERS IN MICROBIOLOGY, 2019, 10
[4]   Vineyard yield estimation by combining remote sensing, computer vision and artificial neural network techniques [J].
Ballesteros, Rocio ;
Intrigliolo, Diego S. ;
Ortega, Jose F. ;
Ramirez-Cuesta, Juan M. ;
Buesa, Ignacio ;
Moreno, Miguel A. .
PRECISION AGRICULTURE, 2020, 21 (06) :1242-1262
[5]  
Chapelle O., 2006, SEMISUPERVISED LEARN, P1, DOI DOI 10.7551/MITPRESS/9780262033589.001.0001
[6]   Fluorescence spectroscopy as a tool for determining microbial quality in potable water applications [J].
Cumberland, Susan ;
Bridgeman, John ;
Baker, Andy ;
Sterling, Mark ;
Ward, David .
ENVIRONMENTAL TECHNOLOGY, 2012, 33 (06) :687-693
[7]  
Dasgupta S., 2004, ADV NEURAL INFORM PR
[8]  
de Sousa Celso Andre R., 2013, Machine Learning and Knowledge Discovery in Databases. European Conference, ECML PKDD 2013. Proceedings: LNCS 8190, P160, DOI 10.1007/978-3-642-40994-3_11
[9]   Logistic regression and artificial neural network classification models: a methodology review [J].
Dreiseitl, S ;
Ohno-Machado, L .
JOURNAL OF BIOMEDICAL INFORMATICS, 2002, 35 (5-6) :352-359
[10]   Identification of a rice drying model with an improved sequential optimal design of experiments [J].
Goujot, Daniel ;
Meyer, Xuan ;
Courtois, Francis .
JOURNAL OF PROCESS CONTROL, 2012, 22 (01) :95-107