Data Augmentation Techniques for Machine Learning Applied to Optical Spectroscopy Datasets in Agrifood Applications: A Comprehensive Review

被引:24
作者
Moises, Ander Gracia [1 ,2 ]
Pascual, Ignacio Vitoria [1 ,2 ,3 ]
Gonzalez, Jose Javier Imas [1 ,3 ]
Zamarreno, Carlos Ruiz [1 ,2 ,3 ]
机构
[1] Univ Publ Navarra, Dept Elect Elect & Commun Engn, Campus Arrosadia, Pamplona 31006, NA, Spain
[2] Pyroistech SL, C Tajonar 22, Pamplona 31006, NA, Spain
[3] Univ Publ Navarra, Inst Smart Cities, Campus Arrosadia, Pamplona 31006, NA, Spain
关键词
optical spectroscopy; agrifood industry; artificial intelligence; data augmentation (DA); generative adversarial networks (GANs); NEAR-INFRARED SPECTROSCOPY; PROGRESS; TECHNOLOGY;
D O I
10.3390/s23208562
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Machine learning (ML) and deep learning (DL) have achieved great success in different tasks. These include computer vision, image segmentation, natural language processing, predicting classification, evaluating time series, and predicting values based on a series of variables. As artificial intelligence progresses, new techniques are being applied to areas like optical spectroscopy and its uses in specific fields, such as the agrifood industry. The performance of ML and DL techniques generally improves with the amount of data available. However, it is not always possible to obtain all the necessary data for creating a robust dataset. In the particular case of agrifood applications, dataset collection is generally constrained to specific periods. Weather conditions can also reduce the possibility to cover the entire range of classifications with the consequent generation of imbalanced datasets. To address this issue, data augmentation (DA) techniques are employed to expand the dataset by adding slightly modified copies of existing data. This leads to a dataset that includes values from laboratory tests, as well as a collection of synthetic data based on the real data. This review work will present the application of DA techniques to optical spectroscopy datasets obtained from real agrifood industry applications. The reviewed methods will describe the use of simple DA techniques, such as duplicating samples with slight changes, as well as the utilization of more complex algorithms based on deep learning generative adversarial networks (GANs), and semi-supervised generative adversarial networks (SGANs).
引用
收藏
页数:29
相关论文
共 92 条
[1]   Quantitative Remote Sensing at Ultra-High Resolution with UAV Spectroscopy: A Review of Sensor Technology, Measurement Procedures, and Data Correction Workflows [J].
Aasen, Helge ;
Honkavaara, Eija ;
Lucieer, Arko ;
Zarco-Tejada, Pablo J. .
REMOTE SENSING, 2018, 10 (07)
[2]   Video Generative Adversarial Networks: A Review [J].
Aldausari, Nuha ;
Sowmya, Arcot ;
Marcus, Nadine ;
Mohammadi, Gelareh .
ACM COMPUTING SURVEYS, 2023, 55 (02)
[3]   Identification of transgenic foods using NIR spectroscopy: A review [J].
Alishahi, A. ;
Farahmand, H. ;
Prieto, N. ;
Cozzolino, D. .
SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 2010, 75 (01) :1-7
[4]  
[Anonymous], Learning Internal Representations by Error Propagation
[5]   The Use of Near-Infrared Spectrometry in the Olive Oil Industry [J].
Armenta, S. ;
Moros, J. ;
Garrigues, S. ;
De La Guardia, M. .
CRITICAL REVIEWS IN FOOD SCIENCE AND NUTRITION, 2010, 50 (06) :567-582
[6]  
Bogner C, 2014, WORK HYPERSP IMAG, DOI 10.1109/WHISPERS.2014.8077584
[7]  
Chadha Anupama, 2021, Proceedings of Second International Conference on Computing, Communications, and Cyber-Security. IC4S 2020. Lecture Notes in Networks and Systems (LNNS 203), P557, DOI 10.1007/978-981-16-0733-2_39
[8]   Use of convolutional neural network (CNN) combined with FT-NIR spectroscopy to predict food adulteration: A case study on coffee [J].
Chakravartula, Swathi Sirisha Nallan ;
Moscetti, Roberto ;
Bedini, Giacomo ;
Nardella, Marco ;
Massantini, Riccardo .
FOOD CONTROL, 2022, 135
[9]  
Chauhan NK, 2018, 2018 INTERNATIONAL CONFERENCE ON COMPUTING, POWER AND COMMUNICATION TECHNOLOGIES (GUCON), P340
[10]  
Chawla NV, 2010, DATA MINING AND KNOWLEDGE DISCOVERY HANDBOOK, SECOND EDITION, P875, DOI 10.1007/978-0-387-09823-4_45