Unsupervised Outlier Detection Mechanism for Tea Traceability Data

被引:3
作者
Yang, Honggang [1 ]
Li, Shaowen [2 ]
Tu, Lijing [1 ]
Ma, Rongrong [1 ]
Chen, Yin [1 ]
机构
[1] Anhui Agr Univ, Sch Informat & Comp Sci, Hefei 230036, Anhui, Peoples R China
[2] Anhui Prov Key Lab Smart Agr Technol & Equipment, Hefei 230036, Anhui, Peoples R China
关键词
Anomaly detection; Machine learning; Tuning; Safety; Feature extraction; Data models; Time series analysis; Feature combination; LOKI algorithm; machine learning; outlier detection mechanism; parameter tuning method; tea traceability; INFORMATION; SYSTEMS;
D O I
10.1109/ACCESS.2022.3204760
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The presence of outliers in tea traceability data can mislead customers and have a significant impact on the reputation and profits of tea companies. To solve this problem, an unsupervised outlier detection mechanism for tea traceability data is proposed. Firstly, tea traceability data is uploaded to the MySQL database, and then the data is preprocessed to aggregate features based on relevance, which makes it easier to identify abnormal features. Secondly, the LOKI algorithm based on Local Outlier Factor (LOF), Isolation Forest (IForest), and K-Nearest Neighbors (KNN) algorithms is used to achieve unsupervised outlier detection of tea traceability data. In addition, a Density-Based Spatial Clustering of Applications with Noise (DBSCAN-based) tuning method for unsupervised outlier detection algorithms is also provided. Finally, the types of anomalies among the identified outliers are identified to investigate the causes of the anomalies in order to develop remedial procedures to eliminate the anomalies, and the analysis results are fed back to the tea companies. Experiments on real datasets show that the DBSCAN-based tuning method can effectively help the unsupervised outlier detection algorithm optimize the parameters, and that the LOF-KNN-IForest (LOKI) algorithm can effectively identify the outliers in tea traceability data. This proves that the unsupervised outlier detection mechanism for tea traceability data can effectively guarantee the quality of tea traceability data.
引用
收藏
页码:94818 / 94831
页数:14
相关论文
共 46 条
[1]  
Anderlini E., 2021, Ocean Eng., V236, DOI DOI 10.1016/J.OCEANENG.2021.109531
[2]   Detection of objects in the images: from likelihood relationships towards scalable and efficient neural networks [J].
Andriyanov, N. A. ;
Dementiev, V. E. ;
Tashlinskiy, A. G. .
COMPUTER OPTICS, 2022, 46 (01) :139-159
[3]   Streamlining life cycle inventory data generation in agriculture using traceability data and information and communication technologies - part I: concepts and technical basis [J].
Bellon-Maurel, Veronique ;
Short, Michael D. ;
Roux, Philippe ;
Schulz, Matthias ;
Peters, Gregory M. .
JOURNAL OF CLEANER PRODUCTION, 2014, 69 :60-66
[4]   Water leak detection using self-supervised time series classification [J].
Blazquez-Garcia, Ane ;
Conde, Angel ;
Mori, Usue ;
Lozano, Jose A. .
INFORMATION SCIENCES, 2021, 574 :528-541
[5]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[6]  
Burgan HI, 2017, WORLD ENVIRONMENTAL AND WATER RESOURCES CONGRESS 2017: HYDRAULICS AND WATERWAYS AND WATER DISTRIBUTION SYSTEMS ANALYSIS, P327
[7]   Multivariate non-stationary hydrological frequency analysis [J].
Chebana, Fateh ;
Ouarda, Taha B. M. J. .
JOURNAL OF HYDROLOGY, 2021, 593
[8]  
Dang TT, 2015, 2015 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), P507, DOI 10.1109/ICDSP.2015.7251924
[9]  
Du X., 2022, INFORM SCIENCES, V608, P532
[10]   Application of Unsupervised Anomaly Detection Techniques to Moisture Content Data from Wood Constructions [J].
Faura, Alvaro Garcia ;
Gtepec, Dejan ;
Cankar, Matija ;
Humar, Miha .
FORESTS, 2021, 12 (02) :1-19