Air Big Data Outlier Detection Based on Infinite Gauss Bayesian and CNN

被引:2
|
作者
Zhou, LiangQi [1 ,2 ]
Xu, HongZhen [1 ,2 ]
Wei, Li [2 ]
Zhang, Quan [2 ]
Zhou, Fei [2 ]
Li, ZhuoPei [2 ]
机构
[1] East China Univ Technol, Engn Lab Radioact Geosci & Big Data Technol, Nanchang, Jiangxi, Peoples R China
[2] East China Univ Technol, Sch Informat Engn, Nanchang, Jiangxi, Peoples R China
来源
ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING | 2019年
关键词
Air quality; outlier detection; Bayesian clustering; Dirichlet process; neural Network;
D O I
10.1145/3318299.3318384
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Air quality has always been a hot issue of concern to the people, the environmental protection department and the government. Among the massive air quality data, abnormal data can interfere with subsequent experiments and analysis. Therefore, it is necessary to detect abnormal data to improve the accuracy of the data. However, traditional air outlier detection methods require at least one year's data to make inferences about air quality. This paper firstly analyzes the characteristics of air quality big data, and then proposes a framework based on Bayesian non-parametric clustering, namely Dirichlet Process (DP) clustering framework, to realize the outlier detection of air quality. The framework optimizes Gaussian mixture model into infinite Gaussian mixture model according to the results of data analysis, and uses neural network to cluster the data processed by infinite Gaussian mixture model, which effectively improves the clustering accuracy and avoids the need of collecting a large number of training data.
引用
收藏
页码:317 / 321
页数:5
相关论文
共 50 条
  • [1] Big Data Outlier Detection Algorithm Based on Grid
    Guo Wei-Wei
    Liu Feng
    2018 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION (ICICTA 2018), 2018, : 274 - 277
  • [2] DP_DETECTION: An outlier detection algorithm based on density of big data
    Li, Xiaodi
    Deng, Ping
    Huang, Ming
    Li, Dingcheng
    Wang, Hongjun
    DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 534 - 544
  • [3] Distributed Local Outlier Detection in Big Data
    Yan, Yizhou
    Cao, Lei
    Kuhlman, Caitlin
    Rundensteiner, Elke
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1225 - 1234
  • [4] A Big Data Online Cleaning Algorithm Based on Dynamic Outlier Detection
    Diao, Yinglong
    Liu, Ke-yan
    Meng, Xiaoli
    Ye, Xueshun
    He, Kaiyuan
    2015 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY, 2015, : 230 - 234
  • [5] A distributed density-based outlier detection algorithm on big data
    Mei, Lin
    Zhang, Fengli
    International Journal of Network Security, 2020, 22 (05): : 775 - 781
  • [6] A Review of Local Outlier Factor Algorithms for Outlier Detection in Big Data Streams
    Alghushairy, Omar
    Alsini, Raed
    Soule, Terence
    Ma, Xiaogang
    BIG DATA AND COGNITIVE COMPUTING, 2021, 5 (01) : 1 - 24
  • [7] An efficient algorithm for distributed density-based outlier detection on big data
    Bai, Mei
    Wang, Xite
    Xin, Junchang
    Wang, Guoren
    NEUROCOMPUTING, 2016, 181 : 19 - 28
  • [8] Big data outlier detection model based on improved density peak algorithm
    Shao, Mengliang
    Qi, Deyu
    Xue, Huili
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (04) : 6185 - 6194
  • [9] Implementation of Infrastructure for Streaming Outlier Detection in Big Data
    Hasani, Zirije
    RECENT ADVANCES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 2, 2017, 570 : 503 - 511
  • [10] Association discovery and outlier detection of air pollution emissions from industrial enterprises driven by big data
    Peng, Zhen
    Zhang, Yunxiao
    Wang, Yunchong
    Tang, Tianle
    DATA INTELLIGENCE, 2023, 5 (02) : 438 - 456