Estimation of Missing Values in Incomplete Industrial Process Data Sets Using ECM Algorithm

被引:0
作者
Pirehgalin, Mina Fahimi [1 ]
Vogel-Heuser, Birgit [1 ]
机构
[1] Tech Univ Munich, Inst Automat & Informat Syst, Munich, Germany
来源
2018 IEEE 16TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN) | 2018年
关键词
Missing Data; Multivariate Gaussian Distribution; Expectation Conditional Maximization; Likelihood Inference; Sweep Matrix; IMPUTATION; IMPACT;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Estimation of missing values is an essential step in data pre-processing to increase the data quality for further data mining approaches. The significance of estimation of missing values in industrial data sets is that different operational situations cannot be describe properly while data sets includes missing values. In this paper, Expectation Conditional Maximization is used to find an approximated model over the data based on Gaussian distribution. Then, in the Expectation step, Sweep operation is used to obtain the regression model of missing values on observable values and estimate the missing values based on observable data. In order to evaluate the results a process data set for a real industrial production system is considered. The missing values are simulated by randomly removing the data from variables. Finally, the accuracy of the proposed method in estimation of missing values is discussed as well as the effect of imputation of missing values on further data analysis.
引用
收藏
页码:245 / 251
页数:7
相关论文
共 13 条
[1]   Data mining and the impact of missing data [J].
Brown, ML ;
Kros, JF .
INDUSTRIAL MANAGEMENT & DATA SYSTEMS, 2003, 103 (8-9) :611-621
[2]   Impact of imputation of missing values on classification error for discrete data [J].
Farhangfar, Alireza ;
Kurgan, Lukasz ;
Dy, Jennifer .
PATTERN RECOGNITION, 2008, 41 (12) :3692-3705
[3]   TUTORIAL ON THE SWEEP OPERATOR [J].
GOODNIGHT, JH .
AMERICAN STATISTICIAN, 1979, 33 (03) :149-158
[4]   IDENTIFICATION OF ARX-MODELS SUBJECT TO MISSING DATA [J].
ISAKSSON, AJ .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1993, 38 (05) :813-819
[5]   Data-driven Soft Sensors in the process industry [J].
Kadlec, Petr ;
Gabrys, Bogdan ;
Strandt, Sibylle .
COMPUTERS & CHEMICAL ENGINEERING, 2009, 33 (04) :795-814
[6]   Imputation of missing data in industrial databases [J].
Lakshminarayan, K ;
Harp, SA ;
Samad, T .
APPLIED INTELLIGENCE, 1999, 11 (03) :259-275
[7]   Multiple Imputation by Ordered Monotone Blocks With Application to the Anthrax Vaccine Research Program [J].
Li, Fan ;
Baccini, Michela ;
Mealli, Fabrizia ;
Zell, Elizabeth R. ;
Frangakis, Constantine E. ;
Rubin, Donald B. .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2014, 23 (03) :877-892
[8]  
Little R. J., 2019, STAT ANAL MISSING DA, V793
[9]   Conditions for Ignoring the Missing-Data Mechanism in Likelihood Inferences for Parameter Subsets [J].
Little, Roderick J. ;
Rubin, Donald B. ;
Zangeneh, Sahar Z. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (517) :314-320
[10]   Missing data methods in PCA and PLS: Score calculations with incomplete observations [J].
Nelson, PRC ;
Taylor, PA ;
MacGregor, JF .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1996, 35 (01) :45-65