Robust imputation method with context-aware voting ensemble model for management of water-quality data

被引:8
|
作者
Choi, Junhyuk [1 ]
Lim, Kyoung Jae [2 ]
Ji, Bongjun [2 ,3 ]
机构
[1] Pohang Univ Sci & Technol POSTECH, Dept Ind & Management Engn, Pohang, South Korea
[2] Kangwon Natl Univ, Dept Reg Infrastruct Engn, Chunchon, South Korea
[3] 1 Gangwondaehakgil, Chuncheon Si 24341, Gangwon Do, South Korea
关键词
Water quality; Missing data; Data imputation; Data quality; Data management; STRATEGIES;
D O I
10.1016/j.watres.2023.120369
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Water-quality monitoring and management are crucial for ensuring the safety and sustainability of water resources. However, missing data is a frequent problem in water-quality datasets, which can result in biased results in hydrological modeling and data analysis. While classic statistical methods and emerging machine/deep learning methods have been applied for imputing missing values, most existing studies perform well in specific missing scenarios, but not in universal scenarios. Therefore, existing imputation methods often fail to robustly impute missing values across various scenarios. To address the problem, we propose an imputation method that uses a context-aware voting-ensemble model to dynamically select optimal weights to integrate various imputation models across different missingness scenarios. For first identify the attributes of missingness scenarios that influence imputation accuracy. Then after introducing missing values in collected data according to the missingness scenarios, we measure the accuracy of various imputation models across the missingness scenarios. Weights of imputation models are optimized by estimating non-linear functions with regression model that can capture relationships between missingness scenarios and imputation accuracies of models. The final imputed value of the ensemble model for a missing scenario can be determined by multiplying each imputation model's weight by its imputed value, then summing the products. The method inherits the advantages of state-of-art imputation models, including the ability to learn long-term dependencies in time series, as well as the flexibility of using a dynamic weighting strategy to process various missingness scenarios. To validate the superiority of our method, we evaluate on real-world water-quality data from a river in South Korea. The proposed method achieves higher accuracy and lower variation of imputed values than baseline models across various missingness scenarios. Furthermore, we showed the applicability of our method to various hydrological environment by validating our method on industrial water quality dataset. This study highlights the potential value of the ensemble model with dynamic weighting in robust imputation of water-quality data.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] A context-aware entity ranking method for web-based data imputation
    Chen, Zhao-Qiang
    Li, Jia-Jun
    Jiang, Chuan
    Liu, Hai-Long
    Chen, Qun
    Li, Zhan-Huai
    Jisuanji Xuebao/Chinese Journal of Computers, 2015, 38 (09): : 1755 - 1766
  • [2] Hybrid Context-Aware Method for Quality Assessment of Data Streams
    Mirzaie, Mostafa
    SERVICE-ORIENTED COMPUTING, ICSOC 2020, 2021, 12632 : 10 - 16
  • [3] Data Management for Context-Aware Computing
    Xue, Wenwei
    Pung, Hungkeng
    Ng, Wenlong
    Gu, Tao
    EUC 2008: PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING, VOL 1, MAIN CONFERENCE, 2008, : 492 - +
  • [4] Context Schema Evolution in Context-Aware Data Management
    Quintarelli, Elisa
    Rabosio, Emanuele
    Tanca, Letizia
    CONCEPTUAL MODELING - ER 2011, 2011, 6998 : 290 - 303
  • [5] Context-aware data quality assessment for big data
    Ardagna, Danilo
    Cappiello, Cinzia
    Sama, Walter
    Vitali, Monica
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 89 : 548 - 562
  • [6] Context-aware platform for mobile data management
    Norrie, Moira C.
    Signer, Beat
    Grossniklaus, Michael
    Belotti, Rudi
    Decurtins, Corsin
    Weibel, Nadir
    WIRELESS NETWORKS, 2007, 13 (06) : 855 - 870
  • [7] Context-aware platform for mobile data management
    Moira C. Norrie
    Beat Signer
    Michael Grossniklaus
    Rudi Belotti
    Corsin Decurtins
    Nadir Weibel
    Wireless Networks, 2007, 13 : 855 - 870
  • [8] An object-oriented version model for context-aware data management
    Grossniklaus, Michael
    Norrie, Moira C.
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2007, PROCEEDINGS, 2007, 4831 : 398 - 409
  • [9] DeCom: A model for context-aware competence management
    Victoria Barbosa, Jorge Luis
    Kich, Marcos Ricardo
    Ferrari Barbosa, Debora Nice
    Klein, Amarolinda Zanela
    Rigo, Sandro Jose
    COMPUTERS IN INDUSTRY, 2015, 72 : 27 - 35
  • [10] Context data management: an architectural framework for context-aware services
    Falcarin, Paolo
    Valla, Massimo
    Yu, Jian
    Licciardi, Carlo Alberto
    Fra, Cristina
    Lamorte, Luca
    SERVICE ORIENTED COMPUTING AND APPLICATIONS, 2013, 7 (02) : 151 - 168