Incremental predictive clustering trees for online semi-supervised multi-target regression

被引:0
作者
Aljaž Osojnik
Panče Panov
Sašo Džeroski
机构
[1] Jožef Stefan Institute,Jožef Stefan International Postgraduate School
[2] Jožef Stefan Institute,undefined
来源
Machine Learning | 2020年 / 109卷
关键词
Multi-target regression; Data stream mining; Semi-supervised learning; Predictive clustering;
D O I
暂无
中图分类号
学科分类号
摘要
In many application settings, labeling data examples is a costly endeavor, while unlabeled examples are abundant and cheap to produce. Labeling examples can be particularly problematic in an online setting, where there can be arbitrarily many examples that arrive at high frequencies. It is also problematic when we need to predict complex values (e.g., multiple real values), a task that has started receiving considerable attention, but mostly in the batch setting. In this paper, we propose a method for online semi-supervised multi-target regression. It is based on incremental trees for multi-target regression and the predictive clustering framework. Furthermore, it utilizes unlabeled examples to improve its predictive performance as compared to using just the labeled examples. We compare the proposed iSOUP-PCT method with supervised tree methods, which do not use unlabeled examples, and to an oracle method, which uses unlabeled examples as though they were labeled. Additionally, we compare the proposed method to the available state-of-the-art methods. The method achieves good predictive performance on account of increased consumption of computational resources as compared to its supervised variant. The proposed method also beats the state-of-the-art in the case of very few labeled examples in terms of performance, while achieving comparable performance when the labeled examples are more common.
引用
收藏
页码:2121 / 2139
页数:18
相关论文
共 62 条
  • [1] Bifet A(2010)MOA: Massive online analysis Journal of Machine Learning Research 11 1601-1604
  • [2] Holmes G(1998)Top-down induction of first-order logical decision trees Artificial Intelligence 101 285-297
  • [3] Kirkby R(2016)Input output kernel regression: Supervised and semi-supervised structured output prediction with operator-valued kernels Journal of Machine Learning Research 17 1-48
  • [4] Pfahringer B(1984)Present position and potential developments: Some personal views: Statistical theory: The prequential approach Journal of the Royal Statistical Society Series A (General) 147 278-292
  • [5] Blockeel H(2016)Adaptive model rules from high-speed data streams ACM Transactions on Knowledge Discovery from Data (TKDD) 10 30-127
  • [6] De Raedt L(2013)Event labeling combining ensemble detectors and background knowledge Progress in Artificial Intelligence 2 113-2060
  • [7] Brouard C(2014)Kernelized Bayesian matrix factorization IEEE Transactions on Pattern Analysis and Machine Intelligence 36 2047-30
  • [8] Szafranski M(1963)Probability inequalities for sums of bounded random variables Journal of the American Statistical Association 58 13-168
  • [9] d’Alché Buc F(2011)Learning model trees from evolving data streams Data Mining and Knowledge Discovery 23 128-833
  • [10] Dawid AP(2013)Tree ensembles for predicting structured outputs Pattern Recognition 46 817-179