An Extension of Iterative Scaling for Decision and Data Aggregation in Ensemble Classification

被引:0
作者
Siddharth Pal
David J. Miller
机构
[1] The Pennsylvania State University,Department of Electrical Engineering
来源
The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology | 2007年 / 48卷
关键词
improved iterative scaling; maximum entropy; ensemble classification; mixed continuous-discrete feature spaces;
D O I
暂无
中图分类号
学科分类号
摘要
Improved iterative scaling (IIS) is an algorithm for learning maximum entropy (ME) joint and conditional probability models, consistent with specified constraints, that has found great utility in natural language processing and related applications. In most IIS work on classification, discrete-valued “feature functions” are considered, depending on the data observations and class label, with constraints measured based on frequency counts, taken over hard (0–1) training set instances. Here, we consider the case where the training (and test) set consist of instances of probability mass functions on the features, rather than hard feature values. IIS extends in a natural way for this case. This has applications (1) to ME classification on mixed discrete-continuous feature spaces and (2) to ME aggregation of soft classifier decisions in ensemble classification. Moreover, we combine these methods, yielding a method, with proved learning convergence, that jointly performs (soft) decision-level and feature-level fusion in making ensemble decisions. We demonstrate favorable comparisons against standard Adaboost.M1, input-dependent boosting, and other supervised combining methods, on data sets from the UC Irvine Machine Learning repository.
引用
收藏
页码:21 / 37
页数:16
相关论文
共 18 条
[1]  
Alpaydin E.(1999)Combined 5 × 2 cv F Test for Comparing Supervised Classification Learning Algorithms Neural Comput. 11 1885-1892
[2]  
Berger A. L.(1996)A Maximum Entropy Approach to Natural Language Processing Comput. Linguist. 22 39-71
[3]  
Della Pietra S.(2000)A Survey of Smoothing Techniques for ME Models IEEE Trans. Speech Audio Process. 8 37-50
[4]  
Della Pietra V. J.(1972)Generalized Iterative Scaling for Log-linear Models Ann. Math. Stat. 43 1470-1480
[5]  
Chen S. F.(1996)Inducing Features of Random Fields IEEE Trans. Pattern Anal. Mach. Intell. 19 380-393
[6]  
Rosenfeld R.(1998)On Combining Classifiers IEEE Trans. Pattern Anal. Mach. Intell. 20 226-239
[7]  
Darroch J. N.(1978)Estimating the Dimension of a Model Ann. Statist. 6 461-464
[8]  
Ratcliff D.(2000)General Statistical Inference for Discrete and Mixed Spaces by an Approximate Application of the Maximum Entropy Principle IEEE Trans. on NN 11 558-573
[9]  
Della Pietra S.(undefined)undefined undefined undefined undefined-undefined
[10]  
Della Pietra V.(undefined)undefined undefined undefined undefined-undefined