Dynamic Instance-Wise Classification in Correlated Feature Spaces

被引:1
作者
Liyanage Y.W. [1 ]
Zois D.-S. [1 ]
Chelmis C. [2 ]
机构
[1] The Department of Electrical and Computer Engineering, University at Albany, Albany, 12222, NY
[2] The Department of Computer Science, University at Albany, Albany, 12222, NY
来源
IEEE Transactions on Artificial Intelligence | 2021年 / 2卷 / 06期
基金
美国国家科学基金会;
关键词
Bayesian network; correlated features; costly features; datumwise feature selection and classification; sequential feature selection;
D O I
10.1109/TAI.2021.3109858
中图分类号
学科分类号
摘要
In a typical supervised machine learning setting, the predictions on all test instances are based on a common subset of features discovered during model training. However, using a different subset of features that is most informative for each test instance individually may improve not only the prediction accuracy but also the overall interpretability of the model. At the same time, feature selection methods for classification have been known to be the most effective when many features are irrelevant and/or uncorrelated. In fact, feature selection ignoring correlations between features can lead to poor classification performance. In this work, a Bayesian network is utilized to model feature dependencies. Using the dependence network, a new method is proposed that sequentially selects the best feature to evaluate for each test instance individually and stops the selection process to make a prediction once it determines that no further improvement can be achieved with respect to classification accuracy. The optimum number of features to acquire and the optimum classification strategy are derived for each test instance. The theoretical properties of the optimum solution are analyzed, and a new algorithm is proposed that takes advantage of these properties to implement a robust and scalable solution for high-dimensional settings. The effectiveness, generalizability, and scalability of the proposed method are illustrated on a variety of real-world datasets from diverse application domains. © 2021 IEEE.
引用
收藏
页码:537 / 548
页数:11
相关论文
共 35 条
[1]  
Belk D., Diagnostic tests-True cost of heathcare, (2020)
[2]  
Kao D.P., Et al., Characterization of subgroups of heart failure patients with preserved ejection fraction with possible implications for prognosis and treatment response, Eur. J. Heart Failure, 17, 9, pp. 925-935, (2015)
[3]  
Zhao Z., Liu H., Searching for interacting features in subset selection, Intell. Data Anal., 13, 2, pp. 207-228, (2009)
[4]  
Hollinger G.A., Mitra U., Sukhatme G.S., Active classification: Theory and application to underwater inspection, Robot. Res.., pp. 95-110, (2017)
[5]  
Liyanage Y.W., Zois D.-S., Chelmis C., Dynamic instance-wise joint feature selection and classification, IEEE Trans. Artif. Intell., 2, 2, pp. 169-184, (2021)
[6]  
Molnar C., Interpretable machine learning, (2020)
[7]  
Hu X., Zhou P., Li P., Wang J., Wu X., A survey on online feature selection with streaming features, Front. Comput. Sci., 12, 3, pp. 479-493, (2018)
[8]  
Perkins S., Theiler J., Online feature selection using grafting, Proc. 20th Int. Conf. Mach. Learn., pp. 592-599, (2003)
[9]  
Zhou J., Foster D., Stine R., Ungar L., Streaming feature selection using alpha-investing, Proc. 11th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, pp. 384-393, (2005)
[10]  
Wu X., Yu K., Ding W., Wang H., Zhu X., Online feature selection with streaming features, IEEE Trans. Pattern Anal. Mach. Intell., 35, 5, pp. 1178-1192, (2013)