An Empirical Evaluation of Constrained Feature Selection

被引:0
|
作者
Bach J. [1 ]
Zoller K. [2 ]
Trittenbach H. [1 ]
Schulz K. [2 ,3 ]
Böhm K. [1 ]
机构
[1] Department of Informatics, Karlsruhe Institute of Technology (KIT), Am Fasanengarten 5, Baden-Württemberg, Karlsruhe
[2] Department of Mechanical Engineering, Karlsruhe Institute of Technology (KIT), Kaiserstraße 12, Baden-Württemberg, Karlsruhe
[3] Faculty of Mechanical Engineering and Mechatronics, Karlsruhe University of Applied Sciences, Moltkestraße 30, Baden-Württemberg, Karlsruhe
关键词
Constraints; Domain knowledge; Feature selection; Theory-guided data science;
D O I
10.1007/s42979-022-01338-z
中图分类号
学科分类号
摘要
While feature selection helps to get smaller and more understandable prediction models, most existing feature-selection techniques do not consider domain knowledge. One way to use domain knowledge is via constraints on sets of selected features. However, the impact of constraints, e.g., on the predictive quality of selected features, is currently unclear. This article is an empirical study that evaluates the impact of propositional and arithmetic constraints on filter feature selection. First, we systematically generate constraints from various types, using datasets from different domains. As expected, constraints tend to decrease the predictive quality of feature sets, but this effect is non-linear. So we observe feature sets both adhering to constraints and with high predictive quality. Second, we study a concrete setting in materials science. This part of our study sheds light on how one can analyze scientific hypotheses with the help of constraints. © 2022, The Author(s).
引用
收藏
相关论文
共 50 条
  • [31] Feature Selection for Thermal Comfort Modeling based on Constrained LASSO Regression
    Guenther, Janine
    Sawodny, Oliver
    IFAC PAPERSONLINE, 2019, 52 (15): : 400 - 405
  • [32] High-dimensional sign-constrained feature selection and grouping
    Shanshan Qin
    Hao Ding
    Yuehua Wu
    Feng Liu
    Annals of the Institute of Statistical Mathematics, 2021, 73 : 787 - 819
  • [33] A novel method of constrained feature selection by the measurement of pairwise constraints uncertainty
    Rostami, Mehrdad
    Berahmand, Kamal
    Forouzandeh, Saman
    JOURNAL OF BIG DATA, 2020, 7 (01)
  • [34] A novel method of constrained feature selection by the measurement of pairwise constraints uncertainty
    Mehrdad Rostami
    Kamal Berahmand
    Saman Forouzandeh
    Journal of Big Data, 7
  • [35] Redundancy-Constrained Feature Selection with Radial Basis Function Networks
    Pal, Nikhil R.
    Malpani, Mridul
    2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [36] Logistic Regression Confined by Cardinality-Constrained Sample and Feature Selection
    Adeli, Ehsan
    Li, Xiaorui
    Kwon, Dongjin
    Zhang, Yong
    Pohl, Kilian M.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (07) : 1713 - 1728
  • [37] High-dimensional sign-constrained feature selection and grouping
    Qin, Shanshan
    Ding, Hao
    Wu, Yuehua
    Liu, Feng
    ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2021, 73 (04) : 787 - 819
  • [38] An empirical evaluation of importance-based feature selection methods for the driver identification task using OBD data
    Priyadharshini, G.
    Ukrit, M. Ferni
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2022,
  • [39] Soft-constrained Laplacian score for semi-supervised multi-label feature selection
    Alalga, Abdelouahid
    Benabdeslem, Khalid
    Taleb, Nora
    KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 47 (01) : 75 - 98
  • [40] Soft-constrained Laplacian score for semi-supervised multi-label feature selection
    Abdelouahid Alalga
    Khalid Benabdeslem
    Nora Taleb
    Knowledge and Information Systems, 2016, 47 : 75 - 98