Applying Dempster-Shafer theory for developing a flexible, accurate and interpretable classifier

被引:28
作者
Penafiel, Sergio [1 ]
Baloian, Nelson [1 ]
Sanson, Horacio [2 ]
Pino, Jose A. [1 ]
机构
[1] Univ Chile, Dept Comp Sci, Santiago, Chile
[2] Allm Inc, Tokyo, Japan
关键词
Supervised learning; Expert systems; Gradient descent; Dempster-Shafer theory; Interpretability; RULE;
D O I
10.1016/j.eswa.2020.113262
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Two approaches have traditionally been identified for developing artificial intelligence systems supporting decision-making: Machine Learning, which applies general techniques based on statistical analysis and optimization methods to extract information from a large amount of data looking for possible relations among them, and Expert Systems, which codify experts knowledge in rules, which are then applied to a specific situation. One of the main advantages of the first approach is its greater accuracy and wider generality for the application of the methods developed which can be used in various scenarios. By contrast, expert systems are usually more restricted and often applicable only to the domain for which they were originally developed. However, the machine learning approach requires the availability of large chunks of data, and it is much more complicated to interpret the results of the statistical methods to obtain some explanation of why the system decides, classifies, or evaluates a situation in a certain way. This issue may become very important in areas such as medicine, where it is relevant to know why the system recommends a certain treatment or diagnoses a certain illness. Likewise, in the financial sector, it might be legally required to explain that a decision to reject the granting of a mortgage loan to a person is not due to discriminatory causes such as gender or race. In order to be able to have interpretability and extract knowledge of available data we developed a classification method based on Dempster-Shafer's Plausibility Theory. Mass assignment functions (MAF) must be established to apply this theory and they assign a weight or probability to all subsets of the possible outcomes, given the presence of a certain fact on a decision scenario. Thus MAF assignments encode expert knowledge. The method learns optimal values for the weights of each MAF using the Gradient Descent method. The presented method allows combination of MAF which have been generated by the method itself or defined by an expert with those that are derived from a set of available data. The developed method was first applied to controlled scenarios and traditional data sets to ensure that classifications and explanations are correct. Results show that the model can classify with an accuracy which is comparable to other statistical classification methods, being also able to extract the most important decision rules from the data. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:12
相关论文
共 39 条
[1]  
[Anonymous], 2001, ICF: International Classification of Functioning Disability and Health
[2]  
[Anonymous], UCI MACHINE LEARNING
[3]  
[Anonymous], 2018, Classification assessment methods: a detailed tutorial, DOI DOI 10.1016/J.ACI.2018.08.003
[4]   On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation [J].
Bach, Sebastian ;
Binder, Alexander ;
Montavon, Gregoire ;
Klauschen, Frederick ;
Mueller, Klaus-Robert ;
Samek, Wojciech .
PLOS ONE, 2015, 10 (07)
[5]  
Baloian N., 2018, SUPPORTING COLLABORA, V19, P1254, DOI [10.3390/proceedings2191254, DOI 10.3390/PROCEEDINGS2191254]
[6]  
[Балонин Николай Алексеевич Balonin N.A.], 2017, [Информационно-управляющие системы, Informatsionno-upravlyayushchie sistemy], P2, DOI 10.15217/issnl684-8853.2017.1.2
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission [J].
Caruana, Rich ;
Lou, Yin ;
Gehrke, Johannes ;
Koch, Paul ;
Sturm, Marc ;
Elhadad, Noemie .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :1721-1730
[9]  
Casillas J, 2003, STUD FUZZ SOFT COMP, V128, P3
[10]   Data classification using the Dempster-Shafer method [J].
Chen, Qi ;
Whitbrook, Amanda ;
Aickelin, Uwe ;
Roadknight, Chris .
JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2014, 26 (04) :493-517