What to expect from a set of itemsets?

被引:0
作者
Delacroix, T. [1 ]
Lenca, P. [2 ]
Lallich, S. [3 ]
机构
[1] Univ Paris Saclay, Polytech Paris Saclay, Orsay, France
[2] IMT Atlantique, Lab STICC, F-29238 Brest, France
[3] Univ Lyon 2, Lab ERIC, Lyon, France
关键词
Pattern mining; Itemset mining; Interestingness; Redundancy; Maximum entropy model; Independence model; INTERESTINGNESS MEASURES;
D O I
10.1016/j.ins.2021.12.115
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dealing with redundancy is one of the main challenges in frequency based data mining and itemset mining in particular. To tackle this issue in the most objective possible way, we introduce the theoretical bases of a new probabilistic concept: Mutual constrained independence (MCI). Thanks to this notion, we describe a MCI model for the frequencies of all itemsets which is the least binding in terms of model hypotheses defined by the knowledge of the frequencies of some of the itemsets. We provide a method for computing MCI models based on algebraic geometry. We establish the link between MCI models and a class of MaxEnt models which has already known to be used in pattern mining. As such, our research presents further insight on the nature of such models and an entirely novel approach for computing them. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:314 / 340
页数:27
相关论文
共 49 条
[1]  
[Anonymous], 2019, The On-Line Encyclopedia of Integer Sequences
[2]  
[Anonymous], 2014, INTRO OUTLIER ANAL, DOI 10.1007/978-3-319-07821-2_1
[3]  
[Anonymous], 2007, ACM Trans. Knowl. Discov. Data, DOI DOI 10.1145/1297332.1297338
[4]   From statistical knowledge bases to degrees of belief [J].
Bacchus, F ;
Grove, AJ ;
Halpern, JY ;
Koller, D .
ARTIFICIAL INTELLIGENCE, 1996, 87 (1-2) :75-143
[5]  
BASU S., 2006, Algorithms in Real Algebraic Geometry
[6]   Advancing quantitative intersectionality research methods: Intracategorical and intercategorical approaches to shared and differential constructs [J].
Bauer, Greta R. ;
Scheim, Ayden I. .
SOCIAL SCIENCE & MEDICINE, 2019, 226 :260-262
[7]  
Bochnak J., 1998, REAL ALGEBRAIC GEOME
[8]  
Calders T., 2002, Principles of Data Mining and Knowledge Discovery. 6th European Conference, PKDD 2002. Proceedings (Lecture Notes in Artificial Intelligence Vol.2431), P74
[9]   Non-derivable itemset mining [J].
Calders, Toon ;
Goethals, Bart .
DATA MINING AND KNOWLEDGE DISCOVERY, 2007, 14 (01) :171-206
[10]   APPROXIMATING DISCRETE PROBABILITY DISTRIBUTIONS WITH DEPENDENCE TREES [J].
CHOW, CK ;
LIU, CN .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1968, 14 (03) :462-+