The minimum description length principle for pattern mining: a survey

被引:0
作者
Esther Galbrun
机构
[1] University of Eastern Finland,School of Computing
来源
Data Mining and Knowledge Discovery | 2022年 / 36卷
关键词
Data mining; Pattern mining; Frequent itemset mining; Minimum description length principle; Information theory;
D O I
暂无
中图分类号
学科分类号
摘要
Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the selection of patterns constitutes a major challenge. The Minimum Description Length (MDL) principle, a model selection method grounded in information theory, has been applied to pattern mining with the aim to obtain compact high-quality sets of patterns. After giving an outline of relevant concepts from information theory and coding, we review MDL-based methods for mining different kinds of patterns from various types of data. Finally, we open a discussion on some issues regarding these methods.
引用
收藏
页码:1679 / 1727
页数:48
相关论文
共 196 条
  • [1] Adriaens F(2019)Subjectively interesting connecting trees and forests Data Min Knowl Disc 33 1088-1124
  • [2] Lijffijt J(1993)Mining association rules between sets of items in large databases ACM SIGMOD Rec 22 207-216
  • [3] De Bie T(2003)Finding haplotype block boundaries by using the minimum-description-length principle Am J Hum Genet 73 336-354
  • [4] Agrawal R(2016)Discovery of “comet” communities in temporal and labeled graphs COM Knowl Inf Syst 46 657-677
  • [5] Imieliński T(2021)Punctuated ecological equilibrium in mammal communities over evolutionary time scales Science 372 300-303
  • [6] Swami A(2020)Large-scale network motif analysis using compression Data Min Knowl Disc 34 1421-1453
  • [7] Anderson EC(2019)Exploring the solution landscape enables more reliable network community detection Phys Rev E 100 231-255
  • [8] Novembre J(1994)Substructure discovery using minimum description length and background knowledge J Artifi Intell Res 1 11027-204
  • [9] Araujo M(2015)Identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems Phys Rev X 5 112-504
  • [10] Günnemann S(2017)Mapping higher-order network flows in memory and multilayer networks with infomap Algorithms 10 197-1983