Triadic Formal Concept Analysis and triclustering: searching for optimal patterns

被引:0
作者
Dmitry I. Ignatov
Dmitry V. Gnatyshak
Sergei O. Kuznetsov
Boris G. Mirkin
机构
[1] National Research University Higher School of Economics,Department of Data Analysis and Artificial Intelligence, Computer Science Faculty
来源
Machine Learning | 2015年 / 101卷
关键词
Formal Concept Analysis; Triclustering; Triadic data; Multi-way set; Tripartite graphs; Pattern mining ; Suboptimal solutions;
D O I
暂无
中图分类号
学科分类号
摘要
This paper presents several definitions of “optimal patterns” in triadic data and results of experimental comparison of five triclustering algorithms on real-world and synthetic datasets. The evaluation is carried over such criteria as resource efficiency, noise tolerance and quality scores involving cardinality, density, coverage, and diversity of the patterns. An ideal triadic pattern is a totally dense maximal cuboid (formal triconcept). Relaxations of this notion under consideration are: OAC-triclusters; triclusters optimal with respect to the least-square criterion; and graph partitions obtained by using spectral clustering. We show that searching for an optimal tricluster cover is an NP-complete problem, whereas determining the number of such covers is #P-complete. Our extensive computational experiments lead us to a clear strategy for choosing a solution at a given dataset guided by the principle of Pareto-optimality according to the proposed criteria.
引用
收藏
页码:271 / 302
页数:31
相关论文
共 141 条
[1]  
Banerjee A(2007)A generalized maximum entropy approach to Bregman co-clustering and matrix approximation Journal of Machine Learning Research 8 1919-1986
[2]  
Dhillon IS(2006)BicAT: a biclustering analysis toolbox Bioinformatics 22 1282-1283
[3]  
Ghosh J(2010)Discovery of optimal factors in binary data via a novel method of matrix decomposition Journal of Computer and System Sciences 76 3-20
[4]  
Merugu S(2009)Inducing decision trees via concept lattices International Journal of General Systems 38 455-467
[5]  
Modha DS(2013)Optimal factorization of three-way binary data using triadic concepts Order 30 437-454
[6]  
Barkow S(2014)Impact of boolean factorization as preprocessing methods for classification of boolean data Annals of Mathematics and Artificial Intelligence 72 3-22
[7]  
Bleuler S(2010)The social bookmark and publication management system Bibsonomy—A platform for evaluating and demonstrating web 2.0 research VLDB Journal 19 849-875
[8]  
Prelic A(2005)Constraint-based concept mining and its application to microarray data analysis Intelligent Data Analysis 9 59-82
[9]  
Zimmermann P(2003)Toxicology analysis by means of the JSM-method Bioinformatics 19 1201-1207
[10]  
Zitzler E(1996)A lattice conceptual clustering system and its application to browsing retrieval Machine Learning 24 95-122