Simple Pattern Minimality Problems: Integer Linear Programming Formulations and Covering-Based Heuristic Solving Approaches

被引:2
|
作者
Boccia, Maurizio [1 ]
Sforza, Antonio [1 ]
Sterle, Claudio [1 ,2 ]
机构
[1] Univ Federico II Naples, Dept Elect Engn & Informat Technol, I-80125 Naples, Italy
[2] CNR, Ist Anal Sistemi Informat Antonio Ruberti, I-00185 Rome, Italy
关键词
data mining and machine learning; logical analysis of data; simple pattern minimality problem; covering/partitioning approach; LOGICAL ANALYSIS; GENERATION; ALGORITHM;
D O I
10.1287/ijoc.2019.0940
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The simple pattern minimality problem (SPMP) represents a central problem in the logical analysis of data and association rules mining, and it finds applications in several fields as logic synthesis, reliability analysis, and automated reasoning. It consists of determining the minimum number of patterns explaining all the observations of a data set, that is, a Boolean logic formula that is true for all the elements of the data set and false for all the unseen observations. We refer to this problem as covering SPMP (C-SPMP), because each observation can be explained (covered) by more than one pattern. Starting from a real industrial application, we also define a new version of the problem, and we refer to it as partitioning SPMP (P-SPMP), because each observation has to be covered just once. Given a propositional formula or a truth table, C-SPMP and P-SPMP coincide exactly with the problem of determining the minimum disjunctive and minimum exclusive disjunctive normal form, respectively. Both problems are known to be NP-hard and have been generally tackled by heuristic methods. In this context, the contribution of this work is twofold. On one side, it provides two original integer linear programming formulations for the two variants of the SPMP. These formulations exploit the concept of Boolean hypercube to build a graph representation of the problems and allow to exactly solve instances with more than 1,000 observations by using an MIP solver. On the other side, two effective and fast heuristics are proposed to solve relevant size instances taken from literature (SeattleSNPs) and from the industrial database. The proposed methods do not suffer from the same dimensional drawbacks of the methods present in the literature and outperform either existing commercial and freeware logic tools or the available industrial solutions in the number of generated patterns and/or in the computational burden.
引用
收藏
页码:1049 / 1060
页数:12
相关论文
共 13 条
  • [1] Tabu Search-Based Heuristic Solver for General Integer Linear Programming Problems
    Koguma, Yuji
    IEEE ACCESS, 2024, 12 : 19059 - 19076
  • [2] Solving large-scale fixed cost integer linear programming models for grid-based location problems with heuristic techniques
    Noor-E-Alam, Md.
    Doucette, John
    ENGINEERING OPTIMIZATION, 2015, 47 (08) : 1085 - 1106
  • [3] Set covering-based surrogate approach for solving sup-T equation constrained optimization problems
    Hu, Cheng-Feng
    Fang, Shu-Cherng
    FUZZY OPTIMIZATION AND DECISION MAKING, 2011, 10 (02) : 125 - 152
  • [4] An Integer Linear Programming based heuristic for the Capacitated m-Ring-Star Problem
    Naji-Azimi, Zahra
    Salari, Majid
    Toth, Paolo
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 217 (01) : 17 - 25
  • [5] Generalizations, formulations and subgradient based heuristic with dynamic programming procedure for target set selection problems
    Ravelo, Santiago, V
    Meneses, Claudio N.
    COMPUTERS & OPERATIONS RESEARCH, 2021, 135
  • [6] Solving Continuous-Time Linear Programming Problems Based on the Piecewise Continuous Functions
    Wu, Hsien-Chung
    NUMERICAL FUNCTIONAL ANALYSIS AND OPTIMIZATION, 2016, 37 (09) : 1168 - 1201
  • [7] A simplex-based branch-and-cut method for solving integer tri-level programming problems
    Awraris, Ashenafi
    Wordofa, Berhanu Guta
    Kassa, Semu Mitiku
    JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE-JMCS, 2022, 25 (03): : 232 - 250
  • [8] Mixed Integer Linear Programming Based Large Neighborhood Search Approaches for the Directed Feedback Vertex Set Problem
    Bresich, Maria
    Varga, Johannes
    Raidl, Guenther R.
    Limmer, Steffen
    METAHEURISTICS AND NATURE INSPIRED COMPUTING, META 2023, 2024, 2016 : 3 - 20
  • [9] Solving fully fuzzy linear programming problems with flexible constraints based on a new order relation
    Yang, Xiao-Peng
    Cao, Bing-Yuan
    Zhou, Xue-Gang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2015, 29 (04) : 1539 - 1550
  • [10] Trip-Based Integer Linear Programming Model for Static Multi-Car Elevator Operation Problems
    Inamoto, Tsutomu
    Higami, Yoshinobu
    Kobayashi, Shin-ya
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2017, E100A (02): : 385 - 394