A Gradient-Based Clustering for Multi-Database Mining

被引:3
|
作者
Miloudi, Salim [1 ]
Wang, Yulin [1 ]
Ding, Wenjia [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
关键词
Databases; Itemsets; Clustering algorithms; Data models; Prototypes; Computer science; Computational modeling; Multi-database mining; graph clustering; dual gradient descent; quasi-convex optimization; similarity measure; HIGH-FREQUENCY RULES; INTERESTING PATTERNS; ITEM RECOMMENDATION; ALGORITHMS; CLASSIFICATION;
D O I
10.1109/ACCESS.2021.3050404
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multinational corporations have multiple databases distributed throughout their branches, which store millions of transactions per day. For business applications, identifying disjoint clusters of similar and relevant databases contributes to learning the common buying patterns among customers and also increases the profits by targeting potential clients in the future. This process is called clustering, which is an important unsupervised technique for big data mining. In this article, we present an effective approach to search for the optimal clustering of multiple transaction databases in a weighted undirected similarity graph. To assess the clustering quality, we use dual gradient descent to minimize a constrained quasi-convex loss function whose parameters will determine the edges needed to form the optimal database clusters in the graph. Therefore, finding the global minimum is guaranteed in a finite and short time compared with the existing non-convex objectives where all possible candidate clusterings are generated to find the ideal clustering. Moreover, our algorithm does not require specifying the number of clusters a priori and uses a disjoint-set forest data structure to maintain and keep track of the clusters as they are updated. Through a series of experiments on public data samples and precomputed similarity matrices, we show that our algorithm is more accurate and faster in practice than the existing clustering algorithms for multi-database mining.
引用
收藏
页码:11144 / 11172
页数:29
相关论文
共 50 条
  • [31] Iterative Gradient-Based Shift Estimation: To Multiscale or Not to Multiscale?
    Rais, Martin
    Morel, Jean-Michel
    Facciolo, Gabriele
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2015, 2015, 9423 : 416 - 423
  • [32] Comprehensive analysis of gradient-based hyperparameter optimization algorithms
    O. Y. Bakhteev
    V. V. Strijov
    Annals of Operations Research, 2020, 289 : 51 - 65
  • [33] Gradient-based enhancement of tubular structures in medical images
    Moreno, Rodrigo
    Smedby, Orjan
    MEDICAL IMAGE ANALYSIS, 2015, 26 (01) : 19 - 29
  • [34] A modified conjugate gradient-based Elman neural network
    Li, Long
    Xie, Xuetao
    Gao, Tao
    Wang, Jian
    COGNITIVE SYSTEMS RESEARCH, 2021, 68 : 62 - 72
  • [35] An Approach Toward Fast Gradient-Based Image Segmentation
    Hell, Benjamin
    Kassubeck, Marc
    Bauszat, Pablo
    Eisemann, Martin
    Magnor, Marcus
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (09) : 2633 - 2645
  • [36] Comprehensive analysis of gradient-based hyperparameter optimization algorithms
    Bakhteev, O. Y.
    Strijov, V. V.
    ANNALS OF OPERATIONS RESEARCH, 2020, 289 (01) : 51 - 65
  • [37] Gradient-based technique for image structural analysis and applications
    Asatryan, D. G.
    COMPUTER OPTICS, 2019, 43 (02) : 245 - 250
  • [38] Gradient-Based Illumination Description for Image Forgery Detection
    Matern, Falko
    Riess, Christian
    Stamminger, Marc
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2020, 15 : 1303 - 1317
  • [39] Gradient-based multi-label feature selection considering three-way variable interaction
    Zou, Yizhang
    Hu, Xuegang
    Li, Peipei
    PATTERN RECOGNITION, 2024, 145
  • [40] GRADIENT-BASED ADAPTIVE ALGORITHMS FOR SYSTEMS WITH EXTERNAL FEEDBACK PATHS
    FLOCKTON, SJ
    IEE PROCEEDINGS-F RADAR AND SIGNAL PROCESSING, 1991, 138 (04) : 308 - 312