A Framework for Simultaneous Co-clustering and Learning from Complex Data

被引:0
|
作者
Deodhar, Meghana [1 ]
Ghosh, Joydeep [1 ]
机构
[1] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
来源
KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING | 2007年
关键词
Co-clustering; classification; regression; prediction models; multimodal data;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For difficult classification or regression problems, practitioners often segment the data into relatively homogenous groups and then build a model for each group. This two-step procedure usually results in simpler, more interpretable and actionable models without any loss in accuracy. We consider problems such as predicting customer behavior across products, where the independent variables can be naturally partitioned into two groups. A pivoting operation can now result in the dependent variable showing up as entries in a "customer by product" data matrix. We present a model-based co-clustering (meta)-algorithm that interleaves clustering and construction of prediction models to iteratively improve both cluster assignment and fit of the models. This algorithm provably converges to a local minimum of a suitable cost function. The framework not only generalizes co-clustering and collaborative filtering to model-based co-clustering, but can also be viewed as simultaneous co-segmentation and classification or regression, which is better than independently clustering the data first and then building models. Moreover, it applies to a wide range of bi-modal. or multimodal data, and can be easily specialized to address classification and regression problems. We demonstrate the effectiveness of our approach on both these problems through experimentation on real and synthetic data.
引用
收藏
页码:250 / 259
页数:10
相关论文
共 50 条
  • [1] SCOAL: A Framework for Simultaneous Co-Clustering and Learning from Complex Data
    Deodhar, Meghana
    Ghosh, Joydeep
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2010, 4 (03)
  • [2] FSCOALParallel simultaneous fuzzy co-clustering and learning
    Biton, David
    Kalech, Meir
    Rokach, Lior
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2018, 33 (07) : 1364 - 1380
  • [3] Co-clustering from Tensor Data
    Boutalbi, Rafika
    Labiod, Lazhar
    Nadif, Mohamed
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT I, 2019, 11439 : 370 - 383
  • [4] A New Framework for Co-clustering of Gene Expression Data
    Zhang, Shuzhong
    Wang, Kun
    Chen, Bilian
    Huang, Xiuzhen
    PATTERN RECOGNITION IN BIOINFORMATICS, 2011, 7036 : 1 - +
  • [5] Joint co-clustering: Co-clustering of genomic and clinical bioimaging data
    Ficarra, Elisa
    De Micheli, Giovanni
    Yoon, Sungroh
    Benini, Luca
    Macii, Enrico
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2008, 55 (05) : 938 - 949
  • [6] Ensemble Block Co-clustering: A Unified Framework for Text Data
    Affeldt, Severine
    Labiod, Lazhar
    Nadif, Mohamed
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 5 - 14
  • [7] SemiNMF-PCA framework for Sparse Data Co-clustering
    Allab, Kais
    Labiod, Lazhar
    Nadif, Mohamed
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 347 - 356
  • [8] Co-ClusterD: A Distributed Framework for Data Co-Clustering with Sequential Updates
    Cheng, Xiang
    Su, Sen
    Gao, Lixin
    Yin, Jiangtao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (12) : 3231 - 3244
  • [9] Co-ClusterD: A Distributed Framework for Data Co-Clustering with Sequential Updates
    Su, Sen
    Cheng, Xiang
    Gao, Lixin
    Yin, Jiangtao
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 1193 - 1198
  • [10] Bi-stochastic Matrix Approximation Framework for Data Co-clustering
    Labiod, Lazhar
    Nadif, Mohamed
    ADVANCES IN INTELLIGENT DATA ANALYSIS XV, 2016, 9897 : 273 - 283