Visual Analytics of Co-Occurrences to Discover Subspaces in Structured Data

被引:1
作者
Jentner, Wolfgang [1 ]
Lindholz, Giuliana [2 ]
Hauptmann, Hanna [3 ]
El-Assady, Mennatallah [4 ]
Ma, Kwan-Liu [5 ]
Keim, Daniel [1 ]
机构
[1] Univ Konstanz, Univ Str 10, D-78457 Constance, Baden Wurttembe, Germany
[2] 4Soft GmbH, Mittererstr 3, D-80336 Munich, Bayern, Germany
[3] Univ Utrecht, Princetonpl 5, NL-3584 Utrecht, Netherlands
[4] ETH AI Ctr, Binzmuhlestr 11-13, CH-8092 Zurich, Switzerland
[5] Univ Calif Davis, 2063 Kemper Hall,1 Shields Ave, Davis, CA 95616 USA
基金
欧盟地平线“2020”;
关键词
Structured data mining; pattern mining; subspace search; TEMPORAL EVENT SEQUENCES; INTERESTINGNESS; VISUALIZATION; EXPLORATION; PATTERNS; SETS;
D O I
10.1145/3579031
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an approach that shows all relevant subspaces of categorical data condensed in a single picture. We model the categorical values of the attributes as co-occurrences with data partitions generated from structured data using pattern mining. We showthat these co-occurrences are a-priori, allowing us to greatly reduce the search space, effectively generating the condensed picture where conventional approaches filter out several subspaces as these are deemed insignificant. The task of identifying interesting subspaces is common but difficult due to exponential search spaces and the curse of dimensionality. One application of such a task might be identifying a cohort of patients defined by attributes such as gender, age, and diabetes type that share a common patient history, which is modeled as event sequences. Filtering the data by these attributes is common but cumbersome and often does not allow a comparison of subspaces. We contribute a powerful multi-dimensional pattern exploration approach (MDPE-approach) agnostic to the structured data type that models multiple attributes and their characteristics as co-occurrences, allowing the user to identify and compare thousands of subspaces of interest in a single picture. In our MDPE-approach, we introduce two methods to dramatically reduce the search space, outputting only the boundaries of the search space in the form of two tables. We implement the MDPE-approach in an interactive visual interface (MDPE-vis) that provides a scalable, pixel-based visualization design allowing the identification, comparison, and sense-making of subspaces in structured data. Our case studies using a gold-standard dataset and external domain experts confirm our approach's and implementation's applicability. A third use case sheds light on the scalability of our approach and a user study with 15 participants underlines its usefulness and power.
引用
收藏
页数:49
相关论文
共 72 条
  • [1] Aggarwal CharuC., 2014, Frequent Pattern Mining, DOI DOI 10.1007/978-3-319-07821-2
  • [2] Agrawal R., 1996, Adv. Knowl. Discovery Data Mining, V12, P307
  • [3] Ankerst M, 2008, LECT NOTES COMPUT SC, V4404, P312, DOI 10.1007/978-3-540-71080-6_19
  • [4] [Anonymous], 2017, EUROPEAN HLTH PSYCHO
  • [5] [Anonymous], 2017, EUROPEAN HLTH PSYCHO
  • [6] Ayres J., 2002, P 8 ACM SIGKDD INT C, P429, DOI DOI 10.1145/775047.775109
  • [7] Sequence Braiding: Visual Overviews of Temporal Event Sequences and Attributes
    Bartolomeo, Sara Di
    Zhang, Yixuan
    Sheng, Fangfang
    Dunne, Cody
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (02) : 1353 - 1363
  • [8] Subspace selection for clustering high-dimensional data
    Baumgartner, C
    Plant, C
    Kailing, K
    Kriegel, HP
    Kröger, P
    [J]. FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2004, : 11 - 18
  • [9] Commercial Visual Analytics Systems-Advances in the Big Data Analytics Field
    Behrisch, Michael
    Streeb, Dirk
    Stoffel, Florian
    Seebacher, Daniel
    Matejek, Brian
    Weber, Stefan Hagen
    Mittelstaedt, Sebastian
    Pfister, Hanspeter
    Keim, Daniel
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2019, 25 (10) : 3011 - 3031
  • [10] Representation Learning: A Review and New Perspectives
    Bengio, Yoshua
    Courville, Aaron
    Vincent, Pascal
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) : 1798 - 1828