Visualizing High Dimensional Datasets Using Parallel Coordinates: Application to Gene Prioritization

被引:0
作者
Boogaerts, Thomas [1 ]
Tranchevent, Leon-Charles [1 ]
Pavlopoulos, Georgios A. [1 ]
Aerts, Jan [1 ]
Vandewalle, Joos [1 ]
机构
[1] Katholieke Univ Leuven, ESAT SCD SISTA IBBT, KU Leuven Future Hlth Dept, B-3001 Louvain, Belgium
来源
IEEE 12TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS & BIOENGINEERING | 2012年
关键词
data visualization; parallel coordinates; genetic algorithm; gene prioritization;
D O I
暂无
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
In this paper, we introduce a visualization tool for interactive and efficient exploration of high dimensional data using parallel coordinates. An algorithm is developed to find an optimal permutation of dimensions, which allows the data miner to immediately see the most important features or irregularities in the dataset. This is implemented as a genetic algorithm based on the travelling salesman problem using maximal correlation as fitness. Other features of the tool include selection operators to group the data such as selection by intersection or by angle, orthogonal and density plots complementing the parallel coordinates plot, manual arrangement of permutation order of the dimensions, possibility to show all plots necessary to see all dimensional relations and displaying a certain number of standard deviations for each dimension separately. The tool is applied to multiple gene prioritization cases in search of genes that are relevant to certain genetic disorders. The used datasets are obtained with the MerKator and Endeavour tools and include a Breast cancer, Cataract, Charcoth-Marie-Tooth and Cardiomyopathy dataset, as well as a dataset relating 29 diseases with 22206 genes. Our tool, manual and data can be downloaded from http://www.toomas.be/parcoord/.
引用
收藏
页码:52 / 57
页数:6
相关论文
共 9 条
  • [1] Organizing and visualizing database data using parallel coordinates
    Presser, Clifton G. M.
    VISUALIZATION AND DATA ANALYSIS 2006, 2006, 6060
  • [2] Using Penalized Regression with Parallel Coordinates for Visualization of Significance in High Dimensional Data
    Wang, Shengwen
    Yang, Yi
    Chang, Jih-Sheng
    Lin, Fang-Pang
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2013, 4 (10) : 32 - 38
  • [3] Visual Signature of High-Dimensional Geometry in Parallel Coordinates
    Yan, Xiaoqi
    Lai, Chi-Fu
    Fu, Chi-Wing
    2014 IEEE PACIFIC VISUALIZATION SYMPOSIUM (PACIFICVIS), 2014, : 65 - 72
  • [4] Interactive Local Clustering Operations for High Dimensional Data in Parallel Coordinates
    Guo, Peihong
    Xiao, He
    Wang, Zuchao
    Yuan, Xiaoru
    IEEE PACIFIC VISUALIZATION SYMPOSIUM 2010, 2010, : 97 - 104
  • [5] A Adaptive Cooperative Coevolutionary Algorithm for Parallel Feature Selection in High-Dimensional Datasets
    Firouznia, Marjan
    Trunfio, Giuseppe A.
    30TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2022), 2022, : 211 - 218
  • [6] Adaptive cooperative coevolutionary differential evolution for parallel feature selection in high-dimensional datasets
    Firouznia, Marjan
    Ruiu, Pietro
    Trunfio, Giuseppe A.
    JOURNAL OF SUPERCOMPUTING, 2023, 79 (14) : 15215 - 15244
  • [7] Depthgram: Visualizing outliers in high-dimensional functional data with application to fMRI data exploration
    Aleman-Gomez, Yasser
    Arribas-Gil, Ana
    Desco, Manuel
    Elias, Antonio
    Romo, Juan
    STATISTICS IN MEDICINE, 2022, 41 (11) : 2005 - 2024
  • [8] A parallel constrained Bayesian optimization algorithm for high-dimensional expensive problems and its application in optimization of VRB structures
    Duan, Libin
    Xue, Kaiwen
    Jiang, Tao
    Du, Zhanpeng
    Xu, Zheng
    Shi, Lei
    STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2024, 67 (04)
  • [9] A novel filter-wrapper hybrid greedy ensemble approach optimized using the genetic algorithm to reduce the dimensionality of high-dimensional biomedical datasets
    Gangavarapu, Tushaar
    Patil, Nagamma
    APPLIED SOFT COMPUTING, 2019, 81