Efficient Dimensionality Reduction for Sparse Binary Data

被引:0
|
作者
Pratap, Rameshwar
Kulkarni, Raghav [1 ]
Sohony, Ishan [2 ]
机构
[1] CMI, Chennai, Tamil Nadu, India
[2] SUNY Stony Brook, Stony Brook, NY 11794 USA
来源
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2018年
关键词
Dimensionality Reduction; Sketching; Binary Data; Similarity Search; Locality Sensitive Hashing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a dimensionality reduction (sketching) algorithm for high dimensional, sparse, binary data. Our proposed algorithm provides a single sketch which simultaneously preserves multiple similarity measures including Hamming distance, Inner product, and Jaccard Similarity [12]. In contrast to the "local projection" strategy used by most of the earlier algorithms [6], [4], [7], our approach exploits sparsity and combines the following two strategies: 1. partitioning the dimensions into several buckets, 2. obtaining " global linear summaries" within those buckets. Our algorithm is faster than the existing state-of-the-art, and it preserves the binary format of the data after the dimensionality reduction, which makes the sketch space efficient. Our algorithm can also be easily adapted in streaming and incremental learning frameworks. We give a rigorous theoretical analysis of the dimensionality reduction bounds and complement it with extensive experiments. Our proposed algorithm is simple and easy to implement in practice.
引用
收藏
页码:152 / 157
页数:6
相关论文
共 50 条
  • [21] Low-rank and sparse embedding for dimensionality reduction
    Han, Na
    Wu, Jigang
    Liang, Yingyi
    Fang, Xiaozhao
    Wong, Wai Keung
    Teng, Shaohua
    NEURAL NETWORKS, 2018, 108 : 202 - 216
  • [22] DIMENSIONALITY REDUCTION OF HYPERSPECTRAL IMAGERY WITH SPARSE AND COLLABORATIVE GRAPHS
    Ly, Nam
    Du, Qian
    Fowler, James E.
    Younan, Nicolas
    2014 6TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2014,
  • [23] A non-negative sparse semi-supervised dimensionality reduction algorithm for hyperspectral data
    Wang, Xuesong
    Gao, Yang
    Cheng, Yuhu
    NEUROCOMPUTING, 2016, 188 : 275 - 283
  • [24] Exploring Data-Independent Dimensionality Reduction in Sparse Representation-Based Speaker Identification
    Haris, B. C.
    Sinha, Rohit
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2014, 33 (08) : 2521 - 2538
  • [25] Exploring Data-Independent Dimensionality Reduction in Sparse Representation-Based Speaker Identification
    B. C. Haris
    Rohit Sinha
    Circuits, Systems, and Signal Processing, 2014, 33 : 2521 - 2538
  • [26] Dimensionality Reduction for Hyperspectral Data Based on Pairwise Constraint Discriminative Analysis and Nonnegative Sparse Divergence
    Wang, Xuesong
    Kong, Yi
    Gao, Yang
    Cheng, Yuhu
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (04) : 1552 - 1562
  • [27] DATA DIMENSIONALITY REDUCTION METHODS FOR ORDINAL DATA
    Prokop, Martin
    Rezankova, Hana
    INTERNATIONAL DAYS OF STATISTICS AND ECONOMICS, 2011, : 523 - 533
  • [28] Toward a Unified Theory of Sparse Dimensionality Reduction in Euclidean Space
    Bourgain, Jean
    Dirksen, Sjoerd
    Nelson, Jelani
    STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, : 499 - 508
  • [29] Low-Rank Sparse Preserving Projections for Dimensionality Reduction
    Xie, Luofeng
    Yin, Ming
    Yin, Xiangyun
    Liu, Yun
    Yin, Guofu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (11) : 5261 - 5274
  • [30] Adaptive sparse graph learning based dimensionality reduction for classification
    Chen, Puhua
    Jiao, Licheng
    Liu, Fang
    Zhao, Zhiqiang
    Zhao, Jiaqi
    APPLIED SOFT COMPUTING, 2019, 82