Fast Computation of Persistent Homology with Data Reduction and Data Partitioning

被引:0
作者
Malott, Nicholas O. [1 ]
Wilsey, Philip A. [1 ]
机构
[1] Univ Cincinnati, Dept EECS, Cincinnati, OH 45221 USA
来源
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2019年
基金
美国国家科学基金会;
关键词
topological data analysis; persistent homology; data reduction; data partitioning; data mining; unsupervised learning; TOPOLOGY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Persistent hornology is a method of data analysis that is based in the mathematical field of topology. Unfortunately. the run-time and memory complexities associated with computing persistent homology inhibit general use for the analysis of big data. For example, the best tools currently available to compute persistent homology can process only a few thousand data points in R-3. Several studies have proposed using sampling or data reduction methods to attack this limit. While these approaches enable the computation of persistent homology on much larger data sets, the methods are approximate. Furthermore, while they largely preserve the results of large topological features, they generally miss reporting information about the small topological features that are present in the data set. While this abstraction is useful in many cases, there are data analysis needs where the smaller features are also significant (e.g., brain artery analysis). This paper explores a combination of data reduction and data partitioning to compute persistent homology on big data that enables the identification of both large and small topological features from the input data set. To reduce the approximation errors that typically accompany data reduction for persistent homology, the described method also includes a mechanism of "upscaling" the data circumscribing the large topological features that are computed from the sampled data. The designed experimental method provides significant results for improving the scale at which persistent homology can be performed.
引用
收藏
页码:880 / 889
页数:10
相关论文
共 26 条
[1]  
Adams H, 2017, J MACH LEARN RES, V18
[2]  
[Anonymous], 2014, TOPOLOGICAL METHODS
[3]   Strong Homotopy Types, Nerves and Collapses [J].
Barmak, Jonathan Ariel ;
Minian, Elias Gabriel .
DISCRETE & COMPUTATIONAL GEOMETRY, 2012, 47 (02) :301-328
[4]  
Bauer U., 2019, RIPSER
[5]   PERSISTENT HOMOLOGY ANALYSIS OF BRAIN ARTERY TREES [J].
Bendich, Paul ;
Marron, J. S. ;
Miller, Ezra ;
Pieloch, Alex ;
Skwerer, Sean .
ANNALS OF APPLIED STATISTICS, 2016, 10 (01) :198-218
[6]   The Simplex Tree: An Efficient Data Structure for General Simplicial Complexes [J].
Boissonnat, Jean-Daniel ;
Maria, Clement .
ALGORITHMICA, 2014, 70 (03) :406-427
[7]   TOPOLOGY AND DATA [J].
Carlsson, Gunnar .
BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, 2009, 46 (02) :255-308
[8]  
Chazal F., 2017, ARXIV E PRINTS
[9]  
Chazal F., 2015, ICML 2015
[10]  
Chen C., 27 EUR WORKSH COMP G, P197