Persistent-homology-based machine learning: a survey and a comparative study

被引:62
作者
Pun, Chi Seng [1 ]
Lee, Si Xian [1 ]
Xia, Kelin [1 ]
机构
[1] Nanyang Technol Univ, Sch Phys & Math Sci, Singapore, Singapore
关键词
Persistent homology; Machine learning; Persistent diagram; Persistent barcode; Kernel; Feature extraction; TOPOLOGICAL DATA-ANALYSIS; STRUCTURAL CLASSIFICATION; PREDICTION; APPROXIMATION; INFERENCE; SELECTION; NETWORKS; DISTANCE; DEEP; SCOP;
D O I
10.1007/s10462-022-10146-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A suitable feature representation that can both preserve the data intrinsic information and reduce data complexity and dimensionality is key to the performance of machine learning models. Deeply rooted in algebraic topology, persistent homology (PH) provides a delicate balance between data simplification and intrinsic structure characterization, and has been applied to various areas successfully. However, the combination of PH and machine learning has been hindered greatly by three challenges, namely topological representation of data, PH-based distance measurements or metrics, and PH-based feature representation. With the development of topological data analysis, progresses have been made on all these three problems, but widely scattered in different literatures. In this paper, we provide a systematical review of PH and PH-based supervised and unsupervised models from a computational perspective. Our emphasizes are the recent development of mathematical models and tools, including PH software and PH-based functions, feature representations, kernels, and similarity models. Essentially, this paper can work as a roadmap for the practical application of PH-based machine learning tools. Further, we compare between two types of simplicial complexes (alpha and Vietrois-Rips complexes), two types of feature extractions (barcode statistics and binned features), and three types of machine learning models (support vector machines, tree-based models, and neural networks), and investigate their impacts on the protein secondary structure classification.
引用
收藏
页码:5169 / 5213
页数:45
相关论文
共 145 条
[1]  
Adams H, 2017, J MACH LEARN RES, V18
[2]   THE RING OF ALGEBRAIC FUNCTIONS ON PERSISTENCE BAR CODES [J].
Adcock, Aaron ;
Carlsson, Erik ;
Carlsson, Gunnar .
HOMOLOGY HOMOTOPY AND APPLICATIONS, 2016, 18 (01) :381-402
[3]  
Ahmed Mahmuda., 2014, Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, SIGSPATIAL '14, P43, DOI [10.1145/2666310.2666390, DOI 10.1145/2666310.2666390]
[4]  
Alfaro E, 2013, J STAT SOFTW, V54, P1
[5]  
Anirudh R., 2016, P IEEE C COMPUTER VI, P68
[6]  
Anirudh R, 2016, ARXIV PREPRINT ARXIV
[7]  
[Anonymous], 2014, P 16 WORKSH ALG ENG
[8]   Beyond Deep Residual Learning for Image Restoration: Persistent Homology-Guided Manifold Simplification [J].
Bae, Woong ;
Yoo, Jaejun ;
Ye, Jong Chul .
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :1141-1149
[9]  
Bauer Ulrich, 2014, Mathematical Software - ICMS 2014. 4th International Congress. Proceedings. LNCS: 8592, P137, DOI 10.1007/978-3-662-44199-2_24
[10]  
Bauer U., 2017, Ripser: a lean c++ code for the computation of vietoris-rips persistence barcodes