Extracting insights from the shape of complex data using topology

被引:307
作者
Lum, P. Y. [1 ]
Singh, G. [1 ]
Lehman, A. [1 ]
Ishkanov, T. [1 ]
Vejdemo-Johansson, M. [2 ]
Alagappan, M. [1 ]
Carlsson, J. [3 ]
Carlsson, G. [1 ,4 ]
机构
[1] Ayasdi Inc, Palo Alto, CA USA
[2] Sch Comp Sci, St Andrews KY16 9SX, Fife, Scotland
[3] Univ Minnesota, Minneapolis, MN 55455 USA
[4] Stanford Univ, Dept Math, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
BREAST CARCINOMAS; PREDICT;
D O I
10.1038/srep01236
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper applies topological methods to study complex high dimensional data sets by extracting shapes (patterns) and obtaining insights about them. Our method combines the best features of existing standard methodologies such as principal component and cluster analyses to provide a geometric representation of complex data sets. Through this hybrid method, we often find subgroups in data sets that traditional methodologies fail to find. Our method also permits the analysis of individual data sets as well as the analysis of relationships between related data sets. We illustrate the use of our method by applying it to three very different kinds of data, namely gene expression from breast tumors, voting data from the United States House of Representatives and player performance data from the NBA, in each case finding stratifications of the data which are more refined than those produced by standard methods.
引用
收藏
页数:8
相关论文
共 20 条
[1]  
Abdi H., 2007, ENCY RES METHODS SOC, P598, DOI DOI 10.4135/9781412952644
[2]   Principal component analysis [J].
Abdi, Herve ;
Williams, Lynne J. .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2010, 2 (04) :433-459
[3]  
[Anonymous], 1974, Introduction to the Theory of Statistics
[4]   TOPOLOGY AND DATA [J].
Carlsson, Gunnar .
BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, 2009, 46 (02) :255-308
[5]  
Euler L., 1741, Opera Omnia, V7, P1
[6]  
Mardia K. V., 1979, Multivariate Analysis
[7]   Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival [J].
Nicolau, Monica ;
Levine, Arnold J. ;
Carlsson, Gunnar .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (17) :7265-7270
[8]   Molecular portraits of human breast tumours [J].
Perou, CM ;
Sorlie, T ;
Eisen, MB ;
van de Rijn, M ;
Jeffrey, SS ;
Rees, CA ;
Pollack, JR ;
Ross, DT ;
Johnsen, H ;
Akslen, LA ;
Fluge, O ;
Pergamenschikov, A ;
Williams, C ;
Zhu, SX ;
Lonning, PE ;
Borresen-Dale, AL ;
Brown, PO ;
Botstein, D .
NATURE, 2000, 406 (6797) :747-752
[9]   Estrogen receptor-negative breast carcinomas: a review of morphology and immunophenotypical analysis [J].
Putti, TC ;
Abd El-Rehim, DM ;
Rakha, EA ;
Paish, CE ;
Lee, AHS ;
Pinder, SE ;
Ellis, IO .
MODERN PATHOLOGY, 2005, 18 (01) :26-35
[10]  
REEB G, 1946, CR HEBD ACAD SCI, V222, P847