Iterative denoising

被引:3
作者
Giles, Kendall E. [1 ]
Trosset, Michael W. [2 ]
Marchette, David J. [3 ]
Priebe, Carey E. [4 ]
机构
[1] Virginia Commonwealth Univ, Dept Stat Sci & Operat Res, Richmond, VA 23284 USA
[2] Indiana Univ, Dept Stat, Bloomington, IN 47405 USA
[3] USN, Dahlgren Div, Ctr Surface Warfare, Dahlgren, VA 22448 USA
[4] Johns Hopkins Univ, Dept Appl Math & Stat, Baltimore, MD 21218 USA
关键词
knowledge discovery; text mining; classification; clustering;
D O I
10.1007/s00180-007-0090-8
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
One problem in many fields is knowledge discovery in heterogeneous, high-dimensional data. As an example, in text mining an analyst often wishes to identify meaningful, implicit, and previously unknown information in an unstructured corpus. Lack of metadata and the complexities of document space make this task difficult. We describe Iterative Denoising, a methodology for knowledge discovery in large heterogeneous datasets that allows a user to visualize and to discover potentially meaningful relationships and structures. In addition, we demonstrate the features of this methodology in the analysis of a heterogeneous Science News corpus.
引用
收藏
页码:497 / 517
页数:21
相关论文
共 41 条
[1]   RECENT DIRECTIONS IN NETLIST PARTITIONING - A SURVEY [J].
ALPERT, CJ ;
KAHNG, AB .
INTEGRATION-THE VLSI JOURNAL, 1995, 19 (1-2) :1-81
[2]  
[Anonymous], 1952, Psychometrika
[4]   An optimal algorithm for approximate nearest neighbor searching in fixed dimensions [J].
Arya, S ;
Mount, DM ;
Netanyahu, NS ;
Silverman, R ;
Wu, AY .
JOURNAL OF THE ACM, 1998, 45 (06) :891-923
[5]  
BANERJEE S, 2003, P 4 INT C INT TEXT P
[6]   Laplacian eigenmaps for dimensionality reduction and data representation [J].
Belkin, M ;
Niyogi, P .
NEURAL COMPUTATION, 2003, 15 (06) :1373-1396
[7]   An introduction to ensemble methods for data analysis [J].
Berk, Richard A. .
SOCIOLOGICAL METHODS & RESEARCH, 2006, 34 (03) :263-295
[8]   Nearest neighbor queries in metric spaces [J].
Clarkson, KL .
DISCRETE & COMPUTATIONAL GEOMETRY, 1999, 22 (01) :63-93
[9]   REVIEW OF CLASSIFICATION [J].
CORMACK, RM .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-GENERAL, 1971, 134 :321-+