PSO Based Fast K-means Algorithm for Feature Selection from High Dimensional Medical data set

被引:0
作者
Doreswamy [1 ]
Salma, Umme M. [1 ]
机构
[1] Mangalore Univ, Dept Comp Sci, Mangalagangothri 574199, Karnataka, India
来源
PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO'16) | 2016年
关键词
Data mining; Feature selection; PSO Algorithm; Fast K-means; Breast Cancer; MEANS CLUSTERING-ALGORITHM; FEATURE-EXTRACTION; DIAGNOSIS; HYBRID;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Features are the most important entity in any data mining and machine learning applications. They are the backbone of any model. Reliability, efficiency and accuracy of the model depends upon the choice of strong and relevant features. However, feature selection is always a time-consuming and challenging task. In this paper, we have proposed an approach where we combine a clustering technique and a stochastic technique to select effective features from the high dimensional breast cancer data set in quick time. In order to select strong and relevant features, we have used an improved version of K-means algorithm called fast K-means algorithm, which is much faster and more accurate than a general means algorithm. The fast K-means algorithm is embedded in Particle Swarm Optimization (PSO) algorithm to produce better results. The results were validated using various classification techniques and were evaluated on various performance evaluation measures. The results obtained were found to be highly supportive in nature. The feature subset generated using PSO based fast K-means algorithm on KDDcup 2008 data set produced an accuracy of 99.39% and its time complexity was found to be O(log(k)).
引用
收藏
页数:6
相关论文
共 30 条
  • [1] Adams J., 2012, MIDW ART INT COGN SC, P83
  • [2] [Anonymous], 2007, P 18 ANN ACM SIAM S
  • [3] [Anonymous], 2011, Advances in neural information processing systems
  • [4] [Anonymous], 2011, Pei. data mining concepts and techniques, DOI 10.1016/C2009-0-61819-5
  • [5] [Anonymous], 1998, Feature Extraction, Construction and Selection: A Data Mining Perspective
  • [6] Disease Forecasting System Using Data Mining Methods
    Banu, M. A. Nishara
    Gomathy, B.
    [J]. 2014 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING APPLICATIONS (ICICA 2014), 2014, : 130 - 133
  • [7] Cui XH, 2005, 2005 IEEE SWARM INTELLIGENCE SYMPOSIUM, P185
  • [8] Breast cancer statistics, 2011
    DeSantis, Carol
    Siegel, Rebecca
    Bandi, Priti
    Jemal, Ahmedin
    [J]. CA-A CANCER JOURNAL FOR CLINICIANS, 2011, 61 (06) : 409 - 418
  • [9] PERCENTAGE POINTS OF A TEST FOR CLUSTERS
    ENGELMAN, L
    HARTIGAN, JA
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1969, 64 (328) : 1647 - &
  • [10] Data clustering: A review
    Jain, AK
    Murty, MN
    Flynn, PJ
    [J]. ACM COMPUTING SURVEYS, 1999, 31 (03) : 264 - 323