Feature Selection-Based Clustering on Micro-blogging Data

被引:6
作者
Dutta, Soumi [1 ,2 ]
Ghatak, Sujata [1 ]
Das, Asit Kumar [2 ]
Gupta, Manan [1 ]
Dasgupta, Sayantika [1 ]
机构
[1] Inst Engn & Management, Kolkata 700091, India
[2] Indian Inst Engn Sci & Technol Shibpur, Howrah 711103, India
来源
COMPUTATIONAL INTELLIGENCE IN DATA MINING | 2019年 / 711卷
关键词
Clustering; Feature selection; Micro-blogs;
D O I
10.1007/978-981-10-8055-5_78
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The growing popularity of micro-blogging phenomena opens up a flexible platform for the public as communication media for the public. For any trending/non-trending topic, thousands of post are posted daily in micro-blogs. During any important event, such as natural calamity and election, and sports event, such as IPL and World Cup, a huge number of messages (micro-blogs) are posted. Due to fast and huge exchange of messages causes information overload, hence clustering or grouping similar messages is an effective way to reduce that. Less content and noisy nature of messages are challenging factor in micro-blog data clustering. Incremental huge data is another challenge to clustering. So, in this work, a novel clustering approach is proposed for micro-blogs combining feature selection technique. The proposed approach has been applied to several experimental dataset, and it is compared with several existing clustering techniques which results in better outcome than other methods.
引用
收藏
页码:885 / 895
页数:11
相关论文
共 50 条
[21]   Feature Selection-Based Hierarchical Deep Network for Image Classification [J].
He, Guiqing ;
Ji, Jiaqi ;
Zhang, Haixi ;
Xu, Yuelei ;
Fan, Jianping .
IEEE ACCESS, 2020, 8 :15436-15447
[22]   Differential Privacy High-Dimensional Data Publishing Based on Feature Selection and Clustering [J].
Chu, Zhiguang ;
He, Jingsha ;
Zhang, Xiaolei ;
Zhang, Xing ;
Zhu, Nafei .
ELECTRONICS, 2023, 12 (09)
[23]   Adaptive Data Clustering Ensemble Algorithm Based on Stability Feature Selection and Spectral Clustering [J].
Li, Zuhong ;
Ma, Zhixin ;
Ma, Zhicheng ;
Yang, Shibo .
2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2019), 2019, :277-281
[24]   A Clustering Based Feature Selection Method Using Feature Information Distance for Text Data [J].
Chao, Shilong ;
Cai, Jie ;
Yang, Sheng ;
Wang, Shulin .
INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2016, PT I, 2016, 9771 :122-132
[25]   Principal Component Analysis based Feature Selection for clustering [J].
Xu, Jun-Ling ;
Xu, Bao-Wen ;
Zhang, Wei-Feng ;
Cui, Zi-Feng .
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, :460-+
[26]   Feature selection in robust clustering based on Laplace mixture [J].
Cord, A ;
Ambroise, C ;
Cocquerez, JP .
PATTERN RECOGNITION LETTERS, 2006, 27 (06) :627-635
[27]   Link based BPSO for feature selection in big data text clustering [J].
Kushwaha, Neetu ;
Pant, Millie .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 82 :190-199
[28]   A novel feature selection approach based on clustering algorithm [J].
Moslehi, Fateme ;
Haeri, Abdorrahman .
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2021, 91 (03) :581-604
[29]   On Taxonomy and Evaluation of Feature Selection-Based Learning Classifier System Ensemble Approaches for Data Mining Problems [J].
Debie, Essam ;
Shafi, Kamran ;
Merrick, Kathryn ;
Lokan, Chris .
COMPUTATIONAL INTELLIGENCE, 2017, 33 (03) :554-578
[30]   Feature selection for genomic data sets through feature clustering [J].
Zheng, Fengbin ;
Shen, Xiajiong ;
Fu, Zhengye ;
Zheng, Shanshan ;
Li, Guangrong .
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2010, 4 (02) :228-240