Privacy preserving sub-feature selection based on fuzzy probabilities

被引:15
作者
Bhuyan, Hemanta Kumar [1 ]
Kamila, Narendra Kumar [2 ]
机构
[1] Mahavir Inst Engn & Technol, Dept Comp Sci & Engn, Odisha, India
[2] CV Raman Coll Engn, Dept Comp Sci & Engn, Odisha, India
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2014年 / 17卷 / 04期
关键词
Distributed data mining; Fuzzy probabilities; Privacy; Feature selection; NETWORKS;
D O I
10.1007/s10586-014-0393-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The feature selection addresses the issue of developing accurate models for classification in data mining. The aggregated data collection from distributed environment for feature selection makes the problem of accessing the relevant inputs of individual data records. Preserving the privacy of individual data is often critical issue in distributed data mining. In this paper, it proposes the privacy preservation of individual data for both feature and sub-feature selection based on data mining techniques and fuzzy probabilities. For privacy purpose, each party maintains their privacy as the instruction of data miner with the help of fuzzy probabilities as alias values. The techniques have developed for own database of data miner in distributed network with fuzzy system and also evaluation of sub-feature value included for the processing of data mining task. The feature selection has been explained by existing data mining techniques i.e., gain ratio using fuzzy optimization. The estimation of gain ratio based on the relevant inputs for the feature selection has been evaluated within the expected upper and lower bound of fuzzy data set. It mainly focuses on sub-feature selection with privacy algorithm using fuzzy random variables among different parties in distributed environment. The sub-feature selection is uniquely identified for better class prediction. The algorithm provides the idea of selecting sub-feature using fuzzy probabilities with fuzzy frequency data from data miner's database. The experimental result shows performance of our findings based on real world data set.
引用
收藏
页码:1383 / 1399
页数:17
相关论文
共 40 条
[1]  
Agrawal R, 2000, SIGMOD REC, V29, P439, DOI 10.1145/335191.335438
[2]   Mining criminal networks from unstructured text documents [J].
Al-Zaidy, Rabeah ;
Fung, Benjamin C. M. ;
Youssef, Amr M. ;
Fortin, Francis .
DIGITAL INVESTIGATION, 2012, 8 (3-4) :147-160
[3]  
[Anonymous], 2006, Introduction to Data Mining
[4]  
[Anonymous], 2002, ACM Sigkdd Explorations Newsletter, DOI [10.1145/772862.772867, DOI 10.1145/772862.772867]
[5]  
[Anonymous], 2006, FUZZY OPTIM DECIS MA, DOI DOI 10.1007/S10700-006-7336-8
[6]  
[Anonymous], 2011, Pei. data mining concepts and techniques
[7]  
[Anonymous], 2007, Uci machine learning repository
[8]  
Azizi A., 2009, INT J COMPUT SCI INF, V2
[9]  
Bacardit J., 2005, THESIS LA SALLE U RA
[10]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550