Research of the Optimization of a Data Mining Algorithm Based on an Embedded Data Mining System

被引:1
作者
Wang, Xindi [1 ]
Chen, Mengfei [1 ]
Chen, Li [2 ]
机构
[1] Beijing Jiaotong Univ, Informat Management Dept, Beijing 100044, CO, Peoples R China
[2] Beijing Jiaotong Univ, Logist Management Dept, Beijing 100044, CO, Peoples R China
关键词
Embedded database; data mining; association rules; Apriori algorithm; duplication; frequent item sets;
D O I
10.2478/cait-2013-0033
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
At present most of the data mining systems are independent with respect to the database system, and data loading and conversion take much time. The running time of the algorithms in a data mining process is also long. Although some optimized algorithms have improved it in different aspects, they could not improve the efficiency to a large extent when many duplicate records are available in a database. Solving the problem of improving the efficiency of data mining in the presence of many coinciding records in a database, an Apriori optimized algorithm is proposed. Firstly, a new concept of duplication and use is suggested to remove and count the same records, in order to generate a new database of a small size. Secondly, the original database is compressed according to the users' requirements. At last, finding the frequent item sets based on binary coding, strong association rules are obtained. The structure of the data mining system based on an embedded database has also been designed in this paper. The theoretical analysis and experimental verification prove that the optimized algorithm is appropriate and the algorithm application in an embedded data mining system can further improve the mining efficiency.
引用
收藏
页码:5 / 17
页数:13
相关论文
共 9 条
  • [1] Zhang G.L., Lei J.S., Wu X.H., An Improved Apriori Algorithm for Mining Association Rules, Computer Technology and Development, 20, 6, pp. 84-89, (2010)
  • [2] Ding R., Embedded Database Technology, pp. 65-91, (2001)
  • [3] Naveen K., Sanjay K., Abid H., Pardeep G., Implementing Lean Manufacturing System: ISM Approach, Journal of Industrial Engineering and Management, 6, 4, pp. 996-1012, (2013)
  • [4] Liu Y., Yu C.Y., Zhang X.J., The Application of Embedded Database in Data Mining System, Journal of Liaoning University of Petroleum and Chemical, 30, 4, pp. 63-65, (2010)
  • [5] Lu Q.C., Zou P., Research and Application Development of Data Mining, Journal of Kunming University of Science and Technology, 27, 5, pp. 62-66, (2002)
  • [6] Luo X.L., Research of Improved Apriori Algorithm, Journal of Yangtze University (Natural Science Edition), 8, 3, pp. 75-77, (2011)
  • [7] Ye X.B., A Kind of Searching Frequent Item Sets Algorithm Based on Binary Code, Journal of Chuxiong NormaLuniversity, 24, 3, pp. 13-19, (2009)
  • [8] Sun D.L., Association Rules Analysis and its Application in Credit Card Fraud, China's Credit Card, 11, pp. 36-37, (2007)
  • [9] Zhou H.Y., Zhang Y., Lin P., A New Algorithm With no Candidate Sets of Mining Frequent Item Sets, Computer Engineering and Applications, 40, 15, pp. 182-185, (2004)