Algorithms for Telemetry Data Mining using Discrete Attributes

被引:0
作者
Ofer, Roy B. [1 ]
Eldar, Adi [1 ]
Shalev, Adi [1 ,2 ]
Resheff, Yehezkel S. [1 ,2 ]
机构
[1] Microsoft ILDC, Herzelyia, Israel
[2] Hebrew Univ Jerusalem, Jerusalem, Israel
来源
ICPRAM: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS | 2017年
关键词
Data Mining; Pattern Mining; Software Telemetry; Failure Analysis; Subspace Clustering;
D O I
10.5220/0006117903090317
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As the cost of collecting and storing large amounts of data continues to drop, we see a constant rise in the amount of telemetry data collected by software applications and services. With the data mounding up, there is an increasing need for algorithms to automatically and efficiently mine insights from the collected data. One interesting case is the description of large tables using frequently occurring patterns, with implications for failure analysis and customer engagement. Finding frequently occurring patterns has applications both in an interactive usage where an analyst repeatedly query the data and in a completely automated process queries the data periodically and generate alerts and or reports based on the mining. Here we propose two novel mining algorithms for the purpose of computing such predominant patterns in relational data. The first method is a fast heuristic search, and the second is based on an adaptation of the apriori algorithm. Our methods are demonstrated on real-world datasets, and extensions to some additional fundamental mining tasks are discussed.
引用
收藏
页码:309 / 317
页数:9
相关论文
共 16 条
  • [1] Agrawal R., P 20 INT C VERY LARG
  • [2] [Anonymous], 1998, AUTOMATIC SUBSPACE C
  • [3] [Anonymous], 2004, SIGKDD EXPLOR, DOI DOI 10.1145/1007730.1007731
  • [4] [Anonymous], 2008, Proceedings of the Third conference on Tackling computer systems problems with machine learning techniques
  • [5] A View of Cloud Computing
    Armbrust, Michael
    Fox, Armando
    Griffith, Rean
    Joseph, Anthony D.
    Katz, Randy
    Konwinski, Andy
    Lee, Gunho
    Patterson, David
    Rabkin, Ariel
    Stoica, Ion
    Zaharia, Matei
    [J]. COMMUNICATIONS OF THE ACM, 2010, 53 (04) : 50 - 58
  • [6] Couto J, 2005, LECT NOTES COMPUT SC, V3646, P46
  • [7] Interpretable and Informative Explanations of Outcomes
    El Gebaly, Kareem
    Agrawal, Parag
    Golab, Lukasz
    Korn, Flip
    Srivastava, Divesh
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 8 (01): : 61 - 72
  • [8] Hegland M., 2005, Mathematics and Computation in Imaging Science and Information Processing, V11, P209
  • [9] Average-case performance of the Apriori Algorithm
    Purdom, PW
    Van Gucht, D
    Groth, DP
    [J]. SIAM JOURNAL ON COMPUTING, 2004, 33 (05) : 1223 - 1260
  • [10] Qian L, 2009, LECT NOTES COMPUT SC, V5931, P626, DOI 10.1007/978-3-642-10665-1_63