An Efficient Method for Mining Rare Association Rules: A Case Study on Air Pollution

被引:7
作者
Borah, Anindita [1 ]
Nath, Bhabesh [1 ]
机构
[1] Tezpur Univ, Dept Comp Sci & Engn, Tezpur, Assam, India
关键词
Air pollution; data mining; association rule; rare association rule; rare pattern; PATTERN; FREQUENT;
D O I
10.1142/S0218213021500184
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most pattern mining techniques almost singularly focus on identifying frequent patterns and very less attention has been paid to the generation of rare patterns. However, in several domains, recognizing less frequent but strongly related patterns have greater advantage over the former ones. Identification of compelling and meaningful rare associations among such patterns may proved to be significant for air quality management that has become an indispensable task in today's world. The rare correlations between air pollutants and other parameters may aid in restricting the air pollution to a manageable level. To this end, efficient and competent rare pattern mining techniques are needed that can generate the complete set of rare patterns, further identifying significant rare association rules among them. Moreover, a notable issue with databases is their continuous update over time due to the addition of new records. The users requirement or behavior may change with the incremental update of databases that makes it difficult to determine a suitable support threshold for the extraction of interesting rare association rules. This paper, presents an efficient rare pattern mining technique to capture the complete set of rare patterns from a real environmental dataset. The proposed approach does not restart the entire mining process upon threshold update and generates the complete set of rare association rules in a single database scan. It can effectively perform incremental mining and also provides flexibility to the user to regulate the value of support threshold for generating the rare patterns. Significant rare association rules representing correlations between air pollutants and other environmental parameters are further extracted from the generated rare patterns to identify the substantial causes of air pollution. Performance analysis shows that the proposed method is more efficient than existing rare pattern mining approaches in providing significant directions to the domain experts for air pollution monitoring.
引用
收藏
页数:35
相关论文
共 42 条
  • [1] Adda M., 2012, ARXIV PREPRINT ARXIV
  • [2] Rare itemset mining
    Adda, Mehdi
    Wu, Lei
    Feng, Yi
    [J]. ICMLA 2007: SIXTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2007, : 73 - +
  • [3] Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
  • [4] Agrawal R., P 20 INT C VERY LARG, DOI DOI 10.1055/S-2007-996789
  • [5] A Novel Approach for Finding Rare Items Based on Multiple Minimum Support Framework
    Bhatt, Urvi
    Patel, Pratik
    [J]. 3RD INTERNATIONAL CONFERENCE ON RECENT TRENDS IN COMPUTING 2015 (ICRTC-2015), 2015, 57 : 1088 - 1095
  • [6] Biswas S., 2018, INT C COMP INT COMM, P291
  • [7] Bora A., 2018, ICLR, P1
  • [8] Borah Anindita, 2017, International Journal of Knowledge Engineering and Data Mining, V4, P204
  • [9] Borah Anindita, 2018, Proceedings of First International Conference on Smart System, Innovations and Computing. SSIC 2017. Smart Innovation, Systems and Technologies (SIST 79), P535, DOI 10.1007/978-981-10-5828-8_51
  • [10] Comparative evaluation of pattern mining techniques: an empirical study
    Borah, Anindita
    Nath, Bhabesh
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2021, 7 (02) : 589 - 619