Data Mining in Programs: Clustering Programs Based on Structure Metrics and Execution Values

被引:2
作者
Wang, TianTian [1 ]
Wang, KeChao [2 ]
Su, XiaoHong [1 ]
Liu, Lin [2 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin, Peoples R China
[2] Harbin Univ, Sch Informat Engn, Harbin, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering; Data Mining; Program Repair; Structural Metrics; Value Sequence;
D O I
10.4018/IJDWM.2020040104
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software exists in various control systems, such as security-critical systems and so on. Existing program clustering methods are limited in identifying functional equivalent programs with different syntactic representations. To solve this problem, firstly, a clustering method based on structured metric vectors was proposed to quickly identify structurally similar programs from a large number of existing programs. Next, a clustering method based on similar execution value sequences was proposed, to accurately identify the functional equivalent programs with code variations. This approach has been applied in automatic program repair, to identify sample programs from a large pool of template programs. The average purity value is 0.95576 and the average entropy is 0.15497. This means that the clustering partition is consistent with the expected partition.
引用
收藏
页码:48 / 63
页数:16
相关论文
共 50 条
[31]   Revolutionizing education: Harnessing data mining for enhanced resource repository in fundamental training programs [J].
Wu, Xiaozhou .
JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2025, 25 (01) :341-354
[32]   Determining The Threshold Values Of Quality Metrics In BPMN Process Models Using Data Mining Techniques [J].
Kbaier, Wiem ;
Ghannouchi, Sonia Ayachi .
CENTERIS2019--INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS/PROJMAN2019--INTERNATIONAL CONFERENCE ON PROJECT MANAGEMENT/HCIST2019--INTERNATIONAL CONFERENCE ON HEALTH AND SOCIAL CARE INFORMATION SYSTEMS AND TECHNOLOGIES, 2019, 164 :113-119
[33]   A new clustering segmentation algorithm of 3D medical data field based on data mining [J].
Xinwu L. .
International Journal of Digital Content Technology and its Applications, 2010, 4 (04) :174-181
[34]   Social Network Data Mining Using Natural Language Processing and Density Based Clustering [J].
Khanaferov, David ;
Luc, Christopher ;
Wang, Taehyung .
2014 IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2014, :250-251
[35]   Research on Data Mining of Wind Disaster of Power Transmission Line Based on Clustering Analysis [J].
Li Peng ;
Liu Bin ;
Cheng Yong-feng ;
Wang Jing-chao ;
Li Dan-yu ;
Yang Jia-lun .
2019 6TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2019), 2019, :459-463
[36]   Research On Novel Model of Data Mining Based on Improved Association Rules and Clustering Algorithm [J].
Tan, Qing .
PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE ON EDUCATION, MANAGEMENT, COMPUTER AND SOCIETY (EMCS 2017), 2017, 61 :522-526
[37]   Efficient clustering in data mining applications based on harmony search and k-medoids [J].
Ranjbar Noshari, Moein ;
Azgomi, Hossein ;
Asghari, Ali .
Soft Computing, 2024, 28 (23) :13245-13268
[38]   A k-means clustering-based security framework for mobile data mining [J].
Guizani, Sghaier .
WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2016, 16 (18) :3449-3454
[39]   Data Mining based Geospatial Clustering for Suitable Recommendation system [J].
Suchithra, M. S. ;
Pai, Maya L. .
PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT-2020), 2020, :132-139
[40]   Fraud detection in social income transfer programs: a social data mining approach applied to data from Brazil [J].
Diego de Castro Rodrigues ;
Márcio Dias de Lima ;
Rommel M. Barbosa .
SN Social Sciences, 2 (9)