Using graph databases in source code plagiarism detection

被引:0
|
作者
Novak, Matija [1 ]
Levak, Iva [1 ]
机构
[1] Univ Zagreb, Fac Org & Informat, Pavlinska 2, Varazhdin 42000, Croatia
来源
CENTRAL EUROPEAN CONFERENCE ON INFORMATION AND INTELLIGENT SYSTEMS, CECIIS 2022 | 2022年
关键词
plagiarism; graph databases; similarity detection; SIMILARITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern plagiarism detection tools calculate the percentage of the similarity between two given source code files. In academia, the process of checking for potential plagiarized students solutions can be challenging in terms of resources due to the large number of combinations between many students. In such conditions, the reliability of plagiarism detection tools may be put to risk. Every plagiarism detection tool produces a similarity report as files containing the results of the analysis for each pair of analyzed source code files. While such a report is useful for a one-time checking, sometimes it is needed to store the result data for future use. In our previous work, the results were stored in a relational database and a list of relevant queries was defined for meaningful analysis. Nevertheless, the large number of pair-wise impacts the storage and query execution speeds. In this paper, we present a solution to this problem by importing the similarity analysis data into a graph database and evaluate the difference in the query execution speed between a graph and a relational database.
引用
收藏
页码:465 / 470
页数:6
相关论文
共 50 条
  • [1] Scalable Source Code Plagiarism Detection Using Source Code Vectors Clustering
    Duracik, Michal
    Krsak, Emil
    Hrkut, Patrik
    PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, : 499 - 502
  • [2] SOURCE CODE PLAGIARISM DETECTION METHOD USING ONTOLOGIES
    Smeureanu, Ion
    Iancu, Bogdan
    INTERNATIONAL CONFERENCE ON INFORMATICS IN ECONOMY, 2013, : 594 - 597
  • [3] USING CONCEPTS OF TEXT BASED PLAGIARISM DETECTION IN SOURCE CODE PLAGIARISM ANALYSIS
    Duracik, Michal
    Krsak, Emil
    Hrkut, Patrik
    PLAGIARISM ACROSS EUROPE AND BEYOND 2017, 2017, : 177 - 186
  • [4] Automatic Source Code Plagiarism Detection
    Kustanto, Cynthia
    Liem, Inggriani
    SNPD 2009: 10TH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCES, NETWORKING AND PARALLEL DISTRIBUTED COMPUTING, PROCEEDINGS, 2009, : 481 - 486
  • [5] Source Code Representations for Plagiarism Detection
    Duracik, Michal
    Krsak, Emil
    Hrkut, Patrik
    LEARNING TECHNOLOGY FOR EDUCATION CHALLENGES, LTEC 2018, 2018, 870 : 61 - 69
  • [6] Source code plagiarism detection: The Unix way
    Petrik, Juraj
    Chuda, Daniela
    Steinmuller, Branislav
    2017 IEEE 15TH INTERNATIONAL SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI), 2017, : 467 - 471
  • [7] A Source Code Similarity System for Plagiarism Detection
    Duric, Zoran
    Gasevic, Dragan
    COMPUTER JOURNAL, 2013, 56 (01): : 70 - 86
  • [8] A State of Art on Source Code Plagiarism Detection
    Agrawal, Mayank
    Sharma, Dilip Kumar
    PROCEEDINGS ON 2016 2ND INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING TECHNOLOGIES (NGCT), 2016, : 236 - 241
  • [9] Source Code Plagiarism Detection Using Biological String Similarity Algorithms
    Rahal, Imad
    Wielga, Colin
    JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2014, 13 (03)
  • [10] Source Code Plagiarism
    Sraka, Dejan
    Kaucic, Branko
    PROCEEDINGS OF THE ITI 2009 31ST INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES, 2009, : 461 - 466