Using graph databases in source code plagiarism detection

被引:0
作者
Novak, Matija [1 ]
Levak, Iva [1 ]
机构
[1] Univ Zagreb, Fac Org & Informat, Pavlinska 2, Varazhdin 42000, Croatia
来源
CENTRAL EUROPEAN CONFERENCE ON INFORMATION AND INTELLIGENT SYSTEMS, CECIIS 2022 | 2022年
关键词
plagiarism; graph databases; similarity detection; SIMILARITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern plagiarism detection tools calculate the percentage of the similarity between two given source code files. In academia, the process of checking for potential plagiarized students solutions can be challenging in terms of resources due to the large number of combinations between many students. In such conditions, the reliability of plagiarism detection tools may be put to risk. Every plagiarism detection tool produces a similarity report as files containing the results of the analysis for each pair of analyzed source code files. While such a report is useful for a one-time checking, sometimes it is needed to store the result data for future use. In our previous work, the results were stored in a relational database and a list of relevant queries was defined for meaningful analysis. Nevertheless, the large number of pair-wise impacts the storage and query execution speeds. In this paper, we present a solution to this problem by importing the similarity analysis data into a graph database and evaluate the difference in the query execution speed between a graph and a relational database.
引用
收藏
页码:465 / 470
页数:6
相关论文
共 50 条
  • [31] Lessons learnt in applying automated code plagiarism detection in an introductory programming module
    Haskins, Bertram
    Pieterse, Vreda
    INDEPENDENT JOURNAL OF TEACHING AND LEARNING, 2016, 11 (01) : 69 - 81
  • [32] Improving Plagiarism Detection Using Genetic Algorithm
    Pajic, Enil
    Ljubovic, Vedran
    2019 42ND INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2019, : 571 - 576
  • [33] To Enhance the Code Clone Detection Algorithm by using Hybrid Approach for detection of code clones
    Roopam
    Singh, Gurpreet
    2017 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS), 2017, : 192 - 198
  • [34] Source-code Similarity Detection and Detection Tools Used in Academia: A Systematic Review
    Novak, Matija
    Joy, Mike
    Kermek, Dragutin
    ACM TRANSACTIONS ON COMPUTING EDUCATION, 2019, 19 (03)
  • [35] Software Reuse and Plagiarism: A Code of Practice
    Gibson, J. Paul
    ITICSE 2009: PROCEEDING OF THE 2009 ACM SIGSE ANNUAL CONFERENCE ON INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, 2009, : 55 - 59
  • [36] Efficient search in graph databases using cross filtering
    Lee, Chun-Hee
    Chung, Chin-Wan
    INFORMATION SCIENCES, 2014, 286 : 1 - 18
  • [37] Modeling XACML Security Policies Using Graph Databases
    Paniagua Diez, Fidel
    Vasu, Amrutha Chikkanayakanahalli
    Suarez Touceda, Diego
    Sierra Camara, Jose Maria
    IT PROFESSIONAL, 2017, 19 (06) : 52 - 57
  • [38] Using Queries as Schema-Templates for Graph Databases
    Stephan Mennicke
    Jan-Christoph Kalo
    Wolf-Tilo Balke
    Datenbank-Spektrum, 2018, 18 (2) : 89 - 98
  • [39] Detection of Source Code Similitude in Academic Environments
    Bejarano, Andres M.
    Garcia, Lucy E.
    Zurek, Eduardo E.
    COMPUTER APPLICATIONS IN ENGINEERING EDUCATION, 2015, 23 (01) : 13 - 22
  • [40] Detection of Plagiarism in Database Schemas Using Structural Fingerprints
    El-Wahed, Samer M. Abd
    Elfatatry, Ahmed
    Abougabal, Mohamed S.
    2009 IEEE/ACS INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS, VOLS 1 AND 2, 2009, : 787 - +