Using graph databases in source code plagiarism detection

被引:0
|
作者
Novak, Matija [1 ]
Levak, Iva [1 ]
机构
[1] Univ Zagreb, Fac Org & Informat, Pavlinska 2, Varazhdin 42000, Croatia
来源
CENTRAL EUROPEAN CONFERENCE ON INFORMATION AND INTELLIGENT SYSTEMS, CECIIS 2022 | 2022年
关键词
plagiarism; graph databases; similarity detection; SIMILARITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern plagiarism detection tools calculate the percentage of the similarity between two given source code files. In academia, the process of checking for potential plagiarized students solutions can be challenging in terms of resources due to the large number of combinations between many students. In such conditions, the reliability of plagiarism detection tools may be put to risk. Every plagiarism detection tool produces a similarity report as files containing the results of the analysis for each pair of analyzed source code files. While such a report is useful for a one-time checking, sometimes it is needed to store the result data for future use. In our previous work, the results were stored in a relational database and a list of relevant queries was defined for meaningful analysis. Nevertheless, the large number of pair-wise impacts the storage and query execution speeds. In this paper, we present a solution to this problem by importing the similarity analysis data into a graph database and evaluate the difference in the query execution speed between a graph and a relational database.
引用
收藏
页码:465 / 470
页数:6
相关论文
共 50 条
  • [21] Source Code Anti-Plagiarism: A C# Implementation Using the Routing Approach
    d'Amore, Fabrizio
    Zarfati, Lorenzo
    PROCEEDINGS OF SEVENTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 4, 2023, 465 : 181 - 189
  • [22] Discovering and exploring cases of educational source code plagiarism with Dolos
    Maertens, Rien
    Van Neyghem, Maarten
    Geldhof, Maxiem
    Van Petegem, Charlotte
    Strijbol, Niko
    Dawyndt, Peter
    Mesuere, Bart
    SOFTWAREX, 2024, 26
  • [23] A Source Code Plagiarism Detecting Method Using Sequence Alignment with Abstract Syntax Tree Elements
    Kikuchi, Hiroshi
    Goto, Takaaki
    Wakatsuki, Mitsuo
    Nishino, Tetsuro
    INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2015, 3 (03) : 41 - 56
  • [24] Software Plagiarism Detection: A Graph-based Approach
    Chae, Dong-Kyu
    Ha, Jiwoon
    Kim, Sang-Wook
    Kang, BooJoong
    Im, Eul Gyu
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1577 - 1580
  • [25] Efficient Source Code Plagiarism Identification Based on Greedy String Tilling
    Haider, Khurram Zeeshan
    Nawaz, Tabassam
    Din, Sami ud
    Javed, Ali
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2010, 10 (12): : 204 - 210
  • [26] Detecting Pervasive Source Code Plagiarism through Dynamic Program Behaviours
    Cheers, Hayden
    Lin, Yuqing
    Smith, Shamus P.
    PROCEEDINGS OF THE TWENTY-SECOND AUSTRALASIAN COMPUTING EDUCATION CONFERENCE, ACE'20, 2020, : 21 - 30
  • [27] Semantic measure of plagiarism using a hierarchical graph model
    Zhang, Tingting
    Lee, Baozhen
    Zhu, Qinghua
    SCIENTOMETRICS, 2019, 121 (01) : 209 - 239
  • [28] Using Functional Dependencies in Conversion of Relational Databases to Graph Databases
    Megid, Youmna A.
    El-Tazi, Neamat
    Fahmy, Aly
    DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA 2018), PT II, 2018, 11030 : 350 - 357
  • [29] A Comparison of Three Popular Source code Similarity Tools for Detecting Student Plagiarism
    Ahadi, Alireza
    Mathieson, Luke
    PROCEEDINGS OF THE 21ST AUSTRALASIAN COMPUTING EDUCATION CONFERENCE (ACE 2019), 2019, : 112 - 117
  • [30] Using word semantic concepts for plagiarism detection in text documents
    Chang, Chia-Yang
    Lee, Shie-Jue
    Wu, Chih-Hung
    Liu, Chih-Feng
    Liu, Ching-Kuan
    INFORMATION RETRIEVAL JOURNAL, 2021, 24 (4-5): : 298 - 321