Using graph databases in source code plagiarism detection

被引:0
作者
Novak, Matija [1 ]
Levak, Iva [1 ]
机构
[1] Univ Zagreb, Fac Org & Informat, Pavlinska 2, Varazhdin 42000, Croatia
来源
CENTRAL EUROPEAN CONFERENCE ON INFORMATION AND INTELLIGENT SYSTEMS, CECIIS 2022 | 2022年
关键词
plagiarism; graph databases; similarity detection; SIMILARITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Modern plagiarism detection tools calculate the percentage of the similarity between two given source code files. In academia, the process of checking for potential plagiarized students solutions can be challenging in terms of resources due to the large number of combinations between many students. In such conditions, the reliability of plagiarism detection tools may be put to risk. Every plagiarism detection tool produces a similarity report as files containing the results of the analysis for each pair of analyzed source code files. While such a report is useful for a one-time checking, sometimes it is needed to store the result data for future use. In our previous work, the results were stored in a relational database and a list of relevant queries was defined for meaningful analysis. Nevertheless, the large number of pair-wise impacts the storage and query execution speeds. In this paper, we present a solution to this problem by importing the similarity analysis data into a graph database and evaluate the difference in the query execution speed between a graph and a relational database.
引用
收藏
页码:465 / 470
页数:6
相关论文
共 50 条
  • [41] CPLAG: Efficient Plagiarism Detection using Bitwise Operations
    Jain, Shikha
    Kaur, Parmeet
    Goyal, Mukta
    Dhanalekshmi, G.
    2017 TENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2017, : 168 - 172
  • [42] Enhancing Investigative Pattern Detection via Inexact Matching and Graph Databases
    Muramudalige, Shashika R.
    Hung, Benjamin W. K.
    Jayasumana, Anura P.
    Ray, Indrakshi
    Klausen, Jytte
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (05) : 2780 - 2794
  • [43] FIGHTING PLAGIARISM: METRICS AND METHODS TO MEASURE AND FIND SIMILARITIES AMONG SOURCE CODE OF COMPUTER PROGRAMS IN VPL
    Rodriguez-del-Pino, J. C.
    Rubio-Royo, E.
    Hernandez-Figueroa, Z.
    EDULEARN11: 3RD INTERNATIONAL CONFERENCE ON EDUCATION AND NEW LEARNING TECHNOLOGIES, 2011, : 4339 - 4346
  • [44] Model Based Development of Data Integration in Graph Databases Using Triple Graph Grammars
    Alqahtani, Abdullah
    Heckel, Reiko
    SOFTWARE TECHNOLOGIES: APPLICATIONS AND FOUNDATIONS, 2018, 11176 : 399 - 414
  • [45] Multilingual plagiarism detection
    Ceska, Zdenek
    Toman, Michal
    Jezek, Karel
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, 2008, 5253 : 83 - 92
  • [46] Plagiarism in the Context of Education and Evolving Detection Strategies
    Gasparyan, Armen Yuri
    Nurmashev, Bekaidar
    Seksenbayev, Bakhytzhan
    Trukhachev, Vladimir I.
    Kostyukova, Elena I.
    Kitas, George D.
    JOURNAL OF KOREAN MEDICAL SCIENCE, 2017, 32 (08) : 1220 - 1227
  • [47] Modeling and Querying Sensor Networks Using Temporal Graph Databases
    Kuijpers, Bart
    Soliani, Valeria
    Vaisman, Alejandro
    NEW TRENDS IN DATABASE AND INFORMATION SYSTEMS, ADBIS 2022, 2022, 1652 : 222 - 231
  • [48] Using Graph Databases for Portraying and Analysing Biological and Biomedical Networks
    Ristevski, Blagoj
    Savoska, Snezana
    Savoski, Zlatko
    2022 8TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'22), 2022, : 565 - 568
  • [49] PARALLELIZATION OF GST ALGORITHM FOR SOURCE CODE SIMILARITY DETECTION
    Misic, Marko J.
    Nikolov, Dusan V.
    Protic, Jelica Z.
    Tomasevic, Milo V.
    2016 24TH TELECOMMUNICATIONS FORUM (TELFOR), 2016, : 921 - 924
  • [50] Intrinsic Plagiarism Detection System Using Stylometric Features and DBSCAN
    Saini, Anu
    Sri, Manepalli Ratna
    Thakur, Mansi
    2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, AND INTELLIGENT SYSTEMS (ICCCIS), 2021, : 13 - 18