Plagiarism Detection System for Indonesia Text Based Document by Fingerprint Method and Natural Language Processing Approach

被引:0
作者
Winarti, Titin [1 ]
Kerami, Djati [2 ]
Etp, Lussiana [3 ]
Sekarwati, Kemal Ade [4 ]
机构
[1] Semarang Univ, Fac Informat Technol & Commun, Semarang 50196, Indonesia
[2] Indonesia Univ, Fac Math & Nat Sci, Depok 16424, Indonesia
[3] Sch Informat Management & Comp Jakarta, Comp Syst, Jakarta 12140, Indonesia
[4] Gunadarma Univ, Fac Comp Sci & Informat Technol, Jakarta 16424, Indonesia
关键词
Plagiarism; Fingerprint; Natural Language Processing;
D O I
10.1166/asl.2016.7993
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The practice of plagiarism is very often carried out in a community environment for example in academia. So it can be stated that plagiarism is a major concern, especially in the academic environment, where it can affect both the credibility of the institution and its ability to ensure the quality of its students. In other words, the act of plagiarism may result in a decrease of creativity in the community. This research uses a combination of fingerprint method with natural language processing (NLP) approach. With the process or plagiarism detection system can be done through various methods, such as by the method of calculation algorithms Manber the similarities using the Jaccard coefficient and K-gram method as an alternative in the detection of document similarity, is expected to allow a user to use the application this without deciding the value of gram and its window to produce an accurate similarity value. Although it has been proven NLP techniques can improve the accuracy of detection tasks, there are other challenges remain. Current plagiarism detection tools are mostly limited to comparisons of suspicious plagiarised texts and potential original texts at string level. By doing stemming, the document similarity measurement process there was an increase of 31% measurement document based on documents that were tested.
引用
收藏
页码:3128 / 3131
页数:4
相关论文
共 50 条
  • [21] Source Code Plagiarism Detection and Performance Analysis Using Fingerprint Based Distance Measure Method
    Narayanan, Sandhya
    Simi, S.
    PROCEEDINGS OF 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, VOLS I-VI, 2012, : 1065 - 1068
  • [22] RESEARCH ON THE TEXT CLASSIFICATION BASED ON NATURAL LANGUAGE PROCESSING AND MACHINE LEARNING
    Chen Keming
    Zheng Jianguo
    JOURNAL OF THE BALKAN TRIBOLOGICAL ASSOCIATION, 2016, 22 (03): : 2484 - 2494
  • [23] Automated Essay Scoring Using Natural Language Processing And Text Mining Method
    Gunawansyah
    Rahayu, Riska
    Nurwathi
    Sugiarto, Bambang
    Gunawan
    PROCEEDING OF 14TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATION SYSTEMS, SERVICES, AND APPLICATIONS (TSSA), 2020,
  • [24] A SUMMARIZATION METHOD AUTOMATIC TEXT THROUGH STATISTICAL DATA AND NATURAL LANGUAGE PROCESSING
    de Souza, Osvaldo
    Tabosa, Hamilton Rodrigues
    de Oliveira, Davi Martins
    de Souza Oliveira, Mayra Helena
    INFORMACAO & SOCIEDADE-ESTUDOS, 2017, 27 (03) : 307 - 320
  • [25] Text Borrowings Detection System for Natural Language Structured Digital Documents
    Kuropiatnyk, Olen
    Shynkarenko, Viktor
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT SYSTEMS (COLINS 2020), VOL I: MAIN CONFERENCE, 2020, 2604
  • [26] A Highly Accurate PDF-To-Text Conversion System for Academic Papers Using Natural Language Processing Approach
    Yong, Tien Fui
    Azad, Saiful
    Rahman, Mohammed Mostafizur
    Zamli, Kamal Z.
    Rabby, Gollam
    ADVANCED SCIENCE LETTERS, 2018, 24 (10) : 7844 - 7849
  • [27] ScamBlk: A Voice Recognition-Based Natural Language Processing Approach for the Detection of Telecommunication Fraud
    Nandakumar, Manoj
    Nachiappan, Ramanathan
    Sunil, Akhil Krishnan
    Neves, Joao C.
    Proenca, Hugo Pedro
    Sathiyanarayanan, Mithileysh
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021), 2022, 394 : 507 - 514
  • [28] Automatic Extraction of Engineering Rules From Unstructured Text: A Natural Language Processing Approach
    Ye, Xinfeng
    Lu, Yuqian
    JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2020, 20 (03)
  • [29] Text Mining and Analysis of Treatise on Febrile Diseases Based on Natural Language Processing
    Kai Zhao
    Na Shi
    Zhen Sa
    Hua-Xing Wang
    Chun-Hua Lu
    Xiao-Ying Xu
    WorldJournalofTraditionalChineseMedicine, 2020, 6 (01) : 67 - 73
  • [30] Text mining and analysis of treatise on febrile diseases based on natural language processing
    Zhao, Kai
    Shi, Na
    Sa, Zhen
    Wang, Hua-Xing
    Lu, Chun-Hua
    Xu, Xiao-Ying
    WORLD JOURNAL OF TRADITIONAL CHINESE MEDICINE, 2020, 6 (01) : 67 - 73