Original Content Extraction Oriented to Anti-plagiarism

被引:0
作者
Shen Yang [1 ]
Cheng Ming [2 ,3 ]
Yao Xing [1 ]
Wei Wei [1 ]
机构
[1] Wuhan Univ, Sch Informat Management, Wuhan 430072, Peoples R China
[2] Wuhan Univ, Res Ctr Chinese Sci Evaluat, Wuhan 430072, Peoples R China
[3] Wuhan Univ, Int Sch Software, Wuhan 430072, Peoples R China
来源
2009 INTERNATIONAL CONFERENCE ON MANAGEMENT SCIENCE & ENGINEERING (16TH), VOLS I AND II, CONFERENCE PROCEEDINGS | 2009年
基金
中国国家自然科学基金;
关键词
Beyes; citation removal; content extraction; plagiarism; thesis structure;
D O I
10.1109/ICMSE.2009.5317530
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In order to reduce the impact of inclusion of citations and references during the detection of plagiarism in academic theses, and extract the original content, the author created three ways to extract original content and remove the citation: (1) Removal of normative citations by symbol features; (2) removal tacit citations by Bayesian method based on the minimum risk and thesis structure; (3) removal common knowledge base on domain public knowledge base. The research results show that during the extraction of original content, the precision decreases as the risk coefficient increases, while the recall rate increases with the risk coefficient. When the risk coefficient is 60, the whole performance achieves the optimum. Plagiarism detection after extracting the original content presents a fault rate decrease from 9.09% to 4.52%.
引用
收藏
页码:17 / +
页数:3
相关论文
共 25 条
  • [1] BAO JP, 2003, J SOFTWARE, V10, P1753
  • [2] Berger Judson, 2004, AM JOURNALISM REV, V26, P25
  • [3] *CHIN NAT TECHN CO, 1988, 771487 GB CHIN NAT T
  • [4] Bayesian network classifiers
    Friedman, N
    Geiger, D
    Goldszmidt, M
    [J]. MACHINE LEARNING, 1997, 29 (2-3) : 131 - 163
  • [5] JIANG XT, 1997, J HANGZHOU U NATURAL, V24, P220
  • [6] JIN ZY, 2009, ACAD PLAGIARISM DETE
  • [7] Kohavi R., 1996, P 2 INT C KNOWLEDGE, P202
  • [8] KONONENKO I, 1991, LECT NOTES ARTIF INT, V482, P206, DOI 10.1007/BFb0017015
  • [9] LANGLEY P, 1992, AAAI-92 PROCEEDINGS : TENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, P223
  • [10] Langley P., 1994, Proc. of the Tenth Conf. on Uncertainty in Artificial Intelligence, P399