Determining Degree of Relevance of Reviews Using a Graph-Based Text Representation

被引:5
作者
Ramachandran, Lakshmi [1 ]
Gehringer, Edward F. [1 ]
机构
[1] NC State Univ, Dept Comp Sci, Raleigh, NC 27607 USA
来源
2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011) | 2011年
关键词
relevance; graph-based representation; plagiarism; paraphrasing; k-nearest neighbor;
D O I
10.1109/ICTAI.2011.72
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reviews are text-based feedback provided by reviewers to authors. The quality of a review can be determined by identifying how relevant it is to the work that the review was written for as well as its similarity to existing well-written and coherent reviews. Relevance between two pieces of text can be determined by identifying semantic and syntactic similarities between them. In this paper, we make use of string-based metrics that incorporate concepts of paraphrasing and plagiarism to determine matching between texts. We use a graph-based text representation technique. We use the k-nearest neighbor classification algorithm to build a supervised model and classify text as LOW, MEDIUM or HIGH based on values of the metrics. We evaluate our approach on three data sets from student assignments and show that our model achieves an average accuracy of 63%.
引用
收藏
页码:442 / 445
页数:4
相关论文
共 5 条
  • [1] Bengoetxea E., 2002, THESIS U PAIS VASCO
  • [2] Boonthum C., 2004, ACLSTUDENT 04
  • [3] NEAREST NEIGHBOR PATTERN CLASSIFICATION
    COVER, TM
    HART, PE
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) : 21 - +
  • [4] Gehringer E.F., 2007, Innovate: Journal of Online Education
  • [5] Qiu Long, 2006, Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP '06, P18