<bold>Computerized summary scoring: crowdsourcing-based latent semantic analysis</bold>

被引:19
作者
Li, Haiying [1 ]
Cai, Zhiqiang [2 ]
Graesser, Arthur C. [2 ,3 ]
机构
[1] Rutgers State Univ, Grad Sch Educ, New Brunswick, NJ 08901 USA
[2] Univ Memphis, Inst Intelligent Syst, Memphis, TN 38152 USA
[3] Univ Memphis, Dept Psychol, Memphis, TN 38152 USA
关键词
LSA similarity; Crowdsourcing; Computerized summary scoring; GRADE BRIEF SUMMARIES; MECHANICAL TURK; COH-METRIX; COMPREHENSION; TEXT; STUDENTS; REPRESENTATION; TECHNOLOGY; RETENTION; RETRIEVAL;
D O I
10.3758/s13428-017-0982-7
中图分类号
B841 [心理学研究方法];
学科分类号
040201 ;
摘要
In this study we developed and evaluated a crowdsourcing-based latent semantic analysis (LSA) approach to computerized summary scoring (CSS). LSA is a frequently used mathematical component in CSS, where LSA similarity represents the extent to which the to-be-graded target summary is similar to a model summary or a set of exemplar summaries. Researchers have proposed different formulations of the model summary in previous studies, such as pregraded summaries, expert-generated summaries, or source texts. The former two methods, however, require substantial human time, effort, and costs in order to either grade or generate summaries. Using source texts does not require human effort, but it also does not predict human summary scores well. With human summary scores as the gold standard, in this study we evaluated the crowdsourcing LSA method by comparing it with seven other LSA methods that used sets of summaries from different sources (either experts or crowdsourced) of differing quality, along with source texts. Results showed that crowdsourcing LSA predicted human summary scores as well as expert-good and crowdsourcing-good summaries, and better than the other methods. A series of analyses with different numbers of crowdsourcing summaries demonstrated that the number (from 10 to 100) did not significantly affect performance. These findings imply that crowdsourcing LSA is a promising approach to CSS, because it saves human effort in generating the model summary while still yielding comparable performance. This approach to small-scale CSS provides a practical solution for instructors in courses, and also advances research on automated assessments in which student responses are expected to semantically converge on subject matter content.
引用
收藏
页码:2144 / 2161
页数:18
相关论文
共 72 条
[1]  
[Anonymous], 2015, LIWC 2015 operators manual
[2]  
[Anonymous], STRATEGIES DISCOURSE
[3]  
[Anonymous], 2003, P 2003 C N AM CHAPT
[4]  
[Anonymous], 2013, J STUDIES ED
[5]  
[Anonymous], 1998, Comprehension: A paradigm for cognition
[6]  
[Anonymous], 2014, DANMARKS GRAESSER
[7]  
Baleghizadeh S., 2011, New England Reading Association Journal, V47, P44
[8]  
Britt M.A., 2004, Reading Psychology, P313, DOI [DOI 10.1080/02702710490522658, 10.1080/02702710490522658]
[9]   MACRORULES FOR SUMMARIZING TEXTS - THE DEVELOPMENT OF EXPERTISE [J].
BROWN, AL ;
DAY, JD .
JOURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR, 1983, 22 (01) :1-14
[10]   Amazon's Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data? [J].
Buhrmester, Michael ;
Kwang, Tracy ;
Gosling, Samuel D. .
PERSPECTIVES ON PSYCHOLOGICAL SCIENCE, 2011, 6 (01) :3-5