A Modified Principal Component Analysis Approach to Automated Essay-Type Grading

被引:0
作者
Oduntan, Odunayo Esther [1 ]
Olabiyisi, Stephen Olatunde [2 ]
Adeyanju, Ibrahim Adepoju [3 ]
Omidiora, Elijah Olusayo [2 ]
机构
[1] Fed Polytech, Dept Comp Sci, Ilaro, Ogun State, Nigeria
[2] LAUTECH, Dept Comp Sci & Engn, Ogbomosho, Nigeria
[3] Fed Univ Oye Ekiti, Dept Comp Engn, Oye Ekiti, Ekiti State, Nigeria
来源
PROCEEDINGS OF 2016 FUTURE TECHNOLOGIES CONFERENCE (FTC) | 2016年
关键词
principal component analysis; feature extraction; vector space model; automated essay-type grading system; n-gram;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This study investigates the relative efficacy of using n-grams extracted terms, the aggregation of such terms, and a combination of feature extraction techniques in building an automated essay-type grading (AETG) system. The paper focused on the modification of the Principal Component Analysis (PCA) by integrating n-grams terms as input into the PCA algorithm. Hardcopies of examiners' marking schemes and softcopies of students' answers for two courses, Management Information System (COM 317) and Research Methodology (COM 325), offered at the Department of Computer Science, Federal Polytechnic, Ilaro, during 2013/2014 academic session were used as case studies. The textual contents of the marking schemes were transcripted into electronic documents using same file format as the students' answers. The documents were preprocessed for stopwords removal and each keyword stemmed to address morphological variations. N-gram terms (N=2,3) were then extracted across all students' answer scripts and marking scheme documents for each of the two courses. The documents were represented in the vector space model as a Document Term Matrix. Principal Component Analysis (PCA) algorithm was modified by integrating n-gram terms as input into existing PCA to derive Modified Principal Component Analysis (MPCA) algorithm. The MPCA was used to reduce the sparseness of the matrix. Document similarity was measured using cosine similarity measure which compared each student's answer script document vector with the marking scheme document vector. The MPCA based AETG system outperformed the PCA equivalent having a high positive correlation and lower mean absolute error when the human marker scores are compared to those of the system. We intend to explore other approaches that will able to capture non-textual contents in our future work.
引用
收藏
页码:94 / 98
页数:5
相关论文
共 50 条
  • [1] A Novel Approach for Automated Skew Correction of Vehicle Number Plate Using Principal Component Analysis
    Bodade, Rajesh
    Pachori, Ram Bilas
    Gupta, Aakash
    Kanani, Pritesh
    Yadav, Deepak
    2013 INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN COMMUNICATION, CONTROL, SIGNAL PROCESSING AND COMPUTING APPLICATIONS (IEEE-C2SPCA-2013), 2013,
  • [2] Robust Principal Component Analysis: An IRLS Approach
    Polyak, Boris T.
    Khlebnikov, Mikhail V.
    IFAC PAPERSONLINE, 2017, 50 (01): : 2762 - 2767
  • [3] An exact approach to sparse principal component analysis
    Farcomeni, Alessio
    COMPUTATIONAL STATISTICS, 2009, 24 (04) : 583 - 604
  • [4] An exact approach to sparse principal component analysis
    Alessio Farcomeni
    Computational Statistics, 2009, 24 : 583 - 604
  • [5] A ROBUST FUZZY CLUSTERING APPROACH AND ITS APPLICATION TO PRINCIPAL COMPONENT ANALYSIS
    Yang, Ying-Kuei
    Lee, Chien-Nan
    Shieh, Horng-Lin
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2010, 16 (01) : 1 - 11
  • [6] Modified Principal Component Analysis in sliding windowed fMRI data
    Khairi, Nazirah Mohd
    Wilkes, D. Mitchell
    Ding, Zhaohua
    2019 IEEE SOUTHEASTCON, 2019,
  • [7] ROBPCA: A new approach to robust principal component analysis
    Hubert, M
    Rousseeuw, PJ
    Vanden Branden, K
    TECHNOMETRICS, 2005, 47 (01) : 64 - 79
  • [8] Principal Component Analysis: A Natural Approach to Data Exploration
    Gewers, Felipe L.
    Ferreira, Gustavo R.
    De Arruda, Henrique F.
    Silva, Filipi N.
    Comin, Cesar H.
    Amancio, Diego R.
    Costa, Luciano Da F.
    ACM COMPUTING SURVEYS, 2021, 54 (04)
  • [9] Principal component analysis approach for biomedical sample identification
    Ye, ZM
    Auner, G
    2004 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOLS 1-7, 2004, : 1348 - 1353
  • [10] LASSO principal component averaging: A fully automated approach for point forecast pooling
    Uniejewski, Bartosz
    Maciejowska, Katarzyna
    INTERNATIONAL JOURNAL OF FORECASTING, 2023, 39 (04) : 1839 - 1852