RTextTools: A Supervised Learning Package for Text Classification

被引:1
作者
Jurka, Timothy P. [1 ]
Collingwood, Loren [2 ]
Boydstun, Amber E. [1 ]
Grossman, Emiliano [3 ]
van Atteveldt, Wouter [4 ]
机构
[1] Univ Calif Davis, Dept Polit Sci, Davis, CA 95616 USA
[2] Univ Calif Riverside, Dept Polit Sci, Riverside, CA 92521 USA
[3] Sci Po CEE, F-75007 Paris, France
[4] Vrije Univ Amsterdam, Dept Commun Sci, NL-1081 HV Amsterdam, Netherlands
来源
R JOURNAL | 2013年 / 5卷 / 01期
关键词
ACCURACY;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Social scientists have long hand-labeled texts to create datasets useful for studying topics from congressional policymaking to media reporting. Many social scientists have begun to incorporate machine learning into their toolkits. RTextTools was designed to make machine learning accessible by providing a start-to-finish product in less than 10 steps. After installing RTextTools, the initial step is to generate a document term matrix. Second, a container object is created, which holds all the objects needed for further analysis. Third, users can use up to nine algorithms to train their data. Fourth, the data are classified. Fifth, the classification is summarized. Sixth, functions are available for performance evaluation. Seventh, ensemble agreement is conducted. Eighth, users can cross-validate their data. Finally, users write their data to a spreadsheet, allowing for further manual coding if required.
引用
收藏
页码:6 / 12
页数:7
相关论文
共 15 条
[11]  
Peters A., 2012, ipred: improved predictors
[12]  
Ripley B., 2012, Tree: Classification and regression trees
[13]  
Sokolova M, 2006, LECT NOTES COMPUT SC, V4304, P1015
[14]  
Tuszynski J., 2012, caTools: Tools: moving window statistics, GIF, Base64, ROC AUC, etc
[15]  
Venables W. N., 2002, Modern Applied Statistics with S, V4th ed.