Computer-Generated Text Detection Using Machine Learning: A Systematic Review

被引：17

作者：

Beresneva, Daria ^{[1
]}

机构：

[1] Russian Acad Natl Econ & Publ Adm, Moscow Inst Phys & Technol, Antiplagiat Res, Moscow, Russia

来源：

NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2016 | 2016年 / 9612卷

关键词：

Artificial content; Generated text; Fake content detection;

D O I：

10.1007/978-3-319-41754-7_43

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Computer-generated text or artificial text nowadays is in abundance on the web, ranging from basic random word salads to web scraping. In this paper, we present a short version of systematic review of some existing automated methods aimed at distinguishing natural texts from artificially generated ones. The methods were chosen by certain criteria. We further provide a summary of the methods considered. Comparisons, whenever possible, use common evaluation measures, and control for differences in experimental set-up.

引用

页码：421 / 426

页数：6

共 19 条

[11]

Labbe C., 2012, SCIENTOMETRICS, P10

[12]

Lavergne T., 2008, PAN 2008

[13]

Manning C., 1999, FDN STAT NATURAL LAN

[14]

Seymore K, 1996, ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, P232, DOI 10.1109/ICSLP.1996.607084

[15] DISTRIBUTION LAW FOR WORD FREQUENCIES [J].

SICHEL, HS .

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1975, 70 (351) :542-547

[16]

Stolcke A., 1998, ENTROPY BASED PRUNIN

[17]

Urvoy T., 2006, AIRWEB 2006

[18]

Vapnik V., 1999, The nature of statistical learning theory

[19]

Witten IH, 2011, MOR KAUF D, P1

← 1 2 →