Automatic analysis of textual hotel reviews

被引:41
作者
García-Pablos A. [1 ]
Cuadros M. [2 ]
Linaza M.T. [1 ]
机构
[1] Department of eTourism and Cultural Heritage, Vicomtech-IK4, San Sebastián
[2] Department of Human Speech and Language Technologies, Vicomtech-IK4, San Sebastián
关键词
Customer-generated reviews; Sentiment analysis; Text analysis;
D O I
10.1007/s40558-015-0047-7
中图分类号
学科分类号
摘要
Social Media and consumer-generated content continue to grow and impact the hospitality domain. Consumers write online reviews to indicate their level of satisfaction with a hotel and inform other consumers on the Internet of their hotel stay experience. A number of websites specialized in tourism and hospitality have flourished on the Web (e.g. Tripadvisor). The tremendous growth of these data-generating sources demands new tools to deal with them. To cope with big amounts of customer-generated reviews and comments, Natural Language Processing (NLP) tools have become necessary to automatically process and manage textual customer reviews (e.g. to perform Sentiment Analysis). This work describes OpeNER, a NLP platform applied to the hospitality domain to automatically process customer-generated textual content and obtain valuable information from it. The presented platform consists of a set of Open Source and free NLP tools to analyse text based on a modular architecture to ease its modification and extension. The training and evaluation has been performed using a set of manually annotated hotel reviews gathered from websites like Zoover and HolidayCheck. © 2015, Springer-Verlag Berlin Heidelberg.
引用
收藏
页码:45 / 69
页数:24
相关论文
共 49 条
[1]  
Agerri R., Cuadros M., Gaines S., Rigau G., OpeNER: Open Polarity Enhanced Named Entity Recognition, Proceedings of the 29th annual meeting of Sociedad Española para el Procesamiento del Lenguaje Natural, SEPLN’13. Madrid, España. Procesamiento del Lenguaje Natural, 51, pp. 215-218, (2013)
[2]  
Bacciu C., Lo Duca A., Marchetti A., Tesconi M., Accommodations in Tuscany as Linked Data, In: Proceedings of the 9th edition of the language resources and evaluation conference, (2014)
[3]  
Bagga A., Baldwin B., Cross-document event coreference: Annotations, experiments, and observations, (1999)
[4]  
Bosma W., Vossen P., Soroa A., KAF: a generic semantic annotation format, In: Proceedings of the GL2009 Workshop on semantic annotation, (2009)
[5]  
Brants T., TnT: a statistical part-of-speech tagger. In: Proceedings of the sixth conference on Applied natural language processing, vol 1, (2000)
[6]  
Brereton R.G., Lloyd G.R., Support vector machines for classification and regression, Analyst, 135, pp. 230-267, (2010)
[7]  
Browning V., So K.K.F., Sparks B., The influence of online reviews on consumers’ attributions of service quality and control for service standards in hotels, J Travel Tour Mark, 30, 1-2, pp. 23-40, (2013)
[8]  
Cambria E., White B., Jumping NLP curves: a review of natural language processing research [review article], Comput Intell Mag IEEE, 9, 2, pp. 48-57, (2014)
[9]  
Cambria E., Schuller B., Xia Y., Havasi C., New avenues in opinion mining and sentiment analysis, IEEE Intell Syst, 2, pp. 15-21, (2013)
[10]  
Collins M., Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, pp 1–8, (2002)