Sentiment Classification on Turkish Hotel Reviews

被引:0
作者
Ogul, Burcin Buket [1 ]
Ercan, Gonenc [1 ]
机构
[1] Hacettepe Univ, Bilgisayar Muhendisligi Bolumu, Ankara, Turkey
来源
2016 24TH SIGNAL PROCESSING AND COMMUNICATION APPLICATION CONFERENCE (SIU) | 2016年
关键词
Sentiment analysis; machine learning; text features; tf-idf; term document matrix; random forest;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Sentiment analysis refers to classify the emotion of a text whether positive or negative. The studies conducted on sentiment analysis are generally based on English and other languages while there are limited studies on Turkish. In this study, after constructing a dataset using a well-known hotel reservation site booking.com, we compare the performances of different machine learning approaches. We also apply dictionary-based method, SentiTFIDF, which differs from the traditional methods due to their logarithmic differential term frequency and term presence distribution usage. The results are evaluated using the area under of a Receiver Operating Characteristic (ROC) curve (AUC). The results show that, using document term matrix as input gives better classification results than TFIDF matrix. We also observe that the best results are obtained using Random Forest classifier with an AUC value of % 89 on both positive and negative comments.
引用
收藏
页码:497 / 500
页数:4
相关论文
共 13 条
  • [1] Akba F., 2014, EUR C DAT MIN 2014
  • [2] Akman M., 2011, Turkiye Klinikleri J Biostat, V3, P36
  • [3] Albayrak A.S, 2009, SULEYMAN DEMIREL U I, V14, P31
  • [4] Borth D., 2013, Proceedings of the 21st ACM international conference on multimedia, P223
  • [5] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [6] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [7] Ghag K, 2014, INT J ADV COMPUT SC, V5, P36
  • [8] Kaya M., 2013, THESIS
  • [9] Sentiment analysis algorithms and applications: A survey
    Medhat, Walaa
    Hassan, Ahmed
    Korashy, Hoda
    [J]. AIN SHAMS ENGINEERING JOURNAL, 2014, 5 (04) : 1093 - 1113
  • [10] Sever H, 2003, LECT NOTES COMPUT SC, V2857, P238