Understanding Citizen Issues through Reviews: A Step towards Data Informed Planning in Smart Cities

被引:16
作者
Dilawar, Noman [1 ]
Majeed, Hammad [1 ]
Beg, Mirza Omer [1 ]
Ejaz, Naveed [2 ]
Muhammad, Khan [3 ]
Mehmood, Irfan [4 ]
Nam, Yunyoung [5 ]
机构
[1] Natl Univ Comp & Emerging Sci, Dept Comp Sci, Islamabad 46000, Pakistan
[2] Iqra Univ, Dept Comp & Technol, Islamabad 46000, Pakistan
[3] Sejong Univ, Digital Contents Res Inst, Intelligent Media Lab, Seoul 143747, South Korea
[4] Sejong Univ, Dept Software, Seoul 143747, South Korea
[5] Soonchunhyang Univ, Dept Comp Sci & Engn, Asan 31538, South Korea
来源
APPLIED SCIENCES-BASEL | 2018年 / 8卷 / 09期
关键词
smart cities; supervised learning; aspect category detection; aspect-based sentiment analysis;
D O I
10.3390/app8091589
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Governments these days are demanding better Smart City technologies in order to connect with citizens and understand their demands. For such governments, much needed information exists on social media where members belonging to diverse groups share different interests, post statuses, review and comment on various topics. Aspect extraction from this data can provide a thorough understanding of citizens' behaviors and choices. Also, categorization of these aspects can better summarize societal concerns regarding political, economic, religious and social issues. Aspect category detection (ACD) from people reviews is one of the major tasks of aspect-based sentiment analysis (ABSA). The success of ABSA is mainly defined by the inexpensive and accurate machine-processable representation of the raw input sentences. Previous approaches rely on cumbersome feature extraction procedures from sentences, which adds its own complexity and inaccuracy in performing ACD tasks. In this paper, we propose an inexpensive and simple method to obtain the most suitable representation of a sentence-vector through different algebraic combinations of a sentence's word vectors, which will act as an input to any machine learning classifier. We have tested our technique on the restaurant review data provided in SemEval-2015 and SemEval-2016. SemEval is a series of global challenges to evaluate the effectiveness of disambiguation of word sense. Our results showed the highest F1-scores of 76.40% in SemEval-2016 Task 5, and 94.99% in SemEval-2015 Task 12.
引用
收藏
页数:19
相关论文
共 37 条
  • [1] Abadi M., 2015, TENSORFLOW LARGESCAL, DOI [DOI 10.48550/ARXIV.1605.08695, 10.5555/3026877.3026899]
  • [2] Alghunaim Abdulaziz., 2015, VS@HLT-NAACL, P116
  • [3] Alvarez-Lopez T., 2016, P 10 INT WORKSHOP SE, P306
  • [4] [Anonymous], 2016, ARXIV160700534
  • [5] Arunachalam R., 2013, PROC IJCNLP 2013 WOR, P23
  • [6] Bin Lu, 2011, 2011 IEEE International Conference on Data Mining Workshops, P81, DOI 10.1109/ICDMW.2011.125
  • [7] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [8] Bo Pang, 2008, INFORM RETRIEVAL, V2, P1, DOI [DOI 10.1561/1500000011, 10.1561/1500000011]
  • [9] Brody S., 1962, P HUM LANG TECHN 201, P804
  • [10] Cid GP., 2015, International Journal of E-Planning Research, P4