Morph Resolution Based on Autoencoders Combined with Effective Context Information

被引:2
作者
You, Jirong [1 ,2 ]
Sha, Ying [1 ,2 ]
Liang, Qi [1 ,2 ]
Wang, Bin [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
来源
COMPUTATIONAL SCIENCE - ICCS 2018, PT III | 2018年 / 10862卷
关键词
Morph; Morph resolution; Effective context information; Autoencoder;
D O I
10.1007/978-3-319-93713-7_44
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In social networks, people often create morphs, a special type of fake alternative names for avoiding internet censorship or some other purposes. How to resolve these morphs to the entities that they really refer to is very important for natural language processing tasks. Although some methods have been proposed, they do not use the context information of morphs or target entities effectively; only use the information of neighbor words of morphs or target entities. In this paper, we proposed a new approach to resolving morphs based on autoencoders combined with effective context information. First, in order to represent the semantic meanings of morphs or target candidates more precisely, we proposed a method to extract effective context information. Next, by integrating morphs or target candidates and their effective context information into autoencoders, we got the embedding representation of morphs and target candidates. Finally, we ranked target candidates based on similarity measurement of semantic meanings of morphs and target candidates. Thus, our method needs little annotated data, and experimental results demonstrated that our approach can significantly outperform state-of-the-art methods.
引用
收藏
页码:487 / 498
页数:12
相关论文
共 17 条
  • [1] Amiri H, 2016, PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, P1882
  • [2] [Anonymous], 2014, P ACL SYST DEM, DOI DOI 10.3115/V1/P14-5010
  • [3] Han J, 2012, ACMIEEE INT CONF HUM, P421
  • [4] Huang H., 2013, P 51 ANN M ASS COMPU, P1083
  • [5] Factors influencing consumer adoption of USB-based Personal Health Records in Taiwan
    Jian, Wen-Shan
    Syed-Abdul, Shabbir
    Sood, Sanjay P.
    Lee, Peisan
    Hsu, Min-Huei
    Ho, Cheng-Hsun
    Li, Yu-Chuan
    Wen, Hsyien-Chia
    [J]. BMC HEALTH SERVICES RESEARCH, 2012, 12
  • [6] Kingma D.P., 2014, INT C LEARN REP
  • [7] Li Z., 2008, P C EMP METH NAT LAN, P1031
  • [8] Mikolov T., 2013, P 26 INT C NEURAL IN, P3111
  • [9] Pennington J., 2014, 2014 C EMP METH NAT, P43
  • [10] Resolving Entity Morphs based on Character-Word Embedding
    Sha, Ying
    Shi, Zhenhui
    Li, Rui
    Liang, Qi
    Wang, Bin
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017), 2017, 108 : 48 - 57