Multilingual open information extraction: Challenges and opportunities

被引:0
作者
Claro D.B. [1 ]
Souza M. [1 ]
Xavier C.C. [2 ]
Oliveira L. [1 ]
机构
[1] FORMAS Research Group, Computer Science Department, Federal University of Bahia, Salvador - BA
[2] FORMAS Research Group, Federal Institute of Rio Grande do Sul, Porto Alegre - RS
来源
Information (Switzerland) | 2019年 / 10卷 / 07期
关键词
Multilingual; Open information extraction; Parallel corpus;
D O I
10.3390/INFO10070228
中图分类号
学科分类号
摘要
The number of documents published on theWeb in languages other than English grows every year. As a consequence, the need to extract useful information from different languages increases, highlighting the importance of research into Open Information Extraction (OIE) techniques. Different OIE methods have dealt with features from a unique language; however, few approaches tackle multilingual aspects. In those approaches, multilingualism is restricted to processing text in different languages, rather than exploring cross-linguistic resources, which results in low precision due to the use of general rules. Multilingual methods have been applied to numerous problems in Natural Language Processing, achieving satisfactory results and demonstrating that knowledge acquisition for a language can be transferred to other languages to improve the quality of the facts extracted. We argue that a multilingual approach can enhance OIE methods as it is ideal to evaluate and compare OIE systems, and therefore can be applied to the collected facts. In this work, we discuss how the transfer knowledge between languages can increase acquisition from multilingual approaches. We provide a roadmap of the Multilingual Open IE area concerning state of the art studies. Additionally, we evaluate the transfer of knowledge to improve the quality of the facts extracted in each language. Moreover, we discuss the importance of a parallel corpus to evaluate and compare multilingual systems. © 2019 by the authors.
引用
收藏
相关论文
共 81 条
  • [51] Bast H., Haussmann E., Open information extraction via contextual sentence decomposition, Proceedings of the 2013 IEEE Seventh International Conference on Semantic Computing, pp. 154-159, (2013)
  • [52] Bast H., Haussmann E., More Informative Open Information Extraction via Simple Inference, Proceedings of the 36th European Conference on IR Research on Advances in Information Retrieval-Volume 8416, pp. 585-590, (2014)
  • [53] Gashteovski K., Gemulla R., Del Corro L., MinIE: Minimizing Facts in Open Information Extraction, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2630-2640, (2017)
  • [54] Gamallo P., Garcia M., Multilingual Open Information Extraction, Progress in Artificial Intelligence: 17th Portuguese Conference on Artificial Intelligence, EPIA 2015, Coimbra, Portugal, 8-11 September 2015, pp. 711-722, (2015)
  • [55] Xavier C.C., de Lima V.L.S., Souza M., Open Information Extraction based on lexical-syntactic patterns, Proceedings of the 2013 Brazilian Conference on Intelligent Systems (BRACIS), pp. 189-194, (2013)
  • [56] Sena C.F.L., Glauber R., Claro D.B., Inference Approach to Enhance a Portuguese Open Information Extraction, Proceedings of the 19th International Conference on Enterprise Information Systems-Volume 1: ICEIS, pp. 442-451, (2017)
  • [57] Sena C.F.L., Claro D.B., InferPortOIE: A Portuguese Open Information Extraction system with inferences, Nat. Lang. Eng, 25, pp. 287-306, (2019)
  • [58] de Oliveira L.S., Glauber R., Claro D.B., DependentIE: An Open Information Extraction system on Portuguese by a Dependence Analysis, Proceedings of the Encontro Nacional de Inteligência Artificial e Computacional, (2017)
  • [59] de Oliveira L.S., Claro D.B., DptOIE: A Portuguese Open Information Extraction system based on Dependency Analysis, Comput. Speech Lang, (2019)
  • [60] Tseng Y.H., Lee L.H., Lin S.Y., Liao B.S., Liu M.J., Chen H.H., Etzioni O., Fader A., Chinese open relation extraction for knowledge acquisition, Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 12-16, (2014)