Enabling deep learning for large scale question answering in Italian

被引:2
作者
Croce, Danilo [1 ]
Zelenanska, Alexandra [1 ]
Basili, Roberto [1 ]
机构
[1] Univ Roma Tor Vergata, Dept Enterprise Engn, Rome, Italy
关键词
Question answering in Italian; deep learning; recurrent neural network with attention;
D O I
10.3233/IA-190018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recent breakthroughs in the field of deep learning led to state-of-the-art results in several NLP tasks, such as Question Answering (QA). Unfortunately, the requirements of such neural QA systems are very strict due to the size of the involved training datasets. In cross-linguistic settings these requirements are not satisfied as training datasets for QA over non-English texts are often not available. This represents the major barrier for a wide-spread adoption of neural QA methods in NLP applications. In this paper, the acquisition of a large scale dataset for an open-domain factoid question answering system in Italian is discussed. It is obtained by automatic translation and linguistic elicitation of an existing English dataset, i.e. the SQUAD question-answer pair corpus. Even though the quality of the resulting corpus for Italian might not be completely satisfying, our work allowed to generate more than 60 thousand question-answer pairs. In the paper the impact of this resource on the QA process over the Italian Wikipedia is studied, according to different training conditions and architectural constraints. A comparative evaluation against the English version, in line with standards in the SQUAD literature, is carried out. The outcomes show that the results achievable for Italian are below the state-of-the-art for English, but the ability of learning not to respond (i.e. the adoption of techniques for detecting question whose answers are simply not available, i.e. EMPTY set of answers) allows the system to pursue reasonable levels of precision. This make it already usable within realistic application scenarios. Finally, an error analysis is presented that suggests possible future research directions on still critical but highly beneficial enhancements, in view of concrete QA applications in Italian.
引用
收藏
页码:49 / 61
页数:13
相关论文
共 50 条
  • [21] Deep learning-based question answering system for intelligent humanoid robot
    Budiharto, Widodo
    Andreas, Vincent
    Gunawan, Alexander Agung Santoso
    JOURNAL OF BIG DATA, 2020, 7 (01)
  • [22] Deep Learning Powered Question-Answering Framework for Organizations Digital Transformation
    Carvalho, Nuno Ramos
    Barbosa, Luis Soares
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON THEORY AND PRACTICE OF ELECTRONIC GOVERNANCE (ICEGOV2019), 2019, : 76 - 79
  • [23] Deep learning-based question answering system for intelligent humanoid robot
    Widodo Budiharto
    Vincent Andreas
    Alexander Agung Santoso Gunawan
    Journal of Big Data, 7
  • [24] Enabling Efficient Large-Scale Deep Learning Training with Cache Coherent Disaggregated Memory Systems
    Wang, Zixuan
    Sim, Joonseop
    Lim, Euicheol
    Zhao, Jishen
    2022 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2022), 2022, : 126 - 140
  • [25] FarsNewsQA: a deep learning-based question answering system for the Persian news articles
    Kazemi, Arefeh
    Zojaji, Zahra
    Malverdi, Mahdi
    Mozafari, Jamshid
    Ebrahimi, Fatemeh
    Abadani, Negin
    Varasteh, Mohammad Reza
    Nematbakhsh, Mohammad Ali
    INFORMATION RETRIEVAL JOURNAL, 2023, 26 (01):
  • [26] Research and Implementation of Railway Technical Specification Question Answering System Based on Deep Learning
    Hu, Zhaohua
    PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 5 - 9
  • [27] Developing an Open Domain Arabic Question Answering System Using a Deep Learning Technique
    Alkhurayyif, Yazeed
    Sait, Abdul Rahaman Wahab
    IEEE ACCESS, 2023, 11 : 69131 - 69143
  • [28] Developing a Vietnamese Tourism Question Answering System Using Knowledge Graph and Deep Learning
    Phuc Do
    Phan, Truong H., V
    Gupta, Brij B.
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (05)
  • [29] FarsNewsQA: a deep learning-based question answering system for the Persian news articles
    Arefeh Kazemi
    Zahra Zojaji
    Mahdi Malverdi
    Jamshid Mozafari
    Fatemeh Ebrahimi
    Negin Abadani
    Mohammad Reza Varasteh
    Mohammad Ali Nematbakhsh
    Information Retrieval Journal, 2023, 26
  • [30] Enabling Large Intelligent Surfaces With Compressive Sensing and Deep Learning
    Taha, Abdelrahman
    Alrabeiah, Muhammad
    Alkhateeb, Ahmed
    IEEE ACCESS, 2021, 9 : 44304 - 44321