Translating natural language questions to SQL queries (nested queries)

被引:0
作者
Sindhuja Swamidorai
T Satyanarayana Murthy
K V Sriharsha
机构
[1] UpGrad,Data Science
[2] CBIT,Information Technology
[3] NIT Trichy,Computer Applications
来源
Multimedia Tools and Applications | 2024年 / 83卷
关键词
Text-to-SQL nested queries; Spider;
D O I
暂无
中图分类号
学科分类号
摘要
Real world questions are generally complex and need the user to extract information from multiple tables in a database using complex SQL queries like nested queries. Though the overall accuracy in translation of Natural Language queries to SQL queries lies close to 75%, the accuracy of complex queries is still quite less, around 60% in the current state-of-the-art models. In this vein, this study proposes to improve the current IRNet framework for translating natural language queries to nested SQL queries, one type of complex queries. Data oversampling is first used to boost the representation of nested queries in order to achieve this goal. Second, a novel loss function that computes the overall loss while accounting for the complexity of SQL, as measured by the quantity of SELECT columns and keywords in the SQL query. The proposed method exhibited a 5% improvement in prediction of hard and extra hard queries when tested on Spider’s development dataset.
引用
收藏
页码:45391 / 45405
页数:14
相关论文
共 43 条
[1]  
Wong A(2021)A Survey of Natural Language Processing Implementation for Data Query Systems IEEE International Conference on Recent Advances in Systems Science and Engineering (RASSE) 2021 1-8
[2]  
Joiner D(2022)Natural Language to SQL Queries: A Review Technol 4 147-162
[3]  
Chiu C(2019)A comprehensive exploration on spider with fuzzy decision text-to-SQL model IEEE Trans Ind Inf 16 2542-2550
[4]  
Elsayed M(2021)ER-SQL: Learning enhanced representation for Text-to-SQL using table contents Neurocomput 465 359-370
[5]  
Pereira K(2020)Enhanced natural language interface for web-based information retrieval IEEE Access 9 4233-4241
[6]  
Khmelevsky Y(2019)Learning from imbalanced data sets w, ith weighted cross-entropy function Neural Process Lett 50 1937-1949
[7]  
Mahony J(2021)Text data augmentation for deep learning J Big Data 8 1-34
[8]  
Baig MS(2021)SADGA: Structure-Aware Dual Graph Aggregation Network for Text-to-SQL Adv Neural Inf Process Syst 34 7664-7676
[9]  
Imran A(2021)A logic-based framework leveraging neural networks for studying the evolution of neurological disorders Theory Pract Log Program 21 80-124
[10]  
Yasin AU(undefined)undefined undefined undefined undefined-undefined