Data generation using sequence-to-sequence

被引：0

作者：

Joshi, Akshat ^{[1
]}

Mehta, Kinal ^{[1
]}

Gupta, Neha ^{[1
]}

Valloli, Varun Kannadi ^{[1
]}

机构：

[1] C DAC, GIST Grp, Pune, Maharashtra, India

来源：

2018 IEEE RECENT ADVANCES IN INTELLIGENT COMPUTATIONAL SYSTEMS (RAICS) | 2018年

关键词：

Sequence2Sequence; NLP; transliteration; LSTM; encoder; decoder;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Sequence to Sequence models have shown a lot of promise for dealing with problems such as Neural Machine Translation (NMT), Text Summarization, Paraphrase Generation etc. Deep Neural Networks (DNNs) work well with large and labeled training sets but in sequence-to-sequence problems, mapping becomes a much harder task due to the differences in syntax, semantics and length. Moreover usage of DNNs is constrained by the fixed dimensionality of the input and output, which is not the case with most of the Natural Language Processing (NLP) problems. Our primary focus was to build transliteration systems for Indian languages. In the case of Indian languages, monolingual corpora are abundantly available but a parallel one which can be directly applied to transliteration problem is scarce. With the available parallel corpus, we could only build weak models. We propose a system to leverage the mono-lingual corpus to generate a clean and quality parallel corpus for transliteration, which is then iteratively used to tune the existing weak transliteration models. The results that we got prove our hypothesis that the process of generation of clean data can be validated objectively by evaluating the models alongside the efficiency of the system to generate data in each iteration.

引用

页码：108 / 112

页数：5

共 50 条

[1] Automatic Target Generation for Electronic Data Interchange using Sequence-to-Sequence Models
Baysan, Mehmet Selman
Kizilay, Furkan
Gundogan, Haluk Harun
Ozmen, Ayse Irem
Ince, Gokhan
INTELLIGENT AND FUZZY SYSTEMS, INFUS 2024 CONFERENCE, VOL 1, 2024, 1088 : 158 - 166
[2] Turkish Data-to-Text Generation Using Sequence-to-Sequence Neural Networks
Demir, Seniz
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (02)
[3] Persian Keyphrase Generation Using Sequence-to-sequence Models
Doostmohammadi, Ehsan
Bokaei, Mohammad Hadi
Sameti, Hossein
2019 27TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2019), 2019, : 2010 - 2015
[4] SEQUENCE-TO-SEQUENCE LABANOTATION GENERATION BASED ON MOTION CAPTURE DATA
Li, Min
Miao, Zhenjiang
Ma, Cong
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4517 - 4521
[5] Synthesizing waveform sequence-to-sequence to augment training data for sequence-to-sequence speech recognition
Ueno, Sei
Mimura, Masato
Sakai, Shinsuke
Kawahara, Tatsuya
ACOUSTICAL SCIENCE AND TECHNOLOGY, 2021, 42 (06) : 333 - 343
[6] Data2Vis: Automatic Generation of Data Visualizations Using Sequence-to-Sequence Recurrent Neural Networks
Dibia, Victor
Demiralp, Cagatay
IEEE COMPUTER GRAPHICS AND APPLICATIONS, 2019, 39 (05) : 33 - 46
[7] Question Generation Using Sequence-to-Sequence Model with Semantic Role Labels
Naeiji, Alireza
An, Aijun
Davoudi, Heidar
Delpisheh, Marjan
Alzghool, Muath
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2830 - 2842
[8] Sequence-to-sequence alignment using a pendulum
Pribanic, Tomislav
Lelas, Marko
Krois, Igor
IET COMPUTER VISION, 2015, 9 (04) : 570 - 575
[9] Neural AMR: Sequence-to-Sequence Models for Parsing and Generation
Konstas, Ioannis
Iyer, Srinivasan
Yatskar, Mark
Choi, Yejin
Zettlemoyer, Luke
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 146 - 157
[10] Sequence-to-Sequence Imputation of Missing Sensor Data
Dabrowski, Joel Janek
Rahman, Ashfaqur
AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 265 - 276

← 1 2 3 4 5 →