Automated Data Mapping Based on FastText and LSTM for Business Systems

被引:0
作者
Liu, Zhibin [1 ]
Hu, Huijun [1 ]
机构
[1] Kingdee Int Software Grp Co Ltd, Kingdee Res, Shenzhen, Peoples R China
来源
COGNITIVE COMPUTING, ICCC 2022 | 2022年 / 13734卷
关键词
Data mapping; Text classification; FastText; Word vectors; Long short-term memory;
D O I
10.1007/978-3-031-23585-6_7
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the continuous development of information technology, massive information processing has become an important problem in business systems. However, themetadata information from different business systems lacks a unified and standardized description method. Mapping data by the manual way greatly reduces the efficiency. Therefore, an automated data mapping method is very necessary. In this paper, we regard data mapping as a text classification problem based on the following reasons: 1) the text classification technology has become more and moremature in the field of the natural language processing (NLP), which is very suitable for processing massive data; 2) a large number of heterogeneous mapping data can be treated as text. In order to implement automated datamapping, in this paper, we propose a classification model based on FastText and long-short term memory (LSTM) for data mapping in business systems. By observing the characteristics of mapping data in business systems, we firstly use FastText to learn word representation containing semantic information, and then adopt the LSTM model to extract features for text classification automatically. Experimental results show that the proposed method can automatically classify mapping data in business systems with common quality.
引用
收藏
页码:75 / 82
页数:8
相关论文
共 20 条
[1]  
[Anonymous], 2008, Scholarpedia, DOI DOI 10.4249/SCHOLARPEDIA.3881
[2]  
Bojanowski P., 2017, Trans. Assoc. Comput. Linguistics, V5, P135, DOI [DOI 10.1162/TACLA00051, 10.1162/tacl_a_00051, DOI 10.1162/TACL_A_00051]
[3]  
[陈磊 Chen Lei], 2018, [小型微型计算机系统, Journal of Chinese Computer Systems], V39, P991
[4]   A proposed model for data warehouse ETL processes [J].
El-Sappagh, Shaker H. Ali ;
Hendawi, Abdeltawab M. Ahmed ;
El Bastawissy, Ali Hamed .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2011, 23 (02) :91-104
[5]   Densely Connected Convolutional Networks [J].
Huang, Gao ;
Liu, Zhuang ;
van der Maaten, Laurens ;
Weinberger, Kilian Q. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269
[6]  
Joulin A, 2016, Arxiv, DOI arXiv:1612.03651
[7]  
Kuang Q., 2010, 2010 INT C INT TECHN, P1, DOI DOI 10.1109/ITAPP.2010.5566113
[8]   Chinese Text Classification Model Based on Deep Learning [J].
Li, Yue ;
Wang, Xutao ;
Xu, Pengjian .
FUTURE INTERNET, 2018, 10 (11)
[9]  
[梁军 Liang Jun], 2015, [中文信息学报, Journal of Chinese Information Processing], V29, P152
[10]  
Lilleberg J, 2015, PROCEEDINGS OF 2015 IEEE 14TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), P136, DOI 10.1109/ICCI-CC.2015.7259377