Analysis of optimization algorithms for stability and convergence for natural language processing using deep learning algorithms

被引：0

作者：

Gangadhar C. ^{[1
]}

Moutteyan M. ^{[2
]}

Vallabhuni R.R. ^{[3
]}

Vijayan V.P. ^{[4
]}

Sharma N. ^{[5
]}

Theivadas R. ^{[6
]}

机构：

[1] Department of Electronics & Communication Engineering, Prasad V. Potluri Siddhartha Institute of Technology, Andhra Pradesh, Vijayawada

[2] Vellore Institute of Technology, Vellore

[3] Bayview Asset Management, LLC, FL

[4] Department of CSE, Principal, Mangalam College of Engineering, Kottayam

[5] Department of Computer Science and Engineering, Galgotias University, Uttar Pradesh, Greater Noida

[6] Digialtic Technologies, Chennai

来源：

Measurement: Sensors | 2023年 / 27卷

关键词：

Convergence; Deep-learning (DL); Neural networks; Optimization algorithms; Stability;

D O I：

10.1016/j.measen.2023.100784

中图分类号：

学科分类号：

摘要：

A boom in applying deep learning (DL) models over the past several years has advanced the discipline of NLP. Firstly, the theoretical foundations of artificial intelligence and NLP are briefly introduced in this survey. Then, it sorts through much recent research and compiles many pertinent contributions. Lately, this article has introduced optimization theory and techniques for neural network training. First, we classify and discuss the various facets and NLP applications profiting from deep learning. Second, we review generic language modelling methods used in pre-training neural networks, such as BERT, RoBERT, AlBERT and DeBERT. Third, we compared the different language models in GLUE, MNL1, and SQuAD for accuracy and efficiency for best optimization. © 2023 The Authors

引用

共 44 条

[1]

Cambria E., White B., Jumping NLP curves: a review of natural language processing research, IEEE Comput. Intell. Mag., 9, 2, pp. 48-57, (2014)

[2]

Mikolov T., Sutskever I., Chen K., Corrado G.S., Dean J., Distributed representations of words and phrases and their compositionality, Advances in Neural Information Processing Systems, pp. 3111-3119, (2013)

[3]

Socher R., Perelygin A., Wu J.Y., Chuang J., Manning C.D., Ng A.Y., Potts C., Et al., Recursive deep models for semantic compositionality over a sentiment treebank, Proceedings of the Conference on Empirical Methods in Natural Language Processing, 1631, (2013)

[4]

Collobert R., Weston J., Bottou L., Karlen M., Kavukcuoglu K., Kuksa P., Natural language processing (almost) from scratch, J. Mach. Learn. Res., 12, Aug, pp. 2493-2537, (2011)

[5]

Burgess C., Lund K., Hyperspace analog to language (HAL): a general model of semantic representation, Proceedings of the Annual Meeting of the Psychonomic Society, 12, pp. 177-210, (1995)

[6]

Cho K., Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua B., Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation, pp. 1724-1734, (2014)

[7]

Socher R., Perelygin A., Wu J., Chuang J., Manning C.D., Ng A., Potts C., Recursive deep models for semantic compositionality over a sentiment treebank, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631-1642, (2013)

[8]

Kannan A., Kurach K., Ravi S., Kaufmann T., Tomkins A., Miklos B., Corrado G., Lukacs L., Ganea M., Young P., Ramavajjala V., Smart reply: automated response suggestion for email, In: arXiv, (2016)

[9]

Sak H., Senior A., Beaufays F., Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling, Proceedings of the Fifteenth Annual Conference of the International Speech Communication Association, (2014)

[10]

Kumar A., Irsoy O., Ondruska P., Iyyer M., Bradbury J., Gulrajani I., Zhong V., Paulus R., Socher R., Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, pp. 1378-1387, (2016)

← 1 2 3 4 5 →