Parallel hardware for faster morphological analysis

被引:4
作者
Damaj, Issam [1 ]
Imdoukh, Mahmoud [1 ]
Zantout, Rached [2 ]
机构
[1] Amer Univ Kuwait, Dept Elect & Comp Engn, POB 3323, Safat 13034, Kuwait
[2] Rafik Hariri Univ, Dept Elect & Comp Engn, POB 10, Damour 2010, Chouf, Lebanon
关键词
Morphological analysis; NLP; Performance; Hardware design; FPGAs; ALGORITHMS DEVELOPMENT; DEVICES;
D O I
10.1016/j.jksuci.2017.07.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Morphological analysis of Arabic language is computationally intensive, has numerous forms and rules, and intrinsically parallel. The investigation presented in this paper confirms that the effective development of parallel algorithms and the derivation of corresponding processors in hardware enable implementations with appealing performance characteristics. The presented developments of parallel hardware comprise the application of a variety of algorithm modelling techniques, strategies for concurrent processing, and the creation of pioneering hardware implementations that target modern programmable devices. The investigation includes the creation of a linguistic-based stemmer for Arabic verb root extraction with extended infix processing to attain high-levels of accuracy. The implementations comprise three versions, namely, software, non-pipelined processor, and pipelined processor with high throughput. The targeted systems are high-performance multi-core processors for software implementations and high-end Field Programmable Gate Array systems for hardware implementations. The investigation includes a thorough evaluation of the methodology, and performance and accuracy analyses of the developed software and hardware implementations. The developed processors achieved significant speedups over the software implementation. The developed stemmer for verb root extraction with infix processing attained accuracies of 87% and 90.7% for analyzing the texts of the Holy Quran and its Chapter 29 - Surat Al-Ankabut. (C) 2017 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University.
引用
收藏
页码:531 / 546
页数:16
相关论文
共 34 条
[1]  
Abu-Errub A., 2014, IJCSI INT J COMPUTER, V11, P128
[2]  
Agarwal B., 2016, Prominent feature extraction for sentiment analysis, P21, DOI [DOI 10.1007/978-3-319-25343-5_3, 10.1007/978-3-319-25343-5_3]
[3]  
Al-Bawab M., 1998, ARABIAN MAGAZINE SCI, V32, P6
[4]  
Al-Shalabi R., 1998, P WORKSHOP COMPUTATI, V98, P66
[5]   Arabic morphological analysis techniques: A comprehensive survey [J].
Al-Sughaiyer, IA ;
Al-Kharashi, IA .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2004, 55 (03) :189-213
[6]  
Boubas A., 2011, 2011 International Conference on Innovations in Information Technology (IIT), P77, DOI 10.1109/INNOVATIONS.2011.5893872
[7]  
Boudlal A, 2011, INT ARAB J INF TECHN, V8, P91
[8]  
Buckwalter T., 2002, ARABIC TRANSLITERATI
[9]   Hardware-assisted algorithm for full-text large-dictionary string matching using n-gram hashing [J].
Cohen, JD .
INFORMATION PROCESSING & MANAGEMENT, 1998, 34 (04) :443-464
[10]   Performance analysis of linear algebraic functions using reconfigurable computing [J].
Damaj, I ;
Diab, H .
JOURNAL OF SUPERCOMPUTING, 2003, 24 (01) :91-107