Matching of hand-drawn flowchart, pseudocode, and english description using transfer learning

被引:0
作者
Ghosh, Sagarika [1 ,2 ]
Pratihar, Sanjoy [1 ]
Chatterji, Sanjay [1 ]
Basu, Anupam [3 ]
机构
[1] Indian Inst Informat Technol Kalyani, Comp Sci & Engn, Kalyani 741235, West Bengal, India
[2] Univ Engn & Management Jaipur, Comp Sci & Engn, Jaipur 303807, Rajasthan, India
[3] Natl Inst Technol Durgapur, Comp Sci & Engn, Durgapur 713209, West Bengal, India
关键词
Hand-drawn flowchart; Pseudocode; Text description; S-DistilBERT; Ruleset; Embedding; Transfer learning; Similarity matching; OPTIMAL POLYGONAL-APPROXIMATION; DOMINANT POINTS; ALGORITHM;
D O I
10.1007/s11042-023-14346-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An algorithm plays an important role when solving a problem. It is challenging to comprehend for computer novices or machines. Therefore, a textual explanation is provided to illustrate the algorithm. To understand an algorithm, a method needs to be devised to find or generate the corresponding text description and vice versa. This paper matches an algorithm in a variety of forms, such as pseudocode and hand-drawn flowchart, with the illustrative text written in English to facilitate a thorough understanding of the algorithm. The experiment includes a proposed set of rules for generating pseudocode from a hand-drawn flowchart and a proposed S-DistilBERT-based transfer learning method to determine the similarity match score between multiple forms of algorithm and text description. Basic block and line identification, as well as OCR-ization, are used to characterize the hand-drawn flowcharts. The experimental result show that we can generate the equivalent pseudocode in 85% cases, and our fine-tuned S-DistilBERT model can accommodate the matching text for the existing pseudocode with 75.59% and the generated pseudocode with 74.57% accuracy. We also find the appropriate description from an algorithm in the top five matches in 30 out of 50 cases. The rules are found to be adequate for non-recursive flowcharts.
引用
收藏
页码:27027 / 27055
页数:29
相关论文
共 40 条
[1]   Deep image captioning using an ensemble of CNN and LSTM based deep neural networks [J].
Alzubi, Jafar A. ;
Jain, Rachna ;
Nagrath, Preeti ;
Satapathy, Suresh ;
Taneja, Soham ;
Gupta, Paras .
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (04) :5761-5769
[2]   Paraphrase identification using collaborative adversarial networks [J].
Alzubi, Jafar A. ;
Jain, Rachna ;
Kathuria, Abhishek ;
Khandelwal, Anjali ;
Saxena, Anmol ;
Singh, Anubhav .
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (01) :1021-1032
[3]  
[Anonymous], 2021, PEACE TEXT SCANNER
[4]  
[Anonymous], 2013, International Journal of Computer Applications, DOI [10.5120/11638-7118, DOI 10.5120/11638-7118]
[5]   Fast polygonal approximation of digital curves using relaxed straightness properties [J].
Bhowmick, Partha ;
Bhattacharya, Bhargab B. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (09) :1590-1602
[6]  
Cer D., 2018, ARXIV
[7]  
Chakraborty S, 2020, IEEE REGION 10 S TEN, DOI [10.1109/TENSYMP50017.2020.9231033https://doi.org/10.1109/TENSYMP50017.2020.9231033, DOI 10.1109/TENSYMP50017.2020.9231033HTTPS://DOI.ORG/10.1109/TENSYMP50017.2020.9231033]
[8]  
Chen X., 2014, ARXIV
[9]  
Cho K., 2014, Learning phrase representations using RNN encoderdecoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
[10]  
2014, DOI DOI 10.3115/V1/D14-1179