Analysis and safety engineering of fuzzy string matching algorithms

被引:5
|
作者
Pikies, Malgorzata [1 ]
Ali, Junade [1 ]
机构
[1] Cloudflare, London, England
关键词
String similarity; Fuzzy string matching; Safety engineering; Natural language processing; Binary classification; Neural network;
D O I
10.1016/j.isatra.2020.10.014
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we explore fuzzy string matching in an automatic ticket classification and processing system. We compare performance of the following string similarity algorithms: Longest Common Subsequence (LCS), Dice coefficient, Cosine Similarity, Levenshtein (edit) distance and Damerau distance. Through optimisation, we accomplished a 15% improvement in the ratio of false positives to true positive classifications over the existing approach used by a customer support system for free customers. To introduce greater safety; we compliment fuzzy string matching algorithms with a second layer Convolutional Neural Network (CNN) binary classifier, achieving an improved keyword classification ratio for two ticket categories by a relative 69% and 78%. Such an approach allows for classification to only be applied where a desired level of safety achieved, such as in instances where automated answers. (C) 2020 ISA. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [1] Fuzzy String Matching with Finite Automat
    Kostanyan, Armen
    2017 ELEVENTH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGIES (CSIT), 2017, : 9 - 11
  • [2] Hybridizing Fuzzy String Matching and Machine Learning for Improved Ontology Alignment
    Rudwan, Mohammed Suleiman Mohammed
    Fonou-Dombeu, Jean Vincent
    FUTURE INTERNET, 2023, 15 (07):
  • [3] Comparision of String Matching Algorithms on Spam Email Detection
    Varol, Cihan
    Abdulhadi, Hezha M. Tareq
    2018 INTERNATIONAL CONGRESS ON BIG DATA, DEEP LEARNING AND FIGHTING CYBER TERRORISM (IBIGDELFT), 2018, : 6 - 11
  • [4] Resolving Ambiguous Queries via Fuzzy String Matching and Dynamic Buffering Techniques
    Onifade, Olufade F. W.
    Osofisan, Adenike O.
    INFORMATION INTELLIGENCE, SYSTEMS, TECHNOLOGY AND MANAGEMENT, 2011, 141 : 198 - 205
  • [5] Algorithms for Matching Strings with Fuzzy Context-Free and Automata Patterns
    Kostanyan, A. H.
    PATTERN RECOGNITION AND IMAGE ANALYSIS, 2024, 34 (01) : 110 - 115
  • [6] Safety assessment of construction engineering based on combined algorithms
    Li Yancang
    Suo Juanjuan
    2009 INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS, VOL 1, PROCEEDINGS, 2009, : 47 - 50
  • [7] Using fuzzy string matching for automated assessment of listener transcripts in speech intelligibility studies
    Bosker, Hans Rutger
    BEHAVIOR RESEARCH METHODS, 2021, 53 (05) : 1945 - 1953
  • [8] Using fuzzy string matching for automated assessment of listener transcripts in speech intelligibility studies
    Hans Rutger Bosker
    Behavior Research Methods, 2021, 53 : 1945 - 1953
  • [9] Dynamic Fuzzy String-Matching Model for Information Retrieval Based on Incongruous User Queries
    Onifade, Olufade F. W.
    Thiery, Odile
    Osofisan, Adenike O.
    Duffing, Gerald
    WORLD CONGRESS ON ENGINEERING, WCE 2010, VOL I, 2010, : 283 - 288
  • [10] Phonetic String Matching for Languages with Cyrillic Alphabet
    Paramonov, Viacheslav
    Shigarov, Alexey
    Ruzhnikov, Gennady
    Cherkashin, Evgeny
    INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2018, PT I, 2019, 852 : 301 - 311