Utilizing structural metrics from knowledge graphs to enhance the robustness quantification of large language models

被引:0
作者
Haque, Mohd Ariful [1 ]
Kamal, Marufa [2 ]
George, Roy [1 ]
Gupta, Kishor Datta [1 ]
机构
[1] Clark Atlanta Univ, Dept Cyber Phys Syst, 223 James P Brawley Dr SW, Atlanta, GA 30314 USA
[2] BRAC Univ, Dept Comp Sci & Engn, Dhaka 1212, Bangladesh
关键词
Knowledge graph; LLM; Structural metrics; CodeLlama; Mistral; Vicuna; LARGE-SCALE;
D O I
10.1007/s41060-024-00643-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge graphs (KGs) play a critical role in organizing large stores of unstructured information into structured formats. This structured information is then accessible through SPARQL queries or graph libraries based on their structure. KGs enhance search, power AI systems, and facilitate knowledge discovery across domains. In this research, we explore the capabilities of different large language models (LLMs) like CodeLlama, Mistral, and Vicuna, which are recognized for text generation, in handling textual information tasks for constructing knowledge graphs with structured data. Utilizing these LLMs, we generate class descriptions for all the classes of well-known KGs like DBpedia, YAGO, and Google Knowledge Graph. Using these class descriptions, we have extracted RDF triples and used different preprocessing techniques for better refinement and extraction of the graph triples from the generated result. These extracted triples are used for the graph ontology creation. Highlighting the contribution of LLMs to structured graph formation, our study includes a comparison of the constructed KGs using the three LLMs with the existing Knowledge Graphs. Later, these KGs are evaluated using six structural quality metrics encompassing both class and property-related information crucial for KG formation. Our insights prove valuable for researchers exploring these domains, offering guidance on overcoming challenges and maximizing the potential of large language models in knowledge graph construction, text generation, and text extraction.
引用
收藏
页数:21
相关论文
共 62 条
  • [1] Building Contextual Knowledge Graphs for Personalized Learning Recommendations using Text Mining and Semantic Graph Completion
    Abu-Rasheed, Hasan
    Dornhoefer, Mareike
    Weber, Christian
    Kismihok, Gabor
    Buchmann, Ulrike
    Fathi, Madjid
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES, ICALT, 2023, : 36 - 40
  • [2] LLM Based Generation of Item-Description for Recommendation System
    Acharya, Arkadeep
    Singh, Brijraj
    Onoe, Naoyuki
    [J]. PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 1204 - 1207
  • [3] LLM-Based Interaction for Content Generation: A Case Study on the Perception of Employees in an IT Department
    Agossah, Alexandre
    Krupa, Frederique
    Perreira Da Silva, Matthieu
    Le Callet, Patrick
    [J]. PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON INTERACTIVE MEDIA EXPERIENCES, IMX 2023, 2023, : 237 - 241
  • [4] Agrawal M., 2022, P 2022 C EMPIRICAL M, P1998, DOI [10.18653/v1/2022.emnlp-main.130, DOI 10.18653/V1/2022.EMNLP-MAIN.130]
  • [5] Allemang D, 2024, Arxiv, DOI arXiv:2405.11706
  • [6] DBpedia: A nucleus for a web of open data
    Auer, Soeren
    Bizer, Christian
    Kobilarov, Georgi
    Lehmann, Jens
    Cyganiak, Richard
    Ives, Zachary
    [J]. SEMANTIC WEB, PROCEEDINGS, 2007, 4825 : 722 - +
  • [7] Bacciu A, 2023, Arxiv, DOI arXiv:2307.12798
  • [8] ChatGPT: five priorities for research
    Bockting, Claudi
    van Dis, Eva A. M.
    Bollen, Johan
    van Rooij, Robert
    Zuidema, Willem L.
    [J]. NATURE, 2023, 614 (7947) : 224 - 226
  • [9] Bollacker K., 2008, P 2008 ACM SIGMOD IN, P1247, DOI DOI 10.1145/1376616.1376746
  • [10] Cambria E, 2010, 2010 AAAI FALL S SER, P14