A Systematic Comparison Between Open- and Closed-Source Large Language Models in the Context of Generating GDPR-Compliant Data Categories for Processing Activity Records

被引:0
作者
von Schwerin, Magdalena [1 ]
Reichert, Manfred [1 ]
机构
[1] Institute of Databases and Information Systems, Ulm University, Ulm
来源
Future Internet | 2024年 / 16卷 / 12期
关键词
GDPR documentation; large language model; natural language processing;
D O I
10.3390/fi16120459
中图分类号
学科分类号
摘要
This study investigates the capabilities of open-source Large Language Models (LLMs) in automating GDPR compliance documentation, specifically in generating data categories—types of personal data (e.g., names, email addresses)—for processing activity records, a document required by the General Data Protection Regulation (GDPR). By comparing four state-of-the-art open-source models with the closed-source GPT-4, we evaluate their performance using benchmarks tailored to GDPR tasks: a multiple-choice benchmark testing contextual knowledge (evaluated by accuracy and F1 score) and a generation benchmark evaluating structured data generation. In addition, we conduct four experiments using context-augmenting techniques such as few-shot prompting and Retrieval-Augmented Generation (RAG). We evaluate these on performance metrics such as latency, structure, grammar, validity, and contextual understanding. Our results show that open-source models, particularly Qwen2-7B, achieve performance comparable to GPT-4, demonstrating their potential as cost-effective and privacy-preserving alternatives. Context-augmenting techniques show mixed results, with RAG improving performance for known categories but struggling with categories not contained in the knowledge base. Open-source models excel at structured legal tasks, although challenges remain in handling ambiguous legal language and unstructured scenarios. These findings underscore the viability of open-source models for GDPR compliance, while highlighting the need for fine-tuning and improved context augmentation to address complex use cases. © 2024 by the authors.
引用
收藏
相关论文
共 40 条
  • [1] Teubner T., Flath C.M., Weinhardt C., van der Aalst W., Hinz O., Welcome to the era of ChatGPT et al.: The prospects of large language models, Bus. Inf. Syst. Eng, 65, pp. 95-101, (2023)
  • [2] Padiu B., Iacob R., Rebedea T., Dascalu M., To What Extent Have LLMs Reshaped the Legal Domain So Far? A Scoping Literature Review, Information, 15, (2024)
  • [3] Regulation (EU) 2016/679 of the European Parliament and of the Council
  • [4] Almeida Teixeira G., Mira da Silva M., Pereira R., The critical success factors of GDPR implementation: A systematic literature review, Digit. Policy Regul. Gov, 21, pp. 402-418, (2019)
  • [5] Brown T., Mann B., Ryder N., Subbiah M., Kaplan J.D., Dhariwal P., Neelakantan A., Shyam P., Sastry G., Askell A., Et al., Language Models are Few-Shot Learners, Advances in Neural Information Processing Systems, 33, pp. 1877-1901, (2020)
  • [6] Salemi A., Zamani H., Evaluating Retrieval Quality in Retrieval Augmented Generation, Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2395-2400
  • [7] Li H., Su W., Wang C., Wu Y., Ai Q., Liu Y., Thuir@coliee 2023: Incorporating structural knowledge into pre-trained language models for legal case retrieval, arXiv, (2023)
  • [8] Huang Q., Tao M., Zhang C., An Z., Jiang C., Chen Z., Wu Z., Feng Y., Lawyer llama technical report, arXiv, (2023)
  • [9] Yao S., Ke Q., Wang Q., Li K., Hu J., Lawyer GPT: A legal large language model with enhanced domain knowledge and reasoning capabilities, Proceedings of the 2024 3rd International Symposium on Robotics, Artificial Intelligence and Information Engineering, pp. 108-112
  • [10] Savelka J., Ashley K.D., The unreasonable effectiveness of large language models in zero-shot semantic annotation of legal texts, Front. Artif. Intell, 6, (2023)