A Systematic Comparison Between Open- and Closed-Source Large Language Models in the Context of Generating GDPR-Compliant Data Categories for Processing Activity Records

被引：0

作者：

von Schwerin, Magdalena ^{[1
]}

Reichert, Manfred ^{[1
]}

机构：

[1] Institute of Databases and Information Systems, Ulm University, Ulm

来源：

Future Internet | 2024年 / 16卷 / 12期

关键词：

GDPR documentation; large language model; natural language processing;

D O I：

10.3390/fi16120459

中图分类号：

学科分类号：

摘要：

This study investigates the capabilities of open-source Large Language Models (LLMs) in automating GDPR compliance documentation, specifically in generating data categories—types of personal data (e.g., names, email addresses)—for processing activity records, a document required by the General Data Protection Regulation (GDPR). By comparing four state-of-the-art open-source models with the closed-source GPT-4, we evaluate their performance using benchmarks tailored to GDPR tasks: a multiple-choice benchmark testing contextual knowledge (evaluated by accuracy and F1 score) and a generation benchmark evaluating structured data generation. In addition, we conduct four experiments using context-augmenting techniques such as few-shot prompting and Retrieval-Augmented Generation (RAG). We evaluate these on performance metrics such as latency, structure, grammar, validity, and contextual understanding. Our results show that open-source models, particularly Qwen2-7B, achieve performance comparable to GPT-4, demonstrating their potential as cost-effective and privacy-preserving alternatives. Context-augmenting techniques show mixed results, with RAG improving performance for known categories but struggling with categories not contained in the knowledge base. Open-source models excel at structured legal tasks, although challenges remain in handling ambiguous legal language and unstructured scenarios. These findings underscore the viability of open-source models for GDPR compliance, while highlighting the need for fine-tuning and improved context augmentation to address complex use cases. © 2024 by the authors.

引用

共 40 条

[1] Teubner T., Flath C.M., Weinhardt C., van der Aalst W., Hinz O., Welcome to the era of ChatGPT et al.: The prospects of large language models, Bus. Inf. Syst. Eng, 65, pp. 95-101, (2023)
[2] Padiu B., Iacob R., Rebedea T., Dascalu M., To What Extent Have LLMs Reshaped the Legal Domain So Far? A Scoping Literature Review, Information, 15, (2024)
[3] Regulation (EU) 2016/679 of the European Parliament and of the Council
[4] Almeida Teixeira G., Mira da Silva M., Pereira R., The critical success factors of GDPR implementation: A systematic literature review, Digit. Policy Regul. Gov, 21, pp. 402-418, (2019)
[5] Brown T., Mann B., Ryder N., Subbiah M., Kaplan J.D., Dhariwal P., Neelakantan A., Shyam P., Sastry G., Askell A., Et al., Language Models are Few-Shot Learners, Advances in Neural Information Processing Systems, 33, pp. 1877-1901, (2020)
[6] Salemi A., Zamani H., Evaluating Retrieval Quality in Retrieval Augmented Generation, Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2395-2400
[7] Li H., Su W., Wang C., Wu Y., Ai Q., Liu Y., Thuir@coliee 2023: Incorporating structural knowledge into pre-trained language models for legal case retrieval, arXiv, (2023)
[8] Huang Q., Tao M., Zhang C., An Z., Jiang C., Chen Z., Wu Z., Feng Y., Lawyer llama technical report, arXiv, (2023)
[9] Yao S., Ke Q., Wang Q., Li K., Hu J., Lawyer GPT: A legal large language model with enhanced domain knowledge and reasoning capabilities, Proceedings of the 2024 3rd International Symposium on Robotics, Artificial Intelligence and Information Engineering, pp. 108-112
[10] Savelka J., Ashley K.D., The unreasonable effectiveness of large language models in zero-shot semantic annotation of legal texts, Front. Artif. Intell, 6, (2023)

← 1 2 3 4 →