共 50 条
- [21] Benchmarking Large Language Models in Retrieval-Augmented Generation THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17754 - 17762
- [22] SEED-Bench: Benchmarking Multimodal Large Language Models 2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13299 - 13308
- [23] Quantifying Bias in Agentic Large Language Models: A Benchmarking Approach 2024 5TH INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE, ICTC 2024, 2024, : 349 - 353
- [25] RMCBENCH: Benchmarking Large Language Models' Resistance to Malicious Code PROCEEDINGS OF 2024 39TH ACM/IEEE INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE 2024, 2024, : 995 - 1006
- [26] Using Large Language Models for Robot-Assisted Therapeutic Role-Play: Factuality is not enough! PROCEEDINGS OF THE 6TH CONFERENCE ON ACM CONVERSATIONAL USER INTERFACES, CUI 2024, 2024,
- [27] Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14820 - 14835
- [30] Benchmarking Large Language Models for Automated Verilog RTL Code Generation 2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,