ForensicLLM: A local large language model for digital forensics
被引:0
|
作者:
Sharma, Binaya
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Computat & Technol, Baggil i Truth BiT Lab, Baton Rouge, LA 70808 USA
Louisiana State Univ, Div Comp Sci & Engn, Baton Rouge, LA USACtr Computat & Technol, Baggil i Truth BiT Lab, Baton Rouge, LA 70808 USA
Sharma, Binaya
[1
,3
]
Ghawaly, James
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Computat & Technol, Intersectional AI & Secur AISx Lab, Baton Rouge, LA USA
Louisiana State Univ, Div Comp Sci & Engn, Baton Rouge, LA USACtr Computat & Technol, Baggil i Truth BiT Lab, Baton Rouge, LA 70808 USA
Ghawaly, James
[2
,3
]
Mccleary, Kyle
论文数: 0引用数: 0
h-index: 0
机构:
Ctr Computat & Technol, Intersectional AI & Secur AISx Lab, Baton Rouge, LA USA
Louisiana State Univ, Div Comp Sci & Engn, Baton Rouge, LA USACtr Computat & Technol, Baggil i Truth BiT Lab, Baton Rouge, LA 70808 USA
Mccleary, Kyle
[2
,3
]
论文数: 引用数:
h-index:
机构:
Webb, Andrew M.
[3
]
论文数: 引用数:
h-index:
机构:
Baggili, Ibrahim
[1
,3
]
机构:
[1] Ctr Computat & Technol, Baggil i Truth BiT Lab, Baton Rouge, LA 70808 USA
[2] Ctr Computat & Technol, Intersectional AI & Secur AISx Lab, Baton Rouge, LA USA
[3] Louisiana State Univ, Div Comp Sci & Engn, Baton Rouge, LA USA
Digital forensics;
Digital investigations;
Generative AI;
Large language model (LLM);
Admissibility of evidence;
ChatGPT;
LLaMA;
Retrieval augmented generation (RAG);
Fine-tuning;
D O I:
10.1016/j.fsidi.2025.301872
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Large Language Models (LLMs) excel in diverse natural language tasks but often lack specialization for fields like digital forensics. Their reliance on cloud-based APIs or high-performance computers restricts use in resourcelimited environments, and response hallucinations could compromise their applicability in forensic contexts. We introduce ForensicLLM, a 4-bit quantized LLaMA-3.1-8B model fine-tuned on Q&A samples extracted from digital forensic research articles and curated digital artifacts. Quantitative evaluation showed that ForensicLLM outperformed both the base LLaMA-3.1-8B model and the Retrieval Augmented Generation (RAG) model. ForensicLLM accurately attributes sources 86.6 % of the time, with 81.2 % of the responses including both authors and title. Additionally, a user survey conducted with digital forensics professionals confirmed significant improvements of ForensicLLM and RAG model over the base model. ForensicLLM showed strength in "correctness" and "relevance" metrics, while the RAG model was appreciated for providing more detailed responses. These advancements mark ForensicLLM as a transformative tool in digital forensics, elevating model performance and source attribution in critical investigative contexts.