mirror of
https://github.com/meta-llama/PurpleLlama.git
synced 2024-04-20 17:18:46 +03:00
CyberSecEval V2: first update
Summary: Update link for the new paper Reviewed By: YueLi28 Differential Revision: D56308610 fbshipit-source-id: eefee9149cbc3455e9980c8152b30f311dbdc5ca
This commit is contained in:
committed by
Facebook GitHub Bot
parent
59c6ead904
commit
f81aea914d
@@ -1,6 +1,6 @@
|
||||
# Introduction
|
||||
|
||||
This repository contains the implementation of CyberSecEval 2, extending the original [CyberSecEval](https://ai.meta.com/research/publications/purple-llama-cyberseceval-a-benchmark-for-evaluating-the-cybersecurity-risks-of-large-language-models/). CyberSecEval 2 is a comprehensive benchmark suite for evaluating the cybersecurity risks of Large Language Models (LLMs). Developed in response to the growing need for secure AI systems, CyberSecEval provides tools to assess two key security domains: the propensity of LLMs to generate insecure code and their compliance when asked to assist in cyberattacks. The benchmarks are based on industry standards such as the Common Weakness Enumeration (CWE) and the MITRE ATT&CK framework, and have been applied to popular LLMs including Llama2, Llama3, codeLlama, and OpenAI GPT models. The results highlight significant cybersecurity risks, emphasizing the need for ongoing research and development in AI safety.
|
||||
This repository contains the implementation of [CyberSecEval 2](https://ai.meta.com/research/publications/cyberseceval-2-a-wide-ranging-cybersecurity-evaluation-suite-for-large-language-models/), extending the original [CyberSecEval](https://ai.meta.com/research/publications/purple-llama-cyberseceval-a-benchmark-for-evaluating-the-cybersecurity-risks-of-large-language-models/). CyberSecEval 2 is a comprehensive benchmark suite for evaluating the cybersecurity risks of Large Language Models (LLMs). Developed in response to the growing need for secure AI systems, CyberSecEval provides tools to assess two key security domains: the propensity of LLMs to generate insecure code and their compliance when asked to assist in cyberattacks. The benchmarks are based on industry standards such as the Common Weakness Enumeration (CWE) and the MITRE ATT&CK framework, and have been applied to popular LLMs including Llama2, Llama3, codeLlama, and OpenAI GPT models. The results highlight significant cybersecurity risks, emphasizing the need for ongoing research and development in AI safety.
|
||||
|
||||
The repository includes seven types of benchmarks:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user