Llm-Attacks
Visit Toolllm-attacks is an open-source repository for universal and transferable adversarial attacks on aligned language models. It provides a fast and easy-to-use implementation of the GCG algorithm for jailbreaking language models.
At a glance
Trending