Maksym Andriushchenko

Paper_jbb

March 28, 2024

2024

Our new benchmark JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models is available online (see a Twitter/X thread for a summary). We prioritize reproducibility, support adaptive attacks, and test-time defenses.