Paper_jbb | Maksym Andriushchenko

Our new benchmark JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models is available online (see a Twitter/X thread for a summary). We prioritize reproducibility, support adaptive attacks, and test-time defenses.