Machine Learning How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT BenchmarkMarch 18, 2025