New AI Jailbreak Method ‘Bad Likert Judge’ Boosts Attack Success Rates by Over 60%

2025-01-03 12:01

Cybersecurity researchers have shed light on a new jailbreak technique that could be used to get past a large language model’s (LLM) safety guardrails and produce potentially harmful or malicious responses.
The multi-turn (aka many-shot) attack strategy has been codenamed Bad Likert Judge by Palo Alto Networks Unit 42 researchers Yongzhe Huang, Yang Ji, Wenjun Hu, Jay Chen, Akshata Rao, and

This article has been indexed from The Hacker News

Read the original article:

New AI Jailbreak Method ‘Bad Likert Judge’ Boosts Attack Success Rates by Over 60%

← DDoS Disrupts Japanese Mobile Giant Docomo

US Imposes Sanctions on Russian and Iranian Groups Over Disinformation Targeting American Voters →

New AI Jailbreak Method ‘Bad Likert Judge’ Boosts Attack Success Rates by Over 60%

Read the original article:

Like this:

Related

Read the original article:

Share this:

Like this:

Related

Post navigation