We examine an LLM jailbreaking technique called “Deceptive Delight,” a technique that mixes harmful topics with benign ones to trick AIs, with a high success rate.
The post Deceptive Delight: Jailbreak LLMs Through Camouflage and Distraction appeared first on Unit 42.
This article has been indexed from Unit 42