Meta’s AI safety system defeated by the space bar

2024-07-29 22:07

‘Ignore previous instructions’ thwarts Prompt-Guard model if you just add some good ol’ ASCII code 32

Meta’s machine-learning model for detecting prompt injection attacks – special prompts to make neural networks behave inappropriately – is itself vulnerable to, you guessed it, prompt injection attacks.…

This article has been indexed from The Register – Security

Read the original article:

Meta’s AI safety system defeated by the space bar

← US border cops really must get a warrant in NY before searching your phones, devices

GitHub Design Flaw Retains Deleted, Private Repos →

‘Ignore previous instructions’ thwarts Prompt-Guard model if you just add some good ol’ ASCII code 32

Read the original article:

Related

Post navigation