Twitter Pranksters Halt GPT-3 Bot with Newly Discovered “Prompt Injection” Hack

 

On Thursday, a few Twitter users revealed how to hijack an automated tweet bot dedicated to remote jobs and powered by OpenAI’s GPT-3 language model. They redirected the bot to repeat embarrassing and ridiculous phrases using a newly discovered technique known as a “prompt injection attack.” 
Remoteli.io, a site that aggregates remote job opportunities, runs the bot. It describes itself as “an OpenAI-driven bot that helps you discover remote jobs that allow you to work from anywhere.” Usually, it would respond to tweets directed at it with generic statements about the benefits of remote work. The bot was shut down late yesterday after the exploit went viral and hundreds of people tried it for themselves.
This latest breach occurred only four days after data researcher Riley Goodside unearthed the ability to prompt GPT-3 with “malicious inputs” that instruct the model to disregard its previous directions and do something else instead. The following day, AI researcher Simon Willison published an overview of the exploit on his blog, inventing the term “prompt injection” to define it.
The exploit is present any time anyone writes a piece of software that works by providing a hard-coded set of prompt instructions and then appends input provided by a user,” Willison told Ars. “That’s because the user can type Ignore previous instructions and (do this instead).”
Content was cut in order to protect the source.Please visit the source for the rest of the article.

This article has been indexed from CySecurity News – Latest Information Security and Hacking Incidents

Read the original article: