Researchers claim to have discovered potentially infinite ways to circumvent the safety measures on key AI-powered chatbots like OpenAI, Google, and Anthropic.
Large language models, such as those used by ChatGPT, Bard, and Anthropic’s Claude, are heavily controlled by tech firms. The devices are outfitted with a variety of safeguards to prevent them from being used for evil purposes, such as educating users on how to assemble a bomb or writing pages of hate speech.
Security analysts from Carnegie Mellon University in Pittsburgh and the Centre for A.I. Safety in San Francisco said last week that they have discovered ways to bypass these guardrails.
The researchers identified that they might leverage jailbreaks built for open-source systems to attack mainstream and closed AI platforms.
The report illustrated how automated adversarial attacks, primarily done by appending characters to the end of user inquiries, might
[…]
Content was cut in order to protect the source.Please visit the source for the rest of the article.
[…]
Content was cut in order to protect the source.Please visit the source for the rest of the article.
This article has been indexed from CySecurity News – Latest Information Security and Hacking Incidents
Read the original article: