Model was fine-tuned to write vulnerable software – then suggested enslaving humanity
Computer scientists have found that fine-tuning notionally safe large language models to do one thing badly can negatively impact the AI’s output across a range of topics.…
This article has been indexed from The Register – Security