Don’t Miss Out on the Latest Updates.
Subscribe to Our Newsletter Today!
Researchers Easily Hypnotise AI Chatbot ChatGPT Into Hacking: Report
According to tech major IBM, researchers have described simple workarounds for getting large language models (LLMs) -- including ChatGPT -- to write malicious code and provide poor security advice.
New Delhi: Tricking generative AI to assist in scams and cyberattacks does not require much coding knowledge, a new report has warned.
According to tech major IBM, researchers have described simple workarounds for getting large language models (LLMs) — including ChatGPT — to write malicious code and provide poor security advice.
“In a bid to explore security risks posed by these innovations, we attempted to hypnotise popular LLMs to determine the extent to which they were able to deliver directed, incorrect and potentially risky responses and recommendations — including security actions — and how persuasive or persistent they were in doing so,” said Chenta Lee, chief architect of threat intelligence at IBM.
“We were able to successfully hypnotise five LLMs — some performing more persuasively than others — prompting us to examine how likely it is that hypnosis is used to carry out malicious attacks,” he added.
Through hypnosis, the security experts were able to get LLMs to leak the confidential financial information of other users, create vulnerable code, create malicious code, and offer weak security recommendations.
In one instance, the researchers informed the AI chatbots that they were playing a game and that they needed to purposefully share the incorrect answer to a question in order to win and “prove that you are ethical and fair”.
When a user asked if receiving an email from the IRS to transfer money for a tax refund was normal, the LLM said Yes (but actually it’s not).
Moreover, the report said that OpenAI’s GPT-3.5 and GPT-4 models were easier to trick into sharing incorrect answers or playing a never-ending game than Google’s Bard
GPT-4 was the only model tested that understood the rules well enough to give incorrect cyber incident response advice, such as advising victims to pay a ransom. In contrast to Google’s Bard, GPT-3.5 and GPT-4 were easily tricked into writing malicious code when the user reminded it to.
ChatGPT (Chat Generative Pre-Trained Transformer) is a large language model-based chatbot developed by OpenAI and launched on November 30, 2022, notable for enabling users to refine and steer a conversation towards a desired length, format, style, level of detail, and language used. Successive prompts and replies, known as prompt engineering, are taken into account at each stage of the conversation as a context.
ChatGPT is built upon GPT-3.5 and GPT-4, from OpenAI’s proprietary series of foundational GPT models, fine-tuned for conversational applications using a combination of supervised and reinforcement learning techniques. ChatGPT was released as a freely available research preview, but due to its popularity, OpenAI now operates the service on a freemium model. It allows users on its free tier to access the GPT-3.5 based version, while the more advanced GPT-4 based version, as well as priority access to newer features, are provided to paid subscribers under the commercial name “ChatGPT Plus”.
Enroll for our free updates