Petr Kocmich portrait square
13 November 2023

Injecting Malicious Code into Generative AI Can Result in Worrying Data Exploitation

Chatbots such as ChatGPT, Google Bard and others are good servants, but they can also pose a great risk to society. They are vulnerable to attacks based on indirect, instant injection. In other words, they can easily be manipulated into doing things they shouldn’t.

Few people realise that systems that use artificial neural networks and artificial intelligence features that utilise large language model (LLM) technology can be flexibly modulated using special instruction sets. These sets make them susceptible to targeted attacks, known as indirect injection, by which attackers can override the original functionality of the system.

What is injection for LLMs?

An injection into a generative AI means that an attacker inserts malicious code (or an entire set) or data into the LLM input, initiating a sequence of events that can lead to various types of damage. These include, data theft, unauthorised transaction, system corruption, or a cyber attack.

This is considered by IT professionals to be one of the most alarming ways that hackers can exploit large language models. Large corporations as well as smaller start-ups often run or use public AI generation systems without realising the potential risks. This is why cybersecurity experts generally strive to raise awareness about potential risks. “Given that more and more companies use LLMs, and feed personal or corporate data into them on a large scale, this poses a significant risk. That is why attackers have focused on stealing AI Chatbot accounts – see for example Raccoon malware, which has already resulted in around 80,000 ChatGPT accounts being stolen and exposed on the Dark Web. If you use similar services in a corporate environment, we recommend setting up MFA as a priority for these tools, avoid inputting sensitive data into chats, and ideally not storing conversation histories,”explains Petr Kocmich, Global Cyber Security Delivery Manager at Soitron.

How injection for LLMs works

There are several ways an attacker can carry out injection into LLM. One of the most common methods is to insert malicious code or data into the input given to LLM. This input can be anything from text to an image.

For example, an attacker can insert malicious code into a chatbot’s input. If a user then asks the chatbot a question, this malicious code could execute the intended attack. Another method is exploiting a vulnerability in the application using LLM. Simply put, if someone can insert data into LLM, they can potentially manipulate what appears in the output.

Defence against injection for LLM

Security experts regularly demonstrate how indirect instant injection can be used for data manipulation or theft, and in the worst case, for remotely executing code.

There are several measures that developers and system administrators can take to protect systems from injection for LLM. One of the most important, but not trivial, is to properly protect the input to help prevent the insertion of malicious code or data.

 “Although chatbot developers have dedicated teams working on securing LLM systems, identifying and filtering injected code for data manipulation or exfiltration and remote code execution, it is not in their power to identify, treat and prevent all potentially dangerous inputs in advance. In cybersecurity, we have become accustomed to attackers using so-called code obfuscation. Here we are actually dealing with obfuscation at the level of natural language processing (NLP), where the desired functionality can be achieved by various types of written text,” points out Kocmich.

Dedicated solutions are also available

The above is another reason why companies are increasingly restricting the use of publicly available chatbots. Alternatively, they are considering how to integrate or implement generative AI into applications and services. This is because it needs to be understood that the moment information is received from third parties (e.g. from the Internet), the LLM can no longer be trusted. LLMs must therefore always be approached with caution.

In this case, a preferred solution are specialized generative systems specifically designed for corporate use, These systems allow the insertion of enterprise data, which never leaves the corporate environment, and the use of internal locally processed data. “An example of this is our Millada – a service which, although built on the ChatGPT model, works with data locally. Compared to other information technologies and systems, it is also much easier to implement as it works independently of the data processor,” says Kocmich, emphasising that while LLM is a helpful technology, it requires careful and responsible use.

Related articles