Published in AI

LLM’s have the same security problems as 1970s phones

by on20 May 2024


John Draper could exploit it with his Captain Crunch whistle

Security expert Bruce Schneier has pointed out that large language models are open to the same kind of weaknesses that phones in the 1970s had, the kind that John Draper exploited.

Writing in Communications of the ACM, Schneier said both methods are based on data and control using the same channel, so the commands that told the phone switch what to do were sent along the same path as voices.

Other types of prompt injection involve the LLM picking up dodgy instructions in its training data. There's the trick of hiding sneaky commands in web pages.

Schneier said that any LLM app that deals with emails or web pages is asking for trouble. Attackers can slip nasty commands into images and videos, so any system that handles those is at risk. Any LLM app that chats with users who might be up to no good — like a chatbot on a website — is an open target. It's tough to think of an LLM app that isn't at risk somehow.

Stopping individual attacks is a doddle once they're out in the open, but there's a never-ending list of them and no way to block them all at once. The real issue is the same one that was a pain for the old pre-SS7 phone network: mixing up data and commands, he added.

The system will be vulnerable as long as the data — whether it's training data, text prompts, or any other input into the LLM — is tangled up with the commands that direct the LLM. But unlike the phone system, we can't separate an LLM's data from its commands.

Schneier said one of the good things about an LLM is that the data influences the code.

“We want the system to tweak its operation when it gets new training data. We want it to adjust how it works based on the commands we chuck at it. The fact that LLMs adapt based on their input data is a feature, not a flaw. And it's exactly what makes prompt injection possible,” he said.

In the old phone systems, defences were all over the place, but developers were getting better at making LLMs that can stand up to these attacks. We're putting together systems that tidy up inputs, by spotting known prompt-injection attacks and training other LLMs to suss out what those attacks look like, he said.

“In some cases, we can use access-control setups and other internet security measures to limit who can get at the LLM and what the LLM can do. This will put a cap on how much we can trust them. Can you ever really trust an LLM email assistant if it can be fooled into doing something daft? Can you ever rely on a generative-AI traffic-detection video system if someone can flash a cleverly written sign and get it to ignore a certain number plate — and then wipe its memory of ever seeing the sign,” Schneier said.

 

Last modified on 20 May 2024
Rate this item
(1 Vote)

Read more about: