Generative AI can’t understand the world

Published in AI

Generative AI can’t understand the world

by Nick Farrell on11 November 2024

font size decrease font size increase font size
Print
Email

To be fair, who can?

MIT boffins have demonstrated that even the best-performing large language models (LLMs) lack a coherent understanding of the world and its rules, which can lead to unexpected failures on similar tasks.

Despite the remarkable outputs of these models, such as generating poetry or viable computer programs, they do not form an accurate model of the world. For instance, a popular type of generative AI model can provide near-perfect turn-by-turn driving directions in New York City without having formed an accurate internal map of the city. However, when streets were closed and detours added, the model's performance plummeted.

MIT Laboratory for Information and Decision Systems assistant professor Ashesh Rambachan said: "The question of whether large language models are learning coherent world models is very important if we want to use these techniques to make new discoveries.”

Rambachan, along with Keyon Vafa, a postdoc at Harvard University, and other collaborators, presented their research at the Conference on Neural Information Processing Systems. They focused on transformers, the backbone of LLMs like GPT-4, and developed new metrics to test a transformer's world model.

The team introduced sequence distinction and sequence compression. These metrics help determine if a model can recognise differences between two states or understand that two identical states have the same sequence of possible next steps. The transformers that made random choices formed more accurate world models.

“In Othello, if you see two random computers playing rather than championship players, in theory you’d see the full set of possible moves, even the bad moves championship players wouldn’t make,” Vafa said.

The researchers found that although transformers generated accurate directions and valid Othello moves, they failed to form coherent world models. When detours were added to New York City's map, all navigation models failed.

“I was surprised by how quickly the performance deteriorated as soon as we added a detour. If we close just a percent of the possible streets, accuracy immediately plummets from nearly 100 per cent to just 67 per cent,” Vafa said.

These findings suggest that transformers can perform well without understanding the rules. To build LLMs that capture accurate world models, researchers need a different approach.

“Often, we see these models do impressive things and think they must have understood something about the world. I hope we can convince people that this is a question to think very carefully about, and we don’t have to rely on our own intuitions to answer it,” Rambachan said.

The team aims to apply their evaluation metrics to real-world scientific problems and tackle more diverse issues where some rules are only partially known. Their work is funded by various institutions, including the Harvard Data Science Initiative and the National Science Foundation.

Last modified on 11 November 2024

Rate this item

(0 votes)

Tagged under

More in this category: « AI arrives in Volish legacy apps ChatGPT rivals Google Search »

Latest comments

b1k3rdude
I bet they are regreating giving agent orange any money at all at this point, lets see how much...

Microsoft confirms ditching $1 billion Ohio data centre dreams · 8 hours ago
b1k3rdude
Whay am I not fcuking surpised the GMP are involved in this, what is it with that part of the UK...

UK’s algorithmic precrime plan sparks outrage · 8 hours ago
Andrea Sibaldi
"only without pyschics or Tom Cruise. ".... that would be a huge plus. Unfortunately...

UK’s algorithmic precrime plan sparks outrage · 14 hours ago
Andrea Sibaldi
Best option in the long run would be having no AI. So when the bubble burts they could say...

Investor bets big on AMD’s AI ‘second place’ · 14 hours ago
Marc GP
“Anyone spamming the chat will get uhhh,” Free Speech Champion all right. Funny that while on an...

Musk rage-quits livestream after PoE2 trolling meltdown · 16 hours ago

Generative AI can’t understand the world

Most popular - Notebooks

Latest comments

Read more about: