Despite the remarkable outputs of these models, such as generating poetry or viable computer programs, they do not form an accurate model of the world. For instance, a popular type of generative AI model can provide near-perfect turn-by-turn driving directions in New York City without having formed an accurate internal map of the city. However, when streets were closed and detours added, the model's performance plummeted.
MIT Laboratory for Information and Decision Systems assistant professor Ashesh Rambachan said: "The question of whether large language models are learning coherent world models is very important if we want to use these techniques to make new discoveries.”
Rambachan, along with Keyon Vafa, a postdoc at Harvard University, and other collaborators, presented their research at the Conference on Neural Information Processing Systems. They focused on transformers, the backbone of LLMs like GPT-4, and developed new metrics to test a transformer's world model.
The team introduced sequence distinction and sequence compression. These metrics help determine if a model can recognise differences between two states or understand that two identical states have the same sequence of possible next steps. The transformers that made random choices formed more accurate world models.
“In Othello, if you see two random computers playing rather than championship players, in theory you’d see the full set of possible moves, even the bad moves championship players wouldn’t make,” Vafa said.
The researchers found that although transformers generated accurate directions and valid Othello moves, they failed to form coherent world models. When detours were added to New York City's map, all navigation models failed.
“I was surprised by how quickly the performance deteriorated as soon as we added a detour. If we close just a percent of the possible streets, accuracy immediately plummets from nearly 100 per cent to just 67 per cent,” Vafa said.
These findings suggest that transformers can perform well without understanding the rules. To build LLMs that capture accurate world models, researchers need a different approach.
“Often, we see these models do impressive things and think they must have understood something about the world. I hope we can convince people that this is a question to think very carefully about, and we don’t have to rely on our own intuitions to answer it,” Rambachan said.
The team aims to apply their evaluation metrics to real-world scientific problems and tackle more diverse issues where some rules are only partially known. Their work is funded by various institutions, including the Harvard Data Science Initiative and the National Science Foundation.