Published in AI

Boffins build a giant brain for a robot

by on15 January 2024


One brain to rule them

Two researchers have revealed how they are creating a single super-brain that can pilot any robot, no matter how different they are.

Sergey Levine and Karol Hausman wrote in IEEE Spectrum that generative AI, which can create text and images, is not enough for robotics because the Internet does not have enough data on how robots interact with the world.

That's why they are working on a single deep neural network that can learn from the experiences of many robots and use them to control any bot.

They said that robots need robot data to learn from, and this data is usually made slowly and boringly by boffins in labs for very specific tasks. The best results usually only work in one lab, on one robot, and often only for a few things.

It would be possible to get around this problem by sharing the data of many robots so that a new robot could learn from all of them.

In 2023, Levine and Hausman joined forces with 32 other robot labs in North America, Europe, and Asia to start the RT-X project to collect data, tools, and code to make robots that can do anything.

The question is whether a deep neural network trained on data from loads of different robots can 'drive' all of them - even robots that look and act very differently. If so, this could unlock the power of big data for robot learning. The size of this project is huge because it must be. The RT-X dataset has nearly a million robot trials for 22 types of robots, including many of the most popular robotic arms on the market.

“Amazingly, we found that our multi-robot data could be used with simple machine-learning methods as long as we use big neural network models with big datasets. Using the same kinds of models used in current LLMs like ChatGPT, we could train robot-control algorithms that do not need any special tricks for different robots,” the pair wrote.

 Just like a person can drive a car or ride a bike using the same brain, a model trained on the RT-X dataset can see what kind of robot it's controlling from what it sees in the robot's camera. If the robot's camera sees a UR10 industrial arm, the model sends commands for a UR10. If the model sees a cheap WidowX hobbyist arm, the model moves it differently.

“To test our model, five of the labs in the RT-X project each tested it against the best control system they had made for their robot. Amazingly, the single model did better than each lab's method, doing the tasks about 50 per cent more often on average.”

They used an existing vision-language model to add the ability to make robot actions based on images.

“The RT-X project shows what can happen when the robot-learning community works together. We hope that RT-X will become a team effort to make data standards, reusable models, and new methods and algorithms,” the boffins wrote.

Last modified on 15 January 2024
Rate this item
(2 votes)