Stephen McAleer and colleagues from the University of California, Irvine say they have pioneered a new kind of deep-learning technique, called "autodidactic iteration", that can teach itself to solve a Rubik's Cube with no human assistance.
The trick that McAleer and co have mastered is to find a way for the machine to create its own system of rewards. The machine must decide whether a specific move is an improvement on the existing configuration. To do this, it must evaluate the move. Autodidactic iteration does this by starting with the finished cube and working backwards to find a configuration that is similar to the proposed move. This process is not perfect, but deep learning helps the system figure out which moves are generally better than others.
Having been trained, the network then uses a standard search tree to hunt for suggested moves for each configuration.
McAleer said: "The result is an algorithm that performs remarkably well. The algorithm can solve 100% of randomly scrambled cubes while achieving a median solve length of 30 moves - less than or equal to solvers that employ human domain knowledge."
The research has implications for a variety of other tasks that deep learning has struggled with, including puzzles like Sokoban, games like Montezuma's Revenge, and problems like prime number factorisation.