University of California Irvine researchers recently developed a deep neural network that can solve a Rubik’s cube in a fraction of a second with 100 percent accuracy in all testing configurations. 

“Artificial intelligence can defeat the world’s best human chess and Go players, but some of the more difficult puzzles, such as the Rubik’s Cube, had not been solved by computers, so we thought they were open for AI approaches,” said senior author Pierre Baldi, UCI Distinguished Professor of computer science and co-author of a newly published Nature study about the work.  

Using two NVIDIA TITAN V GPUs, with six other GPUs used in parallel for data generation, and the cuDNN-accelerated TensorFlow deep learning framework, the researchers trained a DNN on a dataset of 10 billion simulations of the completed and scrambled puzzle. Training took 36 hours and was carried out for 1 million iterations, the team said. 

Source: UC Irvine/Nature: The plots show that DeepCubeA first learns how to solve cubes closer to the goal and then learns to solve increasingly difficult cubes. Dashed lines represent the true average cost-to-go.

Once trained, DeepCubeA achieved 100 percent accuracy during all test configurations, finding the shortest path to the goal state 60.3 percent of the time, the researchers explained. 

DeepCubeA builds on DeepCube, a deep reinforcement learning algorithm developed by the same team and released at ICLR 2019, that solves the Rubik’s cube using a policy and value function combined with Monte Carlo tree search (MCTS).

For inference, the researchers used four NVIDIA TITAN Xp GPUs in parallel to compute the puzzles. 

The researchers also trained the DNN on several other puzzles including, the 24 puzzle, Lights Out, and Sokoban. 

Source: UC Irvine/Nature: Visualization of scrambled states and goal states. Visualization of a scrambled state (top) and the goal state (bottom) for four puzzles investigated here.

“DeepCubeA is able to solve planning problems with large state spaces and few goal states by learning a cost-to-go function, parameterized by a DNN,” the researchers stated. 

The ultimate goal of the project is to build the next generation of deep learning systems, the researchers said.

“While finding an optimal solution to the 15 puzzle takes less than a second on a modern-day desktop, finding an optimal solution to the 24 puzzle can take days, and finding an optimal solution to the 35 puzzle is generally intractable” the team said.  “Not only are the aforementioned puzzles relevant as mathematical games, but they can also be used to test planning algorithms and to assess how well a machine learning approach may generalize to different environments.”

The researchers have publicly released the code and also published an online demo of the original model. 

Source: UC Irvine/Nature: Screen capture of the original model’s online demo.

Read more>