X

Robot learns via trial and error like a human

Researchers at the University of California at Berkeley have developed algorithms that allow robots to learn by making mistakes -- just like a real person.

Michelle Starr Science editor
Michelle Starr is CNET's science editor, and she hopes to get you as enthralled with the wonders of the universe as she is. When she's not daydreaming about flying through space, she's daydreaming about bats.
Michelle Starr
3 min read

brett1.jpg
BRETT figuring out how to put two toy bricks together. Screenshot by Michelle Starr/CNET

Robot "brains" are very different from human brains, and teaching robots how to do tasks is usually just a matter of writing the right code. It may sound simpler than the way humans learn, but it's actually very difficult -- without human intuition, things like flexibility and adapting to changing circumstances become nigh impossible.

A team of researchers at the University of California at Berkeley have demonstrated a robot that can learn via trial and error, much like how humans learn. It constitutes a pretty big step forward in the field of artificial intelligence.

"What we're reporting on here is a new approach to empowering a robot to learn," Professor Pieter Abbeel of UC Berkeley's Department of Electrical Engineering and Computer Sciences said in a statement last week.

"The key is that when a robot is faced with something new, we won't have to reprogram it. The exact same software, which encodes how the robot can learn, was used to allow the robot to learn all the different tasks we gave it."

The team developed a series of algorithms that allowed a Willow Garage PR2 robot named BRETT to learn a series of motor tasks, such as screwing a cap on a water bottle, or assembling a toy aeroplane, without pre-programmed knowledge of its surroundings -- by allowing it to eliminate possibilities by making mistakes, or what we'd call "trial and error."

This could allow robots to operate more efficiently in environments that are a little more chaotic than the environments in which they usually operate, such as factories or laboratories.

"Most robotic applications are in controlled environments where objects are in predictable positions," said project lead UC Berkeley faculty member Trevor Darrell, director of the Berkeley Vision and Learning Center.

"The challenge of putting robots into real-life settings, like homes or offices, is that those environments are constantly changing. The robot must be able to perceive and adapt to its surroundings."

brett2.jpg
BRETT assembling a toy aeroplane. Screenshot by Michelle Starr/CNET

To model the robot's learning on human learning, the team tapped into a branch of machine learning research called deep learning. This uses an approach that allows an artificial neural network to extrapolate knowledge from a baseline, learning without needing the next layer of data to be programmed by a human.

These are sort of "neural nets" where artificial neurons process raw data, such as sound or images, recognising patterns in the data and applying it. Motor tasks, such as those in which BRETT trained, are a difficult, though. They require active physical involvement with the task.

"For all our versatility, humans are not born with a repertoire of behaviours that can be deployed like a Swiss army knife, and we do not need to be programmed," said research team member postdoctoral researcher Sergey Levine.

brett3.jpg
BRETT screws a cap on a water bottle. UC Berkeley

"Instead, we learn new skills over the course of our life from experience and from other humans. This learning process is so deeply rooted in our nervous system, that we cannot even communicate to another person precisely how the resulting skill should be executed. We can at best hope to offer pointers and guidance as they learn it on their own."

BRETT was assisted in its tasks with a reward system that gave points based on how well the robot performed. The robot can see what's in front of it, and the algorithm responds in real-time; the movements that get BRETT closer to achieving the task, the higher the score gets. The real-time feedback allows the robot to learn which movements are best for which task.

Typically, BRETT was able to master each task within about 10 minutes when provided information about what the objects look like and where they are. When BRETT had to identify and locate the objects as well, the tasks took around three hours -- but with greater data processing power, its speed should improve.

"With more data, you can start learning more complex things," said Abdeel. "We still have a long way to go before our robots can learn to clean a house or sort laundry, but our initial results indicate that these kinds of deep learning techniques can have a transformative effect in terms of enabling robots to learn complex tasks entirely from scratch. In the next five to 10 years, we may see significant advances in robot learning capabilities through this line of work."