Self-driving car tech lets computers see our world like never before

The quest for self-driving cars puts deep neural networks on the fast track, a technology that teaches computers to accurately recognize objects from the real world.

Google self-driving car
Google tests self-driving cars, which rely on a variety of sensors to understand and react to their environments, on public roads. Wayne Cunningham/CNET

Since their inception, computers have lived in a world of ones and zeros, doggedly processing if-then and and-or statements through multiple language layers.

One fascinating technology being developed for self-driving cars may change all that, giving computers a stronger visual understanding of our world, and could possibly be the first step towards an evolving computer self-consciousness.

Click here for more Exciting Tech stories.

The technology is called "deep learning," a technique for training a computer based on a neural network program, similar to the pathways found in your brain.

While researchers apply deep learning in a variety of scenarios, including speech recognition, it's visual recognition that's the most relevant, and the hottest, field for self-driving cars.

Simple object labels

To safely navigate our cities and suburbs, self-driving cars need to recognize objects in their immediate environment. Among the lidar and radar sensors being deployed, researchers also fit self-driving cars with cameras to let them "see" their surroundings.

But computers have not enjoyed the same kind of visual evolution as we humans, so they have no natural ability to discern and identify objects in an image. Without object recognition, engineers can't program software to tell the car how to react to camera input.

A few cars, such as the Volvo XC90 , employ camera-based systems that let them recognize other cars, pedestrians and cyclists, but these current systems haven't been programmed using deep learning. Instead, these existing systems analyze input from a camera, comparing the imagery to a stored library of objects previously identified as a car, pedestrian, bicyclist, street sign or other familiar object in the world. The problem with this system is that not every object the computer "sees" will match the stored image, even if the object is that thing. Our world is just too varied for everything to be labeled in a computer.

This screen shows what automotive equipment supplier Delphi's autonomous car "sees" in its environment. Wayne Cunningham/CNET

Imagine, for example, if you only knew the term "cake" applied to a round, two-layer sponge pastry with white frosting. If you were confronted with a single-layer rectangular sheet cake with chocolate frosting, you wouldn't know it was a cake. Through years of learning and the flexibility of our thinking, we humans can make the leap to recognize both objects and apply the "cake" label, but using traditional object recognition, a computer must be explicitly told that each one is a cake.

Correlating similarities and accepting disparities

Deep learning takes a different approach than traditional object labeling for computers, which will ultimately help self-driving cars show greater flexibility in visual recognition. Following our previous example, engineers "show" a visual processor thousands of images that we define as "cake." The deep learning program then breaks down the imagery into layers and textures, correlating similarities and accepting disparities. After processing enough images through deep learning, the computer's neural network should be able to properly identify an image of a cake it has never seen before, even if it's a one-of-a-kind large wedding confection.

Nvidia demonstrates visual processing that recognizes a person partially obscured by another object. James Martin/CNET

With self-driving cars, engineers want to use deep learning to train neural networks to understand what a pedestrian, another car, a bicyclist or a road sign looks like. But instead of a specific pedestrian shape, the neural network is fed images of many different people, leaving it to build up a visual impression of what constitutes a person in its environment.

As such, it can differentiate between a person sitting on a sidewalk bench (safe) and a person about to step off a curb (dangerous). Even more impressive, the visual processor can recognize a person based on a partial image, for example, if only the person's head and shoulders are visible above an obstruction, such as another car.

Once a car's computer can accurately recognize objects in its environment, it can enact appropriate driving control protocols. If it recognizes a person about to step off a curb, or even standing on the roadside, the autonomous car can slow down or stop, or veer out of the way. Visual processing may be the only way we can get cars to safely maneuver through complicated urban areas in the same manner as a good human driver.

A world of objects

To train self-driving cars, the neural network only needs to focus on things that would be in the car's environment and that are relevant to driving behavior. However, the larger scope of this research lead to ImageNet, a database developed by Stanford and Princeton universities containing millions of labeled images, letting neural networks learn the difference between a power tool and a penguin, for example. Beyond imagery, researchers can use any other type of input, such as audio or 3D shapes, to train neural networks.

Google has been researching neural networks, and came up with a Web-based tool to let the computer show humans some of what it is seeing in existing imagery.

Honda Asimo robot
With more advanced visual processing, Honda's Asimo could be a real help around the home. AFP/Getty Images

Beyond self-driving cars, neural networks and visual learning have much to offer. Consider a visor-mounted head-up display for law enforcement, which could instantaneously analyze suspects and determine if they are holding a gun or not. A tool such as this could distinguish between a gun and other handheld objects, helping police avoid making fatal mistakes.

Home robotics could also burgeon with the technology. Rather than a simple Roomba vacuum-bot that works through a two-dimensional grid of a room and backs away from objects its mechanical sensors hit, a vacuuming robot with camera-based sensors could distinguish which (nonliving) objects in a room it can safely move, such as a wastebasket, picking them up to clean underneath and around, then putting them back where they were.

Awareness to consciousness

With deep learning and neural networks, however, we have to confront the idea that we are getting closer to the technological singularity, the point where computers develop strong artificial awareness, where the complexity of their programming becomes indistinguishable from consciousness. When a computer becomes aware of the outside world similar to the manner in which humans have learned about the world, would they in turn develop similar self-awareness?

RoboCop's ED-209 obviously did not employ good visual recognition software, and it was autonomously lethal. Columbia Pictures

There is very little that can be concretely said about how computers might develop consciousness, but a number of very smart people have been raising the alarm. Notable scientists such as Stephen Hawking and artificial intelligence researchers such as Google DeepMind CEO Demis Hassabis have called for international restrictions on autonomous weapons systems that make a kill decision with no human intervention.

A drone programmed to fire at any person it identifies carrying a weapon is still be a far cry from self-aware artificial intelligence, but these are certainly major issues we have to consider when experimenting with advanced computing like neural networks and visual recognition. So long as human safety is involved, autonomous cars will also be held to higher standards and be placed under more intense scrutiny.

As engineers further develop deep learning and neural networks for self-driving cars and other tools, they will open up our world to machine understanding like no technology before. Networked computers have radically transformed our expectations and experience of the world in just a few decades. What will our world look like in 10 years, with computers that can accurately identify anything seen through their camera eyes?

Close
Drag
Autoplay: ON Autoplay: OFF