Machine vision -- aka the process by which a computer able to visually perceive the environment -- is complicated for many reasons, and like most complicated things, it can be handled in a bunch of different ways, each with its strengths and weaknesses. Some of the more common ways of handling this task in a car include radar, ultrasonic, lidar and plain old cameras.
The thing is, none of those are especially good at seeing an object that's close up quickly with sufficient resolution, and then getting that information to the car's computer fast enough for the vehicle to be able to act on it. For example, you're driving, a kid runs into the road and now your car has to see the kid and apply the brakes before you hit them. A company called Terranet believes it has a solution to that problem, which it announced on Thursday, and it's called Voxelflow.
If you're like me, the first thing you're probably thinking after reading the name "Voxelflow" is, "What the hell is a voxel? That sounds made up." Well, I wasn't sure either, so I asked Dr. Anthony Roy, an expert in machine vision, for a total layperson's explanation:
"A voxel is like a pixel. A pixel is a point in two-dimensional space with an X and a Y coordinate -- like the pixels on your TV. A voxel is the same, except it's in 3D space, so it has an X, a Y and a Z coordinates."
OK, so how does the Voxelflow system generate the cloud of voxels (aka a point cloud) it needs to define an object in space? Well, unlike most on-vehicle cameras, which use traditional shutter and sensor-based cameras, the Voxelflow system uses something called an "event camera," or, more specifically, three of them and a laser.
An event camera doesn't have a shutter. Instead, the individual pixels that make up the camera's sensor independently react to changes in brightness as they occur. This makes an event camera much quicker to respond than a shutter-based camera with less chance of motion blur. Cool, right?
So, now we've got three cameras to capture the images of the object we're trying not to hit, but those only give us the X and Y coordinates. The Voxelflow system's big innovation is a scanning laser that locks onto objects detected by the cameras and provides that Z coordinate, locating the object in space and turning a pixel into a voxel.
"You can use multiple 2D cameras to generate that third coordinate," continues Dr. Roy. "The problem is that it takes longer for a computer to crunch that data. Using lidar or something like this [Voxelflow] system is going to be faster."
The result of this is a system that can see, parse and react to a possible collision in five milliseconds. But wait, as they say, there's more. Instead of just storing all its data locally, Voxelflow is partnering with
to apply it to Mercedes' Live Map technology.
Nihat Kuecuek, who works for Mercedes on its maps and navigation projects, describes the way Live Map will integrate the Voxelflow data using the three states of matter: solid, liquid and gas.
The objects that don't change -- buildings, etc. -- are like solids. Things that change on occasion -- crosswalks, traffic lights, etc. -- are like liquids. The Voxelflow data collected by vehicles as they drive, documenting things that change frequently, are like gases. All three are used to generate a more complete, living map that is then streamed, rather than downloaded, to a vehicle.
The end result of the Live Map integration will be more effective navigation and the ability for safer route planning. The benefits of this will be felt with conventional, human-driven vehicles but will genuinely pay dividends when Level 4 and Level 5 autonomous vehicles start to exist widely on public roads.
Hyundai's Tiger X-1 walking car shoots for the moon