Credit: Wayve

A team of researchers at U.K. startup Wayve has developed a way to apply deep learning networking to autonomous driving. In a recent blog post (along with accompanying YouTube demonstration video) representatives outlined how their technology works, and offer a demonstration using a real car on a real road.

As the team at Wayve notes, most self-driving cars use a host of cameras and sensors, along with mapping tools and a lot of programming. But such an approach, they argue, overlooks what appears to be a ceiling of sorts. Autonomous cars programmed by big companies such as Google have reached a point at which they are good, but not good enough for common use. This, they claim, is because such cars are not yet smart enough to handle the myriad conditions present on an average . What is needed, they suggest, is a smarter computer, not more sensors or programming.

The team at Wayve believe a smarter approach is to use reinforcement learning algorithms such as those used on such projects as DeepMind—let the computer learn how to do something the same way people do, by practicing. Reinforcement learning algorithms are what lie at the heart of networks—they learn by doing, over and over again, improving as they go. In the case of autonomous vehicle control, that would mean driving a car until they get it right.

To demonstrate how well such an approach can work, a team at Wayve outfitted a Renault Twizy with a single camera and gas, brake and steering control gear and then hooked them up to a graphics processor and a computer running reinforcement learning algorithms the company has developed. The computer was "told" that the optimum result would be the car moving forward along a road without leaving the road. The longer it could do this, the better. They then added a human driver and placed the car on a country road. The human driver would point the car in the right direction and then allow the computer to take over. If the car came close to driving off the road, the human would stop it, get the car aligned and then give the computer another go. In this manner, the computer was able to learn how to prevent the car from running off the road in about 20 minutes. After that, it was able to continue on indefinitely.