Donkey Car Newsletter

Safe, In Person Racing! Donkey Car Roadmap with PyTorch and

In Person Racing Returns to Oakland

Safe, in person racing returns to Oakland on December 5. It will be outdoors and limited to 50 mask wearing people. Unlike most DIYRobocars races, this one will allow GPS! Sign up here

For those of you interested in a online race, sign up here for the December 19th virtual race.

Donkey Car 4.1 roadmap

It feels like just yesterday that Donkey Car 4.0 released, nonetheless 4.1 is coming soon. With he new maintainers we expect to do quarterly releases of Donkey Car. Dirk Prange will be the release captain.

  • PyTorch and support - If you have been putting off learning PyTorch and, now is the time to get started.

  • Auto Encoder - More below

  • Lots of other fixes…

More on the AutoEncoder

The driving idea behind the auto encoder in Donkey Car is the decoupling of the visual system from the ‘motor neurons’ of the car, called the controller. As it turns out, the visual system which is represented by the CNN layers is far more complex than the controller. And the visual system is not necessarily dependent on a specific track but a component that we will train on a much bigger set of data and then re-use on new tracks, providing an efficient transfer learning for the controller part only. This is very similar to learning motor skills in biological brains. In order to learn how to catch a ball or drive a bicycle we don’t have to learn to see again, this is a skill that we have already acquired at that state. We just learn how to activate the muscles in the right order at the right time, which is a much simpler task.

Cutting the standard donkey linear model after the CNN layers and adding a single dense layer provides us with an image encoder which compresses the high-dimensional image data into a much lower dimensional latent space. Our 120x160x3 size frames are mapped into a 128-dimensional vector. The auto encoder now adds an exact reverse deconvolutional network on its back which inflates the latent vector back to the input image size. A good introduction with Keras code can be found here:

The auto encoder can be trained completely unsupervised by minimizing the difference of the output and input images in training. However, in this approach we will not get a latent representation of the image information that is most suitable for detecting features relevant for driving, like lane edges. Hence our loss function in training contains also steering and throttle terms.

From the pre-trained auto encoder we subsequently only use the encoder part in our model, where the latent vector is directly fed into the controller. This is a small feed forward network that is about two orders (i.e. factor 100) smaller than the decoder. The training is only performed on the controller, which makes it very fast.

Things to develop further:
1) Training might be so fast that we can possibly train while we are driving. Not necessarily on the car but likely on the PC in the background.
2) We can explore RL on the physical car not only in simulation
3) With the new augmentation we can train a de-noising auto encoder (see example in Keras above). Instead of feeding the original images in training, we feed noisy images into the auto encoder but in the loss function we compare the reproduced images against the original ones. In particular, changes in light levels which are scalar multiplication of the input images can be efficiently factored out. Currently for a trained linear model such an image transformation creates a change in output, i.e. the car over or understeers and drives faster or slower with changes in light levels. Auto encoders are very good at de-noising. If you play with the Keras tutorial above you can create very blurred images that you will find hard to recognize -  the network will generally do a better job than the human eye.