Scene Understanding for self driving car

8/1/2017

Semantic Segmentation Using Deep Learning

Scene understanding is one of the important challenge for self driving car. The semantic segmentation map provides pix level information of things in scene on road which therefore, helps an autonomous car in decision making. In order to get segmented image for self driving car (SDC), we evaluated three major segmentation models namely, SegNet [1], KitiiSeg [2] and ResNet [3] on popular

challenging dataset including CamVid-11, CamVid-32 [4], CityScapes [5] and Mapillary [6]. We trained models in tensorflow. SegNet is fast in training however, it miss classifies pixel on boundaries. ResNet-101 with atrous convolution [8,9] not only gives better segmentation but decreases size of model weights, although it takes more time to converge during training. Results are demonstrated in the following images.

Quality of road segmentation We road pixel and try to find lane marking and then fuse with road markings detected through segmentation.

Result on test image of SegNet model trained on Cam-11 (right) and CamVid-32 (left) dataset. Lane marking is class in CamVid-32 dataset however, it is not a class in CamVid-11 dataset.

Road marking detection

Road marking is class in CamVid-32 dataset, but we can't rely only on segmented pixels as at it can't segment sharp boundaries. Furthermore, we also need to identify road marking, count total number of lanes on road, and orientation of marking. Therefore, we applied classical vision algorithms using OpenCV to achieve the target. Some of the examples are demonstrated in following figure, where red line is parametrized line drawn on lane marking.

References:
1. https://github.com/tkuanlun350/Tensorflow-SegNet
2. https://github.com/MarvinTeichmann/KittiSeg
3. Chen, Liang-Chieh, et al. "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs." arXiv preprint arXiv:1606.00915 (2016).
4. http://mi.eng.cam.ac.uk/research/projects/VideoRec/CamVid/
5. https://www.cityscapes-dataset.com/
6. https://www.mapillary.com/
7. Teichmann, Marvin, et al. "MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving." arXiv preprint arXiv:1612.07695
8. Chen, Liang-Chieh, et al. "Rethinking Atrous Convolution for Semantic Image Segmentation." arXiv preprint arXiv:1706.05587 (2017).
9. Kendall, Alex, Vijay Badrinarayanan, and Roberto Cipolla. "Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding." arXiv preprint arXiv:1511.02680 (2015).

0 Comments