Review on Scene Semantics Extraction for Decision Making System in Autonomous Vehicles

Main Article Content

Yuvraj Bapu Hembade

Abstract

Abstract: It is a worldwide witnessed fact that traditional manual driving mechanism will be superseded by Autonomous Vehicles [AVs] in coming years. Autonomous vehicles are going to be most foreseen development in the automotive industry. That would require Decision Making System which will enable AVs to intuitively interpret the real-time situations around. Most importantly scene recognition on streets & extracting relevant semantics from the scene is challenging task. So, image classification & object detection techniques using Deep Convolutional Neural Networks [DCNN] are going to play vital role in every other methodology designed for scene semantics extraction. As per the extracted scene semantics DMS actuates the necessary devices which control the speed of vehicle & steering angel. So for that matter information extraction from road scene images covering all aspects to take intuitive decisions has huge concern with overall performance of the AV’s.

 

Downloads

Download data is not yet available.

Article Details

Section
Articles

References

L. Chen, W. Zhan, W. Tian, Y. He and Q. Zou, "Deep Integration: A Multi-Label Architecture for Road Scene Recognition," in IEEE Transactions on Image Processing, vol. 28, no. 10, pp. 4883-4898, Oct. 2019. doi: 10.1109/TIP.2019.2913079

A. J. Davison, I. D. Reid, N. D. Molton, and O. Stasse, “Mono SLAM: Real-time single camera SLAM,†IEEE Transactions on Pattern Analysis & Machine Intelligence, no. 6, pp. 1052–1067, 2007.

C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira, I. Reid, and J. J. Leonard, “Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age,†IEEE Transactions on Robotics, vol. 32, no. 6, pp. 1309–1332, 2016.

Q. Li, L. Chen, M. Li, S. Shaw, and A. Nuchter, “A sensor-fusion drivable-region and lane-detection system for autonomous vehicle navigation in challenging road scenarios,†IEEE Transactions on Vehicular Technology, vol. 63, no. 2, pp. 540–555, 2014.

D. Gonzlez, J. Prez, V. Milans, and F. Nashashibi, “A review of motion planning techniques for automated vehicles,†IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 4, pp. 1135–1145, 2016.

L. Chen, L. Fan, G. Xie, K. Huang, and A. Nuchter, “Moving-object detection from consecutive stereo pairs using slanted plane smoothing,†IEEE Transactions on Intelligent Transportation Systems,vol.18,no.11, pp. 3093–3102, 2017.

L. Chen, X. Hu, T. Xu, H. Kuang, and Q. Li, “Turn signal detection during night time by cnn detector and perceptual hashing tracking,†IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 12, pp. 3303–3314, 2017.

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223, 2016.

J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Feifei, “Imagenet: A large-scale hierarchical image database,†European Conference on Computer Vision, pp. 248–255, 2009.

B. Zhou, A. Lapedriza, A. Khosla, A. Oliva, and A. Torralba, “Places: A 10 million image database for scene recognition,†IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 6, pp. 1452– 1464, 2018.

S. Ren, K. He, R. B. Girshick, and J. Sun, “Faster r-cnn: Towards real time object detection with region proposal networks,†IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137– 1149, 2017.

J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,†computer vision and pattern recognition, pp. 3431–3440, 2015.

Q. Zou, Z. Zhang, Q. Li, X. Qi, Q. Wang, and S. Wang, “Deepcrack: Learning hierarchical convolutional features for crack detection,†IEEE Transactions on Image Processing, vol. 28, no. 3, pp. 1498–1512, 2019.

N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,†computer vision and pattern recognition, pp. 4040–4048, 2016.

L. Chen, M. Cui, F. Zhang, B. Hu, and K. Huang, “High speed scene flow on embedded commercial-off-the-shelf systems,†IEEE Transactions on Industrial Informatics, pp. 1–1, 2018.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classiï¬cation with deep convolutional neural networks,†in Advances in neural information processing systems, 2012, pp. 1097–1105.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,†International Conference on Learning Representations, 2015.

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9, 2015.

J. Xiao, J. Hays, K. A. Ehinger, A. Oliva, and A. Torralba, “Sun database: Large-scale scene recognition from abbey to zoo,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 3485– 3492, 2010.

A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361, 2012.

L. Yang, P. Luo, C. C. Loy, and X. Tang, “A large-scale car dataset for ï¬ne-grained categorization and veriï¬cation,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 3973–3981, 2015.

I. Sikiri´c, K. Brki´c, J. Krapac, and S. ˇSegvi´c, “Image representations on a budget: Trafï¬c scene classiï¬cation in a restricted bandwidth scenario,†IEEE Intelligent Vehicles Symposium, 2014.

Y. Luo, T. Liu, D. Tao, and C. Xu, “Multi view matrix completion for multi label image classiï¬cation,†IEEE Transactions on Image Processing, vol. 24, no. 8, pp. 2355–2368, 2015.

X. Li, X. Zhao, Z. Zhang, F. Wu, Y. Zhuang, J. Wang, and X. Li, “Joint multi label classiï¬cation with community-aware label graph learning,†IEEE Transactions on Image Processing, vol. 25, no. 1, pp. 484–493, 2016.

J. Wang, Y. Yang, J. Mao, Z. Huang, C. Huang, and W. Xu, “Cnnrnn: A uniï¬ed framework for multi-label image classiï¬cation,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 2285– 2294, 2016.

H. Lai, P. Yan, X. Shu, Y. Wei, and S. Yan, “Instance-aware hashing for multi-label image retrieval,†IEEE Transactions on Image Processing, vol. 25, no. 6, pp. 2469–2479, 2016.

M. Oquab, L. Bottou, I. Laptev, and J. Sivic, “Learning and transferring mid-level image representations using convolutional neural networks,†IEEE conference on computer vision and pattern recognition, pp. 1717– 1724, 2014.

L. Wang, S. Guo, W. Huang, Y. Xiong, and Y. Qiao, “Knowledge guided disambiguation for large-scale scene classiï¬cation with multi-resolution cnns,†IEEE Transactions on Image Processing, 2017.

K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,†IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904– 1916, 2015.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016.

T. Lin, M. Maire, S. J. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick, “Microsoft coco: Common objects in context,†IEEE Conference on European Conference on Computer Vision, pp. 740–755, 2014.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,†International Conference on Learning Representations, 2015.

C. Szegedy,V. Vanhoucke, S. Ioffe,J. Shlens, and Z.Wojna, “Rethinking the inception architecture for computer vision,†IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826, 2016.

C. Huang, Y. Li, C. C. Loy, and X. Tang, “Learning deep representation for imbalanced classiï¬cation,†IEEE conference on computer vision and pattern recognition, pp. 5375–5384, 2016.

C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi. (2016). “Inception-v4, inception-ResNet and the impact of residual connections on learning.†[Online]. Available: https://arxiv.org/abs/1602.07261

A. Oliva and A. Torralba, “Modeling the shape of the scene: A holistic representation of the spatial envelope,†Int. J. Comput. Vis., vol. 42, no. 3, pp. 145–175, 2001.

A. Veit, N. Alldrin, G. Chechik, I. Krasin, A. Gupta, and S. Belongie. (2017). “Learning from noisy large-scale datasets with minimal supervision.†[Online]. Available: https://arxiv.org/abs/1701.01619

L. Li, K. Ota and M. Dong, "Humanlike Driving: Empirical Decision-Making System for Autonomous Vehicles," in IEEE Transactions on Vehicular Technology, vol. 67, no. 8, pp. 6814-6823, Aug. 2018, doi: 10.1109/TVT.2018.2822762.

Yuan, S.; Chen, Y.; Huo, H.; Zhu, L. Analysis and Synthesis of Traffic Scenes from Road Image Sequences. Sensors 2020, 20, 6939. https://doi.org/10.3390/s20236939

https://idd.insaan.iiit.ac.in/dataset/download/

W. Zhiqiang and L. Jun, "A review of object detection based on convolutional neural network," 2017 36th Chinese Control Conference (CCC), 2017, pp. 11104-11109, doi: 10.23919/ChiCC.2017.8029130.

https://cloud.google.com/tpu/docs/inception-v3-advanced