Scalable Vehicle Detection System for Intelligent Infrastructures
DOI:
https://doi.org/10.17979/ja-cea.2025.46.12076Keywords:
Intelligent transportation systems, Machine Learning, Sensor integration and perception, Perception and sensingAbstract
The growing urban population has increased the need for efficient transportation systems that enhance road safety, optimize traffic flow and reduce environmental impact. Intelligent infrastructures equipped with sensing technologies have emerged as a key solution for traffic monitoring; however, they still face challenges related to cost, accuracy and installation complexity. This article presents a scalable 3D vehicle detection system that supports three detection modes: monocular, LiDAR and multimodal (LiDAR combined with an RGB camera). The proposed system automatically selects the most suitable mode based on the sensors available in each infrastructure. Experimental results show that this modular approach effectively balances cost and performance, enabling flexible and progressive deployment according to the specific needs of each urban environment.
References
Arnold, E., Al-Jarrah, O. Y., Dianati, M., Fallah, S., Oxtoby, D., Mouzakitis, A., 2019. A survey on 3d object detection methods for autonomous driving applications. IEEE Transactions on Intelligent Transportation Systems 20 (10), 3782–3795.
Borau Bernad, J., 2024. RoadVision3D. https://github.com/jborau/RoadVision3D.
Borau Bernad, J., Ramajo-Ballester, Á., Armingol Moreno, J. M., 2024. Three-dimensional vehicle detection and pose estimation in monocular images for smart infrastructures. Mathematics 12 (13), 2027.
Dong, Q., Zhou, Z., Qiu, X., Zhang, L., 2025. A survey on self-supervised monocular depth estimation based on deep neural networks. IEEE Transactions on Neural Networks and Learning Systems (Advance online publication). URL: https://doi.org/10.1109/TNNLS.2025.3552598 DOI: 10.1109/TNNLS.2025.3552598
Geiger, A., Lenz, P., Urtasun, R., 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite. 2012 IEEE Conference on Computer Vision and Pattern Recognition, 3354–3361. URL: https://api.semanticscholar.org/CorpusID:6724907
Huang, K., Shi, B., Li, X., Li, X., Huang, S., Li, Y., 2024. Multi-modal sensor fusion for auto driving perception: A survey. URL: https://arxiv.org/abs/2202.02703
Li, Z., Jia, J., Shi, Y., 2024. MonoLSS: Learnable sample selection for monocular 3d detection. In: 2024 International Conference on 3D Vision (3DV), pp. 1125–1135. DOI: 10.1109/3DV62453.2024.00088
Liu, X., Xue, N., Wu, T., 2021. Learning auxiliary monocular contexts helps monocular 3d object detection. URL: https://arxiv.org/abs/2112.04628
Liu, Z., Wu, Z., Tóth, R., 2020. SMOKE: Single-stage monocular 3d object detection via keypoint estimation. URL: https://arxiv.org/abs/2002.10111
MMDetection3D Contributors, 2020. MMDetection3D: OpenMMLab next-generation platform for general 3D object detection. https://github.com/open-mmlab/mmdetection3d
Owais, M., 2024. Deep learning for integrated origin–destination estimation and traffic sensor location problems. IEEE Transactions on Intelligent Transportation Systems 25 (7), 6501–6513. DOI: 10.1109/TITS.2023.3344533
Sindagi, V. A., Zhou, Y., Tuzel, O., 2019. MVX-Net: Multimodal voxelnet for 3d object detection. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 7276–7282. DOI: 10.1109/ICRA.2019.8794195
Wang, Y., Mao, Q., Zhu, H., Deng, J., Zhang, Y., Ji, J., Li, H., Zhang, Y., 2023. Multi-modal 3d object detection in autonomous driving: A survey. International Journal of Computer Vision 131 (8), 2122–2152.
Yan, Y., Mao, Y., Li, B., 2018. SECOND: Sparsely embedded convolutional detection. Sensors 18 (10). URL: https://www.mdpi.com/1424-8220/18/10/3337 DOI: 10.3390/s18103337
Yu, F., Wang, D., Shelhamer, E., Darrell, T., 2018. Deep layer aggregation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2403–2412.
Yu, H., Luo, Y., Shu, M., Huo, Y., Yang, Z., Shi, Y., Guo, Z., Li, H., Hu, X., Yuan, J., Nie, Z., June 2022. DAIR-V2X: A large-scale dataset for vehicle-infrastructure cooperative 3d object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 21361–21370.
Zanella, A., Bui, N., Castellani, A., Vangelista, L., Zorzi, M., 2014. Internet of Things for smart cities. IEEE Internet of Things Journal 1 (1), 22–32.
Zhang, Y., Carballo, A., Yang, H., Takeda, K., 2023. Perception and sensing for autonomous vehicles under adverse weather conditions: A survey. ISPRS Journal of Photogrammetry and Remote Sensing 196, 146–177. DOI: 10.1016/j.isprsjprs.2022.12.021
Zhou, Y., Tuzel, O., 2017. VoxelNet: End-to-end learning for point cloud based 3d object detection. URL: https://arxiv.org/abs/1711.06396
Zimmer, W., Ercelik, E., Zhou, X., Ortiz, X. J. D., Knoll, A., 2022. A survey of robust 3d object detection methods in point clouds. URL: https://arxiv.org/abs/2204.00106
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Javier Borau Bernad, José María Armingol Moreno, Araceli Sanchis de Miguel

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.