Fusion of images and physiological signals for driver modeling
DOI:
https://doi.org/10.17979/ja-cea.2025.46.12121Keywords:
Human and vehicle interaction, Shared control, Cooperation and level of automation, Human-centered automation and design, Physiological Model, Information and sensor fusion, Neural networks, Autonomous VehiclesAbstract
At intermediate levels of automation, vehicles still rely on the driver as a backup, which requires estimating their cognitive and physical readiness to resume control when necessary. While the use of separate images or physiological signals has proven useful, combining them presents technical challenges due to the disparity in dimensionality. This work proposes a fusion architecture that transforms physiological signals into images for integration with visual data using autoencoders trained with perceptual loss functions. The architecture is initially validated with physiological signals transformed into images by analyzing different conversion techniques. Additionally, the efficacy of perceptual loss functions in autoencoders applied to images is evaluated, emphasizing their utility in preserving visual structure during reconstruction tasks. These findings underscore the approach’s viability and pave the way for future extensions that will integrate visual data and assess the system in more complex scenarios.
References
Bank, D., Koenigstein, N., Giryes, R., 2023. Autoencoders. Springer International Publishing, Cham. pp. 353–374. doi:https://doi.org/10.1007/978-3-031-24628-9_16.
Deng, M., Gluck, A., Zhao, Y., Li, D., Menassa, C.C., Kamat, V.R., Brinkley, J., 2024. An analysis of physiological responses as indicators of driver takeover readiness in conditionally automated driving. Accident Analysis & Prevention 195, 107372. doi:https://doi.org/10.1016/j.aap.2023.107372.
Kazemi, M., Rezaei, M., Azarmi, M., 2025. Evaluating driver readiness in conditionally automated vehicles from eye-tracking data and head pose. IET Intelligent Transport Systems 19, e70006. doi:https://doi.org/10.1049/itr2.70006.
Marcano, M., D´ıaz, S., P´erez, J., Irigoyen, E., 2020. A review of shared control for automated vehicles: Theory and applications. IEEE Transactions on Human-Machine Systems 50, 475–491. doi:https://doi.org/10.1109/THMS.2020.3017748.
Marwan, N., Carmen Romano, M., Thiel, M., Kurths, J., 2007. Recurrence plots for the analysis of complex systems. Physics Reports 438, 237–329. doi:https://doi.org/10.1016/j.physrep.2006.11.001.
Montoya, A., Holman, D., SF data science, Smith, T., Kan, W., 2016. State farm distracted driver detection. https://kaggle.com/competitions/state-farm-distracted-driver-detection. Kaggle.
Pihlgren, G.G., Sandin, F., Liwicki, M., 2020. Improving image autoencoder embeddings with perceptual loss, in: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–7. doi:https://doi.org/10.1109/IJCNN48605.2020.9207431.
Puertas-Ramirez, D., Fernandez-Matellan, R., Martin-Gomez, D., G. Boticario, J., Tena-Gago, D., 2023. Improving Autonomous Vehicle Automation Through Human-System Interaction. The 37th annual European Simulation and Modelling Conference , 294–300.
SAE International, 2021. Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles. Technical Report. SAE International. doi: https://doi.org/10.4271/J3016_202104.
Slavic, G., Alemaw, A.S., Marcenaro, L., Mart´ın G´omez, D., Regazzoni,C., 2023. A kalman variational autoencoder model assisted by odometric clustering for video frame prediction and anomaly detection. IEEE Transactions on Image Processing 32, 415–429. doi:https://doi.org/10.1109/TIP.2022.3229620.
Wang, J., Yang, X., Wang, Z., Wei, X., Wang, A., He, D., Wu, K., 2024. Efficient mixture-of-expert for video-based driver state and physiological multi-task estimation in conditional autonomous driving. arXiv preprint arXiv:2410.21086. doi:https://doi.org/10.48550/arXiv.2410.21086.
Weigl, K., Schartm¨uller, C., and, A.R., 2023. Development of the questionnaire on non-driving related tasks (qndrt) in automated driving: revealing age and gender differences. Behaviour & Information Technology 42, 1374–1388. doi: https://doi.org/10.1080/0144929X.2022.2073473.
Xu, H., Li, J., Yuan, H., Liu, Q., Fan, S., Li, T., Sun, X., 2020. Human activity recognition based on gramian angular field and deep convolutional neural network. IEEE Access 8, 199393–199405. doi: https://doi.org/10.1109/ACCESS.2020.3032699.
Yan, J., Kan, J., Luo, H., 2022. Rolling bearing fault diagnosis based on markov transition field and residual network. Sensors 22. doi: https://doi.org/10.3390/s22103936.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Raúl Fernández Matellán, David Martín Gómez, Arturo de la Escalera Hueso

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.