Application of Large Language Models in glaucoma diagnosis
DOI:
https://doi.org/10.17979/ja-cea.2025.46.12085Keywords:
Soporte a la toma de decisiones, Imágenes médicas y procesamiento, Identificación y validación, Formulación de modelos y diseño de experimentos, Procesamiento y sistemas de imágenes biomédicas y médicasAbstract
This research explores the potential of Visual Large Language-Language Models (Visual LLM) in the diagnosis of glaucoma from retinographies. Specifically, the use of the Visual LLM known as Moondream is analysed. Using transfer learning techniques, the model has been re-trained with retinal images, with the aim of learning to distinguish between healthy eyes and eyes with glaucomatous signs. The designed methodology combines visual feature extraction and textual reasoning, opening new ways for automated clinical interpretation. This work positions Visual LLMs as an attractive option for integrating multimodal Artificial Intelligence in Ophthalmology and improving glaucoma detection.
References
Batista, F. J. F., Diaz-Aleman, T., Sigut, J., Alayon, S., Arnay, R. & Angel-Pereira, D., 2020. RIM-ONE DL: A unified retinal image database for assessing glaucoma using deep learning. Image Analysis & Stereology 39, 161–167. DOI: 10.5566/ias.2346
Brzezinski, D., Stefanowski, J., Susmaga, R. & Szczȩch, I., 2018. Visual-based analysis of classification measures and their properties for class imbalanced problems. Information Sciences 462, 242–261. DOI: 10.1016/j.ins.2018.06.020
Chen, Y., Xu, D. W., Kee Wong, T. Y., Wong, J. & Liu, J., 2015. Glaucoma detection based on deep convolutional neural network. In: 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 715–718. DOI: 10.1109/EMBC.2015.7318462
Fan, R. et al., 2023. Detecting glaucoma from fundus photographs using deep learning without convolutions: Transformer for improved generalization. Ophthalmology Science 3, 100233. DOI: 10.1016/j.xops.2022.100233
Haouli, I.-E., Hariri, W., Seridi-Bouchelaghem, H., 2023. Exploring Vision Transformers for Automated Glaucoma Disease Diagnosis in Fundus Images, in: 2023 International Conference on Decision Aid Sciences and Applications (DASA). pp. 520–524. DOI: 10.1109/DASA59624.2023.10286714
Jonas, J. B., Aung, T., Bourne, R. R., Bron, A. M., Ritch, R. & Panda-Jonas, S., 2017. Glaucoma. The Lancet 390, 2183–2193. DOI: 10.1016/S0140-6736(17)31469-1
Li, L., Xu, M., Liu, H., Li, Y., Wang, X., Jiang, L., Wang, Z., Fan, X., Wang, N., 2020. A Large-Scale Database and a CNN Model for Attention-Based Glaucoma Detection. IEEE Transactions on Medical Imaging 39, 413–424. DOI: 10.1109/TMI.2019.2927226
Mienye, I. D. et al., 2024. A survey of explainable artificial intelligence in healthcare: Concepts, applications, and challenges. Informatics in Medicine Unlocked 51, 101587. DOI: 10.1016/j.imu.2024.101587
Moondream AI, 2024. Moondream.ai. [En línea]. Disponible en: https://moondream.ai/
Rajpurkar, P., Chen, E., Banerjee, O. & Topol, E. J., 2022. AI in health and medicine. Nature Medicine 28, 31–38. DOI: 10.1038/s41591-021-01614-0
Sallam, A., Gaid, A.S.A., Saif, W.Q.A., Kaid, H.A.S., Abdulkareem, R.A., Ahmed, K.J.A., Saeed, A.Y.A., Radman, A., 2021. Early Detection of Glaucoma using Transfer Learning from Pre-trained CNN Models, in: 2021 International Conference of Technology, Science and Administration (ICTSA). pp. 1–5. DOI: 10.1109/ICTSA52017.2021.9406522
Tan, T., Elangovan, K. & Ting, D., 2024. Fine-tuning large language model (LLM) artificial intelligence chatbots in ophthalmology and LLM-based evaluation using GPT-4. arXiv preprint 2402.10083. DOI: 10.48550/arXiv.2402.10083
Van, M.-H., Verma, P. & Wu, X., 2024. On large visual language models for medical imaging analysis: An empirical study. [En línea]. Disponible en: https://arxiv.org/abs/2402.14162
Vaswani, A. et al., 2023. Attention is all you need. [En línea]. Disponible en: https://arxiv.org/abs/1706.03762
Wang, Y. X., Panda-Jonas, S. & Jonas, J. B., 2021. Optic nerve head anatomy in myopia and glaucoma, including parapapillary zones alpha, beta, gamma and delta: Histology and clinical features. Progress in Retinal and Eye Research 83, 100933. DOI: 10.1016/j.preteyeres.2020.100933
Wassel, M., Hamdi, A. M., Adly, N. & Torki, M., 2022. Vision Transformers based classification for glaucomatous eye condition. In: 26th International Conference on Pattern Recognition (ICPR), pp. 5082–5088. DOI: 10.1109/ICPR56361.2022.9956086
Zhou, J. et al., 2024. Pre-trained multimodal large language model enhances dermatological diagnosis using SkinGPT-4. Nature Communications 15, 5649. DOI: 10.1038/s41467-024-50043-3
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Jorge Hernández Vidal, Eduardo José Barrios García, Silvia Alayón Miranda, Valentín Tinguaro Díaz Alemán

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.