Abstract: The present disclosure provides a system (100) and method for predicting macular thickness from retinal fundus images (101) and clinical data (106) as a cost-effective alternative to OCT. The system segments the macular region using a trained U-Net convolutional neural network (104), extracts visual features from the segmented region using pre-trained CNN architectures, and processes clinical parameters such as diabetes duration and visual acuity. The visual and clinical features are combined (107) to form a unified feature vector which feeds into a trained regression model (116) that outputs predicted macular thickness values (109). Three models- InceptionV3, EfficientNetB0 and DenseNet121 are compared and the results are noted. The model achieves strong performance for EfficientNetB0 with a Mean Squared Error (MSE) of 0.2518 and, R2 Score of .7752 demonstrating good generalization ability on unseen data by effectively combining imaging and tabular features for accurate regression. The system integrates with telemedicine platforms for remote diagnosis, enabling early detection and monitoring of retinal diseases in resource-limited settings where OCT equipment is unavailable.
Description:FIELD OF THE INVENTION
[0001] The invention generally relates to the field of medical imaging and artificial intelligence (AI) and in particular, the present disclosure relates to the development of a system and method for predicting macular thickness from retinal fundus images and associated clinical data.
DESCRIPTION OF THE RELATED ART
[0002] Retinal disease diagnosis and monitoring is a critical area in ophthalmology. Various methods and techniques have been proposed to assess retinal conditions, including fundus photography, optical coherence tomography (OCT), fluorescein angiography, and visual field testing. Recently, with the advancements in artificial intelligence and deep learning techniques, researchers have explored the use of these technologies in retinal disease diagnosis and management.
[0003] Convolutional neural networks (CNNs) and deep learning models are some of the popular techniques used for retinal image analysis. Gulshan et al. proposed methods that utilize CNNs to detect diabetic retinopathy from fundus images. Burlina et al. described methodologies that make use of deep learning and transfer learning to achieve intelligent classification of age-related macular degeneration. Gargeya and Leng proposed methods for diabetic retinopathy detection based on CNNs that exploit spatial information and achieve superior performance and robustness. However, the quantitative assessment of macular thickness without OCT poses a significant challenge in developing accurate and robust models. Schlegl et al. proposed various techniques to address this issue, including multi-modal fusion, transfer learning, and regression-based approaches. Poplin et al. proposed methods that utilize CNN-based feature extraction from fundus images to predict cardiovascular risk factors. De Fauw et al. developed approaches that leverage both imaging and clinical data to improve diagnostic accuracy. Schmidt-Erfurth et al. proposed methods for estimating retinal parameters using deep learning on OCT images, but these methods cannot be applied to settings where OCT is unavailable. In real clinical scenarios, it is challenging to access expensive OCT equipment in many settings, particularly in low-resource environments.
[0004] Macular thickness assessment is critical in ophthalmology to ensure early detection, proper management of retinal diseases, and prevention of vision loss. To enhance diagnostic capabilities for different retinal conditions, data-driven approaches have been proposed. Data-driven retinal assessment includes image acquisition, feature extraction, and prediction. Necessary images are acquired through fundus cameras that are widely available in clinical settings. Feature extraction techniques such as time-frequency analysis, segmentation techniques, and convolutional neural networks (CNNs) aim to remove redundant information, reduce dimensionality, and enhance relevant features in the images. In the prediction stage, extracted features are used as input for machine learning models. In the context of obtaining OCT measurements in resource-limited settings for the purpose of capturing macular thickness can be challenging due to various reasons. Implementing OCT in remote or underdeveloped regions can be costly, not only in terms of the direct financial cost of the equipment but also in terms of the infrastructure and training required. Further, maintaining OCT devices may not be technically feasible due to the lack of technical expertise in certain regions. Also, the limited availability of OCT may lead to delayed diagnosis and treatment for patients in these areas. Furthermore, acquiring OCT scans can be a time-consuming process that may not be feasible in high-volume clinical settings with limited resources.
[0005] It is not possible to obtain OCT measurements and capture macular thickness data from all patients in resource-limited settings where such high-capacity equipment is not available. This imposes a major challenge to the implementation of data-driven methods based on conventional OCT techniques for accurate macular thickness assessment.
[0006] Therefore, there is a requirement for developing an intelligent macular thickness prediction system using widely available fundus images and clinical data that can serve as a cost-effective alternative to OCT.
OBJECTS OF THE PRESENT DISCLOSURE
[0007] An object of the present disclosure is to provide a system and method for cost-effective macular thickness prediction which receives input samples that include retinal fundus images and clinical data of patients.
[0008] Another object of the present disclosure is to provide a system which segments the macular region from retinal fundus images using a trained U-Net convolutional neural network model.
[0009] Another object of the present disclosure is to a system which extracts visual features from the segmented macular region using convolutional neural networks and processes clinical data to generate clinical features.
[0010] Another object of the present disclosure is to a system which combines the visual features and clinical features to form a unified feature vector for input into a trained regression model.
[0011] Another object of the present disclosure is to a system which predicts macular thickness values from the combined feature vector, providing a cost-effective alternative to optical coherence tomography (OCT) for retinal assessment.
SUMMARY
[0012] In an aspect, the present disclosure relates to a system to predict macular thickness from retinal fundus images. The present disclosure relates to an AI-driven system and method in the field of medical imaging that provides a cost-effective alternative to OCT for macular thickness prediction, enabling widespread accessibility for retinal disease diagnosis. The system includes a processor and a memory communicatively coupled to the processor, where said memory stores instructions to be executed by the processor, causes the processor to receive one or more input samples. The one or more input samples include a retinal fundus image and associated clinical data of a patient. The processor segments a macular region from the retinal fundus image using a trained segmentation module including a U-Net CNN model. The processor extracts visual features from the segmented macular region using a visual feature extraction module. The processor processes the clinical data to generate clinical features using a clinical feature extraction module. The processor combines the visual features from the visual feature extraction module and the clinical features from the clinical feature extraction module to form a combined feature vector using a feature fusion module. The processor inputs the combined feature vector into a trained regression module. The processor outputs a predicted macular thickness value based on the regression module.
[0013] In an embodiment, the trained segmentation module includes a U-Net convolutional neural network configured to detect and isolate the macular region in the retinal fundus image.
[0014] In an embodiment, the clinical data include diabetes duration and visual acuity.
[0015] In an embodiment, the trained segmentation module is trained on a dataset of 70 manually annotated fundus images for building 70 masks where the macular region has been precisely delineated using a Computer Vision Annotation Tool (CVAT).
[0016] In an embodiment, the system further provides a confidence score associated with each macular thickness prediction.
[0017] In an embodiment, the system is configured to automatically generate a report indicating the predicted macular thickness value and associated risk of retinal diseases.
[0018] In an aspect, the present disclosure relates to a method for predicting macular thickness from retinal fundus images. The method includes receiving, by a processor associated with a system, one or more input samples, wherein the one or more input samples include a retinal fundus image and associated clinical data of a patient. The method includes segmenting, by the processor, a macular region from the retinal fundus image using a trained segmentation module including a U-Net CNN model. The method includes extracting, by the processor, visual features from the segmented macular region using a visual feature extraction module. The method includes processing, by the processor, the clinical data to generate clinical features using a clinical feature extraction module. The method includes combining, by the processor, the visual features from the visual feature extraction module and the clinical features from the clinical feature extraction module to form a combined feature vector using a feature fusion module. The method includes inputting, by the processor, the combined feature vector into a trained regression module. The method includes outputting, by the processor, a predicted macular thickness value based on the regression module.
[0019] In an embodiment, processing the clinical data to generate clinical features using the clinical feature extraction module includes normalizing continuous clinical variables to have a mean of 0 and a standard deviation of 1 and converting categorical clinical variables into a numerical format using one-hot encoding. Further, imputing missing values in the clinical data using mean imputation for continuous variables and mode imputation for categorical variables and transforming the preprocessed clinical data into a feature vector using convolutional neural network.
[0020] In an embodiment, processing the clinical data comprises normalizing continuous clinical variables and converting categorical clinical variables into a numerical format.
[0021] In an embodiment, the method includes integrating the system with a telemedicine platform to enable remote diagnosis and monitoring of retinal diseases.
[0022] In an embodiment, the convolutional neural network used for macular region segmentation achieves a validation accuracy of approximately 97.74%, making it a reliable tool for macular region identification in fundus images.
[0023] In an embodiment, the system is particularly beneficial in low-resource settings where access to optical coherence tomography (OCT) equipment is limited, thereby democratizing access to retinal diagnostics.
[0024] In an embodiment, the system enables early detection of retinal diseases by providing a cost-effective alternative to OCT, which may lead to timely interventions and improved patient outcomes.
BRIEF DESCRIPTION OF DRAWINGS
[0025] The accompanying drawings, which are incorporated herein, and constitute a part of this invention, illustrate exemplary embodiments of the disclosed methods and systems which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present invention. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that the invention of such drawings includes the invention of electrical components, electronic components or circuitry commonly used to implement such components.
[0026] FIG. 1 illustrates an exemplary block diagram of an AI-driven macular thickness prediction system, in accordance with an embodiment of the present disclosure.
[0027] FIG. 2 illustrates 12 fundus images and their corresponding macular masks from patients with diabetic retinopathy, in accordance with an embodiment of the present disclosure.
[0028] FIG. 3 illustrates input fundus images and their corresponding predicted macular masks generated by the trained U-Net model, in accordance with an embodiment of the present disclosure.
[0029] FIG. 4 illustrates an exemplary architecture of flowchart depicting working proposed system, in accordance with an embodiment of the present disclosure.
[0030] FIG. 5 illustrates an exemplary flowchart of a method for predicting macular thickness from retinal fundus images, in accordance with an embodiment of the present disclosure.
[0031] FIG. 6 illustrates comparative performance plots of different convolutional neural network models (InceptionV3, EfficientNetB0, and DenseNet121) used for feature extraction, depicting both training versus validation mean squared error and predicted versus true macular thickness values, in accordance with an embodiment of the present disclosure.
DETAILED DESCRIPTION
[0032] While the present disclosure has been disclosed with reference to certain embodiments, it will be understood by those skilled in the art that various changes may be made, and equivalents may be substituted without departing from the scope of the invention. In addition, many modifications may be made to adapt to a particular situation or material to the teachings of the invention without departing from its scope.
[0033] Throughout the specification and claims, the following terms take the meanings explicitly associated herein unless the context clearly dictates otherwise. The meaning of "a", "an", and "the" include plural references. The meaning of "in" includes "in" and "on." Referring to the drawings, like numbers indicate like parts throughout the views. Additionally, a reference to the singular includes a reference to the plural unless otherwise stated or inconsistent with the disclosure herein.
[0034] The present disclosure discloses an artificial intelligence (AI) driven system that leverages convolutional neural networks and regression modeling to predict macular thickness from retinal fundus images and clinical data. This method effectively eliminates the need for expensive optical coherence tomography (OCT) equipment by segmenting the macular region from fundus images, extracting visual features, and combining them with processed clinical data to form a unified feature vector. The resulting vector is then input into a trained regression model to generate accurate macular thickness predictions, enabling cost-effective screening and monitoring of retinal diseases in resource-limited settings.
[0035] Various embodiments of the present disclosure are described using FIGs. 1 to 6.
[0036] FIG. 1 illustrates an exemplary block diagram of an AI-driven macular thickness prediction system, in accordance with an embodiment of the present disclosure.
[0037] As illustrated in FIG. 1, a block diagram showcases a complete AI-driven macular thickness prediction system (100) (interchangeably referred to as system (100), hereinafter) with integrated hardware and software components. The system (100) includes a processor (110) for executing instructions, a memory (111) for storing temporary data and instructions, a storage device (112) for permanent data storage, and an input device (113) such as fundus cameras and clinical data entry interfaces, which are communicatively coupled through a system bus to support the specialized software modules. The FIG. 1 illustrates the comprehensive data flow: retinal fundus images (101) and clinical data (106) are processed through a segmentation module (102) that isolates macular region (114), a segmentation model trained using CVAT annotations (103). The segmented region undergoes visual feature extraction via CNN (105) while clinical data is processed in parallel through its feature extraction pathway (105). These features converge at a feature fusion module (107) to create a unified feature vector that feeds into a regression module (112), ultimately generating macular thickness predictions with confidence scores (109). This architecture demonstrates ability of the system (100) to integrate multimodal data for accurate macular thickness assessment without requiring specialized OCT equipment.
[0038] In an embodiment, the system (100) employs a trained segmentation model that is specifically developed using a dataset of 70 manually annotated fundus images. These images were meticulously processed using the Computer Vision Annotation Tool (CVAT), which precisely delineate the macular region in each fundus image. The annotation process involved identifying the boundaries of the macula with pixel-level precision, creating binary masks that serve as ground truth for the segmentation model training. The CVAT tool enabled efficient and accurate annotation by providing specialized features for medical image labeling, including zoom capabilities, contrast adjustment, and precise boundary marking tools. The resulting dataset of annotated images encompasses diverse retinal appearances across different pathologies, particularly focusing on diabetic retinopathy cases, to ensure the model's robustness and generalizability. This comprehensive training dataset enables the segmentation model to accurately identify and isolate the macular region in previously unseen fundus images.
[0039] FIG. 2 illustrates 12 sample fundus images along with their corresponding macular masks from patients with diabetic retinopathy, in accordance with an embodiment of the present disclosure.
[0040] As illustrated in FIG. 2, each macular mask serves as ground truth for training the U-Net model to identify and isolate the macular region in unseen images. These images depict the diversity of fundus appearances across patients with diabetic retinopathy and demonstrate the variety of macular regions that the system must accurately segment. The left side depicts the original fundus images with varying pigmentation, vessel visibility, and pathological features, while the right side depicts the corresponding masks where the macular region is clearly delineated.
[0041] In an embodiment, the U-Net model employed in the present disclosure includes an input layer accepting RGB fundus images of size 512×512 pixels; four down-sampling blocks in the encoder, each consisting two 3×3 convolutional layers with ReLU activation followed by a 2×2 max pooling layer; a bottleneck with two 3×3 convolutional layers with ReLU activation; four up-sampling blocks in the decoder, each consisting a 2×2 transposed convolution followed by two 3×3 convolutional layers with ReLU activation; skip connections that concatenate feature maps from the encoder to the corresponding decoder blocks; and a final 1×1 convolutional layer with sigmoid activation to produce a binary mask of the macular region.
[0042] In an embodiment, the U-Net model is trained using a binary cross-entropy loss function and the Adam optimizer with a learning rate of 0.001. Training is conducted with a batch size of 4 over 100 epochs.
[0043] In an embodiment, the trained U-Net model achieved a training accuracy of 93.54 % and a validation accuracy of 90.61 %, demonstrating excellent performance in macular region segmentation.
[0044] FIG. 3 illustrates input fundus images and their corresponding predicted macular masks generated by the trained U-Net model, in accordance with an embodiment of the present disclosure.
[0045] As illustrated in FIG. 3, results of the trained U-Net model on unseen fundus images, showcasing ability of the model to accurately segment the macular region across different fundus images with varying characteristics. Following successful segmentation of the macular region, the system extracts meaningful visual features from this region using a pre-trained convolutional neural network (CNN). This figure demonstrates the model's ability to accurately segment the macular region in unseen fundus images. The predicted masks closely match the actual macular regions, highlighting the precision of the segmentation model with different retinal appearances.
[0046] FIG. 4 illustrates an exemplary flowchart (400) depicting working of a regression model for macular thickness prediction, in accordance with an embodiment of the present disclosure. At step (402), the processor (110) receives the fundus image from the user interface or connected imaging device. At step (403), the system segments the macular region from the fundus image using the trained U-Net model. At step (404), the system extracts visual features from the segmented macular region through a convolutional neural network. At step (405), the system processes the patient's clinical data to generate clinical features. At step (406), the system combines the visual features and clinical features to form a unified feature vector. At step (407), the system inputs the combined feature vector into the trained regression model. At step (408), the system outputs the predicted macular thickness value with an associated confidence score based on the model's uncertainty estimation.
[0047] In an embodiment, the present disclosure employs a U-Net convolutional neural network (404) for the segmentation of the macular region from retinal fundus images. The U-Net architecture, as illustrated in FIG. 1, includes an encoder-decoder structure with skip connections that allow precise localization of the macular region within the fundus image. The segmentation process follows these steps: First, the input fundus image is normalized and resized to a uniform dimension of 256 x 256 pixels. Next, the image is processed through the encoder path, which consists of four blocks, each containing two 3×3 convolutional layers with ReLU activation followed by a 2×2 max pooling operation. The number of feature channels starts at 64 in the first layer and doubles with each downsampling step, reaching 1024 at the bottleneck. Then, the decoder path, also including four blocks, each containing a 2×2 transposed convolution (upsampling) followed by two 3×3 convolutional layers with ReLU activation, reconstructs the spatial resolution. Skip connections concatenate corresponding encoder feature maps to decoder feature maps to preserve spatial information. Finally, a 1×1 convolutional layer with sigmoid activation produces a binary segmentation mask identifying the macular region.
[0048] FIG. 5 illustrates an exemplary flowchart of a method (500) for predicting macular thickness from retinal fundus images, in accordance with an embodiment of the present disclosure. At step (502), the method (500) includes receiving by a processor (110), one or more input samples, wherein the one or more input samples include a retinal fundus image (101) and associated clinical data (106) of a patient.
[0049] Continuing further, at step (504), the method (500) includes segmenting a macular region from the retinal fundus image (101) using a U-Net convolutional neural network (404). Further, the U-Net convolutional neural network (404) is trained on a dataset of manually annotated fundus images to precisely delineate the macular region.
[0050] Continuing further, at step (506), the method (500) includes extracting visual features from the segmented macular region using a convolutional neural network. Further, the convolutional neural network processes the segmented macular region to identify relevant visual characteristics for predicting macular thickness.
[0051] Continuing further, at step (508), the method (500) includes processing the clinical data (106) to generate clinical features. The clinical features include processed information about the patient's diabetes duration and visual acuity.
[0052] Continuing further, at step (510), the method (500) includes combining the visual features and the clinical features to form a unified feature vector. The unified feature vector integrates the complementary information from both the fundus image and clinical parameters.
[0053] Continuing further, at step (512), the method (500) includes inputting the unified feature vector into a trained regression model. The regression model is trained to predict macular thickness from the combined visual and clinical features.
[0054] Continuing further, at step (514), the method (500) includes generating, by the regression model, a predicted macular thickness value and an associated confidence score based on the unified feature vector. The confidence score indicates the reliability of the prediction.
[0055] The method steps described in FIG. 5 provide a comprehensive approach to predict macular thickness without the need for expensive Optical Coherence Tomography (OCT) equipment, thereby enabling early detection and monitoring of retinal diseases in resource-constrained settings.
[0056] In an embodiment, transfer learning is employed to fine-tune the pre-trained CNN on a dataset of macular images to enhance its feature extraction capabilities for this specific domain. Fine-tuning is performed by freezing the early layers of the network while allowing the deeper layers to be updated during training, enabling the model to adapt to the specific characteristics of macular images while leveraging the general feature extraction capabilities learned from a large-scale dataset.
[0057] In an embodiment, the system (100) processes clinical data associated with each patient to generate clinical features that complement the visual features extracted from the fundus images. The clinical data includes various parameters relevant to retinal health and systemic conditions that may influence macular thickness.
[0058] In an embodiment, the clinical data includes diabetes duration and visual acuity.,
[0059] In an embodiment, the system (100) combines the visual features extracted from the segmented macular region and the clinical features derived from patient data to form a unified feature vector. This combined approach leverages both the imaging characteristics of the macula and the relevant clinical information, providing a comprehensive basis for accurate macular thickness prediction.
[0060] FIG. 6 illustrates comparative performance plots of different convolutional neural network models (InceptionV3, EfficientNetB0, and DenseNet121) used for feature extraction, depicting both training versus validation mean squared error and predicted versus true macular thickness values, in accordance with an embodiment of the present disclosure.
[0061] In an embodiment, the combined feature vector is input into a trained regression model to predict the macular thickness value. The regression model is designed to capture the complex relationships between the visual and clinical features and the corresponding macular thickness. Three models (InceptionV3, EfficientNetB0 and DenseNet121) are compared and the results are noted.
[0062] In an embodiment, the regression model includes a series of fully connected layers: an input layer accepting the combined feature vector (1×2050); formed by concatenating a 1×2048 visual feature vector with a 1×2 clinical feature vector; a hidden layer with 128 units and ReLU activation; a hidden layer with 64 units and ReLU activation a dropout layer with a rate of 0.3 to prevent overfitting; and an output layer with a single unit and linear activation, representing the predicted macular thickness in micrometers.
[0063] In an embodiment, the regression model is trained using a dataset of 3000 fundus images with corresponding ground truth macular thickness values obtained from OCT scans. The dataset is split into training (80 %), testing (20 %) with 10% of the training data reserved for validation during training.
[0064] The model is trained using the mean squared error (MSE) loss function and the Adam optimizer with a learning rate of 0.0001 (1e-4). Early stopping is employed based on the validation loss to prevent overfitting, with a patience of 15 epochs to prevent overfitting and the best weights are restored after training.
[0065] EfficientNetB0 demonstrated the most favourable performance. It achieved the lowest test Mean Squared Error (MSE) of 0.2518 and the highest R² score of 0.7752This demonstrates the model's ability to provide clinically relevant predictions of macular thickness from fundus images and clinical data.
[0066] In an embodiment, the system is configured to automatically generate a report indicating the predicted macular thickness and associated risk of retinal diseases. The report includes the original fundus image; the segmented macular region; the predicted macular thickness value with confidence score; a risk assessment for various retinal conditions based on the predicted macular thickness; and recommendations for follow-up based on the predicted values and risk assessment.
[0067] In an embodiment, the report is designed to be comprehensive yet concise, providing clinicians with actionable information for patient management. The report can be exported in various formats, including PDF and DICOM, for integration into electronic health record systems.
[0068] In an embodiment, the system is integrated with telemedicine platforms to enable remote diagnosis and monitoring of retinal diseases. This integration allows secure transmission of fundus images and clinical data from remote locations; cloud-based processing and prediction of macular thickness; real-time feedback and reporting to remote healthcare providers; longitudinal tracking of macular thickness changes over time; and alert systems for significant changes that may require intervention.
[0069] In an embodiment, the telemedicine integration extends the reach of specialized retinal care to underserved populations and regions with limited access to advanced imaging technologies such as OCT.
Table 1: Pattern in Fundus images for different eye diseases
[0070] In an embodiment, the predicted macular thickness value is used for diagnosing or monitoring various retinal conditions (Table 1), including but not limited to: diabetic macular edema (DME); diabetic retinopathy (DR);
[0071] In an embodiment, by democratizing access to retinal diagnostics, the system contributes to reducing the burden of preventable blindness in underserved communities.
[0072] In an embodiment, the system's performance is evaluated using various metrics to ensure its clinical utility: Mean squared error and Visual acuity.
[0073] In an embodiment, validation is performed on diverse datasets representing different patient populations, imaging conditions, and retinal pathologies to ensure generalizability.
[0074] In an embodiment, the system incorporates mechanisms for continuous learning and improvement: feedback loops where OCT-confirmed measurements are used to refine the model; regular retraining on expanded datasets to improve generalization; adaptation to different fundus camera specifications and image characteristics; and incorporation of new clinical parameters as they become relevant to macular thickness prediction.
[0075] In an embodiment, this continuous improvement mechanism ensures that the system remains accurate and relevant as clinical knowledge and imaging technologies evolve.
[0076] In an embodiment, the system is implemented using Python with deep learning frameworks such as TensorFlow or PyTorch. The implementation includes modular architecture for easy maintenance and updates; efficient image processing pipelines optimized for performance; secure data handling in compliance with healthcare regulations; user-friendly interfaces for healthcare providers; comprehensive logging for audit trails and quality assurance; and scalable cloud infrastructure for handling variable workloads.
[0077] In an embodiment, the implementation prioritizes both accuracy and efficiency, enabling real-time processing of fundus image.
EXAMPLES
Example 1: Macular Segmentation
[0078] A U-Net model was trained on 70 annotated fundus images to segment the macular region. The model architecture included convolutional layers with 3x3 kernels and ReLU activation, max pooling layers with 2x2 kernels for downsampling, and transpose convolutions for upsampling. The number of feature channels doubled at each downsampling level, starting from 64 and reaching 1024 in the bottleneck.
[0079] The model was trained using the Adam optimizer with a binary cross-entropy loss function. Training was performed with a batch size of 4 over 100 epochs. The model achieved a training loss of 0.0548 and a training accuracy of 93.54 and validation accuracy of 90.61 %.
[0080] When applied to unseen images, the trained U-Net model successfully identified and segmented the macular region, demonstrating its robustness and generalizability.
Example 2: Macular Thickness Prediction
[0081] A dataset which is augmented to 3000 fundus images with corresponding OCT-measured macular thickness values and clinical data was used to train and evaluate the regression model. The macular regions were segmented using the trained U-Net model from Example 1.
[0082] In many real-world medical applications, relying only on imaging data can miss important clinical insights. Therefore, we designed a hybrid model that combines image features and clinical tabular data for a more comprehensive prediction system. It is a customized deep learning model that fuses:
Convolutional layers (CNN) for fundus image feature extraction.
Tabular input layers for patient-specific numerical data (e.g., Visual Acuity, Diabetes Duration).
A joint feature fusion mechanism to predict a continuous regression target (Retinal Thickness).
[0083] The images are first resized to 128×128 pixels, normalized, and augmented using techniques such as rotation, shifting, and flipping to enhance data diversity. Clinical data is simultaneously processed by standardizing two key tabular features: Visual Acuity (VA) and Diabetes Duration. To mitigate overfitting and improve model robustness, the dataset is augmented to approximately 3000 samples. The model adopts a hybrid architectures featuring a CNN including InceptionV3,EfficientNetB0 and DenseNet121., to capture detailed visual features for training, the model uses Mean Squared Error (MSE) as the loss function—ideal for regression tasks—and is optimized with the Adam optimizer at a learning rate of 1e-4. Techniques like Early Stopping and Model Checkpoint are incorporated to ensure better generalization and performance stability.
[0084] A study began with 176 original samples, which were augmented to create a dataset of 3000 samples. The data was then split into a training set of 2600 samples including validation, and a testing set of 400 samples. Training was conducted over 40 epochs. Throughout subsequent epochs, both the training and validation losses continued to decline steadily, reflecting effective learning. The model consistently saved progress after each epoch, although warnings indicated that the HDF5 file format used for saving is considered legacy, recommending the newer Keras format instead. From epochs 20 to 30, the validation loss fluctuated slightly but remained low, suggesting that the model had reached a stable and effective learning state, achieving a final small validation loss.
Example 3: Testing the hybrid model
[0085] The trained hybrid model was evaluated on a small independent test set consisting of 10 fundus images and corresponding clinical data. Images were preprocessed by resizing to 128×128 and normalized, while clinical features (Visual Acuity and Diabetes Duration) were scaled using the same scaler from training.
[0086] Predictions were made on the new samples, and the predicted macular thickness values were compared against the ground truth measurements. Evaluation metrics including Mean Squared Error (MSE) and R 2 Score were computed to assess model performance. The trained model was tested on a new set of 600 fundus images combined with clinical data (Visual Acuity and Diabetes Duration). After preprocessing and scaling, the model predicted macular thickness values, achieving strong performance for three models, where EfficientNetB0 got superior result.
[0087] These results demonstrate the model’s good generalization ability on unseen data by effectively combining imaging and tabular features for accurate regression. The model demonstrated promising generalization on unseen data, indicating its ability to effectively integrate visual and clinical information for accurate thickness prediction.
[0088] Thus, the present disclosure disclose the system and method for predicting macular thickness from retinal fundus images and clinical data, providing a cost-effective alternative to OCT that demonstrates high accuracy and enables widespread accessibility for retinal disease diagnosis in resource-limited settings.
[0089] While considerable emphasis has been placed herein on the preferred embodiments, it will be appreciated that many embodiments can be made and that many changes can be made in the preferred embodiments without departing from the principles of the invention. These and other changes in the preferred embodiments of the invention will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter to be implemented merely as illustrative of the invention and not as limitation.
ADVANTAGES OF THE PRESENT DISCLOSURE
[0090] The present disclosure leverages widely available retinal fundus images and clinical data to predict macular thickness, eliminating the need for expensive and inaccessible optical coherence tomography (OCT) equipment.
[0091] The present disclosure provides a cost-effective alternative for macular thickness assessment that can be deployed in resource-limited settings, democratizing access to advanced retinal diagnostics.
[0092] The present disclosure generates accurate macular thickness predictions with a clinically acceptable margin of error, enabling reliable screening and monitoring of various retinal conditions.
[0093] The present disclosure integrates with telemedicine platforms, extending specialized retinal care to underserved populations and regions with limited access to advanced imaging technologies.
[0094] The present disclosure enables early detection of sight-threatening conditions through widespread screening, potentially reducing the global burden of preventable blindness.
, Claims:1. A system (100) to predict macular thickness from retinal fundus images, comprising:
a processor (124); and
a memory (126) communicatively coupled to the processor (124), said memory (126) storing instructions to be executed by the processor (124), which when executed, causes the processor (124) to:
receive one or more input samples, wherein the one or more input samples comprise a retinal fundus image (102) and associated clinical data (118) of a patient;
segment a macular region from the retinal fundus image (102) using a trained segmentation module (104) comprising a U-Net CNN model (112);
extract visual features from the segmented macular region obtained from the trained segmentation module (104) using a visual feature extraction module (108);
process the clinical data (118) to generate clinical features using a clinical feature extraction module (120);
combine the extracted visual features from the visual feature extraction module (108) and the generated clinical features from the clinical feature extraction module (120) to form a combined feature vector using a feature fusion module (114);
input the combined feature vector from the feature fusion module (114) into a trained regression module (116); and
output a predicted macular thickness value (122) based on processing of the combined feature vector by the trained regression module (116).
2. The system (100) to predict macular thickness as claimed in claim 1, wherein the trained segmentation module (104) comprises a U-Net convolutional neural network (112) configured to detect and isolate the macular region in the retinal fundus image (102).
3. The system (100) to predict macular thickness as claimed in claim 1, wherein the trained segmentation module (104) is trained on a dataset of 70 manually annotated fundus images for building 70 masks where the macular region has been precisely delineated using a Computer Vision Annotation Tool (CVAT) (406).
4. The system (100) to predict macular thickness as claimed in claim 1, wherein the clinical data (118) comprises diabetes duration and visual acuity.
5. The system (100) to predict macular thickness as claimed in claim 1, wherein visual features are extracted from the segmented macular region using a pre-trained convolutional neural network within the visual feature extraction module (108).
6. The system (100) to predict macular thickness as claimed in claim 1, wherein the system (100) further provides a confidence score associated with each macular thickness prediction (122).
7. The system (100) to predict macular thickness as claimed in claim 1, wherein the system (100) is configured to automatically generate a report indicating the predicted macular thickness value (122) and associated risk of retinal diseases.
8. The system (100) to predict macular thickness as claimed in claim 1, wherein processing the clinical data (118) to generate clinical features using the clinical feature extraction module (120) comprises:
normalizing continuous clinical variables to have a mean of 0 and a standard deviation of 1;
converting categorical clinical variables into a numerical format using one-hot encoding.
9. A method (500) for predicting macular thickness from retinal fundus images, the method comprising:
receiving (502), by a processor (124), one or more input samples, wherein the one or more input samples comprise a retinal fundus image (102) and associated clinical data (118) of a patient;
segmenting (504), by the processor (124), a macular region from the retinal fundus image (102) using a trained segmentation module (104) comprising a U-Net CNN model (112);
extracting (506), by the processor (124), visual features from the segmented macular region obtained from the trained segmentation module (104) using a visual feature extraction module (108);
processing (508), by the processor (124), the clinical data (118) to generate clinical features using a clinical feature extraction module (120);
combining (510), by the processor (124), the extracted visual features from the visual feature extraction module (108) and the generated clinical features from the clinical feature extraction module (120) to form a combined feature vector using a feature fusion module (114);
inputting (512), by the processor (124), the combined feature vector from the feature fusion module (114) into a trained regression module (116); and
outputting (514), by the processor (124), a predicted macular thickness value (122) based on processing of the combined feature vector by the trained regression module (116).
10. The method (500) as claimed in claim 9, wherein processing the clinical data (118) to generate clinical features comprises:
normalizing continuous clinical variables and converting categorical clinical variables into a numerical format.
| # | Name | Date |
|---|---|---|
| 1 | 202541054421-STATEMENT OF UNDERTAKING (FORM 3) [05-06-2025(online)].pdf | 2025-06-05 |
| 2 | 202541054421-REQUEST FOR EXAMINATION (FORM-18) [05-06-2025(online)].pdf | 2025-06-05 |
| 3 | 202541054421-REQUEST FOR EARLY PUBLICATION(FORM-9) [05-06-2025(online)].pdf | 2025-06-05 |
| 4 | 202541054421-FORM-9 [05-06-2025(online)].pdf | 2025-06-05 |
| 5 | 202541054421-FORM FOR SMALL ENTITY(FORM-28) [05-06-2025(online)].pdf | 2025-06-05 |
| 6 | 202541054421-FORM 18 [05-06-2025(online)].pdf | 2025-06-05 |
| 7 | 202541054421-FORM 1 [05-06-2025(online)].pdf | 2025-06-05 |
| 8 | 202541054421-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [05-06-2025(online)].pdf | 2025-06-05 |
| 9 | 202541054421-EVIDENCE FOR REGISTRATION UNDER SSI [05-06-2025(online)].pdf | 2025-06-05 |
| 10 | 202541054421-EDUCATIONAL INSTITUTION(S) [05-06-2025(online)].pdf | 2025-06-05 |
| 11 | 202541054421-DRAWINGS [05-06-2025(online)].pdf | 2025-06-05 |
| 12 | 202541054421-DECLARATION OF INVENTORSHIP (FORM 5) [05-06-2025(online)].pdf | 2025-06-05 |
| 13 | 202541054421-COMPLETE SPECIFICATION [05-06-2025(online)].pdf | 2025-06-05 |
| 14 | 202541054421-RELEVANT DOCUMENTS [18-08-2025(online)].pdf | 2025-08-18 |
| 15 | 202541054421-FORM-5 [18-08-2025(online)].pdf | 2025-08-18 |
| 16 | 202541054421-FORM 13 [18-08-2025(online)].pdf | 2025-08-18 |
| 17 | 202541054421-ENDORSEMENT BY INVENTORS [18-08-2025(online)].pdf | 2025-08-18 |
| 18 | 202541054421-Proof of Right [20-08-2025(online)].pdf | 2025-08-20 |
| 19 | 202541054421-FORM-26 [20-08-2025(online)].pdf | 2025-08-20 |
| 20 | 202541054421-OTHERS [23-08-2025(online)].pdf | 2025-08-23 |
| 21 | 202541054421-EDUCATIONAL INSTITUTION(S) [23-08-2025(online)].pdf | 2025-08-23 |
| 22 | 202541054421-FORM-8 [29-08-2025(online)].pdf | 2025-08-29 |