System And Method For Retinal Fundus Image Semantic Segmentation

Abstract: A computer implemented system 100 for processing a fundus image of a patient is disclosed. The system comprises a graphical user interface 103i comprising interactive elements 103h configured to enable capture and analysis of the fundus image; a reception means 103a adapted to receive an input from an image capturing device based on a plurality of parameters of the image capturing device; an interactive fundus image rendering means 103b adapted to dynamically render the input; a fundus image capture means 103c adapted to capture the fundus image based on the dynamically rendered input; a processing means 103g adapted to train a deep convolutional neural network; process the fundus image to identify an occurrence of each of the candidate objects; and generate a candidate segment mask for the received fundus image based on the processed fundus image.

Patent Information

Application #

Filing Date

20 March 2018

Publication Number

39/2019

Publication Type

INA

Invention Field

BIO-MEDICAL ENGINEERING

Status

Email

kraji@artelus.com

Parent Application

Patent Number

Legal Status

Grant Date

2021-08-16

Renewal Date

Applicants

ARTIFICIAL LEARNING SYSTEMS INDIA PVT LTD

Hansa Complex, 1665/A, second floor, 14th Main, 7th, sector, HSR Layout, HSR Layout, Bengaluru, Karnataka 560102, India.

Inventors

1. Pradeep Walia

6138 Boundary Road, Downers Grove, Illinois 60516, USA

2. Raja Raja Lakshmi

No.139 2nd Cross, 7th Block, Koramangala, Bangalore 560095, Karnataka, India.

3. Mrinal Haloi

C/O: Kanak Ch. Haloi,HN: 01,Pashim Barpit,Village Bhojkuchi,PO: Haribhanga District Nalbari, Assam 781378

Specification

Claims:We claim:
1. A method for processing a fundus image using a computer implemented system 100, said method comprising:
- receiving the fundus image of a patient;
- training a deep convolutional neural network over a training dataset, said training dataset comprising a plurality of training fundus images and a predetermined candidate segment mask associated with each of the training fundus images, wherein the predetermined candidate segment mask associated with a training fundus image is based on a pixel intensity annotation for a plurality of candidate objects present in the training fundus image;
- processing the fundus image to identify an occurrence of each of the candidate objects throughout the fundus image by the trained deep convolutional neural network; and
- generating a candidate segment mask for the received fundus image based on the processed fundus image by the trained deep convolutional neural network.
2. The method as claimed in claim 1, wherein said processing the fundus image to identify the occurrence of each of the candidate objects throughout the fundus image by the trained deep convolutional neural network, comprising:
determining a candidate object category for each pixel in the fundus image based on a pixel intensity; and
achieving semantic segmentation of the fundus image based on determined candidate object category of the pixels.

3. The method as claimed in claim 1, wherein the candidate object is a pathology indicator, an artefact, a retinal feature or the like.
4. The method as claimed in claim 3, wherein the pathology indicator indicates one or more retinal diseases.
5. The method as claimed in claim 4, wherein the retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like.
6. A computer implemented system 100 for processing a fundus image of a patient, comprising:
at least one processor;
a non-transitory computer readable storage medium communicatively coupled to the at least one processor, the non-transitory computer readable storage medium configured to store a fundus image processing application 103, the at least one processor configured to execute the fundus image processing application 103; and
the fundus image processing application 103 comprising:
a graphical user interface 103i comprising a plurality of interactive elements 103h configured to enable capture and process of the fundus image via a user device 101a, 101b or 101c;

a reception means 103a adapted to receive an input from an image capturing device, wherein the input is the fundus image of the patient;

an interactive fundus image rendering means 103b adapted to dynamically render the input, wherein the dynamically rendered input is configurably accessible on the graphical user interface 103i via the user device 101a, 101b or 101c using the interactive elements 103h;

a processing means 103g adapted to

train a deep convolutional neural network over a training dataset, said training dataset comprising a plurality of training fundus images and a predetermined candidate segment mask associated with each of the training fundus images, wherein the predetermined candidate segment mask associated with a training fundus image is based on a pixel intensity annotation for a plurality of candidate objects present in the training fundus image;
process the fundus image to identify an occurrence of each of the candidate objects throughout the fundus image by the trained deep convolutional neural network; and
generate a candidate segment mask for the received fundus image based on the processed fundus image by the trained deep convolutional neural network.
7. The system 100 as claimed in claim 6, wherein said processing means 103g is adapted to:
determine a candidate object category for each pixel in the fundus image based on a pixel intensity; and achieve semantic segmentation of the fundus image based on the determined candidate object category of the pixels.
8. The system 100 as claimed in claim 6, wherein the candidate object is a pathology indicator, an artefact or the like.
9. The system 100 as claimed in claim 8, wherein the pathology indicator indicates one or more retinal diseases.
10. The system 100 as claimed in claim 9, wherein the retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like.
, Description:1. TITLE OF THE INVENTION

SYSTEM AND METHOD FOR RETINAL FUNDUS IMAGE SEMANTIC SEGMENTATION

2. APPLICANT:

a. Name: ARTIFICIAL LEARNING SYSTEMS INDIA PVT LTD

b. Nationality: INDIA

c. Address: 1665/A, 14th Main Rd, Sector 7, HSR Layout, Bengaluru,
Karnataka 560102, India.

Complete specification:

The following specification particularly describes the invention and the manner in which it is to be performed.

Technical field of the invention

[0001] The invention relates to the field of medical decision support. More particularly, the invention relates to identification and localization of diagnostic indicators in a retinal fundus image of a patient to recognize retinal diseases.

Background of the invention

[0002] Vision is an important survival attribute for a human, thus making eyes as one of the most vital sensory body part. The eye is an organ of sight and has a complex structure with an outer fibrous layer, a middle vascular layer, and a nervous tissue layer. Though most of the eye diseases may not be fatal, failure of proper diagnosis and treatment of an eye disease may lead to vision loss. Analysis of a fundus image of a patient is a very convenient way of screening and monitoring eye diseases. The fundus image of the patient illustrates several elements such as optic disc, blood vessels, macula, etc. The fundus of the eye provides indications of several diseases, in particular, eye diseases like diabetic retinopathy.

[0003] In recent times, computer-aided screening systems assist doctors to improve the quality of examination of fundus images for screening of eye diseases. Evidence based medicine is a common approach in medical practice intended to enhance decision making by highlighting the evidence obtained from well conducted examination and research. For example, methods to highlight eye abnormalities in fundus images enable easy distinction of several eye diseases. The examination and research involved in the evidence based medicine are typically accomplished based on noninvasive methods. Noninvasive methods are a significantly rapid emerging division of medical science used for medical research and early diagnostics of diseases, specifically eye diseases.

[0004] An artificial neural network is a computational device comprising interconnected artificial neurons or neuron models akin to a network of neurons in a human brain which learn to do tasks by considering instances. A convolutional neural network is a type of feed forward artificial neural network having numerous applications in the area of pattern recognition and classification.

[0005] Image segmentation is a method of segmenting an image into similar segments for a simplified and more meaningful analysis of the image. In general, image segmentation is used to locate objects and boundaries in the image by assigning a label to each pixel in the image based on certain characteristics such as colors, textures, etc. Semantic segmentation of the image is a method to process the image to assign a label to semantically matching objects. Semantic segmentation using artificial neural networks is based on supervised learning techniques using a dataset and associated ground-truth data. In the case of fundus images, segmentation may refer to the marking of each pixel with a label corresponding to a certain object such as a vessel, an optic disk, an exudate, etc. Semantic segmentation of fundus images using artificial neural networks assists in segmenting diagnostic information such as retinal features and pathology indicators necessary for analysis of retinal diseases. The semantic segmentation of the fundus images using artificial neural networks provides significant insight during the diagnosis of retinal diseases by medical practitioners. A simple, comprehensive and cost-effective solution involving effective use of artificial neural networks enabling systems to access concealed visions for automated effective identification and semantic segmentation of diagnostic information related to retinal diseases using fundus images is essential.

Summary of invention

[0006] This summary is provided to introduce a selection of concepts in a simplified form that are further disclosed in the detailed description of the invention. This summary is not intended to identify key or essential inventive concepts of the claimed subject matter, nor is it intended for determining the scope of the claimed subject matter.

[0007] The present invention discloses a computer implemented system for processing a fundus image of a patient. The system comprises at least one processor; a non-transitory computer readable storage medium communicatively coupled to the at least one processor, the non-transitory computer readable storage medium configured to store a fundus image processing application, the at least one processor configured to execute the fundus image processing application; and the fundus image processing application comprising: a graphical user interface comprising a plurality of interactive elements configured to enable capture and process of the fundus image via a user device; a reception means adapted to receive an input from an image capturing device, wherein the input is the fundus image of the patient; an interactive fundus image rendering means adapted to dynamically render the input, wherein the dynamically rendered input is configurably accessible on the graphical user interface via the user device using the interactive elements; a processing means adapted to train a deep convolutional neural network over a training dataset, the training dataset comprising a plurality of training fundus images and a predetermined candidate segment mask associated with each of the training fundus images, wherein the predetermined candidate segment mask associated with a training fundus image is based on a pixel intensity annotation for a plurality of candidate objects present in the training fundus image; process the fundus image to identify an occurrence of each of the candidate objects throughout the fundus image by the trained deep convolutional neural network; and generate a candidate segment mask for the received fundus image based on the processed fundus image by the trained deep convolutional neural network.

[0008] The user device is, for example, a personal computer, a laptop, a tablet computing device, a personal digital assistant, a client device, a web browser, etc. Here, the image capturing device refers to a camera for photographing the retinal fundus of the patient. The candidate object is a pathology indicator, an artefact, a retinal feature or the like. The pathology indicator indicates a retinal disease. The pathology indicator is, for example, a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like. The retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like. The candidate segment mask denotes a location and structure of each of the candidate objects with each type of candidate object highlighted with a predetermined pixel intensity.

Brief description of the drawings

[0009] The present invention is described with reference to the accompanying figures. The accompanying figures, which are incorporated herein, are given by way of illustration only and form part of the specification together with the description to explain the make and use the invention, in which,

[0010] Figure 1 illustrates a block diagram of a computer implemented system for processing a fundus image of a patient in accordance with the invention;

[0011] Figure 2 exemplary illustrates a deep convolutional neural network of the system to process the fundus image of the patient;

[0012] Figure 3 exemplarily illustrates the architecture of a computer system employed by a fundus image processing application; and

[0013] Figure 4 illustrates a flowchart for processing the fundus image of the patient in accordance with the invention.

Detailed description of the invention

[0014] Figure 1 illustrates a block diagram of a computer implemented system 100 for processing a fundus image of a patient in accordance with the invention. The system 100 comprises at least one processor; a non-transitory computer readable storage medium communicatively coupled to the at least one processor, the non-transitory computer readable storage medium configured to store a fundus image processing application 103, the at least one processor configured to execute the fundus image processing application 103; and the fundus image processing application 103 comprising: a graphical user interface (GUI) 103i comprising a plurality of interactive elements 103h configured to enable capture and analysis of the fundus image via a user device 101a, 101b or 101c; a reception means 103a adapted to receive an input from an image capturing device, wherein the input is the fundus image of the patient displayed in a live mode; an interactive fundus image rendering means 103b adapted to dynamically render the input, wherein the dynamically rendered input is configurably accessible on the GUI 103i via the user device 101a, 101b or 101c using the interactive elements 103h; a fundus image capture means 103c adapted to capture the fundus image based on the dynamically rendered input; a processing means 103g adapted to train a deep convolutional neural network over a training dataset, the training dataset comprising a plurality of training fundus images and a predetermined candidate segment mask associated with each of the training fundus images, wherein the predetermined candidate segment mask associated with a training fundus image is based on a pixel intensity annotation for a plurality of candidate objects present in the training fundus image; process the fundus image to identify an occurrence of each of the candidate objects throughout the fundus image by the trained deep convolutional neural network; and generate a candidate segment mask for the received fundus image based on the processed fundus image by the trained deep convolutional neural network.

[0015] As used herein, the term “patient” refers to an individual receiving or registered to receive medical treatment. The patient is, for example, an individual undergoing a regular health checkup, an individual with a condition of diabetes mellitus, etc. As used herein, the term “fundus image” refers to a three-dimensional array of digital image data, however, this is merely illustrative and not limiting of the scope of the invention.

[0016] The fundus image processing application 103 processes the fundus image to identify the occurrence of each of the candidate objects throughout the fundus image using the trained deep convolutional neural network. The trained deep convolutional neural network determines a candidate object category for each pixel in the fundus image based on a pixel intensity; and achieves semantic segmentation of the fundus image based on determined candidate object category of the pixels.

[0017] The user device 101a, 101b or 101c is, for example, a personal computer, a laptop, a tablet computing device, a personal digital assistant, a client device, a web browser, etc. Here, the image capturing device refers to a camera for photographing the fundus of the patient. As used herein, the term “candidate object” refers to a pathology indicator, an artefact, a retinal feature or the like. The pathology indicator indicates a retinal disease. The pathology indicator is, for example, a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like. The retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like. As used herein, the term “candidate segment mask” denotes the structure and location of the candidate objects in the fundus image with each type of candidate object highlighted with a predetermined pixel intensity. As used herein, the term “predetermined candidate segment mask” refers to the candidate segment mask for the training fundus image as annotated by an annotator. The predetermined candidate segment mask is stored as a label for the corresponding training fundus image in a ground-truth file. The candidate object category represents a type or class of the candidate object.

[0018] The predetermined pixel intensity indicates a distinct color to distinguish a candidate object category from other candidate object categories. The colors corresponding to each type of candidate object category is predetermined. The different candidate object categories to be identified are also predetermined by, for example, a medical practitioner. The candidate object category denotes a type of the candidate object, for example, a pathology indicator such as a lesion, an artifact such as a dust particle, a retinal feature such as an optic disc, etc. In an example, the medical practitioner decides the candidate object categories to be highlighted based on the essential candidate objects required in the easy diagnosis of retinal diseases. In an embodiment, the background of the fundus image is one of the candidate object category and is allotted a predetermined pixel intensity. In another embodiment, unrecognizable lesions are categorized into a candidate object category and allotted a predetermined pixel intensity. The semantic segmentation of the fundus image is a pixel classification process of associating one of the predetermined candidate object categories to each pixel in the fundus image.

[0019] The computer implemented system 100 comprises at least one processor; a non-transitory computer readable storage medium communicatively coupled to the at least one processor and the fundus image processing application 103. The non-transitory computer readable storage medium is configured to store the fundus image processing application 103. The at least one processor is configured to execute the fundus image processing application 103. The fundus image processing application 103 is executable by at least one processor configured to enable capture and analysis of the fundus image of the patient via the user device 101a, 101b or 101c. The user device 101a, 101b or 101c is, for example, a personal computer, a laptop, a tablet computing device, a personal digital assistant, a client device, a web browser, etc.

[0020] In an embodiment, the fundus image processing application 103 is a web application implemented on a web based platform, for example, a website hosted on a server or a setup of servers. For example, the fundus image processing application 103 is implemented on a web based platform, for example, a fundus image processing platform 104 as illustrated in Figure 1.

[0021] The fundus image processing platform 104 hosts the fundus image processing application 103. The fundus image processing application 103 is accessible by one or more user devices 101a, 101b or 101c. The user device 101a, 101b or 101c is, for example, a computer, a mobile phone, a laptop, etc. In an example, the user device 101a, 101b or 101c is accessible over a network 102 such as the internet, a mobile telecommunication network, a Wi-Fi® network of the Wireless Ethernet Compatibility Alliance, Inc., etc. The fundus image processing application 103 is accessible through browsers such as Internet Explorer® (IE) 8, IE 9, IE 10, IE 11 and IE 12 of Microsoft Corporation, Safari® of Apple Inc., Mozilla® Firefox® of Mozilla Foundation, Chrome of Google, Inc., etc., and is compatible with technologies such as hypertext markup language 5 (HTML5), etc.

[0022] In another embodiment, the fundus image processing application 103 is configured as a software application, for example, a mobile application downloadable by a user on the user device 101a, 101b or 101c, for example, a tablet computing device, a mobile phone, etc. As used herein, the term “user” is an individual who operates the fundus image processing application 103 to capture the fundus images of the patient and generate a report resulting from the processing of the captured fundus images.

[0023] The fundus image processing application 103 is accessible by the user device 101a, 101b or 101c via the GUI 103i provided by the fundus image processing application 103. In an example, the fundus image processing application 103 is accessible over the network 102. The network 102 is, for example, the internet, an intranet, a wireless network, a wired network, a Wi-Fi® network of the Wireless Ethernet Compatibility Alliance, Inc., a universal serial bus (USB) communication network, a ZigBee® network of ZigBee Alliance Corporation, a general packet radio service (GPRS) network, a global system for mobile (GSM) communications network, a code division multiple access (CDMA) network, a third generation (3G) mobile communication network, a fourth generation (4G) mobile communication network, a wide area network, a local area network, an internet connection network, an infrared communication network, etc., or any combination of these networks.

[0024] The fundus image processing application 103 comprises the GUI 103i comprising a plurality of interactive elements 103h configured to enable capture and processing of the fundus image via the user device 101a, 101b or 101c. As used herein, the term “interactive elements 103h” refers to interface components on the GUI 103i configured to perform a combination of processes, for example, a retrieval process from the input received from the user, for example, the fundus images of the patient, processes that enable real time user interactions, etc. The interactive elements 103h comprise, for example, clickable buttons.

[0025] The fundus image processing application 103 comprises the reception means 103a adapted to receive the input from the image capturing device. The input is the fundus image of the patient. In an embodiment, the input is a plurality of fundus images of the patient. As used herein, the term “image capturing device” refers to a camera for photographing the fundus of the patient. In an example, the image capturing device is a Zeiss FF 450+ fundus camera comprising a charged coupled device (CCD) photographic unit. In another example, the image capturing device is a smart phone with a camera capable of capturing the fundus image of the patient.

[0026] The reception means 103a receives information associated with the patient from the user device, for example, 101a, 101b or 101c via the GUI 103i. The information associated with the patient is, for example, personal details about the patient, medical condition of the patient, etc.

[0027] The image capturing device is in communication with the fundus image processing application 103 via the network 102, for example, the internet, an intranet, a wireless network, a wired network, a Wi-Fi® network of the Wireless Ethernet Compatibility Alliance, Inc., a universal serial bus (USB) communication network, a ZigBee® network of ZigBee Alliance Corporation, a general packet radio service (GPRS) network, a global system for mobile (GSM) communications network, a code division multiple access (CDMA) network, a third generation (3G) mobile communication network, a fourth generation (4G) mobile communication network, a wide area network, a local area network, an internet connection network, an infrared communication network, etc., or any combination of these networks.

[0028] The fundus image processing application 103 accesses the image capturing device to receive the input of the patient. The fundus image processing application 103 comprises a transmission means to request the image capturing device for a permission to control the activities of the image capturing device to capture the input associated with the patient. The image capturing device responds to the request received from the transmission means. The reception means 103a receives the response of the image capturing device.

[0029] In other words, the image capturing device permits the user of the fundus image processing application 103 to control the activities of the image capturing device via the interactive elements 103h of the GUI 103i. As used herein, the term “activities” refer to a viewing of the live mode of the fundus of the patient on a screen of the GUI 103i, focusing a field of view by zooming in or zooming out the field of view to observe the fundus of the patient and capturing the fundus image of the patient from the displayed live mode of the fundus of the patient.

[0030] In an embodiment, the fundus image processing application 103 adaptably controls the activities specific to the image capturing device based on parameters of the image capturing device. The fundus image processing application 103 is customizable to suit the parameters of the image capturing device such as the version, the manufacturer, the model details, etc. In other words, the fundus image processing application 103 is customizable and can be suitable adapted to capture the fundus images of the patient for different manufacturers of the image capturing device.

[0031] Once the fundus image processing application 103 has the permission to control the activities of the image capturing device, the user of the fundus image processing application 103 can view the input of the image capturing device on the screen of the GUI 103i. The interactive fundus image rendering means 103b dynamically renders the input on the GUI 103i. The dynamically rendered input is configurably accessible on the GUI 103i via the user device 101a, 101b or 101c using the interactive elements 103h. The field of view of the image capturing device is displayed on a screen of the GUI 103i via the user device 101a, 101b or 101c. The user can focus the field of view by zooming in or zooming out the field of view to observe the fundus of the patient by using with the interactive elements 103h via a user input device such as a mouse, a trackball, a joystick, etc. The user captures the fundus image of the patient from the displayed live mode of the fundus of the patient using the interactive elements 103h of the GUI 103i via the user device 101a, 101b or 101c. As used herein, the term “live mode” refers to the seamless display of the fundus of the patient in real time via the GUI 103i. In an embodiment, the input is an already existing fundus image of the patient stored in the database 104a.

[0032] The fundus image processing application 103 comprises the processing means 103g adapted to train the deep convolutional neural network over the training dataset with the associated predetermined candidate segment mask for each of the training fundus image in the training dataset, wherein the predetermined candidate segment mask is based on the pixel intensity annotation for the multiple candidate objects; process the fundus image to identify an occurrence of each of the candidate objects throughout the fundus image by the trained deep convolutional neural network; and generate a candidate segment mask for the received fundus image based on the processed fundus image by the trained deep convolutional neural network.

[0033] As used herein, the term “deep convolutional neural network” refers to a class of deep artificial neural networks that can be applied to analyzing visual imagery. The deep convolutional neural network corresponds to a specific model of an artificial neural network. The deep convolutional neural network generates the candidate segment mask for the fundus image of the patient. The candidate segment mask refers to a map of the fundus image indicating the location and structure of each of the candidate objects in the fundus image of the patient. The candidate objects define the pathology indicators, artefacts, retinal features in the fundus image of the patient. The pathology indicators indicate one or more retinal diseases such as diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment, etc.

[0034] The deep convolutional neural network is trained over a training dataset of fundus images to accomplish the function associated with the deep convolutional neural network. Here, the term “function” of the deep convolutional neural network refers to the processing of the fundus image to identify the occurrence of each of the candidate objects throughout the fundus image and generate the candidate segment mask for the received fundus image.

[0035] In an embodiment, the fundus image processing application 103 receives the training dataset from one or more external devices. The training dataset comprises a plurality of fundus images. The fundus images in the training dataset are referred to as training fundus images. The external device is, for example, the image capturing device such as a camera incorporated into a mobile device, a server, a network of personal computers, or simply a personal computer, a mainframe, a tablet computer, etc. The fundus image processing application 103 stores the training dataset in a database 104a of the system 100. The system 100 comprises the database 104a in communication with the fundus image processing application 103. The database 104a is also configured to store patient profile information, patient medical history, the training fundus images of patients, reports of the patients, etc.

[0036] As used herein, the term “training fundus image” is a three-dimensional array of digital image data used for the purpose of training the deep convolutional neural network. The training fundus image is a fundus image used for the purpose of training the deep convolutional neural network. In this invention, the term ‘training’ refers to a process of developing the deep convolutional neural network for the generation of the candidate segment mask of the received fundus image based the training dataset and the ground-truth file.

[0037] The predetermined candidate segment mask for a training fundus image is based on a pixel intensity annotation for the multiple candidate objects present in the training fundus image. The term “pixel intensity annotation” refers to the annotation of the training fundus images, for example, by the annotator, by assigning a predetermined pixel intensity or color for each candidate object category. Each candidate object category is assigned a predetermined pixel intensity which the annotator uses to specify the location and structure of one or more candidate objects belonging to the candidate object category. In an embodiment, an appropriate image processing method is used by the fundus image processing application 103 to compute the predetermined candidate segment mask for each of the training fundus images.

[0038] The ground-truth file comprises the label and a fundus image identifier for each of the training fundus image. The label defines the predetermined candidate segment mask for the training fundus image which is an indication of one or more retinal diseases and a corresponding severity of the one or more retinal disease in the training fundus image. The training fundus image identifier of the training fundus image is, for example, a name or an identity assigned to the training fundus image.

[0039] Manual annotation of the training dataset:

[0040] In an embodiment, the annotator annotates each of the training fundus images in the training dataset using the GUI 103i via the user device 101a, 101b or 101c. As used herein, the term “annotator” refers to a user of the fundus image processing application 103 who is usually a trained/certified specialist in accurately annotating a fundus image to identify and localize the candidate objects in the fundus image. For example, the annotator is trained to recognize pathology indicators related to retinal diseases such as diabetic retinopathy along with artifacts and retinal features. The terms “annotator” and “user” are used interchangeably herein.

[0041] The annotator annotates the training fundus images using the GUI 103i. The annotator creates the label for each of the training fundus images in the training dataset based on the annotation. The label of a training fundus image comprises the predetermined candidate segment mask for the training fundus image. The predetermined candidate segment mask is the candidate segment mask for the training fundus image as annotated by the annotator. The “candidate segment mask” refers to a map of the training fundus image showing only the segments of the training fundus image having potentially useful information while masking other segments that are potentially less useful in the training fundus image. The segments with potentially useful information are the highlighted regions depicting the structure of the one or more candidate objects, with each candidate object category highlighted with a predetermined pixel intensity.

[0042] The fundus image processing application 103 comprises suitable interactive elements 103h for the annotator to select at least one training fundus image from the training dataset for annotation. The annotator selects at least one training fundus image for annotation using the GUI 103i via the user device 101a, 101b or 101c using the interactive elements 103h. The fundus image processing application 103 displays the at least one selected training fundus image on the GUI 103i. The GUI 103i allows the annotator to view the at least one selected training fundus image; and aids in identifying and classifying symptoms related to retinal diseases in the at least one selected training fundus image. The GUI 103i also allows the annotator to add, view, edit and/or delete comments and/or markups corresponding to the candidate objects, measure the candidate objects, for example, measure vessel diameter, tortuosity and branching angle for each of the at least one selected training fundus image via the interactive elements 103h. The GUI 103i permits the annotator to retrieve previously annotated data associated with each of the at least one selected training fundus image via the interactive elements 103h.

[0043] In an embodiment, the interactive elements 103h further enables a plurality of options to help the annotator during the annotation of the training fundus image. The options are, for example, a magnifying lens for zooming in/zooming out the training fundus image; pointers to allow selection of the candidate objects; pencils to permit drawing outlines to mark the candidate objects; commenting fields to enter text referencing the candidate objects; measuring tools to indicate distances, for example, accurate scaling with reference to the optic disc and fovea; display options and screen choices to mark the candidate objects; color options to highlight the candidate objects; etc.

[0044] The annotator performs the following steps on each of the training fundus images to complete the annotation using the interactive elements 103h. The annotator analyzes the training fundus image. The annotator creates the candidate segment mask which is an indication of the one or more retinal diseases present in the training fundus image. The annotator defines a structure for each candidate object in the training fundus images. Each candidate object is identified by the candidate object category it belongs to. Each candidate object category has a pre-defined color associated with it. The annotator highlights the structure of each of the candidate objects with the pre-defined color associated with its candidate object category. That is, the candidate object is identified with the pre-defined color of the candidate object category it belongs to. The annotator finalizes the predetermined candidate segment mask by marking the location and structure of each of the candidate objects on the training fundus image. The predetermined candidate segment mask provides an insight into the one or more retinal diseases present in the training fundus image along with a severity level of each of the one or more retinal diseases.

[0045] The annotator also attaches information, for example, a date on which the training fundus image was captured, whether the training fundus image is that of a right or left eye, the kind of equipment used to capture the training fundus image, the field and angle of acquisition of the training fundus image, whether or not mydriasis was used at the time the training fundus image was acquired, etc.

[0046] In an embodiment, the fundus image processing application 103 provides options to integrate a third party platform or a tool for the annotator to annotate the training fundus images in the training dataset.

[0047] In an embodiment, the annotator annotates the training fundus image to detect the presence of one or more retinal diseases based on the identified pathology indicators. The annotator further updates the label of the fundus image with each type of the retinal disease, the severity of each type of the retinal disease, etc. In another embodiment, the annotator may concentrate only on the identification of a particular retinal disease. In an example, consider that the annotator annotates the training fundus images to create the predetermined candidate segment mask for the retinal disease - diabetic retinopathy (DR). The annotator analyses the candidate objects in the retinal fundus image and accordingly marks the candidate objects to generate the predetermined candidate segment mask.

[0048] The processing means 103g utilizes the training dataset to train the deep convolutional neural network for subsequent generating of the candidate segment mask for the fundus image of the patient. In an embodiment, the processing means 103g uses an image processing method to create the predetermined candidate segment mask for the training fundus images without employing manual annotation.

[0049] Pre-processing of the training fundus images:

[0050] The fundus image processing application 103 further comprises a pre-processing means 103d to pre-process each of the training fundus images. The pre-processing means 103d communicates with the database 104a to access the training dataset. For each of the training fundus image, the pre-processing means 103d executes the following steps as part of the pre-processing. The pre-processing means 103d separates any text matter present at the border of the training fundus image. The pre-processing means 103d adds a border to the training fundus image with border pixel values as zero. The pre-processing means 103d increases the size of the training fundus image by a predefined number of pixels, for example, 20 pixels width and height. The additional pixels added are of a zero value. The pre-processing means 103d next converts the training fundus image from a RGB color image to a grayscale image. The pre-processing means 103d now binarize the training fundus image using histogram analysis. The pre-processing means 103d applies repetitive morphological dilation with a rectangular element of size [5, 5] to smoothen the binarized training fundus image. The pre-processing means 103d acquires all connected regions such as retina, text matter of the smoothen training fundus image to separate text matter present in the training fundus image from a foreground image. The pre-processing means 103d determines the largest region among the acquired connected regions as the retina. The retina is assumed to be the connected element with the largest region. The pre-processing means 103d calculates a corresponding bounding box for the retina. The pre-processing means 103d, thus identifies retina from the training fundus image.

[0051] Once the pre-processing means 103d identifies the retina in the training fundus image, the pre-processing means 103d further blurs the training fundus image using a Gaussian filter. The pre-processing means 103d compares an image width and an image height of the blurred training fundus image based on Equation 1.

Image width > 1.2(image height) ---- Equation 1

[0052] The pre-processing means 103d calculates a maximum pixel value of a left half, a maximum pixel value of a right half and a maximum background pixel value for the blurred training fundus image when the image width and the image height of the blurred identified retina satisfies the Equation 1. The maximum background pixel value (Max_background pixel value) is given by the below Equation 2. The term ‘max_pixel_left’ in Equation 2 is the maximum pixel value of the left half of the blurred identified retina. The term ‘max_pixel_right’ in Equation 2 is the maximum pixel value of the right half of the blurred training fundus image.

Max_background pixel value = max (max_pixel_left, max_pixel_right) ---- Equation 2

[0053] The pre-processing means 103d further extracts foreground pixel values from the blurred training fundus image by considering pixel values which satisfy the below Equation 3.

All pixel values > max_background_pixel_value + 10 ---- Equation 3

[0054] The pre-processing means 103d calculates a bounding box using the extracted foreground pixel values from the blurred training fundus image. The pre-processing means 103d processes the bounding box to obtain a resized image using cubic interpolation of shape, for example, [448, 448, 3]. The training fundus image at this stage is referred to as the pre-processed training fundus image. The pre-processing means 103d stores the pre-processed training fundus images in a pre-processed training dataset. The ground-truth file associated with the training dataset holds good even from the pre-processed training dataset. The pre-processing means 103d stores the pre-processed training dataset in the database 104a.

[0055] Segregation of the training dataset:

[0056] The fundus image processing application 103 further comprises a segregation means 103e. The segregation means 103e splits the pre-processed training dataset into two sets – a learning set and a validation set. Hereafter, the pre-processed training fundus images in the learning set is termed as learning fundus images and the pre-processed training fundus images in the validation set is termed as validation fundus images for simplicity. The learning set is used to train the deep convolutional neural network. The validation set is typically used to test the accuracy of the convolutional neural network.

[0057] Augmentation of the learning fundus images:

[0058] The fundus image processing application 103 further comprises an augmentation means 103f. The augmentation means 103f augments the learning set. The augmentation means 103f preforms the following steps for the augmentation of the learning set. The augmentation means 103f randomly shuffles the learning fundus images to divide the learning set into a plurality of batches. Each batch is a collection of a predefined number of learning fundus images. The augmentation means 103f randomly samples each batch of learning fundus images and corresponding predetermined candidate segment masks. The augmentation means 103f processes each batch of the learning fundus images and corresponding predetermined candidate segment masks using affine transformations. The augmentation means 103f translates and rotates the learning fundus images and corresponding predetermined candidate segment masks in the batch randomly based on a coin flip analogy.

[0059] Training and validation of the deep convolutional neural network:

[0060] The processing means 103g receives the batches of augmented learning fundus images from the augmentation means 103f. The processing means 103g trains the deep convolutional neural network using the batches of augmented learning fundus images. The segregation means 103e groups the validation fundus images of the validation set into a plurality of batches. Each batch comprises multiple validation fundus images. The processing means 103g also receives the batches of the validation set from the segregation means 103e. The processing means 103g validates each of the validation fundus images in each batch of the validation set using the deep convolutional neural network. The processing means 103g compares a result of the validation against a corresponding label of the validation fundus image by referring to the ground-truth file. The processing means 103g evaluates a convolutional network performance of the deep convolutional neural network for each batch of the validation set. Here, the convolutional network performance of the deep convolutional neural network refers to the generation of a correct predetermined candidate segment mask for each of the validation fundus images.

[0061] The processing means 103g optimizes the deep convolutional neural network parameters using an optimizer, for example, a Nadam optimizer which is an Adam optimizer with Nesterov Momentum. The optimizer iteratively optimizes the parameters of the deep convolutional neural network during multiple iterations using the learning set. Here, each iteration refers to a batch of the learning set. The processing means 103g evaluates the convolutional network performance of the deep convolutional neural network after a predefined number of iterations on the validation set. Here, each iteration refers to a batch of the validation set.

[0062] Thus, the processing means 103g trains the deep convolutional neural network based on the augmented learning set and tests the deep convolutional neural network based on the validation set. Upon completion of training and validation of the deep convolution neural network based on the convolutional network performance, the processing means 103g is ready to generate the candidate segment mask for the fundus image of the patient using the trained deep convolutional neural network.

[0063] The deep convolutional neural network learns to identify the regions of different candidate objects using the learning set and the ground-truth file. Each training fundus image in the learning set is compared against the corresponding predetermined candidate segment mask which is a part of the ground-truth file to learn the identification of different candidate objects and the shape and location of each candidate objects.

[0064] Processing of the fundus image of the patient:

[0065] The pre-processing means 103d receives the fundus image of the patient from the fundus image capture means 103c. The pre-processing means 103d processes the fundus image of the patient in the same method as the training fundus image. The pre-processing means 103d transmits the processed fundus image to the processing means 103g. The processing means 103g processes the received fundus image from the pre-processing means 103d using the deep convolutional neural network. The processing means 103g determines the candidate object category of each pixel in the fundus image based on the pixel intensity value of the pixel.

[0066] The processing means 103g provides a probability value for each candidate object category for each pixel of the fundus image. The probability value is an indication of a confidence denoting the candidate object category to which each pixel of the fundus image belongs to. The output of the processing means 103g in turn indicates the candidate segment mask associated with the fundus image.

[0067] Figure 2 exemplary illustrates the deep convolutional neural network of the system 100 to process the fundus image of the patient. Figure 2 shows the deep convolutional neural network of the system 100 inside a dashed box. The deep convolutional neural network is a part of the processing means 103g as shown in the Figure 2. The processing means 103g employs the deep convolutional neural network as shown inside the dashed box in the Figure 2 to process the fundus image of the patient. The structure of the deep convolutional neural network and the functioning of the deep convolutional neural network to process the fundus image of the patient is explained below.

[0068] The processed fundus image of the patient from the pre-processing means 103d is an input to a first encoder stack (ES1) of the deep convolutional neural network. The processed fundus image is, for example, represented as a matrix of width 448 pixels and height 448 pixels with ‘3’ channels. That is, the processed fundus image is a representative array of pixel values is 448 x 448 x n1, n1 equal to 3.

[0069] The deep convolutional neural network comprises five encoder stacks – the ES1, a second encoder stack (ES2), a third encoder stack (ES3), a four encoder stack (ES4) and a fifth encoder stack (ES5).

[0070] The ES1 comprises the following sublayers – two convolutional layers followed by a subsampling layer in the same order. The processed fundus image of size 448 x 448 x n1 is an input to the ES1. Here, it is assumed that n1 is equal to 3. The two convolutional layers of ES1 processes the input and the subsampling layer of ES1 reduces the size of the processed input to 224 x 224 x n2. The output of the ES1 is thus an intermediate processed fundus image of size 224 x 224 x n2. The output of ES1 is fed as an input to the second encoder stack ES2.

[0071] The second encoder stack (ES2) comprises the following sublayers - three convolutional layers followed by a subsampling layer in the same order. The intermediate processed fundus image of size 224 x 224 x n2 is an input to ES2. The three convolutional layers of ES1 further processes the input and the subsampling layer of ES2 reduces the size of the further processed input to 112 x 112 x n3. The output of the ES2 is an intermediate processed fundus image of size 112 x 112 x n3. The output of ES2 is fed as an input to the third encoder stack ES3.

[0072] The third encoder stack (ES3) comprises the following sublayers - three convolutional layers followed by a subsampling layer in the same order. The intermediate processed fundus image of size 112 x 112 x n3 is the input to ES3. The three convolutional layers of ES1 further processes the input and the subsampling layer of ES2 reduces the size of the further processed input to 56 x 56 x n4. The ES3 produces an output – an intermediate processed fundus image of size 56 x 56 x n4. The output of ES3 is fed as an input to a fourth encoder stack ES4.

[0073] The four encoder stack (ES4) comprises the following sublayers - three convolutional layers followed by a subsampling layer in the same order. The intermediate processed fundus image of size 56 x 56 x n4 is the input to ES4. The three convolutional layers of ES1 further processes the input and the subsampling layer of ES2 reduces the size of the further processed input to 28 x 28 x n5. The ES4 produces an output – an intermediate processed fundus image of size 28 x 28 x n5. The output of ES4 is fed as an input to a fifth convolution stack ES5.

[0074] The fifth encoder stack (ES5) comprises the following sublayers - three convolutional layers. The intermediate processed fundus image of size 28 x 28 x n5 is the input to ES5. The three convolutional layers of ES5 further processes the input to produce an output. The output is a processed fundus image of size 28 x 28 x n6.

[0075] A first decoder stack (DS1) receives an input from the ES5. The intermediate processed fundus image of size 28 x 28 x n6 is the input to DS1. The DS1 up-samples the processed fundus image to an image of size 56 x 56 x n6.

[0076] The output of DS1, the processed fundus image of size 56 x 56 x n6, is an input to a first downsample-concat layer (DC1). The DC1 also receives an output of a last convolutional layer of ES1 – a processed fundus image of size 448 x 448 x n2, an output of a last convolutional layer of ES2 – a processed fundus image of size 224 x 224 x n3, an output of a last convolutional layer of ES3 – a processed fundus image size 112 x 112 x n4 and an output of a last convolutional layer of ES4 – a processed fundus image of size 56 x 56 x n5.

[0077] The DC1 comprises three subsampling layers and a concatenation layer. The output of the last convolutional layer of ES1 is passed through three subsampling layers of DC1 to get a subsampled fundus image of size 56 x 56 x n2. The output of the last convolutional layer of ES2 is passed through two subsampling layers of DC1 to get a subsampled image of size 56 x 56 x n3. The output of the last convolutional layer of ES3 is passed through a subsampling layer of DC1 to get a subsampled image of size 56 x 56 x n4.

[0078] The concatenation layer of DC1 receives the outputs of the three subsampling layers of DC1. The concatenation layer of DC1 also receives the output of ES4, that is, 56 x 56 x n5 and the output of DS1, 56 x 56 x n6. The concatenation layer of DC1 concatenates these five data to produce a concatenated result, which is an image of size 56 x 56 x (n2+n3+n4+n5+n6).

[0079] A first convolutional stack (CS1) receives the output of DC1. The CS1 comprises two convolutional layers. The CS1 processes the fundus image of size 56 x 56 x (n2+n3+n4+n5+n6). The output of CS1 is a processed fundus image of size 56 x 56 x n7. A second decoder stack (DS2) receives the output of the CS1. That is, the processed fundus image of size 56 x 56 x n7 from the CS1 is an input to DS2. The DS2 further up-samples the input to a fundus image to size 112 x 112 x n7.

[0080] The output of DS2, the fundus image of size 112 x 112 x n7, is an input to a second downsample-concat layer (DC2). The DC2 also receives the output of the last convolutional layer of ES1 – a processed fundus image of size 448 x 448 x n2, the output of the last convolutional layer of ES2 – a processed fundus image of size 224 x 224 x n3 and the output of the last convolutional layer of ES3 - a processed fundus image size 112 x 112 x n4.

[0081] The DC2 comprises two subsampling layers and a concatenation layer. The output of the last convolutional layer of ES1 is passed through the two subsampling layers of DC2 to get a subsampled image of size 112 x 112 x n2. The output of the last convolutional layer of ES2 is passed through a subsampling layer of DC2 to get a subsampled image of size 112 x 112 x n3.

[0082] The concatenation layer of DC2 receives the outputs of the two subsampling layers of DC2. The concatenation layer of DC2 also receives the output of ES3, that is, a fundus image of size 112 x 112 x n4 and the output of DS2, a fundus image of size 112 x 112 x n7. The concatenation layer of DC2 concatenates these four data to produce a concatenated result, which is a fundus image of size 112 x 112 x (n2 + n3+ n4 + n7).

[0083] A second convolutional stack (CS2) receives the output of DC2 – a fundus image of size 112 x 112 x (n2 + n3+ n4 + n7). The CS2 comprises two convolutional layers. The CS2 processes the fundus image of size 112 x 112 x (n2 + n3+ n4 + n7). The output of CS2 is a processed fundus image of size 112 x 112 x n8. A third decoder stack (DS3) receives the output of CS2. That is, the processed fundus image of size 112 x 112 x n8 is the input to the DS3. The DS3 further up-samples the fundus image to a size 224 x 224 x n8.

[0084] The output of DS3, 224 x 224 x n8, is an input to a third downsample-concat layer (DC3). The DC3 also receives the output of the last convolutional layer of ES1 - processed fundus image of size 448 x 448 x n2 and the output of the last convolutional layer of ES2 - processed fundus image of size 224 x 224 x n3.

[0085] The DC3 comprises a subsampling layer and a concatenation layer. The output of the last convolutional layer of ES1 is passed through the subsampling layer of DC3 to get a subsampled image of size 224 x 224 x n2. The concatenation layer of DC3 receives the output of the subsampling layer of DC3. The concatenation layer of DC3 also receives the output of ES2, that is, a processed fundus image of size 224 x 224 x n3 and the output of DS3, a fundus image to a size 224 x 224 x n8. The concatenation layer of DC2 concatenates these three data to produce a concatenated result, which is a fundus image of size 224 x 224 x (n2 + n3+ n8).

[0086] A third convolutional stack (CS3) receives the output of DC2 - fundus image of size 224 x 224 x (n2 + n3+ n8). The CS3 comprises two convolutional layers. The CS3 processes the image and transmits the output to the fourth decoder stack (DS4). The output of CS3 is a processed fundus image of size 224 x 224 x n9.

[0087] The DS4 receives the output of CS3. That is, the processed fundus image of size 224 x 224 x n9 from the CS3 is the input to DS4. The DS4 comprises two convolutional layers – a first convolutional layer and a second convolutional layer. The second convolutional layer comprises filters equal to the number of candidate object categories identified by the system 100. The candidate object category may also be referred to a class of the deep convolutional neural network.

[0088] The DS4 further up-samples the fundus image obtained from the second convolutional layer of DS4 to produce an output - upsampled processed fundus image of size 448 x 448 x n10. The output of DS4 is the final output of the deep convolutional neural network which gives a probability score for each candidate object category. The value of probability score for each candidate object category could be any value [0, 1]. The output of the deep convolutional neural network is the candidate segment mask for the fundus image depicting the location of the candidate objects in the fundus image.

[0089] In an example, the second convolutional layer of DS4 comprises 17 filters to indicate the 17 classes of the deep convolutional neural network. That is, the system 100 identifies 17 candidate object categories. The 17 candidate object categories are 0: 'background'; 1: 'Microaneurysm'; 2: 'Haemorrhage'; 3: 'Hard exudate'; 4: 'Soft exudate'; 5: 'Intraretinal microvascular abnormalities'; 6: 'New vessels on disc'; 7: 'Vitreal/Pre-retinal haemorrhage'; 8: 'Venous beading'; 9: 'Venous loop'; 10: 'Laser'; 11: 'Artefact'; 12: 'Fovea'; 13: 'Macula'; 14: 'Optic disc'; 15: 'drusen'; and 16: 'unrecognizible lesion'.

[0090] The processing means 103g of the fundus image processing application 103 further overlaps the candidate segment mask and the fundus image and displays on the GUI 103i. This makes the further analysis of the fundus image easier for the medical practitioner. In an example, suitable suggestions with a set of instructions to the user may also be included and provided via a pop-up box displayed on a screen. The fundus image processing application 103 may also generate a report comprising the overlapped candidate segment mask and the fundus image, symptoms of the retinal disease and the severity of the retinal disease derived from the candidate segment mask and communicated to the patient via an electronic mail. The report could also be stored in the database 104a of the system 100.

[0091] In another embodiment, the system 100 detects the presence of several diseases, for example, diabetes, stroke, hypertension, cardiovascular diseases, etc., and not limited to retinal diseases based on changes in the retinal feature.

[0092] In an example, the fundus image processing application 103 provides suitable interactive elements 103h such as a drop down menu, a button, etc., to select a set of parameters specific to the image capturing device, for example, a version, a manufacturer detail, etc., while capturing the fundus image of the patient. The fundus image processing application 103 analyses the fundus image based on the selection of the set of parameters specific to the image capturing device. When a generic analysis of the fundus image is desired by the user (without the consideration of the set of parameters specific to the image capturing device), an appropriate interactive element 103h is provided by the fundus image processing application 103. In another example, the user can upload an existing fundus image of the patient for processing by the fundus image processing application 103. When the information regarding the set of parameters of the image capturing device are uploaded by the user, the system 100 provides specific analysis of the fundus image corresponding to the set of parameters of the image capturing device. If no information regarding the set of parameters of the image capturing device are not uploaded by the user, then the system 100 provides generic analysis of the fundus image. This feature enables customization of the results for various manufacturers of the image capturing device.

[0093] In an embodiment, the system 100 uses the processed fundus image to further train the deep convolutional neural network. In another embodiment, system 100 refers to the patient profiles to gather information such as age, gender, race, ethnicity, nationality, hereditary diseases, etc., of existing patients to further train the deep convolutional neural network to improve the convolutional network performance and provide customized results to the patients.

[0094] In an embodiment, the system 100 may also be used to detect certain conditions such as a laser treated fundus. The system 100 is trained to identify and locate candidate objects related to laser marks in the laser treated fundus. The system 100 may be a part of a web cloud with the fundus image and the report uploaded to the web cloud. The system 100 involving computer-based process of supervised learning using the convolutional network as described can thus be effectively used to screen the fundus images. The system 100 identifies indicators which are further processed to automatically provide indications of relevant retinal disease, in particular indications of DR.

[0095] The system 100 reduces the time-consumption involved in a manual process requiring a trained medical practitioner to evaluate digital fundus photographs of the retina. The system 100 effectively improves the quality of analysis of the fundus image by detecting indicators of minute sizes which are often difficult to detect in the manual process of evaluating the fundus image.

[0096] Figure 3 exemplarily illustrates the architecture of a computer system 300 employed by the fundus image processing application 103. The fundus image processing application 103 of the computer implemented system 100 exemplarily illustrated in Figure 1 employs the architecture of the computer system 300 exemplarily illustrated in Figure 3. The computer system 300 is programmable using a high level computer programming language. The computer system 300 may be implemented using programmed and purposeful hardware.

[0097] The fundus image processing platform 104 hosting the fundus image processing application 103 communicates with user devices, for example, 101a, 101b, 101c, etc., of a user registered with the fundus image processing application 103 via the network 102. The network 102 is, for example, the internet, a local area network, a wide area network, a wired network, a wireless network, a mobile communication network, etc. The computer system 300 comprises, for example, a processor 301, a memory unit 302 for storing programs and data, an input/output (I/O) controller 303, a network interface 304, a data bus 305, a display unit 306, input devices 307, fixed disks 308, removable disks 309, output devices 310, etc.

[0098] As used herein, the term “processor” refers to any one or more central processing unit (CPU) devices, microprocessors, an application specific integrated circuit (ASIC), computers, microcontrollers, digital signal processors, logic, an electronic circuit, a field-programmable gate array (FPGA), etc., or any combination thereof, capable of executing computer programs or a series of commands, instructions, or state transitions. The processor 301 may also be realized as a processor set comprising, for example, a math or graphics co-processor and a general purpose microprocessor. The processor 301 is selected, for example, from the Intel® processors such as the Itanium® microprocessor or the Pentium® processors, Advanced Micro Devices (AMD®) processors such as the Athlon® processor, MicroSPARC® processors, UltraSPARC® processors, hp® processors, International Business Machines (IBM®) processors, the MIPS® reduced instruction set computer (RISC) processor, Inc., RISC based computer processors of ARM Holdings, etc. The computer implemented system 100 disclosed herein is not limited to a computer system 300 employing a processor 301 but may also employ a controller or a microcontroller.

[0099] The memory unit 302 is used for storing data, programs, and applications. The memory unit 302 is, for example, a random access memory (RAM) or any type of dynamic storage device that stores information for execution by the processor 301. The memory unit 302 also stores temporary variables and other intermediate information used during execution of the instructions by the processor 301. The computer system 300 further comprises a read only memory (ROM) or another type of static storage device that stores static information and instructions for the processor 301.

[0100] The I/O controller 303 controls input actions and output actions performed by the fundus image processing application 103. The network interface 304 enables connection of the computer system 300 to the network 102. For example, the fundus image processing platform 104 hosting the fundus image processing application 103 connects to the network 102 via the network interface 304. The network interface 304 comprises, for example, one or more of a universal serial bus (USB) interface, a cable interface, an interface implementing Wi-Fi® of the Wireless Ethernet Compatibility Alliance, Inc., a FireWire® interface of Apple, Inc., an Ethernet interface, a digital subscriber line (DSL) interface, a token ring interface, a peripheral controller interconnect (PCI) interface, a local area network (LAN) interface, a wide area network (WAN) interface, interfaces using serial protocols, interfaces using parallel protocols, and Ethernet communication interfaces, asynchronous transfer mode (ATM) interfaces, interfaces based on transmission control protocol (TCP)/internet protocol (IP), radio frequency (RF) technology, etc. The data bus 305 permits communications between the means/modules (103a, 103b, 103c, 103d, 103e, 103f, 103g, 103h and 103i) of the fundus image processing application 103.

[0101] The display unit 306, via the GUI 103i, displays information, display interfaces, interactive elements 103h such as drop down menus, text fields, checkboxes, text boxes, floating windows, hyperlinks, etc., for example, for allowing the user to enter inputs associated with the patient. In an example, the display unit 306 comprises a liquid crystal display, a plasma display, etc. The input devices 307 are used for inputting data into the computer system 300. A user, for example, an operator, registered with the fundus image processing application 103 uses one or more of the input devices 307 of the user devices, for example, 101a, 101b, 101c, etc., to provide inputs to the fundus image processing application 103. For example, a user may enter a patient’s profile information, the patient’s medical history, etc., using the input devices 307. The input devices 307 are, for example, a keyboard such as an alphanumeric keyboard, a touch pad, a joystick, a computer mouse, a light pen, a physical button, a touch sensitive display device, a track ball, etc.

[0102] Computer applications and programs are used for operating the computer system 300. The programs are loaded onto the fixed disks 308 and into the memory unit 302 of the computer system 300 via the removable disks 309. In an embodiment, the computer applications and programs may be loaded directly via the network 102. The output devices 310 output the results of operations performed by the fundus image processing application 103.

[0103] The processor 301 executes an operating system, for example, the Linux® operating system, the Unix® operating system, any version of the Microsoft® Windows® operating system, the Mac OS of Apple Inc., the IBM® OS/2, VxWorks® of Wind River Systems, Palm OS®, the Solaris operating system, the Android operating system, Windows Phone™ operating system developed by Microsoft Corporation, the iOS operating system of Apple Inc., etc.

[0104] The computer system 300 employs the operating system for performing multiple tasks. The operating system is responsible for management and coordination of activities and sharing of resources of the computer system 300. The operating system employed on the computer system 300 recognizes, for example, inputs provided by the user using one of the input devices 307, the output display, files, and directories stored locally on the fixed disks 308. The operating system on the computer system 300 executes different programs using the processor 301. The processor 301 and the operating system together define a computer platform for which application programs in high level programming languages are written.

[0105] The processor 301 retrieves instructions for executing the modules (103a, 103b, 103c, 103d, 103e, 103f, 103g, 103h and 103i) of the fundus image processing application 103 from the memory unit 302. A program counter determines the location of the instructions in the memory unit 302. The program counter stores a number that identifies the current position in the program of each of the modules (103a, 103b, 103c, 103d, 103e, 103f, 103g, 103h and 103i) of the fundus image processing application 103. The instructions fetched by the processor 301 from the memory unit 302 after being processed are decoded. The instructions are stored in an instruction register in the processor 301. After processing and decoding, the processor 301 executes the instructions.

[0106] Figure 4 illustrates a flowchart for processing the fundus image of the patient in accordance with the invention. At step S1, the fundus image processing application 103 receives the fundus image of the patient. The non-transitory computer readable storage medium is configured to store the fundus image processing application 103 and at least one processor is configured to execute the fundus image processing application 103. The fundus image processing application 103 is thus a part of the system 100 comprising the non-transitory computer readable storage medium communicatively coupled to the at least one processor. The fundus image processing application 103 comprises the GUI 103i comprising multiple interactive elements 103h configured to enable capture and processing of the fundus image via the user device 101a, 101b or 101c. The reception means 103a adapted to receive the input from the image capturing device. The input is the fundus image of the patient displayed in a live mode. In an embodiment, the fundus image processing application 103 is a web application implemented on a web based platform, for example, a website hosted on a server or a setup of servers.

[0107] At step S2, the interactive fundus image rendering means 103b is adapted to dynamically render the input. The dynamically rendered input is configurably accessible on the GUI 103i via the user device 101a, 101b or 101c using the interactive elements 103h. At step S3, the fundus image capture means 103c is adapted to capture the fundus image based on the dynamically rendered input.

[0108] At step S4, the processing means 103g is adapted to train the deep convolutional neural network over the training dataset with the associated predetermined candidate segment mask for each of the training fundus image in the training dataset. The predetermined candidate segment mask is based on the pixel intensity annotation for the candidate objects.

[0109] At step S5, the processing means 103g is adapted to process the fundus image to identify an occurrence of each of the candidate objects throughout the fundus image by the trained deep convolutional neural network. The candidate object is a pathology indicator, an artefact, a retinal feature or the like. The pathology indicator indicates one or more retinal diseases. The retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like. The retinal feature is an optic disc, a macula, a blood vessel or the like. The pathology indicator is one of a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like. The processing means 103g determines the candidate object category for each pixel in the fundus image based on the pixel intensity. The processing means 103g achieves semantic segmentation of the fundus image based on determined candidate object category of the pixels. At step S6, the processing means 103g is adapted to generate a candidate segment mask for the received fundus image based on the processed fundus image by the trained deep convolutional neural network.

[0110] The method accurately detects indicators throughout the fundus image which are indicative of disease conditions to properly distinguish indicators of a healthy fundus from indicators which define retinal diseases. This improves efficiency and reduces errors in identifying various medical conditions. The method employing the system 100 acts as an important tool in monitoring a progression of one or more retinal diseases and/or or a response to a therapy.

[0111] The present invention described above, although described functionally or sensibly, may be configured to work in a network environment comprising a computer in communication with one or more devices. It will be readily apparent that the various methods, algorithms, and computer programs disclosed herein may be implemented on computer readable media appropriately programmed for general purpose computers and computing devices. As used herein, the term “computer readable media” refers to non-transitory computer readable media that participate in providing data, for example, instructions that may be read by a computer, a processor or a similar device. Non-transitory computer readable media comprise all computer readable media. Non-volatile media comprise, for example, optical discs or magnetic disks and other persistent memory volatile media including a dynamic random access memory (DRAM), which typically constitutes a main memory. Volatile media comprise, for example, a processor cache, a register memory, a random access memory (RAM), etc. Transmission media comprise, for example, coaxial cables, copper wire, fiber optic cables, modems, etc., including wires that constitute a system bus coupled to a processor, etc. Common forms of computer readable media comprise, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, a Blu-ray Disc®, a magnetic medium, a compact disc-read only memory (CD-ROM), a digital versatile disc (DVD), any optical medium, a flash memory card, a laser disc, RAM, a programmable read only memory (PROM), an erasable programmable read only memory (EPROM), an electrically erasable programmable read only memory (EEPROM), a flash memory, any other cartridge, etc.

[0112] The database 104a is, for example, a structured query language (SQL) data base or a not only SQL (NoSQL) data base such as the Microsoft® SQL Server®, the Oracle® servers, the MySQL® database of MySQL AB Company, the MongoDB® of 10gen, Inc., the Neo4j graph database, the Cassandra database of the Apache Software Foundation, the HBase™ database of the Apache Software Foundation, etc. In an embodiment, the database 104a can also be a location on a file system. The database 104a is any storage area or medium that can be used for storing data and files. In another embodiment, the database 104a can be remotely accessed by the fundus image processing application 103 via the network 102. In another embodiment, the database 104a a is configured as a cloud based database 104a implemented in a cloud computing environment, where computing resources are delivered as a service over the network 102, for example, the internet.

[0113] The foregoing examples have been provided merely for the purpose of explanation and does not limit the present invention disclosed herein. While the invention has been described with reference to various embodiments, it is understood that the words are used for illustration and are not limiting. Those skilled in the art, may effect numerous modifications thereto and changes may be made without departing from the scope and spirit of the invention in its aspects.

Documents

Application Documents

#	Name	Date
1	201841010204-FORM 13 [27-02-2025(online)].pdf	2025-02-27
1	201841010204-STATEMENT OF UNDERTAKING (FORM 3) [20-03-2018(online)].pdf	2018-03-20
2	201841010204-Correspondence to notify the Controller [21-02-2025(online)].pdf	2025-02-21
2	201841010204-OTHERS [20-03-2018(online)].pdf	2018-03-20
3	201841010204-FORM FOR SMALL ENTITY(FORM-28) [20-03-2018(online)].pdf	2018-03-20
3	201841010204-FORM 4 [12-06-2024(online)].pdf	2024-06-12
4	201841010204-FORM-27 [12-06-2024(online)].pdf	2024-06-12
4	201841010204-FORM 1 [20-03-2018(online)].pdf	2018-03-20
5	201841010204-FORM-15 [28-05-2024(online)].pdf	2024-05-28
5	201841010204-FIGURE OF ABSTRACT [20-03-2018(online)].jpg	2018-03-20
6	201841010204-POWER OF AUTHORITY [28-05-2024(online)].pdf	2024-05-28
6	201841010204-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [20-03-2018(online)].pdf	2018-03-20
7	201841010204-RELEVANT DOCUMENTS [10-03-2023(online)].pdf	2023-03-10
7	201841010204-DRAWINGS [20-03-2018(online)].pdf	2018-03-20
8	201841010204-RELEVANT DOCUMENTS [24-05-2022(online)].pdf	2022-05-24
8	201841010204-DECLARATION OF INVENTORSHIP (FORM 5) [20-03-2018(online)].pdf	2018-03-20
9	201841010204-COMPLETE SPECIFICATION [20-03-2018(online)].pdf	2018-03-20
9	201841010204-US(14)-HearingNotice-(HearingDate-25-06-2021).pdf	2021-10-17
10	201841010204-IntimationOfGrant16-08-2021.pdf	2021-08-16
10	abstract201841010204.jpg	2018-03-21
11	201841010204-PatentCertificate16-08-2021.pdf	2021-08-16
11	Form1_After Filing_23-04-2018.pdf	2018-04-23
12	201841010204-Written submissions and relevant documents [08-07-2021(online)].pdf	2021-07-08
12	Correspondence by Applicant_Form1_23-04-2018.pdf	2018-04-23
13	201841010204-FORM-26 [11-12-2020(online)].pdf	2020-12-11
13	201841010204-Request Letter-Correspondence [11-04-2019(online)].pdf	2019-04-11
14	201841010204-ABSTRACT [16-10-2020(online)].pdf	2020-10-16
14	201841010204-Form 1 (Submitted on date of filing) [11-04-2019(online)].pdf	2019-04-11
15	201841010204-CLAIMS [16-10-2020(online)].pdf	2020-10-16
15	201841010204-STARTUP [13-02-2020(online)].pdf	2020-02-13
16	201841010204-Covering Letter [16-10-2020(online)].pdf	2020-10-16
16	201841010204-FORM28 [13-02-2020(online)].pdf	2020-02-13
17	201841010204-FORM 18A [13-02-2020(online)].pdf	2020-02-13
17	201841010204-FER_SER_REPLY [16-10-2020(online)].pdf	2020-10-16
18	201841010204-FER.pdf	2020-03-16
18	201841010204-PETITION u-r 6(6) [16-10-2020(online)].pdf	2020-10-16
19	201841010204-FER.pdf	2020-03-16
19	201841010204-PETITION u-r 6(6) [16-10-2020(online)].pdf	2020-10-16
20	201841010204-FER_SER_REPLY [16-10-2020(online)].pdf	2020-10-16
20	201841010204-FORM 18A [13-02-2020(online)].pdf	2020-02-13
21	201841010204-Covering Letter [16-10-2020(online)].pdf	2020-10-16
21	201841010204-FORM28 [13-02-2020(online)].pdf	2020-02-13
22	201841010204-CLAIMS [16-10-2020(online)].pdf	2020-10-16
22	201841010204-STARTUP [13-02-2020(online)].pdf	2020-02-13
23	201841010204-ABSTRACT [16-10-2020(online)].pdf	2020-10-16
23	201841010204-Form 1 (Submitted on date of filing) [11-04-2019(online)].pdf	2019-04-11
24	201841010204-Request Letter-Correspondence [11-04-2019(online)].pdf	2019-04-11
24	201841010204-FORM-26 [11-12-2020(online)].pdf	2020-12-11
25	201841010204-Written submissions and relevant documents [08-07-2021(online)].pdf	2021-07-08
25	Correspondence by Applicant_Form1_23-04-2018.pdf	2018-04-23
26	201841010204-PatentCertificate16-08-2021.pdf	2021-08-16
26	Form1_After Filing_23-04-2018.pdf	2018-04-23
27	201841010204-IntimationOfGrant16-08-2021.pdf	2021-08-16
27	abstract201841010204.jpg	2018-03-21
28	201841010204-COMPLETE SPECIFICATION [20-03-2018(online)].pdf	2018-03-20
28	201841010204-US(14)-HearingNotice-(HearingDate-25-06-2021).pdf	2021-10-17
29	201841010204-DECLARATION OF INVENTORSHIP (FORM 5) [20-03-2018(online)].pdf	2018-03-20
29	201841010204-RELEVANT DOCUMENTS [24-05-2022(online)].pdf	2022-05-24
30	201841010204-DRAWINGS [20-03-2018(online)].pdf	2018-03-20
30	201841010204-RELEVANT DOCUMENTS [10-03-2023(online)].pdf	2023-03-10
31	201841010204-POWER OF AUTHORITY [28-05-2024(online)].pdf	2024-05-28
31	201841010204-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [20-03-2018(online)].pdf	2018-03-20
32	201841010204-FORM-15 [28-05-2024(online)].pdf	2024-05-28
32	201841010204-FIGURE OF ABSTRACT [20-03-2018(online)].jpg	2018-03-20
33	201841010204-FORM-27 [12-06-2024(online)].pdf	2024-06-12
33	201841010204-FORM 1 [20-03-2018(online)].pdf	2018-03-20
34	201841010204-FORM FOR SMALL ENTITY(FORM-28) [20-03-2018(online)].pdf	2018-03-20
34	201841010204-FORM 4 [12-06-2024(online)].pdf	2024-06-12
35	201841010204-OTHERS [20-03-2018(online)].pdf	2018-03-20
35	201841010204-Correspondence to notify the Controller [21-02-2025(online)].pdf	2025-02-21
36	201841010204-STATEMENT OF UNDERTAKING (FORM 3) [20-03-2018(online)].pdf	2018-03-20
36	201841010204-FORM 13 [27-02-2025(online)].pdf	2025-02-27
37	201841010204-FORM-27 [18-09-2025(online)].pdf	2025-09-18
38	201841010204-FORM-27 [18-09-2025(online)]-1.pdf	2025-09-18

Search Strategy

1	SearchstrategyE_11-03-2020.pdf