System And Method For Detection Of Features In An Image Using

System And Method For Detection Of Features In An Image Using Knowledge Of Expert’s Eye Gaze Pattern

Abstract: The invention proposes systems and methods for the detection and extraction of one or more regions of interest in an input image, using eye gaze patterns of experts who are adept at identifying the regions of interest, along with eye gaze data of non-experts. The method calculates eye fixation data of the expert and the non-expert dataset, then clusters the eye fixation data into image regions and labels these into attractor or distractor regions and fits statistical distributions to data. It then extracts from the data using a fuzzy logic computational system image features corresponding to the attractor and distractor regions rankings of the image features. The fuzzy logic computational system creates a top down knowledge comprising top down feature maps for recognizing the image features based on the rankings. The system then detects features of interest in a test image using the top down feature maps.

Patent Information

Application #

Filing Date

04 November 2016

Publication Number

19/2018

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

shankar@maxval.com

Parent Application

Patent Number

Legal Status

Grant Date

2024-02-29

Renewal Date

Applicants

AMRITA VISHWA VIDYAPEETHAM

Amrita School Of Engineering, Amritanagar, Coimbatore,Tamil Nadu, India- 641 112 patents@amrita.edu

Inventors

1. Dr. JOSEPH, AMUDHA

201, Sunny Palazzo Apartment, 25/3 Owner's Court Layout, Kasavanahalli ,Bangalore, Karnataka 560 035.

2. Ms. KULKARNI, NILIMA

002, Unique Shilpa, 17th Cross, 2nd A Main Vignan Nagar, Kagadaspura, Bangalore, Karnataka 560 075

Specification

Field of the Invention
[0002] The disclosure relates generally to the detection of features in images and in particular to a system and method for the detection of such image features using knowledge of experts’ eye gaze pattern.
Description of the Related Art
[0003] Detection and extraction of a region of interest in an image is widely performed on images, for the easier analysis of the region. They find application in various fields such as medical imaging, object detection, machine vision, geographical information systems etc. Various existing segmentation algorithms operate on an image, assigning a label to every pixel in an image such that pixels with the same label share certain characteristics. Classical methods of segmentation include edge-based segmentation, where edges on the region are traced by identifying the pixel value and comparing with those of the neighboring pixels. Region based methods are a second method, in which pixels that are related to an object are grouped. The third way of segmentation is thresholding, which uses threshold values obtained from a histogram of the edges of objects in the original image.
[0004] Eye tracking plays an important role in visual search tasks for interactive and diagnostic applications. The process of searching for something, whether an object or abnormalities, will include a deliberate process of choosing one location after another for

analysis until the target of the search is found, or the searcher decides the target is not present. However, subject experts will often report the sensation of “knowing” that a particular image contains a known target before they are able to locate it. A method of analyzing an image is disclosed in US published application US20050105768A1, comprising carrying out eye tracking on an observer observing the image and applying factor analysis to the fixation regions to identify the underlying image attributes which the observer is seeking. In granted US patent US9039419B2 a method and system for capturing expert behavior, such as gaze patterns is disclosed, and a catalog (e.g. database) of these behaviors is created.
[0005] PCT application WO2013150494A2 relates to systems and methods for detection and identification of anomalies in target images, with respect to reference images. US granted patent US8774498B2 discloses a system and method that extracts features representative of patches of the image, to generate weighting factors for the features based on location relevance data. The weighting factors are then used to form a representation of the image. US8929680B2 discloses a visual attention map to represent one or more regions of an image. A salient region map defines one or more regions of the image as salient. An intersection between the visual attention map and the salient region map is determined to identify a distracting element in the image.
[0006] The present disclosure describes a system and method for the detection of a region of interest in an image using the knowledge of expert’s eye gaze pattern that overcomes some of the drawbacks of the existing methods.

SUMMARY OF THE INVENTION
[0007] The disclosure relates to detection of features in images using machine learning methods. A method for the detection and extraction of one or more regions of interest in a test image is disclosed. The method comprises providing a set of stimulus images to a group of experts and a group of non-experts, recording a first dataset of eye gaze data from the experts looking at the stimulus images, wherein the experts’ eye gaze focuses on at least one region of interest. A second dataset of eye gaze data from the non-experts looking at the stimulus images is then recorded. Subsequently, the method involves calculating eye fixation data of the first dataset and the second dataset with reference to image coordinates, clustering the eye fixation data into image regions and to label the image regions into attractor regions and distractor regions and fitting a statistical distribution to the eye fixation data in the attractor regions and the distractor regions. The method further involves extracting from the fitted distributions using a machine learning algorithm image features corresponding to the attractor regions and the distractor regions to create rankings of the image features. The fuzzy logic computational system creates a top down knowledge comprising top down feature maps for recognizing the image features based on these rankings. The method further includes providing a test image, and detecting in the detection system using either the top down knowledge or a bottom up map or a combination thereof, the one or more regions of interest in the test image.
[0008] In some embodiments of the method, the first dataset of eye gaze data comprises eye tracker records of eye gaze patterns of experts looking for one or more regions of interest in stimulus images displayed for a specified period of time placed at a predetermined distance from the expert. Each stimulus image comprises at least one region of interest and the stimulus images are displayed sequentially.
[0009] In some embodiments the second dataset of eye gaze data comprises eye tracker records of eye gaze patterns of non-experts looking for one or more regions of interest in

stimulus images displayed for a specified period of time placed at a predetermined distance from the non-expert. Each image comprises at least one region of interest and the stimulus images are displayed sequentially.
[0010] In some embodiments of the method, the calculated fixation data comprises number of fixations, or duration of fixations wherein fixations are identified by finding data samples that are within a predetermined visual angle of one another for a predetermined period of time. In some embodiments the fixation is identified if the data points are within 1° of visual angle for a time period of at least 150 ms.
[0011] In some embodiments of the method, the clustering and labeling the image to attractor regions comprises the steps of providing fixation data of the expert dataset, using a clustering technique, clustering the image fixation data into a set of clusters, obtaining a set of cluster centers from the set of clusters, calculating the distance between the cluster centers, and comparing the calculated distance with a predetermined threshold value. If the calculated distance is less than the predetermined threshold value, the attractor region is identified by drawing a circle with the mean of the centers as center and the predetermined threshold value as radius. If the calculated distance is greater than the predetermined threshold value, the cluster is identified with the greater number of fixation data and drawing a circle with the identified cluster’s center and the predetermined threshold value as radius as the attractor region.
[0012] In various embodiments of the method, clustering and labeling the image to distractor regions comprises the steps of providing fixation data of the non-expert dataset, using a clustering technique, clustering the image fixation data into a set of clusters, obtaining a set of cluster centers from the clusters, calculating a first distance between the cluster centers, and comparing the first distance with a predetermined threshold value. If the first distance is less than the predetermined threshold value, the clusters are merged into a single region and if the first distance is greater than the

predetermined threshold value, a second distance between the cluster center and the center of the attractor region is calculated. The region is excluded if the distance is less than the predetermined threshold value. The clusters with the second distance greater than the threshold value are identified as distractor regions.
[0013] In various embodiments the machine learning algorithm performs the steps of extracting low level image features comprising one or more of color, intensity or orientation features and creating output feature maps. The feature maps may comprise one or more of an intensity feature map, a color feature map wherein the color feature map comprises one or more of red, green, blue or yellow, a color opponency map wherein the color-opponency map comprises one or more red vs. green (RG) color opponency, blue vs. yellow color opponency (BY), green vs. red (GR), or yellow vs. blue (YB) or an orientation map. The orientation map may be computed using pyramids oriented at 0⁰, 45⁰, 90⁰ and 135⁰ and obtained using log Gabor filters. The method further comprises ranking the output feature maps based on the extracted image features and rearranging the feature maps according to the ranking.
[0014] In some embodiments creating the top down knowledge base of the image features comprises the steps of assigning weightage to the features in the order of the ranking sequence, creating a top down map of the output image features based on the assigned weightage, and classifying each pixel in the test image as belonging to attractor or distractor regions using the top down map.
[0015] In some embodiments, creating the bottom up map comprises adding one or more of a color conspicuity map wherein the color conspicuity map comprises a combination of one or more of red vs. green (RG) color opponency map, blue vs. yellow color opponency (BY) map, green vs. red (GR) opponency map, or yellow vs. blue (YB) opponency map, an intensity conspicuity map or an orientation conspicuity map. The orientation conspicuity comprises a combination of feature maps of one or more

orientations at 0⁰, 45⁰, 90⁰ and 135⁰. The obtained result is normalized a range of 0 to 255 to obtain the bottom up map.
[0016] In some embodiments of the method, the image is a retinal fundus image and the region of interest comprises an optic disc, or a pathological lesion. In some embodiments the expert is an optometrist, a radiologist or a physician.
[0017] In some embodiments the combination of the bottom-up and top-down knowledge base constructs a classification map that compromises between a top-down requirement and a bottom-up constraint wherein the uniform regions created by the bottom-up process can be split into attractor and distractor regions.

BRIEF DESCRIPTION OF THE DRAWINGS
[0018] The invention has other advantages and features which will be more readily apparent from the following detailed description of the invention and the appended claims, when taken in conjunction with the accompanying drawings, in which:
[0019] FIG. 1 illustrates a system for detection of optic disc in retinal image using knowledge of expert’s eye gaze pattern according to embodiments of the invention. [0020] FIG. 2 illustrates a method for the detection of target region in an image. [0021] FIG. 3 shows a method for identifying attractor regions in an expert’s eye gaze. [0022] FIG. 4 illustrates a method for identifying the distractor regions in a non¬expert’s eye gaze.
[0023] FIG. 5 illustrates pictorially, the detection of optic disc in retinal image using knowledge of expert’s eye gaze pattern.
[0024] FIG. 6A shows retinal image with optical disc as distinctive search target. [0025] FIG. 6B shows retinal image with optical disc as conjunctive search target.FIG. 7A illustrates eye fixation data of expert
[0026] FIG. 7B shows eye fixation data of non-expert.
[0027] FIG. 8A shows calculated clusters from domain expert gaze data. [0028] FIG. 8B shows calculated clusters from domain non-expert gaze data. [0029] FIG. 9A shows Bottom Up map of input image. [0030] FIG. 9B shows Top Down map of input image.
[0031] FIG. 9C shows combined map of input image for three examples.
[0032] FIG. 10 illustrates success rates of the system with reference to standard datasets. [0033] Referring to the drawings, like numbers indicate like parts throughout the views.

DETAILED DESCRIPTION
[0034] While the invention has been disclosed with reference to certain embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the scope of the invention. In addition, many modifications may be made to adapt to a particular situation or material to the teachings of the invention without departing from its scope.
[0035] Throughout the specification and claims, the following terms take the meanings explicitly associated herein unless the context clearly dictates otherwise. The meaning of "a", "an", and "the" include plural references. The meaning of "in" includes "in" and "on." Referring to the drawings, like numbers indicate like parts throughout the views. Additionally, a reference to the singular includes a reference to the plural unless otherwise stated or inconsistent with the disclosure herein.
[0036] The invention in its various embodiments proposes a system for the detection and extraction of one or more regions of interest in an input image, using eye gaze patterns of experts who are adept at identifying the regions of interest, and eye gaze data of non-experts. The system as shown in FIG. 1 consists of an input unit, a processing unit and a test unit. The input system includes an eye tracker and a monitor to record the eye gaze patterns of domain experts and non-experts while looking at stimulus images. The recorded eye gaze data are stored in a database system. The processing unit comprises an Eye Gaze Data Processing (EGDP) unit and a Feature Extraction and Top Down Knowledge Building (FETDKB) unit. The EGDP unit calculates the eye fixation data of the eye gaze patterns received from the database system The EGDP unit further comprises an Automatic labeling of Retinal Images using Eye Gaze Analysis (ALRIEGA) unit. The ALRIEGA unit clusters the eye fixation data and automatically labels the stimulus images into attractor and distractor regions. Image features are extracted from the attractor regions and the distractor regions and are ranked using a

machine learning algorithm. The fuzzy logic computational system assigns weightage to the features corresponding to their ranks and creates a top down knowledge of the features of the attractor and distractor regions. In some embodiments when an input test image is given to the system the test unit creates a top down map from the top down knowledge and a bottom up map of the input test image. The target region in the input test image is then detected using either a top down map derived from the top down knowledge or the bottom up map or by combining the top down map and the bottom up map.
[0037] In various embodiments the system of FIG. 1 is used to implement a method as illustrated in FIG. 2. The method 200 involves in step 201, recording the eye gaze patterns of experts and non-experts while looking at a set of stimulus images. The recorded eye gaze data of experts provide a first dataset and the recorded eye gaze data of non-experts provide a second dataset. The eye fixation data of experts and non-experts are calculated in step 202 from the first dataset and second dataset respectively with reference to the image co-ordinates. In step 203 the eye fixation data of experts are clustered into image regions and are labeled as attractor regions. The eye fixation data of non-experts are clustered into image regions and are labeled as distractor regions in step 204. Image features are extracted from the attractor regions and the distractor regions and are ranked using a machine learning algorithm as in step 205. The fuzzy logic computational system assigns weightage to the features corresponding to their ranks and creates a top down knowledge of the features of the attractor and distractor regions as in step 206. On providing an input test image in step 207 the fuzzy system constructs a top down map using top down knowledge in step 209. The bottom up map is constructed from the image in step 208. In step 210 and 211 the one or more regions of interest in the input test image is detected using either the top down map or the bottom up map or a combination of the two.

[0038] In some embodiments the first dataset of eye gaze data is obtained by recording the eye gaze pattern of domain experts while looking for one or more regions of interest in a set of stimulus images. The images are displayed sequentially for a specific period of time and are placed at a predetermined distance from the expert. The stimulus image comprises at least one region of interest and once the region of interest is identified by the expert, the expert looks at the region of interest until the next image is displayed.
[0039] In some embodiments the second dataset of eye gaze data is obtained by recording the eye gaze pattern of non-experts while looking for one or more regions of interest in a set of stimulus images. The images are displayed sequentially for a specific period of time and are placed at a predetermined distance from the non-expert.
[0040] In various embodiments the eye fixation data of experts and non-experts comprises of the number of fixations, duration of fixation, number of dwells and duration of dwells for the first dataset and also for the second dataset. The fixations are identified by finding data samples that are within a predetermined visual angle of one another for a predetermined period of time. In some embodiments the predetermined visual angle between data samples to be identified as fixations is 1o for a predetermined time period of at least 150ms.
[0041] In some embodiments the expert fixation data are clustered into regions as
shown in FIG. 3 to obtain the attractor regions. The method 300 of clustering the expert
fixation data into regions comprises the following steps. In step 301 eye fixation data of
experts are provided to obtain the attractor regions. K-means clustering is applied in step
302 on the eye fixation data of experts with the value of K determined using the elbow
method. In step 303 the fixation data of experts are clustered into K number of clusters
with cluster centers c1c2,…ck . In step 304 the distance dcc between the cluster centers
are calculated to be √( ) ( ).The distance dcc is compared
with the predetermined threshold value in step 305. If the distance dcc is less than the

predetermined threshold value then in step 311 the mean CE of the cluster centers c1,c2,… ck is calculated In step 312 a circle is drawn with CE as center and predetermined threshold as radius to obtain the attractor regions. If the distance dcc is more than the predetermined threshold value then in step 321 the cluster ci with more number of fixation points is selected and a circle with ci as center and threshold as radius is drawn in step 322 to obtain the attractor regions.
[0042] In some embodiments the non-expert fixation data are clustered into regions to
obtain distractor regions. The method 400 of clustering the non-expert fixation data and
labeling it into distractor regions as shown in FIG.4 comprises the following steps. In
step 401 eye fixation data of non-experts are provided to obtain the distractor regions. K-
means clustering is performed in step 402 on the eye fixation data of experts with the
value of K determined using the elbow method. In step 403 the fixation data of non¬
experts are clustered into K number of clusters with each cluster having a center denoted
as c1n,c2n,…ckn . In step 404 a set of first distance dcc are calculated between the cluster
centers where √( ) ( ).The distance dcc is compared with the
predetermined threshold value in step 410. If the first distance dcc is more than the predetermined threshold value then in step 420 a second distance d between each cluster center and the cluster center of the attractor region is calculated. If the first distance dcc is less than the predetermined threshold value then in step 411 the clusters are merged and the mean of the cluster centers are calculated to be CM. The second distance d is calculated as in step 420 on the merged clusters. The second distance is further compared with the threshold value in step 420. If d is lesser than the predetermined threshold value then the clusters are excluded in step 421 and if d is greater than the predetermined threshold value a circle is drawn in step 431 with ci as center and predetermined threshold as radius to obtain the distractor regions.
[0043] In various embodiments the stimulus images and the test images could be any medical image such as retinal fundus images. The test images could be tested for

pathological conditions selected from diabetic maculopathy lesions, or glaucoma detected by measurement of varying OD to cup diameter ratio or it could be detection of cancer also. In various embodiments the expert is an optometrist, a radiologist or a physician.
[0044] In some embodiments the test images could be radiographic images such as X-ray or tomographic images, or optical images of anatomical features or tissue, containing one or more features of interest. The features of interest could be related to abnormality or pathological condition identifiable in the test image.
[0045] In some embodiments the image could be a radiographic transmission image, an ultrasonography or other acoustic image, or an optical reflection or transmission image. The feature of interest could be a defect such as a crack or discontinuity, porosity, etc.
[0046] In some embodiments the machine learning algorithm extracts color, intensity, depth, motion, orientation or any other feature to create the corresponding feature maps. The output feature maps are further ranked by the machine learning algorithm and rearranged according to the rank.
[0047] In some embodiments the fuzzy logic computational system creates a top down knowledge associating a meaningful weight descriptor for the features obtained from the two classified groups attractor and distractor region and hence creates top-down map.
[0048] In some embodiments the detection system creates a bottom up knowledge base for the detection of the region of interest in an input test image. The detection system computes, adds and normalizes the color conspicuity map, color opponency map, intensity conspicuity map, and orientation conspicuity map at an angle of 0⁰, 45⁰, 90⁰ and 135⁰ to create the bottom-up map.
[0049] In some other embodiments the detection system creates combination of the bottom-up and top-down map to construct a classification map that compromises

between a top-down requirement and a bottom-up constraint. The uniform regions created by the bottom-up process are split into attractor and distractor regions of smaller segments. The final segmentation reaches either pixel or even sub-pixel accuracy.
[0050] In another embodiment, on providing an input test image to the system the detection system constructs a map of the input test image using the top down knowledge and the bottom up map from the conspicuity maps or a combination of both, and detects the one or more regions of interest in the input test image.
[0051] Auto segregation of regions derived from the knowledge of expert’s eye gaze pattern helps to improve the robustness of the system developed. This method can be incorporated into currently available instruments for ocular screening, such as glaucoma screening. It could also be incorporated in automated systems to annotate a medical image database.
[0052] While the above is a complete description of the embodiments of the invention, various alternatives, modifications, and equivalents may be used. Therefore, the above description and the examples to follow should not be taken as limiting the scope of the invention which is defined by the appended claims.
EXAMPLES
Example 1: Recording of Expert and Non-Expert Eye Gaze Patterns
[0053] The experiment was performed on a laptop computer, with screen size 1366 X 768 Retinal stimulus images were displayed to participants on the screen and eye tracking data of participants were collected. The SMI iView X RED-m (60 Hz) eye tracker is fixed and the participants faced the center of the screen at a viewing distance of 60 cm. They were instructed to restrict their head movement. All participants were explained initially that a set of 25 retinal images have to be viewed one by one. They were asked to find the optic disc (i.e. the target) in each image. Once they identified the

target, they were instructed to look at the target, until next image was displayed. The SMIiViewX system was calibrated prior to each recording session using 5 point grid covering the area in which images were presented. The images were displayed for 7 sec and in sequence. For the non-experts group of participants, a demo was given for information on optic disc. The main task of the participants was to remain still and look at the monitor. The steps conducted during the experiment were as follows.
[0054] (a) Eye tracking: The participants were instructed to look around the stimulus monitor and their eyes were traced using SMIiViewX RED-m eye tracker. When a user sits at an optimal position in front of the RED-m Eye Tracking Device, the Eye Tracking Monitor shows the user's eyes as two ovals somewhere near the center of the screen. This means the user is at an ideal distance from the monitor and the RED-m Eye Tracking Device can track both of the user's eyes.
[0055] (b) Calibration & Validation: Calibration and validation are important steps. In calibration process, participants will see small circle on the screen. The participants were asked to follow the circle. To validate how well the calibration is, the same procedure is repeated for validation. Calibration is considered good, if the validated points are close to original points. Farther points indicate poor calibration, and in such cases, the data was discarded and the process repeated.
[0056] (c)Watching images and Recording of Eye gaze data: The stimulus images were shown on full screen mode. All images were displayed sequentially one after another. The participants need to watch the images, search for target (i.e Optic Disc), and look at the target. The program controls the SMI iView X software and instructs it to record the movement of the subject’s eye while watching the images. 100 eye gaze samples were collected from experts and 300 eye gaze samples were collected from non¬experts. The images were selected under two category one in which the images were optic disc (OD) pop out as shown in FIG. 6A where OD is disjunctive target and other were

lesions which share similar properties with the target OD also present as shown in FIG. 6B known as conjunctive target. The stimuli images used in experiment were subsets of the DRIVE, STARE, High resolution fundus image and INSPIRE datasets.
Example 2: Automatic Labeling of Retinal Images using Eye Gaze Analysis
(ALRIEGA) System [0057] The detection of optic disc in retinal image using knowledge of expert’s eye gaze pattern is illustrated pictorially in FIG. 5. The ALRIEGA system identifies/labels attractor and distractor regions. In the ALRIEGA system, the fixations were detected by implementing a dispersion based algorithm. This algorithm identifies fixations by finding data samples that are close enough to one another for specified minimal period of time. When the data samples were at 1⁰ of visual angle for at least 150ms then that sequence of data samples were considered as fixation as shown in FIG 7A and FIG. 7B. Once fixations were detected, it was counted. When the fixations were inside the area of interest, the number of fixations was called as fixation density. Fixation duration gives a period of time for which eyes are relatively stable. The dispersion threshold was set to 1⁰ visual angle and the duration threshold was set to 150ms.
[0058] The fixations were calculated and identified as shown in FIG. 7A and FIG. 7B. The identified fixations were classified using K-means clustering algorithm. The elbow method was used to find, the suitability of the number of clusters. The elbow method gives value of k as 2. The fixation data calculated from both groups were treated separately. K-means algorithm finds the two clusters in near vicinity as shown in FIG. 8A and FIG. 8B using the experts and non-expert eye gaze data. Further, distance �� between the two cluster centers c1, and c2 were calculated for the experts’ data. The distance �c was found to be greater than the threshold. Hence the cluster with more number of domain expert’s fixation and maximum fixation duration was considered for further processing. The cluster center CE of this final cluster was used as center for cropping the circular region, which gives the attractor’s region. The distance dcc is

calculated between the two cluster centers c1, and c2 for non-expert’s data. The distance is greater than the threshold and hence the distance between these cluster centers and the final cluster center CE of the attractor’s region were calculated. The purpose of this distance calculation is, to check the cluster which is overlapping with the attractor’s region. The cluster with the shorter distance is ignored while the cluster at a greater distance is considered for further processing. The area which is identified with the non¬expert’s fixation data is identified as distractor’s region. Once the attractor’s and distractor’s region were identified, the number of fixations, duration of fixation of the attractor’s region and distractor’s region are calculated.
Example 3: Feature Extraction and Top Down Knowledge Building System
[0059] The FETDKB unit contains image feature extraction and fuzzy system to build top down knowledge. The low-level features color, intensity, orientation were extracted from the attractor’s and distractor’s region. Along with the color feature map for red, green, blue and yellow, the other map called color opponency map for red-green RG, blue-yellow BY, and their complements green- red GR and yellow-blue YB were extracted. The orientation maps are computed using oriented pyramids. Log Gabor filters are used for obtain the different orientation maps. The orientation pyramid consists of four pyramids, one for each orientation 0⁰, 45⁰, 90⁰ and 135⁰. Total 13 features were extracted.
[0060] The feature ranking is used on the collected features from attractor and distractor region. The purpose behind ranking the features is to identify the sequence of features contributing for classification of the attractor’s regions and distractor’s regions. The 100 sample of attractor’s regions and 250 samples distractor’s regions were used for ranking. This ranking will give the most important features. The ranker search method with InfoGainAttributeEval in WEKA is used for ranking.

[0061] The fuzzy inference is employed in this system to compute the values for featu maps. The fuzzy inference system has 13 input variables and 13 output variab representing a single pixel values. All the input and output variables have thr membership functions represented by linguistic variables small, medium and larg Triangular shaped membership functions are used as they are suitable to represent pix value. The range of the values of the pixels is 0 to 255. This range is divided in to thr for assigning to the three membership function. The range 0 to 100 belongs to t membership function small, the range 50 to 200 belongs to the membership functio medium and 150 to 255 belongs to large. The rules are written using if and th statements. The system outputs thirteen maps. These rules are written such that outp features has values based on the feature ranking given in previous step. The feature wi high ranking was given high weightage i.e. a large membership value. The feature ma were rearranged and given to the fuzzy system. The fuzzy system will assign high values to the features according to sequence. The first five features in input assign large values. The next four features in input assigned medium values and last fo features in the input are assigned with low values. The top down knowledge is generat in the form of, identifying the ranking of the features and assigning weight age (value to the features according to ranks. The top down map is created by finding the differen between maps.
Example 4: Eye Gaze based Optic Disc Detection system (EGODD System)
[0062] This unit takes the input as fundus retinal image and calculated two maps bottom up map (BU map) and top down map (TD map). The bottom up map builds from bottom

up features and top down map is calculated from the output of fuzzy system as shown in FIG. 9A and FIG. 9B. The steps include:
1. Reading the input test image
2. Calculating bottom up map based on color, intensity and orientation features
3. Calculating top down map. This map is build using top down knowledge
4. Calculating combined map
Combined map= BU map+ TD map
5. Identifying the most salient region
[0063] Initially the bottom up map is computed from the input image. Conspicuity maps are computed by summing up the feature maps corresponding to each feature. The color conspicuity map is the combination of RG, BY, GR and YB feature maps. The Intensity conspicuity map is the same as the intensity feature map. The orientation conspicuity map is obtained by adding the feature maps for all the four orientations. To get the bottom up map all the conspicuity maps are summed and normalized to the range 0 to 255.
( )
[0064] The top down map is created by using the fuzzy system explained above. The bottom up map and top down map were combined to get the combined map. FIG. 9A-9C.
Combined map= BU map+ TD map
The system is implemented using MATLAB. The Fuzzy Inference System designed using the Fuzzy Toolbox in the MATLAB.
[0065] The EGODD system is tested on the different fundus retinal images datasets. Following table 1 shows the results. The performance is measured in terms of success

rate as shown in FIG. 10. The system proposed here gets 100% success rates for DRIVE and INSPIRE datasets. For High Resolution Fundus Images and STARE datasets we got 95.55% and 90.12% success rates respectively. Similarly the performance is measured in terms of average hit number. The hit number on image I for target t is the rank of the focus that hits the target in order of the saliency. For ex. if the 2nd focus is on the target, the hit number is 2. The lower the hit number betters the performance. The average hit number for the dataset is the arithmetic mean of the hit numbers of all images. The average hit number is computed for each dataset separately as shown in Table 1. We have further computed number of first hits for different datasets. The number is converted into percentage so that comparison becomes easy for different datasets with different number of images. The proposed system is compared with the existing system as shown in Table 2 and results show considerable improvement of the proposed system for optic disc detection.

Example 5: Extraction of Low Level Features
[0066] The low-level features color, intensity, orientation were extracted from the attractor’s and distractor’s region. Along with the color feature map for red, green, blue and yellow, the other map called color opponency map for red-green, blue-yellow, and their complements green- red and yellow-blue were extracted.
[0067] The color feature map is obtained using the input image in RGB color space. The Red, Green, Blue and Yellow color channels are obtained using the equation (1) to equation (4). Each channel yields maximum response to the hue to which it is tuned, and zero response to black and white inputs.

[0068] Here the color opponency maps are created by using center surround method. The two level Gaussian pyramid is created, where the first level is used as the center pixel and the second level is used as surround pixel. Initially the image is filtered using Gaussian filter and sub sampled to half its size for obtaining the level one. The level one image is again filtered and sub sampled to half the size, to obtain level two. The color opponency maps are obtained by creating Gaussian pyramid for the red, green, blue and yellow feature maps. The RG color opponency map have red regions highlighted and green regions inhibited while BY color opponency map have blue regions highlighted and yellow regions inhibited (given in equation (5) to equations(8)).
[0069] The intensity map is created by finding the average of red, green and blue components (equation (9)). After finding the average, the center surround difference of the intensity map is obtained for calculation of intensity feature map.
[0070] The orientation maps are computed using oriented pyramids. Log Gabor filters are used to obtain the different orientation maps. The orientation pyramid consists of four pyramids, one for each orientation 0⁰, 45⁰, 90⁰ and 135⁰. The pyramid for each orientation highlights the edges having this orientation on different scales. A Laplacian pyramid of the image is created using Filter-Subtract-Decimate (FSD) method. Oriented pyramid is formed by modulating each level of the Laplacian pyramid with a set of oriented sine waves, followed by a Low Pass Filtering (LPF) operation using a separable filter, and corresponding sub sampling, as defined in equation (10).

, since four oriented components are used corresponding to 0, 45, 90 and 135 degree. Gabor filters directly give the centre surround difference output which is considered as the feature map.
[0071] Conspicuity maps are computed by summing up the feature maps corresponding to each feature. The color conspicuity map is the combination of RG, BY, GR and YB feature maps. The Intensity conspicuity map is the same as the intensity feature map. The orientation conspicuity map is obtained by adding the feature maps for all the four orientations. To get the bottom up map all the conspicuity maps are summed and normalized to the range 0 to 255.

1. A method for the detection and extraction of one or more regions of interest
in a test image comprising:
providing a set of stimulus images to a group of experts and a group of non-experts;
recording a first dataset of eye gaze data from the experts looking at the stimulus
images, wherein the experts’ eye gaze focuses on at least one region of interest;
recording a second dataset of eye gaze data from the non-experts looking at the
stimulus images;
calculating eye fixation data of the first dataset and the second dataset with
reference to image coordinates;
clustering the eye fixation data into image regions and to label the image regions
into attractor regions and distractor regions;
fitting a statistical distribution to the eye fixation data in the attractor regions and the
distractor regions;
extracting from the fitted distributions using a machine learning algorithm image
features corresponding to the attractor regions and the distractor regions to create
rankings of the image features, wherein the fuzzy logic computational system
creates a top down knowledge comprising top down feature maps for recognizing
the image features based on these rankings;
providing a test image, and,
detecting in the detection system using either the top down knowledge or a bottom
up map or a combination thereof, the one or more regions of interest in the test
image.
2. The method of claim 1, wherein the first dataset of eye gaze data comprises eye
tracker records of eye gaze patterns of experts looking for one or more regions of interest
in stimulus images displayed for a specified period of time placed at a predetermined

distance from the expert, wherein each stimulus image comprises at least one region of interest and wherein the stimulus images are displayed sequentially.
3. The method of claim 1, wherein the second dataset of eye gaze data comprises eye tracker records of eye gaze patterns of non-experts looking for one or more regions of interest in stimulus images displayed for a specified period of time placed at a predetermined distance from the non-expert, wherein each image comprises at least one region of interest and wherein the stimulus images are displayed sequentially.
4. The method of claim 1, wherein the calculated fixation data comprises a number of fixations, or a duration of fixations wherein fixations are identified by finding data samples that are within a predetermined visual angle of one another for a predetermined period of time.
5. The method of claim 4, wherein the fixation is identified if the data points are within 1o of visual angle for a time period of at least 150 ms.
6. The method of claim 1, wherein clustering and labeling the image to attractor regions comprises the steps of:

a) providing fixation data of the expert dataset;
b) using a clustering technique, clustering the image fixation data into a set of clusters;
c) obtaining a set of cluster centers from the set of clusters;
d) calculating the distance between the cluster centers;

e) comparing the calculated distance with a predetermined threshold value;
f) if the calculated distance is less than the predetermined threshold value, identifying the attractor region by drawing a circle with the mean of the centers as center and the predetermined threshold value as radius; and
g) if the calculated distance is greater than the predetermined threshold value, identifying the cluster with the greater number of fixation data and drawing a circle with the identified cluster’s center and the predetermined threshold value as radius as the attractor region.
7. The method of claim 1, wherein clustering and labeling the image to distractor
regions comprises the steps of:
h) providing fixation data of the non-expert dataset;
i) using a clustering technique , clustering the image fixation data into a set
of clusters; j) obtaining a set of cluster centers from the clusters; k) calculating a first distance between the cluster centers; l) comparing the first distance with a predetermined threshold value; m) if the first distance is less than the predetermined threshold value,
merging the clusters into a single region; n) if the first distance is greater than the predetermined threshold value,
calculating a second distance between the cluster center and the center of
the attractor region and excluding the region if the distance is less than the
predetermined threshold value; and o) identifying the clusters with the second distance greater than the threshold
value as distractor region.

8. The method of claim 1, wherein the machine learning algorithm performs
the steps of:
p) extracting low level image features comprising one or more of
color, intensity or orientation features;
q) creating output feature maps comprising one or more of:
an intensity feature map;
a color feature map wherein the color feature map comprises one or more of red, green, blue or yellow;
a color opponency map wherein the color-opponency map comprises one or more red vs. green (RG) color opponency, blue vs. yellow color opponency (BY), green vs. red (GR), or yellow vs. blue (YB); or
an orientation map wherein the orientation maps are computed using pyramids oriented at 0⁰, 45⁰, 90⁰ and 135⁰ and wherein the orientation maps are obtained using log Gabor filters; r) ranking the output feature maps based on the extracted image features;
and s) rearranging the feature maps according to the ranking.
9. The method of claim 1 wherein creating the top down knowledge base of the
image features comprises the steps of:
assigning weightage to the features in the order of the ranking sequence; creating a top down map of the output image features based on the assigned weightage; and
classifying each pixel in the test image as belonging to attractor or distractor regions using the top down map.

10. The method of claim 1 wherein creating the bottom up map comprises adding one
or more of:
a color conspicuity map wherein the color conspicuity map comprises a combination of one or more of red vs. green (RG) color opponency map, blue vs. yellow color opponency (BY) map, green vs. red (GR) opponency map, or yellow vs. blue (YB) opponency map; an intensity conspicuity map; or
an orientation conspicuity map wherein the orientation conspicuity comprises a combination of feature maps of one or more orientations at 0⁰, 45⁰, 90⁰ and 135⁰; and normalizing the obtained result over a range of 0 to 255.
11. The method of claim 1 wherein the image is a retinal fundus image and the region of interest comprises an optic disc, or a pathological lesion.
12. The method of claim 1 wherein the expert is an optometrist, a radiologist or a physician.
13. The method of claim 1 wherein the combination of the bottom-up and top-down
knowledge base constructs a classification map that compromises between a top-down
requirement and a bottom-up constraint wherein the uniform regions created by the
bottom-up process can be split into attractor and distractor.

Documents

Application Documents

#	Name	Date
1	201641037789-PROOF OF ALTERATION [02-04-2025(online)].pdf	2025-04-02
1	Form3 As Filed _ 04-11-2016.pdf	2016-11-04
2	201641037789-IntimationOfGrant29-02-2024.pdf	2024-02-29
2	Drawings_ 04-11-2016.pdf	2016-11-04
3	Description (Provisional)_ 04-11-2016.pdf	2016-11-04
3	201641037789-PatentCertificate29-02-2024.pdf	2024-02-29
4	Claims_ 04-11-2016.pdf	2016-11-04
4	201641037789-Annexure [21-02-2024(online)].pdf	2024-02-21
5	Abstract_ 04-11-2016.pdf	2016-11-04
5	201641037789-Written submissions and relevant documents [21-02-2024(online)].pdf	2024-02-21
6	Drawing [01-03-2017(online)].pdf	2017-03-01
6	201641037789-Correspondence to notify the Controller [08-02-2024(online)].pdf	2024-02-08
7	Description(Complete) [01-03-2017(online)].pdf_197.pdf	2017-03-01
7	201641037789-EDUCATIONAL INSTITUTION(S) [07-02-2024(online)].pdf	2024-02-07
8	Description(Complete) [01-03-2017(online)].pdf	2017-03-01
8	201641037789-FORM 13 [07-02-2024(online)].pdf	2024-02-07
9	201641037789-OTHERS [07-02-2024(online)].pdf	2024-02-07
9	Other Patent Document [12-05-2017(online)].pdf	2017-05-12
10	201641037789-POA [07-02-2024(online)].pdf	2024-02-07
10	Assignment [10-07-2017(online)].pdf	2017-07-10
11	201641037789-FORM-26 [01-09-2017(online)].pdf	2017-09-01
11	201641037789-RELEVANT DOCUMENTS [07-02-2024(online)].pdf	2024-02-07
12	201641037789-US(14)-HearingNotice-(HearingDate-09-02-2024).pdf	2024-01-16
12	Correspondence By Agent_Power Of Attorney_04-10-2017.pdf	2017-10-04
13	201641037789-FER.pdf	2021-10-17
13	201641037789-FORM 18 [22-10-2018(online)].pdf	2018-10-22
14	201641037789-Annexure [23-06-2021(online)].pdf	2021-06-23
14	201641037789-PETITION UNDER RULE 137 [23-06-2021(online)].pdf	2021-06-23
15	201641037789-CLAIMS [23-06-2021(online)].pdf	2021-06-23
15	201641037789-OTHERS [23-06-2021(online)].pdf	2021-06-23
16	201641037789-COMPLETE SPECIFICATION [23-06-2021(online)].pdf	2021-06-23
16	201641037789-FER_SER_REPLY [23-06-2021(online)].pdf	2021-06-23
17	201641037789-DRAWING [23-06-2021(online)].pdf	2021-06-23
17	201641037789-CORRESPONDENCE [23-06-2021(online)].pdf	2021-06-23
18	201641037789-CORRESPONDENCE [23-06-2021(online)].pdf	2021-06-23
18	201641037789-DRAWING [23-06-2021(online)].pdf	2021-06-23
19	201641037789-COMPLETE SPECIFICATION [23-06-2021(online)].pdf	2021-06-23
19	201641037789-FER_SER_REPLY [23-06-2021(online)].pdf	2021-06-23
20	201641037789-CLAIMS [23-06-2021(online)].pdf	2021-06-23
20	201641037789-OTHERS [23-06-2021(online)].pdf	2021-06-23
21	201641037789-Annexure [23-06-2021(online)].pdf	2021-06-23
21	201641037789-PETITION UNDER RULE 137 [23-06-2021(online)].pdf	2021-06-23
22	201641037789-FER.pdf	2021-10-17
22	201641037789-FORM 18 [22-10-2018(online)].pdf	2018-10-22
23	Correspondence By Agent_Power Of Attorney_04-10-2017.pdf	2017-10-04
23	201641037789-US(14)-HearingNotice-(HearingDate-09-02-2024).pdf	2024-01-16
24	201641037789-FORM-26 [01-09-2017(online)].pdf	2017-09-01
24	201641037789-RELEVANT DOCUMENTS [07-02-2024(online)].pdf	2024-02-07
25	201641037789-POA [07-02-2024(online)].pdf	2024-02-07
25	Assignment [10-07-2017(online)].pdf	2017-07-10
26	201641037789-OTHERS [07-02-2024(online)].pdf	2024-02-07
26	Other Patent Document [12-05-2017(online)].pdf	2017-05-12
27	201641037789-FORM 13 [07-02-2024(online)].pdf	2024-02-07
27	Description(Complete) [01-03-2017(online)].pdf	2017-03-01
28	201641037789-EDUCATIONAL INSTITUTION(S) [07-02-2024(online)].pdf	2024-02-07
28	Description(Complete) [01-03-2017(online)].pdf_197.pdf	2017-03-01
29	201641037789-Correspondence to notify the Controller [08-02-2024(online)].pdf	2024-02-08
29	Drawing [01-03-2017(online)].pdf	2017-03-01
30	Claims_ 04-11-2016.pdf	2016-11-04
30	Abstract_ 04-11-2016.pdf	2016-11-04
30	201641037789-Written submissions and relevant documents [21-02-2024(online)].pdf	2024-02-21
31	Claims_ 04-11-2016.pdf	2016-11-04
31	201641037789-Annexure [21-02-2024(online)].pdf	2024-02-21
32	Drawings_ 04-11-2016.pdf	2016-11-04
32	Description (Provisional)_ 04-11-2016.pdf	2016-11-04
32	201641037789-PatentCertificate29-02-2024.pdf	2024-02-29
33	Form3 As Filed _ 04-11-2016.pdf	2016-11-04
33	Drawings_ 04-11-2016.pdf	2016-11-04
33	201641037789-IntimationOfGrant29-02-2024.pdf	2024-02-29
34	Form3 As Filed _ 04-11-2016.pdf	2016-11-04
34	201641037789-PROOF OF ALTERATION [02-04-2025(online)].pdf	2025-04-02
35	201641037789-OTHERS [12-05-2025(online)].pdf	2025-05-12
36	201641037789-EDUCATIONAL INSTITUTION(S) [12-05-2025(online)].pdf	2025-05-12

Search Strategy

1	2020-12-2213-44-04E_22-12-2020.pdf
2	201641037789AE_15-12-2021.pdf