Sign In to Follow Application
View All Documents & Correspondence

System And Method To Compute Active Visual Attention Score Using Eye Blink Rate Variability

Abstract: ABSTRACT SYSTEM AND METHOD TO COMPUTE ACTIVE VISUAL ATTENTION SCORE USING EYE BLINK RATE VARIABILITY This disclosure relates generally to system and method to compute active visual attention score using eye blink rate variability. Monitoring visual attention is a cognitive ability to focus on important visual information and filter out unimportant information in the context of human-behavior assessment and human machine interactions. The method records blink data of a subject using a video recording sensor. A blink threshold is computed from a set of eye aspect ratio values and a set of uniformly sampled blink rate variability series signals are reconstructed from the blink data. The method further determines pareto frequency feature and a reference pareto frequency feature. A frequency spectrum is constructed from the set of uniformly sampled BRV series signals indicating behavior of the subject during the systematic execution of each task. Then, an active visual attention score is computed at every task window by using the frequency spectrum. [To be published with FIG. 3]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
24 March 2023
Publication Number
39/2024
Publication Type
INA
Invention Field
BIO-MEDICAL ENGINEERING
Status
Email
Parent Application

Applicants

Tata Consultancy Services Limited
Nirmal Building, 9th floor, Nariman point, Mumbai 400021, Maharashtra, India

Inventors

1. KARMAKAR, Somnath
Tata Consultancy Services Limited, Building 1B, Ecospace, Plot - IIF/12, New Town, Rajarhat, Kolkata 700156, West Bengal, India
2. GAVAS, Rahul Dasharath
Tata Consultancy Services Limited, Gopalan Enterprises Pvt Ltd (Global Axis) SEZ "H" Block, No. 152 (Sy No. 147,157 & 158), Hoody Village, EPIP Zone, (II Stage), Whitefield, K.R. Puram Hobli, Bangalore 560066, Karnataka, India
3. CHATTERJEE, Debatri
Tata Consultancy Services Limited, Building 1B, Ecospace, Plot - IIF/12, New Town, Rajarhat, Kolkata 700156, West Bengal, India
4. RAMAKRISHNAN, Ramesh Kumar
Tata Consultancy Services Limited, Gopalan Enterprises Pvt Ltd (Global Axis) SEZ "H" Block, No. 152 (Sy No. 147,157 & 158), Hoody Village, EPIP Zone, (II Stage), Whitefield, K.R. Puram Hobli, Bangalore 560066, Karnataka, India
5. BASARALU SHESHACHALA, Mithun
Tata Consultancy Services Limited, Gopalan Enterprises Pvt Ltd (Global Axis) SEZ "H" Block, No. 152 (Sy No. 147,157 & 158), Hoody Village, EPIP Zone, (II Stage), Whitefield, K.R. Puram Hobli, Bangalore 560066, Karnataka, India
6. VARGHESE, Tince
Tata Consultancy Services Limited, Gopalan Enterprises Pvt Ltd (Global Axis) SEZ "H" Block, No. 152 (Sy No. 147,157 & 158), Hoody Village, EPIP Zone, (II Stage), Whitefield, K.R. Puram Hobli, Bangalore 560066, Karnataka, India
7. PAL, Arpan
Tata Consultancy Services Limited, Building 1B, Ecospace, Plot - IIF/12, New Town, Rajarhat, Kolkata 700156, West Bengal, India

Specification

Description:FORM 2

THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003

COMPLETE SPECIFICATION
(See Section 10 and Rule 13)

Title of invention:

SYSTEM AND METHOD TO COMPUTE ACTIVE VISUAL ATTENTION SCORE USING EYE BLINK RATE VARIABILITY

Applicant

Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India

Preamble to the description:

The following specification particularly describes the invention and the manner in which it is to be performed.
CROSS-REFERENCE TO RELATED APPLICATIONS AND PRIORITY
The present application is a patent of addition of Indian Patent Application No. 202021038081, filed on September 03, 2020, the entire content of which is hereby incorporated herein by way of reference.

TECHNICAL FIELD
The disclosure herein generally relates to eye blink rate variability, and, more particularly, to system and method to compute active visual attention score using eye blink rate variability.

BACKGROUND
Attention is an important visual function by which important information can be selected and irrelevant information can be filtered out. Multiple brain centers of a user act in conjunction for selecting the most relevant information. Inability to maintain the focused attention is an indicator of cognitive dysfunction and hence, is treated as an early indicator of cognitive decline. Therefore, assessment of visual attention is important. Conventional methods utilize various markers for assessment of attention. However, blink rate variability (BRV) series signal are not explored yet for assessment of active visual attention score.
Visual attention can be measured using various modalities like questionnaire, psychological tests, and physiological sensing. Features like blink rate, magnitude of blinks are closely related to attentional control. Eye blinks can be detected from an assortment of physiological sensing, viz. electroencephalogram (EEG), nearable eye trackers based on infrared, wearable eye trackers, electro-oculogram (EOG), electromyograph (EMG) and video cameras. Detection of blinks using video camera or nearable eye trackers makes it possible to deploy such systems in real time scenarios as they are cheap and unobtrusive in nature. Active visual attention (AVA) is the cognitive ability to focus on important visual information while responding to a stimulus and is important for human-behavior and psychophysiological research. Existing eye-trackers or camera-based methods are either expensive or impose privacy issues during quantification of AVA.
The concerns and limitations of the convention art and approaches are addressed in Applicant’s Indian patent applicant No. 202021038081, filed on 03 September 2020 by providing assessment of visual sustained attention of a target using BRV series signal. The Indian patent applicant No. 202021038081 determines appropriate frequency regions of BRV signals that can be indicative of visual attention. The blink rate variability (BRV) which is defined as variability of the inter-blink durations, is used for analyzing sustained attention for tasks. However, further additions and refinements to the approach are not discussed that can improve the efficiency of privacy issues and accuracy by computing active visual attention score of the target while performing a task. For calculation of active visual attention score only personalized thresholds and eye aspect ratio (EAR) are generated from the face video. Existing approaches are expensive and impose privacy issues while handling BRV series signal. Moreover, the active visual attention score metric achieves robustness across subjects under various external conditions.

SUMMARY
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a system to compute active visual attention score using eye blink rate variability is provided. The system includes recording blink data of the subject using a video recording sensor. The blink data comprises data associated with eye blink response of the subject during a systematic execution of a set of tasks. A blink threshold is calculated from a set of eye aspect ratio (EAR) values of the blink data. Further, a set of uniformly sampled blink rate variability (BRV) series signals are reconstructed from the blink data. Each BRV series signal from the set of uniformly sampled BRV series signals comprises a time series data constructed from intervals between consecutive eye blinks obtained during the systematic execution of the set of tasks. Further, from the set of uniformly sampled BRV series signals, a pareto frequency feature and a reference pareto frequency feature are determined. The reference pareto frequency feature is determined from the pareto frequency feature. The pareto frequency feature comprises a frequency range within which a specific percentage of cumulative power of a frequency spectrum of the set of uniformly sampled BRV series signals resides. The frequency spectrum is constructed for the set of uniformly sampled BRV series signals indicates behavior of the subject during the systematic execution of the set of tasks at every task window to compute a current pareto frequency feature. Then, an active visual attention score is computed at every task window by using the frequency spectrum the pareto frequency feature, the reference pareto frequency feature, the current pareto frequency, and a state of the subject.
In another aspect, a method to compute active visual attention score using eye blink rate variability is provided. The method includes recording blink data of the subject using a video recording sensor. The blink data comprises data associated with eye blink response of the subject during a systematic execution of a set of tasks. A blink threshold is calculated from a set of eye aspect ratio (EAR) values of the blink data. Further, a set of uniformly sampled blink rate variability (BRV) series signals are reconstructed from the blink data. Each BRV series signal from the set of uniformly sampled BRV series signals comprises a time series data constructed from intervals between consecutive eye blinks obtained during the systematic execution of the set of tasks. Further, from the set of uniformly sampled BRV series signals, a pareto frequency feature and a reference pareto frequency feature are determined. The reference pareto frequency feature is determined from the pareto frequency feature. The pareto frequency feature comprises a frequency range within which a specific percentage of cumulative power of a frequency spectrum of the set of uniformly sampled BRV series signals resides. The frequency spectrum is constructed for the set of uniformly sampled BRV series signals indicates behavior of the subject during the systematic execution of the set of tasks at every task window to compute a current pareto frequency feature. Then, an active visual attention score is computed at every task window by using the frequency spectrum the pareto frequency feature, the reference pareto frequency feature, the current pareto frequency, and a state of the subject.
In yet another aspect, a non-transitory computer readable medium for recording blink data of the subject using a video recording sensor. The blink data comprises data associated with eye blink response of the subject during a systematic execution of a set of tasks. A blink threshold is calculated from a set of eye aspect ratio (EAR) values of the blink data. Further, a set of uniformly sampled blink rate variability (BRV) series signals are reconstructed from the blink data. Each BRV series signal from the set of uniformly sampled BRV series signals comprises a time series data constructed from intervals between consecutive eye blinks obtained during the systematic execution of the set of tasks. Further, from the set of uniformly sampled BRV series signals, a pareto frequency feature and a reference pareto frequency feature are determined. The reference pareto frequency feature is determined from the pareto frequency feature. The pareto frequency feature comprises a frequency range within which a specific percentage of cumulative power of a frequency spectrum of the set of uniformly sampled BRV series signals resides. The frequency spectrum is constructed for the set of uniformly sampled BRV series signals indicates behavior of the subject during the systematic execution of the set of tasks at every task window to compute a current pareto frequency feature. Then, an active visual attention score is computed at every task window by using the frequency spectrum the pareto frequency feature, the reference pareto frequency feature, the current pareto frequency, and a state of the subject.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
FIG.1 illustrates a block diagram of a system to compute active visual attention (AVA) score using eye blink rate variability (BRV) and mental state analysis, in accordance with some embodiments of the present disclosure.
FIG.2 is a functional block diagram illustrating a method to compute active visual attention score from the eye blink rate variability of the subject, in accordance with some embodiments of the present disclosure.
FIG.3 is a flow diagram of a method for computing active visual attention score from the eye blink rate variability of the subject, in accordance with some embodiments of the present disclosure.
FIG.4A through FIG.4D illustrates a set of tasks being performed by the subject for computing active visual attention score from the eye blink rate variability, in accordance with some embodiments of the present disclosure.
FIG.5A through FIG.5D illustrates BRV series signal reconstruction from the eye aspect ratio (EAR) signal involving detected blinks in time (in FIG.5A), binary blink and no-blink signal constructed (in FIG.5B), interpolated BRV signal (in FIG.5C), and spectrum of the signal (in FIG.5D) respectively, in accordance with some embodiments of the present disclosure.
FIG.6A and FIG.6B illustrates zoomed frequency spectrum of the BRV signal for low attention score and high attention score of the subject observed while performing the set of tasks using the system of FIG.1, in accordance with some embodiments of the present disclosure.
FIG.7A illustrates pareto frequencies corresponding to different ground truth feedback scores in conjunction for the set of tasks using the system of FIG.1, in accordance with some embodiments of the present disclosure.
FIG.7B illustrates experimental simulation of the active visual attention score having different no attention duration values of the subject using the system of FIG.1, in accordance with some embodiments of the present disclosure.
FIG.8A and FIG.8B illustrates distribution of feedback score and attention score computed for the set of tasks using the system of FIG.1, in accordance with some embodiments of the present disclosure.
FIG.9A illustrates plot of example active visual attention score of the subject at every predefined time for the set of tasks using the system of FIG.1, in accordance with some embodiments of the present disclosure.
FIG.9B illustrates plot of example active visual attention score of the subject for the set of tasks considering various external conditions using the system of FIG.1, in accordance with some embodiments of the present disclosure.

DETAILED DESCRIPTION OF EMBODIMENTS
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
Visual attention is a cognitive ability that helps users to focus on important visual information and filter out unimportant information. Eye trackers are most widely used for detection of blinks. Such eye trackers are very expensive and hence, not suitable for applications that require mass deployment. Another most widely adopted approach uses camera or webcams for assessment of visual attention. There are three main types of visual attention comprising (i) spatial attention which involves focusing to a particular location in visual space (ii) feature-based attention that involves focusing on a particular feature of the scene and (iii) object-based attention in which attention is influenced or guided by object structure. Each of these types can further be categorized as covert and overt. For overt visual attention, the stimulus is used to physically to direct the eyes like searching for a target item. On the other hand, covert attention deals with mentally shifting the attention without physically shifting the eye. While engaged in an overt attention task, if the subject has to respond to the stimulus, it is termed as active overt spatial attention. The present disclosure mainly focuses on active overt spatial attention. Since active attention is applicable only to the overt spatial scenario.
The Applicant has addressed concerns and limitations in the art in applicants Indian patent applicant No. 202021038081, filed on 03 September 2020 by providing assessment of visual sustained attention of a target using BRV series signal. The Indian patent applicant No. 202021038081 focusses only on determining appropriate frequency regions of BRV signals that can be indicative of visual attention. The blink rate variability (BRV) which is defined as variability of the inter-blink durations, is used for analyzing sustained attention for tasks. The Indian patent applicant No. 202021038081 identifies the eye blink positions in the gaze data and wherein a point in the gaze data is treated as eye blink on performance as determination of occurrence of a threshold number of consecutive zeros and missing data point in the gaze data.
The embodiments provide further additions and refinements to the approach are not discussed in Indian patent applicant No.202021038081 for determining visual sustained attention of the target by systematically recording eye movements and eyelid movements during the set of tasks. For any human-computer interface applications, the measurements needs to be continuous, easy to implement and real-time using existing or frugal sensors. The hypothesized blink rate and its coherence is a key measure of the active visual attention score. The method improves efficiency of privacy issues and accuracy of active visual attention score of the subject while performing each task. The method of the present disclosure is inexpensive, efficient, and continuous quantification of the active visual attention score based on a personalized blink threshold. The active visual attention score is computed only for personalized thresholds and a set of eye aspect ratio (EAR) signals generated from a set of video frames capturing a face of the subject. Moreover, the active visual attention score achieves robustness across subjects under various external conditions comprising ambient light conditions, head pose, occlusions like spectacles. The disclosed system is further explained with the method as described in conjunction with FIG.1 to FIG.9B below.
Referring now to the drawings, and more particularly to FIG. 1 through FIG.9B, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
FIG. 1 illustrates a block diagram of a system to compute active visual attention (AVA) score using eye blink rate variability (BRV) and mental state analysis, in accordance with some embodiments of the present disclosure. In an embodiment, the batch processing system 100 includes processor (s) 104, communication interface (s), alternatively referred as or input/output (I/O) interface(s) 106, and one or more data storage devices or memory 102 operatively coupled to the processor (s) 104. The system 100, with the processor(s) is configured to execute functions of one or more functional blocks of the system 100.
Referring to the components of the system 100, in an embodiment, the processor (s) 104 can be one or more hardware processors 104. In an embodiment, the one or more hardware processors 104 can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor(s) 104 is configured to fetch and execute computer-readable instructions stored in the memory. In an embodiment, the system 100 can be implemented in a variety of computing systems, such as laptop computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud, and the like.
The I/O interface(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface (s) 106 can include one or more ports for connecting a number of devices (nodes) of the system 100 to one another or to another server.
The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random-access memory (SRAM) and dynamic random-access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. The memory 102 may comprise information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system 100 and methods of the present disclosure. Functions of the components of system 100, for predicting batch processes, are explained in conjunction with FIG.2 through FIG.9B providing flow diagram, architectural overviews, and performance analysis of the system 100.
FIG. 2 is a functional block diagram illustrating a method to compute active visual attention score from the eye blink rate variability of the subject, in accordance with some embodiments of the present disclosure. The system 200 may be an example of the system 100 (FIG. 1). In an example embodiment, the system 200 may be embodied in, or is in direct communication with the system, for example the system 100 (FIG. 1). Many factors affect the focus of the subject engaged in a set of tasks. Different subjects are distracted with different outside influences depending on the task at hand and often the subject will not recognize factors that are degrading his or her ability to perform and complete the task. In an embodiment, the system 200 includes a calibration module 202, an EAR estimation module 204, a blink threshold generator 206 and a scoring module 208. The system 200 receives a set of video frames from a user interface. The set of video frames are processed using the system 200 to compute active visual attention score from the eye blink rate variability of the subject in two phases. The first phase includes processing of the calibration module 202, the EAR estimation module 204 and the blink threshold generator 206. The second phase includes the scoring module 208 which processes the output of the first phase.
In the first phase, the calibration module 202 of the system 200 comprises an eye contour detector 202A, a brightness contrast adjustor 202B, a head pose detector 202C and a calibrated data 202D. The calibration module 202 performs initial calibration of the subject by checking ambience, lighting conditions and various external factors for smooth execution of the task being performed by the subject. The eye contour detector 202A detects the face eye contour of the subject. The brightness contrast adjustor 202B adjust the brightness where the subject is being positioned to perform each task.
The head pose detector 202C detects head pose of the subject ensuring subjects’ proper visual alignment in the screen or application screen prepared for performing the task. Head pose is the relative orientation and position of the head with respect to a camera which is represented by yaw, pitch and roll angles along three axes of the camera coordinate system (CCS), respectively. Head pose is estimated by solving perspective point problem of a calibrated video recording sensor (known as camera intrinsic parameters) on a set of 3D points in coordinate system and their corresponding 2D projections in the image plane. The yaw ? angle and pitch angle ? measured during the head pose calibration are called reference angles ?_r and ?_r. These angles deviate from the reference values when subject eye lid position is viewed towards left or right and up or down. It is to be noted that experiments are conducted to derive thresholds for both the yaw ? angle and pitch angle ? for standard 14-inch laptop screen. It is observed that more than 25° deviation in ? angle and more than 15° deviation in ? angle from base position angles can be observed if the participant is not looking towards the screen. However, these deviations are considered as yaw angle ?_t and pitch angle threshold ?_t for deciding if the participant is looking at the screen. Yaw angle ?_i and pitch angle ?_i of every video frame i where is compared with the reference yaw angle ?_r and pitch angle ?_r to determine the gaze on and gaze outside the screen conditions, with respect to the head pose as represented in equation 1,
Gaze state={Gaze outside if (|?_i-?_r |>?_t )||(|?_i-?_r |>?_t
Gaze on the screen,otherwise
-----------------equation 1
Where the thresholds are ?_t is 25°and ?_t is 15°.
The EAR estimation module 204 of the system 200 includes a face detector 204A, a facial landmark detector 204B, EAR calculation 204C and a DC offset correction 204D. The face detector 204A detects face position of the subject. The facial landmark detector 204B detects the face landmarks of the subject. The EAR calculation 204C calculates the set of eye aspect ratio from the face captured in each video frame. The DC offset correction 204D removes non-stationary nature of the EAR signal. The blink threshold generator 206 of the system 200 computes the blink threshold specific to the subject based on historical experimental data.
In the second phase, the attention scoring module 208 of the system 200 comprises an eye blink detector 208A, a blink/no blink signal detector 208B, a BRV signal constructor 208B and an attention score 208D. The eye blink detector 208A detects eye blink of the subject from the set of video frames while performing the set of tasks. The blink/no blink signal detector 208B detects frames corresponding to the cases when the subject is not looking at the computer screen. The BRV signal constructor 208B constructs the BRV signal from the EAR signal at every window of each video frame. The task window is preset at a determined interval. The active visual attention score 208D computes the score for the set of uniformly sampled BRV series signals .
FIG.3 is a flow diagram of a method for computing active visual attention score from the eye blink rate variability of the subject, in accordance with some embodiments of the present disclosure. In an embodiment, the system 100 comprises one or more data storage devices or the memory 102 operatively coupled to the processor(s) 104 and is configured to store instructions for execution of steps of the method 300 by the processor(s) or one or more hardware processors 104. The steps of the method 300 of the present disclosure will now be explained with reference to the components or blocks of the system 100 as depicted in FIG.2 through FIG.9B, and the steps of flow diagram as depicted in FIG.3. Although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods, and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps to be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
Referring now to the steps of the method 300, at step 302, the one or more hardware processors 104 record blink data of a subject using a video recording sensor, the blink data comprising data associated with eye blink response of the subject during a systematic execution of a set of tasks. In an embodiment, gaze data may be recorded using the video recording sensor. For example, the video recording sensor may be at least one of an infrared based eye tracker, an RGB camera, an IR eye tracker, and a camera-based eye tracker. An example below explains, assuming a scenario where the subject receives a set of tasks to be performed using the system 200. The set of tasks may be for example a counting game, a track the light task, odd one out task, spot the difference and thereof. For each of the task being performed by the subject, the system 200 obtains data to compute the active visual attention score based on subject’s attention.
The calibration module 202 initializes where an initial screen appears for each subject being a participant and then enters their unique identity masked user identified (ID) and associated demographic data. The eye contour detector 202A of the calibration module 202 detects face eye contour of the subject that is drawn using a dlib (digital library) shape predictor (known in the art tool).
The brightness contrast adjustor 202B checks ambient conditions of the scenario where the subject is positioned to perform each task. Here, the subject is requested to adjust the video recording camera position and lighting to ensure whether the eyes are getting detected properly. If the eyes of the subject are not properly visible, the brightness and contrast levels may be adjusted accordingly using a simple slider provided in the application screen. Such adjustments may be performed by the subject as known in the art. Along with this, metadata like whether the subject is wearing glasses, whether the data collection is done in natural light or artificial lighting conditions are also collected.
The head pose detector 202C of the calibration module 202 performs head pose calibration of each subject indicating whether the subject is looking into the screen or out of the screen. Here, the application screen is shown to the subject and the subject ensures that camera is at eye level and then align their head to the center of the box by manually adjusting the camera. At that position, head pose angles yaw (?), and pitch (?) are estimated and stored as base head pose angles.
External factors like ambient lighting conditions, distance from camera and usage of spectacles might affect the accuracy of blink detection across subject being participant. To enable personalized blink detection, an initial calibration is performed during which a white screen appears with a black fixation cross (‘+’) at the center. A beep sound (example 2500 Hz of 500ms duration) is played, and the subject is supposed to fixate at the fixation cross and the eye blink as soon as the subject hear the beep sound. Here, one or more such beeps may be used to intimate the subject to perform the set of tasks (e.g., say m beeps, where value of m is 5).
Experimental data were collected from 52 subjects where they were screened for mental health and physical health using SF12 questionnaire (known in the art) providing a measure of quality of life. The calibration data 202D are collated after completion of the initial setup later the actual data collection process is started. Initially, the subjects were asked to rest for a duration of about two minutes by looking onto the screen. This duration is treated as the baseline period. Further, subjects performed four tasks where each task which spans a duration of about two minutes. Subjects could finish each trial of these tasks at their own pace and after submitting the response for each trial in each of the given task, a new trial is presented. The trials keep appearing till the total session lasts for about two minutes. The sequence of tasks are kept uniform across subjects. Thus, the data collection session lasts for about eight minutes (4 tasks × 2 minutes). After finishing the session, each subject provides feedback about their level of visual attention experienced while performing each task.
Referring now FIG.4A through FIG.4D illustrates a set of tasks being performed by the subject for computing attention score from the eye blink rate variability. The set of tasks used in the system may be for an example the counting game (Task 1) where the subject is asked to count the number of occurrences of a target entity and enter the count in the textbox.
Task 2 may be for example track the light where this task may consists of M × N grid based screen with all black grids. Any black grid turns red randomly and the subject is instructed to observe it as early as possible and click on the red grid. The gap between two trial is five second. Each red grid remains in that state for a duration of three second before turning black in color.
Task 3 may be for example odd one out where these trials consists of several images placed in grid form and one of them is the odd one. The subject is supposed to identify that image and type the id of that image in the textbox provided at the bottom of the screen.
Task 4 may be for example spot the difference where a set of two adjacent similar looking but with six differences images being presented, and the participants are expected to identify the differences and click on those. These tasks are classic paradigms in the field of visual attention and conceptually characterize visual attention of the subject concurrently.
Referring now to the steps of the method 300, at step 304, the one or more hardware processors 104 calculate a blink threshold from a set of eye aspect ratio (EAR) values of the blink data. The blink threshold generator 206 generates the blink threshold specifically to the subject being the participant which can be used for detecting eye blinks from the experimental data. Here, the set of eye aspect ratios is calculated from the calibration data 202D obtained from the calibration module 202 of the system 200. The set of video frames having face image of the subject is acquired through the video recorder during calibration which is used further to detect the face landmarks using the dlib (digital library) shape predictor. Here, a total of 68 facial landmarks are given by the dlib shape predictor of which the ones corresponding to the eyes (P_1-P_6) are considered for further evaluation. Further, the eye aspect ratio (?EAR?_i) are computed as represented in equation 2,
?EAR?_i=(|P_(2_i )-P_(6_i ) |+|P_(3_i )-P_(5_i ) |)/(2*?|P?_(1_i )-P_(4_i ) |)--- equation 2
where, P_(j_i ) corresponds to the ????h facial landmark (among 68 landmarks from dlib) for a given frame i ? {1,2,3,...,N} image frames.
Referring now to the steps of the method 300, at step 306, the one or more hardware processors 104 reconstruct from the blink data, a set of uniformly sampled blink rate variability (BRV) series signals, where each BRV series signals from the set of uniformly sampled BRV series signals comprising a time series data constructed from intervals between consecutive eye blinks obtained during the systematic execution of the set of tasks. Here, the set of uniformly sampled blink rate variability (BRV) series signals are reconstructed from the blink data.
Upon starting each task, the head pose of the subject categorizes the subject state into at least one of a no-visual attention duration, a no-data duration, and a visual attention duration. The no-visual attention duration indicates change in a location of the subject eyes within a screen. The no-data duration indicates undetected head pose of the subject. The visual attention duration indicates attentiveness of eyes of the subject towards the screen. The blink signals are utilized directly obtained from the gaze data recorded using the RGB camera. Here, the blink information is computed using the eye images acquired using the RGB camera. The video data recorded during task interval is sub-divided into windows of duration of about predefined intervals which may be for example two minutes. Each window of data is processed to generate ?????? signal. The set of ?????? values are computed for both the right and left eyes separately and the average of these two are considered for further analysis. The set of EAR values are computed for each image frame forms the set of ?????? signal (FIG.5A). The peaks are detected as blinks (considering a threshold of 100 ms) as shown in the plot.
Referring now to the plot of FIG.5B the personalized threshold (Tb) obtained are constructed where the binary blink-no blink signal from the set of ?????? signal. Assuming the blinks occur at time instances d_n, where n ? {0,1,2,...,N- 1} where N is the number of frames in the video data captured. The gaps between blinks is used to construct the signal ? given by equation 3,
?[i_n ]=(d_(n+1)-d_n) --------------equation 3
Where,
i_n=(?C(d?_(n+1)+d_n))/2 ---------equation 4
and C is a constant oversampling factor, with a value of 10 selected empirically. Oversampling was performed in order to have uniformly sampled intervals in the signal ?. Here, a cubic model was used to interpolate the signal ?[i_n ] to obtain the set of uniformly sampled BRV series signals sampled at C × sampling rate, given by ? [i],i ? {0,1,2,...,I - 1} and I ˜ Cd_n-1. The sampling rate of RGB camera used is 30 Hz.
Referring now to FIG.5C, the interpolated signal with the cross marks corresponds to ?[i_n ]. Thus, the inter-blink variations were used to construct the set of uniformly sampled BRV series signals. This BRV signal was carefully analyzed for rest versus attention inducing tasks and it was found by the system 100 that, in the latter case, the gaps in blinks tend to become constant. This signal in the frequency domain, aided in clearly distinguishing the attention versus no attention states.
FIG.5D shows the frequency spectrum of the signal ?? during attention where the signals mostly have a small frequency range (0.05Hz), in accordance with some embodiments of the present disclosure. It is also observed that signal spreads across a wider spectrum for no attention (or low attention) levels.
Referring now to the steps of the method 300, at step 308, the one or more hardware processors 104 determine from the set of uniformly sampled BRV series signals, a pareto frequency feature and a reference pareto frequency feature. The reference pareto frequency feature is determined from the pareto frequency feature. The pareto frequency feature comprises a frequency range within which a specific percentage of cumulative power of a frequency spectrum of the set of uniformly sampled BRV series signals resides. The specific percentage of cumulative power may be for example 80%.
Referring now to the steps of the method 300, at step 310, the one or more hardware processors 104 construct a frequency spectrum for the set of uniformly sampled BRV series signals indicates behavior of the subject during the systematic execution of the set of tasks at every task window to compute a current pareto frequency feature. The frequency spectrum from the set of uniformly sampled BRV series signals are determined at every task window.
FIG.6A and FIG.6B show the frequency spectrum of ?? for low attention (baseline period) and for high attention (task duration) data for a sample participant, in accordance with some embodiments of the present disclosure. The active visual attention score is computed based on the pareto principle that states that the 80% of the effects come from 20% causes. The pareto principle increases attention score from the blinks which become more coherent when the inter blink interval increases. The pareto principle identifies the dominating frequency of blinks. Further, the cumulative sum of powers upto 80% of the total power in the frequency spectrum is computed. Thus, the obtained feature is termed as the pareto frequency feature O.
Referring now to the steps of the method 300, at step 312, the one or more hardware processors 104. computing (312) via the one or more hardware processors, an active visual attention score at every task window by using the frequency spectrum the pareto frequency feature, the reference pareto frequency feature, the current pareto frequency and a state of the subject.
The active visual attention score for the set of uniformly sampled BRV series signals is continually computed to assess visual attention of the subject during the systematic execution of the set of tasks. The active visual attention score for the set of uniformly sampled BRV series signals are computed using the reference pareto frequency feature, and the subject state by initially obtaining the set of uniformly sampled BRV series signals, the no-visual attention duration, the visual attention duration and the pareto frequency feature. Further, a value deviated between the pareto frequency feature and the reference pareto frequency is determined. The active visual attention score is computed based on the deviated value, the visual attention duration, and the no-visual attention duration.
Referring now to FIG.7A illustrates pareto frequencies corresponding to different ground truth feedback scores in conjunction for the set of tasks, in accordance with some embodiments of the present disclosure. Since, the active visual attention duration versus no no-visual attention duration demarcation is found to be centered around the O value of 0.05 Hz and the value as the reference pareto frequency O_r feature. The pareto frequencies are computed per jth task window O_j is used to compute the active visual attention score F_j as depicted in equation 4,
F_j=100*O_r/O_j ----------equation 4
The active visual attention score F_j expects the participant to always look into the computer screen for an entire time window under test. The task window duration is of about two minutes (in order to have sufficient amount of blinks).
However, in practical scenarios the subjects may not be looking onto the screen throughout the window. Thus, in terms of visual focus on the screen three scenarios have been formulated comprising a no-visual attention duration (??????), a no-data duration (??????) and a visual attention duration (????).
The no-visual attention duration (??????) are the frames corresponding to the scenario when the subject is not looking at the computer screen. This is ascertained using the head pose compared with the baseline head pose determined during the initial calibration.
The no-data duration (??????) are the frames with undetectable face due to the subject looking sideways or out of camera range. These set of video frames are completely discarded from further analysis as we consider only visual attention.
The visual attention duration (????) are the frames where the subject is attending the screen of the app. Thus, AD = (1-NAD). Thus, considering these cases, the active visual attention score for the ????h task window, can be modified in equation 5,
F_j=(100-NAD)*O_r/O_j ----------equation 5
FIG.7B illustrates experimental simulation of the active visual attention score having different no attention duration values of the subject using the system of FIG.1, in accordance with some embodiments of the present disclosure. Since, the ?????? acts as the major attribute in the overall window duration, the impact of the computed active visual attention score through simulations. From the simulations it is observed that when the NAD value goes above 50, the active visual attention score quickly tapers to zero. Hence, NAD = 50 has been considered as the threshold. The visual attention score value has been calculated using camera input in the range of 0-100, where 0 indicates no visual attention and 100 indicates high attention. It is to be noted that EAR signal is generated frame wise, whereas, baseline correction, BRV calculation and finally the attention score calculation have been performed window by considering all the frames within the window.
In one embodiment, the efficacy is testing of active visual attention score when compared with the perceived attention level of the subject. After executing the tasks, the participants rated each of these tasks in terms of attention level on a scale of 1 to 4, where 1 represents minimum attention and 4 represents maximum attention. This subjective feedback score is considered as ground truth of attention.
FIG.8A and FIG.8B illustrates distribution of feedback score and attention score computed for the set of tasks using the system of FIG.1, in accordance with some embodiments of the present disclosure. For each subject, the attention score is calculated using the equation 5. After calculating the AVA scores, outlier rejection is performed. Further, the outliers are analyzed across each task. A data point was treated as an outlier if it was found to be more than (mean ±2 SD). In any task if the outliers were significantly more than 5% they were removed resulting in 5 out of 120 being removed from further analysis. The AVA scores are for Task 1 average±SD (mean±SD: 90.59±17.82), Task 2 (mean±SD: 52.98±27.51), Task 3 (mean±SD:92.16±14.36) and Task 4 (mean±SD: 93.07±12.82) match with that of feedback scores and the effective task duration. The pattern formed by connecting the means of each of the tasks in the 3 measures are similar to each other. It is assumed that the ground truth feedback score and the active visual attention score (F) in the trend of visual attention demand for the 4 tasks is { Task 2 < Task 1 < Task 3 < Task 4 } in the increasing order of attention requirement.
FIG.9A illustrates plot of few example attention score at every predefined time for the set of tasks using the system of FIG.1, in accordance with some embodiments of the present disclosure. Validity of continuous AVA have been performed using feedback score. The feedback scores do not give direct indication of one’s attention levels. Moreover, these active visual scores are post-facto, discrete measures and/or are subjective in nature. This creates the need to have an objective and continuous quantified measure of visual attention. The active visual attention score is capable of providing continuous information of one’s visual attention as shown in FIG.9A. The aggregated attention score considering all the participants and all the tasks taken together for each participant, which accounted for a data of 2 minutes (baseline duration) along with 8 minutes (2 minutes × 4 tasks), totaling 10 minutes. The attention score is computed on a 2 minute window length with an overlap of 105 seconds in order to have continuous scores at every 15 seconds. The value of 15 seconds is chosen in order to have a continuous measure of attention level and also as this provides a sufficiently large overlapping window duration of 105 seconds. Analogous to the ground truth feedback scores, the trend of this plot shows a sharp downward jump for the task 2 and is high for the rest of the tasks and aligns with the trend that we observed in the previous section. The method of the present disclosure provides accurate continuous attention measure which is a very good representation of active visual attention at any given instance of time, ??.
FIG.9B illustrates plot of few example attention score of the subject for the set of tasks considering various external conditions using the system of FIG.1, in accordance with some embodiments of the present disclosure. The method of the present disclosure computes the attention score when attempting to deploy the solution in real life scenarios, such as an enterprise context and occlusion primarily due to spectacles and varying ambient light conditions.
The effect of occlusion is assessed by collecting the data from the subject’s using spectacles and the results are compared from subjects who do not use spectacles. The active visual attention score obtained for these two conditions and observed that the attention scores are similar in both the cases (with ??> 0.05). The effect of ambient lighting conditions collects data under natural as well as artificial lighting conditions and the corresponding average attention scores are presented on comparing the average scores and they are comparable across the two scenarios (with ??> 0.05).
The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
The embodiments of present disclosure herein address unresolved problem of eye blink rate variability. The embodiment, thus provides system and method to compute active visual attention score using eye blink rate variability. Moreover, the embodiments herein further provide an efficient and continuous monitoring privacy concerns during quantification of the active visual attention score. The method of the present disclosure is affordable, easy-to-use solution for continuous monitoring of the active visual attention score. The active visual attention score is robust enough to handle external factors like ambient light conditions and occlusions, making it amenable to real time deployment situations.
It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
, Claims:We Claim:
1. A processor implemented method to compute active visual attention score using eye blink rate variability, the method comprising:
recording (302) blink data of a subject using a video recording sensor, via one or more hardware processors, the blink data comprising data associated with eye blink response of the subject during a systematic execution of a set of tasks;
calculating (304) via the one or more hardware processors, a blink threshold from a set of eye aspect ratio (EAR) values of the blink data;
reconstructing (306) from the blink data, a set of uniformly sampled blink rate variability (BRV) series signals via the one or more hardware processors, from the set of uniformly sampled BRV series signals comprising a time series data constructed from intervals between consecutive eye blinks obtained during the systematic execution of the set of tasks;
determining (308) from the set of uniformly sampled BRV series signals, via the one or more hardware processors, a pareto frequency feature and a reference pareto frequency feature, wherein the reference pareto frequency feature is determined from the pareto frequency feature, wherein the pareto frequency feature comprises a frequency range within which a specific percentage of cumulative power of a frequency spectrum of the set of uniformly sampled BRV series signals resides;
constructing (310) via the one or more hardware processors, a frequency spectrum for the set of uniformly sampled BRV series signals indicates behavior of the subject during the systematic execution of the set of tasks at every task window to compute a current pareto frequency feature; and
computing (312) via the one or more hardware processors, an active visual attention score at every task window by using the frequency spectrum the pareto frequency feature, the reference pareto frequency feature, the current pareto frequency, and a state of the subject.

2. The processor implemented method as claimed in claim 1, wherein a head pose of the subject categorizes the state of the subject into at least one of (i) a no-visual attention duration that indicates change in a location of the subject eyes within a screen, (ii) a no-data duration that indicates undetected head pose of the subject, and (iii) a visual attention duration that indicates attentiveness of eyes of the subject towards the screen.

3. The processor-implemented method as claimed in claim 1, computing the active visual attention score at every task window by using the frequency spectrum the pareto frequency feature, the reference pareto frequency feature, the current pareto frequency, and the state of the subject by performing the steps of:
obtaining the set of uniformly sampled BRV series signals, the no-visual attention duration, the visual attention duration and the pareto frequency feature;
determining a value deviated between the pareto frequency feature and the reference pareto frequency feature; and
computing the active visual attention score based on the deviated value, the visual attention duration, and the no-visual attention duration.

4. The processor implemented method as claimed in claim 1, wherein the active visual attention score for the set of uniformly sampled BRV series signals are computed continuously to assess visual attention of the subject during the systematic execution of the set of tasks.

5. The processor implemented method as claimed in claim 1, wherein the task window is preset at a determined interval.

6. The processor implemented method as claimed in claim 1, wherein the behavior of the subject are categorized into at least one of a low visual attention and a high visual attention.

7. A system (100) to compute active visual attention score using eye blink rate variability comprising:
a memory (102) storing instructions;
one or more communication interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the one or more communication interfaces (106), wherein the one or more hardware processors (104) are configured by the instructions to:
record blink data of a subject using a video recording sensor, the blink data comprising data associated with eye blink response of the subject during a systematic execution of a set of tasks;
calculate a blink threshold from a set of eye aspect ratio (EAR) values of the blink data;
reconstructing from the blink data, a set of uniformly sampled blink rate variability (BRV) series signals, each BRV series signal from the set of uniformly sampled BRV series signals comprising a time series data constructed from intervals between consecutive eye blinks obtained during the systematic execution of the set of tasks;
determine from the set of uniformly sampled BRV series signals, a pareto frequency feature and a reference pareto frequency feature, wherein the reference pareto frequency feature is determined from the pareto frequency feature, wherein the pareto frequency feature comprises a frequency range within which a specific percentage of cumulative power of a frequency spectrum of the set of uniformly sampled BRV series signals resides;
determine from the set of uniformly sampled BRV series signals, a pareto frequency feature and a reference pareto frequency feature, wherein the reference pareto frequency feature is determined from the pareto frequency feature, wherein the pareto frequency feature comprises a frequency range within which a specific percentage of cumulative power of a frequency spectrum of the set of uniformly sampled BRV series signals resides;
construct a frequency spectrum for the set of uniformly sampled BRV series signals indicates behavior of the subject during the systematic execution of the set of tasks at every task window to compute a current pareto frequency feature; and
compute an active visual attention score at every task window by using the frequency spectrum the pareto frequency feature, the reference pareto frequency feature, the current pareto frequency, and a state of the subject.

8. The system as claimed in claim 7, wherein a head pose of the subject categorizes the state of the subject into at least one of (i) a no-visual attention duration that indicates change in a location of the subject eyes within a screen, (ii) a no-data duration that indicates undetected head pose of the subject, and (iii) a visual attention duration that indicates attentiveness of eyes of the subject towards the screen.

9. The system as claimed in claim 7, computing the active visual attention score at every task window by using the frequency spectrum the pareto frequency feature, the reference pareto frequency feature, the current pareto frequency, and the state of the subject by performing the steps of:
obtaining the set of uniformly sampled BRV series signals , the no-visual attention duration, the visual attention duration and the pareto frequency feature;
determining a value deviated between the pareto frequency feature and the reference pareto frequency; and
computing the active visual attention score based on the deviated value, the visual attention duration, and the no-visual attention duration.

10. The system as claimed in claim 7, wherein the active visual attention score for the set of uniformly sampled BRV series signals are computed continuously to assess visual attention of the subject during the systematic execution of the set of tasks.

11. The system as claimed in claim 7, wherein the task window is preset at a determined interval.

12. The system as claimed in claim 7, wherein the behavior of the subject are categorized into at least one of a low visual attention and a high visual attention.

Dated this 24th Day of March 2023

Tata Consultancy Services Limited
By their Agent & Attorney

(Adheesh Nargolkar)
of Khaitan & Co
Reg No IN-PA-1086

Documents

Application Documents

# Name Date
1 202323021085-STATEMENT OF UNDERTAKING (FORM 3) [24-03-2023(online)].pdf 2023-03-24
2 202323021085-REQUEST FOR EXAMINATION (FORM-18) [24-03-2023(online)].pdf 2023-03-24
3 202323021085-FORM 18 [24-03-2023(online)].pdf 2023-03-24
4 202323021085-FORM 1 [24-03-2023(online)].pdf 2023-03-24
5 202323021085-FIGURE OF ABSTRACT [24-03-2023(online)].pdf 2023-03-24
6 202323021085-DRAWINGS [24-03-2023(online)].pdf 2023-03-24
7 202323021085-DECLARATION OF INVENTORSHIP (FORM 5) [24-03-2023(online)].pdf 2023-03-24
8 202323021085-COMPLETE SPECIFICATION [24-03-2023(online)].pdf 2023-03-24
9 202323021085-Proof of Right [21-04-2023(online)].pdf 2023-04-21
10 202323021085-FORM-26 [27-04-2023(online)].pdf 2023-04-27
11 Abstract1.jpg 2023-05-29
12 202323021085-FORM-26 [14-11-2025(online)].pdf 2025-11-14