Abstract: This relates generally to analysis of EEG signals for cognitive load assessment. The cognitive load assessment in real-time helps in preventing burnout, prolonged stress and ensuring safety in high mental load working environments. The state of art techniques performs offline processing to analyses EEG signals to make a post facto assessment, and the existing techniques mostly learn the spatial and temporal features sequentially. The disclosure is a spatio-temporal analysis of EEG signals for cognitive load assessment using a spatio-temporal deep network architecture. The EEG signals are processed based on several techniques to obtain a topographic representation of the EEG signals and an EEG video. The topographic representation and the EEG videos are used to train the spatio-temporal deep network architecture for spatio-temporal analysis of EEG signals for cognitive load assessment. The cognitive load assessment includes analysis of multiple norms, comprising a cognitive activity state and a quality of cognitive usage.
Claims::
1. A processor-implemented method (300) for spatio-temporal analysis of EEG signals for cognitive load assessment comprising:
receiving a plurality of Electroencephalography (EEG) signals from a plurality of subjects, via one or more hardware processors, using a pre-defined number of electrodes (302);
pre-processing the plurality of EEG signals to obtain a plurality of pre-processed EEG signals, via the one or more hardware processors, using a plurality of signal pre-processing techniques (304);
transforming the set of pre-processed EEG signals to obtain a plurality of EEG transformation signals, via the one or more hardware processors, wherein the plurality of EEG transformation signals is obtained by (a) segmenting the plurality of pre-processed EEG signals at a pre-defined time instance and (b) applying a plurality of transformation techniques to the segmented plurality of pre-processed EEG signals (306);
generating a plurality of topographic representation for the plurality of EEG transformation signals at the pre-defined time instance, via the one or more hardware processors, using a topographic generation technique, wherein the topographic representation is a spatial representation of the plurality of EEG transformation signals with a pre-defined height and a pre-defined width (308);
creating an EEG video using the plurality of topographic representation, via the one or more hardware processors, wherein the EEG videos is a spatio-temporal representation of the plurality of topographic representation with a pre-defined length and the EEG video is created by an optimized arrangement of the plurality of topographic representation (310);
extracting a set of EEG-keyframes from the EEG video, via the one or more hardware processors, wherein the EEG-keyframes are a sub-set of the EEG video extracted based on a keyframe extraction technique (312);
training a spatio-temporal deep network architecture for the cognitive load assessment using the EEG video and the set of EEG-keyframes, wherein the spatio-temporal architecture is a feature extraction network comprising a multi-scale recurrent convolutional neural network (314); and
receiving a plurality of user EEG signals from a user for cognitive load assessment, via the one or more hardware processors, wherein the user’s cognitive load is assessed using the trained spatio-temporal deep network architecture for multiple norms in cognitive load assessment comprising a cognitive activity state and a quality of cognitive usage (316).
2. The method of claim 1, wherein the spatio-temporal deep network architecture is trained for the cognitive load assessment to obtain a trained spatio-temporal deep network architecture, the training (400) comprising:
receiving a processed EEG video at the spatio-temporal deep network architecture, wherein the processed EEG video is obtained by processing the created EEG video using a video processing technique (402);
extracting a plurality of spatial features from the set of EEG-keyframes using a plurality of two-dimensional convolution layers (404);
extracting a plurality of short-term spatio-temporal features from the set of EEG-keyframes and the processed EEG video using a plurality of three-dimensional convolution layers (406);
extracting a plurality of long-term spatio-temporal features from the set of EEG-keyframes and the processed EEG video using a plurality of three-dimensional convolution layers and a plurality of long short-term memory layers (408);
extracting a plurality of attention-weighted spatial features from the plurality of spatial features using a spatial attention unit, wherein the spatial attention unit comprises a plurality of two-dimensional convolution layers and a plurality of fully connected layers (410);
extracting a plurality of attention-weighted spatial-temporal features from the plurality of short-term spatio-temporal features and the plurality of long-term spatio-temporal features using a spatial-temporal attention unit, wherein the spatial-temporal attention unit comprises a plurality of three-dimensional convolution layers and a plurality of fully connected layers (412); and
pooling the plurality of attention-weighted spatial features and the plurality of attention-weighted spatio-temporal features at a pre-defined temporal scale to obtain a plurality of video level features, wherein the plurality of video level features is utilized by the trained spatio-temporal deep network architecture for cognitive load assessment (414).
3. The method of claim 1, wherein the cognitive load assessment (500) of the user’s EEG signals based on spatio-temporal analysis using the trained spatio-temporal deep network architecture, comprises:
receiving a plurality of the user Electroencephalography (EEG) signals from a user, via one or more hardware processors, using the pre-defined number of electrodes (502);
pre-processing the plurality of the user EEG signals to obtain a plurality of pre-processed user EEG signals, via the one or more hardware processors, using the plurality of signal pre-processing techniques (504);
transforming the set of pre-processed user EEG signals to obtain a plurality of user EEG transformation signals, via the one or more hardware processors, wherein the plurality of user EEG transformation signals is obtained by (a) segmenting the plurality of pre-processed user EEG signals at the pre-defined time instance and (b) applying the plurality of transformation techniques to the segmented plurality of pre-processed user EEG signals (506);
generating a plurality of user topographic representation for the plurality of user EEG transformation signals at the pre-defined time instance, via the one or more hardware processors, using the topographic generation technique, wherein the user topographic representation is a spatial representation of the plurality of user EEG transformation signals with the pre-defined height and the pre-defined width (508);
creating a user EEG video using the plurality of user topographic representation, via the one or more hardware processors, wherein the user EEG videos is a spatio-temporal representation of the plurality of user topographic representation with the pre-defined length and the user EEG video is created by an optimized arrangement of the plurality of user topographic representation (510);
extracting a set of user EEG-keyframes from the user EEG video, via the one or more hardware processors, wherein the user EEG-keyframes are a sub-set of the user EEG video extracted based on a keyframe extraction technique (512); and
assessing the cognitive load of the user EEG video for the multiple norms via the one or more hardware processors, wherein the cognitive load assessment of the user’s EEG video comprises (a) classification of the user’s EEG video for the multiple norms using the trained spatio-temporal deep network architecture, and (b) the classification is visualized by generating a plurality of activation maps representing a plurality of activation regions in the user’s brain (514).
4. The method of claim 1, wherein the plurality of signal pre-processing techniques comprises one of a power line inference removal technique, a band pass filtering technique, an artifact removal technique, and a bad channel identification technique.
5. The method of claim 1, wherein applying the plurality of transformation techniques comprises of computing at least one transformation parameter for the set of pre-processed EEG signals, wherein the transformation parameters comprise one of a plurality of statistical parameters, a plurality of power-based parameters, and a plurality of information theory parameters.
6. The method of claim 1, wherein the topographic generation technique comprises one of a Montage type setting technique, a channel mapping technique, a reference electrode setting technique, and an interpolation technique.
7. The method of claim 1, wherein the keyframe extraction technique comprises one of a clustering techniques, a regular sampling technique and an inter-frame similarity technique.
8. The method of claim 1, wherein the classification is performed for multiple norms comprising the cognitive activity state and the quality of cognitive usage, wherein the cognitive activity state comprises of a rest state and an active state, and the quality of cognitive usage comprises a good usage and a bad usage.
9. A system (100), comprising:
a memory (102) storing instructions;
one or more communication interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the one or more communication interfaces (106), wherein the one or more hardware processors (104) are configured by the instructions to:
receive a plurality of Electroencephalography (EEG) signals from a plurality of subjects, via one or more hardware processors, using a pre-defined number of electrodes ;
pre-process the plurality of EEG signals to obtain a plurality of pre-processed EEG signals, via the one or more hardware processors, using a plurality of signal pre-processing techniques;
transform the set of pre-processed EEG signals to obtain a plurality of EEG transformation signals, via the one or more hardware processors, wherein the plurality of EEG transformation signals is obtained by (a) segmenting the plurality of pre-processed EEG signals at a pre-defined time instance and (b) applying a plurality of transformation techniques to the segmented plurality of pre-processed EEG signals;
generate a plurality of topographic representation for the plurality of EEG transformation signals at the pre-defined time instance, via the one or more hardware processors, using a topographic generation technique, wherein the topographic representation is a spatial representation of the plurality of EEG transformation signals with a pre-defined height and a pre-defined width;
create an EEG video using the plurality of topographic representation, via the one or more hardware processors, wherein the EEG videos is a spatio-temporal representation of the plurality of topographic representation with a pre-defined length and the EEG video is created by an optimized arrangement of the plurality of topographic representation;
extract a set of EEG-keyframes from the EEG video, via the one or more hardware processors, wherein the EEG-keyframes are a sub-set of the EEG video extracted based on a keyframe extraction technique;
train a spatio-temporal deep network architecture for the cognitive load assessment using the EEG video and the set of EEG-keyframes, wherein the spatio-temporal architecture is a feature extraction network comprising a multi-scale recurrent convolutional neural network; and
receive a plurality of user EEG signals received from a user for cognitive load assessment, via the one or more hardware processors, wherein the user’s cognitive load is assessed using the trained spatio-temporal deep network architecture for multiple norms in cognitive load assessment comprising a cognitive activity state and a quality of cognitive usage.
10. The system of claim 9, wherein the one or more hardware processors are configured by the instructions to train the spatio-temporal deep network architecture for the cognitive load assessment, comprising:
receiving a processed EEG video at the spatio-temporal deep network architecture, wherein the pre-processed EEG video is obtained by processing the created EEG video using a video processing technique;
extracting a plurality of spatial features from the set of EEG-keyframes using a plurality of two-dimensional convolution layers ;
extracting a plurality of short-term spatio-temporal features from the set of EEG-keyframes and the processed EEG video using a plurality of three-dimensional convolution layers;
extracting a plurality of long-term spatio-temporal features from the set of EEG-keyframes and the processed EEG video using a plurality of three-dimensional convolution layers and a plurality of long short-term memory layers;
extracting a plurality of attention-weighted spatial features from the plurality of spatial features using a spatial attention unit, wherein the spatial attention unit comprises a plurality of two-dimensional convolution layers and a plurality of fully connected layers;
extracting a plurality of attention-weighted spatial-temporal features from the plurality of short-term spatio-temporal features and the plurality of long-term spatio-temporal features using a spatial-temporal attention unit, wherein the spatial-temporal attention unit comprises a plurality of three-dimensional convolution layers and a plurality of fully connected layers; and
pooling the plurality of attention-weighted spatial features and the plurality of attention-weighted spatio-temporal features at a pre-defined temporal scale to obtain a plurality of video level features, wherein the plurality of video level features is utilized by the trained spatio-temporal deep network architecture for cognitive load assessment.
11. The system of claim 9, wherein the one or more hardware processors are configured by the instructions for cognitive load assessment comprises:
receiving a plurality of user Electroencephalography (EEG) signals from a user, via one or more hardware processors, using the pre-defined number of electrodes;
pre-processing the plurality of user EEG signals to obtain a plurality of pre-processed user EEG signals, via the one or more hardware processors, using the plurality of signal pre-processing techniques;
transforming the set of pre-processed user EEG signals to obtain a plurality of user EEG transformation signals, via the one or more hardware processors, wherein the plurality of user EEG transformation signals is obtained by (a) segmenting the plurality of pre-processed user EEG signals at the pre-defined time instance and (b) applying the plurality of transformation techniques to the segmented plurality of pre-processed user EEG signals;
generating a plurality of user topographic representation for the plurality of user EEG transformation signals at the pre-defined time instance, via the one or more hardware processors, using the topographic generation technique, wherein the user topographic representation is a spatial representation of the plurality of user EEG transformation signals with the pre-defined height and the pre-defined width;
creating a user EEG video using the plurality of user topographic representation, via the one or more hardware processors, wherein the user EEG videos is a spatio-temporal representation of the plurality of user topographic representation with the pre-defined length and the user EEG video is created by an optimized arrangement of the plurality of user topographic representation; and
assessing the cognitive load of the user EEG video for the multiple norms via the one or more hardware processors, wherein the cognitive load assessment of the user’s EEG video comprises (a) classification of the user’s EEG video for the multiple norms using the trained spatio-temporal deep network architecture, and (b) the classification is visualized by generating a plurality of activation maps representing a plurality of activation regions in the user’s brain.
.
12. The system of claim 9, wherein the one or more hardware processors are configured by the instructions to perform the signal pre-processing techniques comprising one of a power line inference removal technique, a band pass filtering technique, an artifact removal technique and a bad channel identification technique.
13. The system of claim 9, wherein the one or more hardware processors are configured by the instructions to perform the plurality of transformation techniques comprises computing atleast one transformation parameters for the set of pre-processed EEG signals, wherein the transformation parameters comprise one of a plurality of statistical parameters, a plurality of power-based parameters, and a plurality of information theory parameters.
14. The system of claim 9, wherein the one or more hardware processors are configured by the instructions to perform the keyframe extraction technique comprises one of a clustering technique, a regular sampling technique and an inter-frame similarity technique.
15. The system of claim 9, wherein the one or more hardware processors are configured by the instructions to perform the topographic generation technique comprises one of a Montage type setting technique, a channel mapping technique, a reference electrode setting technique, and an interpolation technique
, Description:FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention:
A METHOD AND A SYSTEM FOR SPATIO-TEMPORAL ANALYSIS OF ELECTROENCEPHALOGRAPHY SIGNALS FOR COGNITIVE LOAD ASSESSMENT
Applicant:
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th Floor,
Nariman Point, Mumbai 400021,
Maharashtra, India
The following specification particularly describes the invention and the manner in which it is to be performed.
TECHNICAL FIELD
[001] The disclosure herein generally relates to analysis of electroencephalography (EEG) signals for cognitive load assessment, and, more particularly, a method and a system for spatio-temporal analysis of EEG signals for cognitive load assessment.
BACKGROUND
[002] The load on the working memory of a person while performing various mental tasks is a measure of cognitive load on that person. The cognitive load is used as an important criterion for analyzing/assessing proficiency in various tasks such as but not limited to problem-solving, driving and learning. The cognitive assessment can be useful in applications such as optimal work allocation, increasing efficiency in the workplace and ensuring safety in difficult work environments. The assessment of the cognitive load in real-time helps in preventing burnout, prolonged stress and ensuring safety in high mental load working environments. For analysis of cognitive load, the study and analysis of the electroencephalography (EEG) data plays an important role as EEG has been used to capture the activity of the brain.
[003] The state-of-art techniques for cognitive load assessment utilize multiple channels to calculate time and frequency domain features of the EEG signals for determining a cognitive state that in turn gives a measure of the cognitive load. Hence, restricting the assessment to post facto analysis of the data rather than continuous assessment of the cognitive state. Further several classical machine learning techniques have been applied to EEG data for cognitive load analysis. The deep learning techniques are combined with EEG topographic maps (topo maps) to understand the brain functionality with respect to different tasks, wherein the deep recurrent convolutional networks are utilized to learn spatial, spectral, and temporal features from topographic representations for cognitive load classification. However, the state-of-the-art topo map-based approaches sequentially learn the spatial and temporal features by having either spatial or spatial-spectral layers followed by temporal layers in the network, which may not be very efficient. Hence there is a requirement for a real time and efficient technique for cognitive load assessment.
SUMMARY
[004] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a method for spatio-temporal analysis of EEG signals for cognitive load assessment is provided. The system includes a memory storing instructions, one or more communication interfaces, and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to receive a plurality of Electroencephalography (EEG) signals from a plurality of subjects, via one or more hardware processors, using a pre-defined number of electrodes. The system is further configured to pre-process the plurality of EEG signals to obtain a plurality of pre-processed EEG signals, via the one or more hardware processors, using a plurality of signal pre-processing techniques. The system is further configured to transform the set of pre-processed EEG signals to obtain a plurality of EEG transformation signals, via the one or more hardware processors, wherein the plurality of EEG transformation signals is obtained by (a) segmenting the plurality of pre-processed EEG signals at a pre-defined time instance and (b) applying a plurality of transformation techniques to the segmented plurality of pre-processed EEG signals. The system is further configured to generate a plurality of topographic representation for the plurality of EEG transformation signals at the pre-defined time instance, via the one or more hardware processors, using a topographic generation technique, wherein the topographic representation is a spatial representation of the plurality of EEG transformation signals with a pre-defined height and a pre-defined width. The system is further configured to create an EEG video using the plurality of topographic representation, via the one or more hardware processors, wherein the EEG videos is a spatio-temporal representation of the plurality of topographic representation with a pre-defined length and the EEG video is created by an optimized arrangement of the plurality of topographic representation. The system is further configured to extract a set of EEG-keyframes from the EEG video, via the one or more hardware processors, wherein the EEG-keyframes are a sub-set of the EEG video extracted based on a keyframe extraction technique. The system is further configured to train a spatio-temporal deep network architecture for the cognitive load assessment using the EEG video and the set of EEG-keyframes, wherein the spatio-temporal architecture is a feature extraction network comprising a multi-scale recurrent convolutional neural network. The system is further configured to receive a plurality of user EEG signals received from a user for cognitive load assessment, via the one or more hardware processors, wherein the user’s cognitive load is assessed using the trained spatio-temporal deep network architecture for multiple norms in cognitive load assessment comprising a cognitive activity state and a quality of cognitive usage.
[005] In another aspect, a method for spatio-temporal analysis of EEG signals for cognitive load assessment is provided. The method includes receiving a plurality of Electroencephalography (EEG) signals from a plurality of subjects using a pre-defined number of electrodes. The method further includes pre-processing the plurality of EEG signals to obtain a plurality of pre-processed EEG signals using a plurality of signal pre-processing techniques. The method further includes transforming the set of pre-processed EEG signals to obtain a plurality of EEG transformation signals, wherein the plurality of EEG transformation signals is obtained by (a) segmenting the plurality of pre-processed EEG signals at a pre-defined time instance and (b) applying a plurality of transformation techniques to the segmented plurality of pre-processed EEG signals. The method further includes generating a plurality of topographic representation for the plurality of EEG transformation signals at the pre-defined time instance using a topographic generation technique, wherein the topographic representation is a spatial representation of the plurality of EEG transformation signals with a pre-defined height and a pre-defined width. The method further includes creating an EEG video using the plurality of topographic representation, wherein the EEG videos is a spatio-temporal representation of the plurality of topographic representation with a pre-defined length and the EEG video is created by an optimized arrangement of the plurality of topographic representation. The method further includes extracting a set of EEG-keyframes from the EEG video, wherein the EEG-keyframes are a sub-set of the EEG video extracted based on a keyframe extraction technique. The method further includes training a spatio-temporal deep network architecture for the cognitive load assessment using the EEG video and the set of EEG-keyframes, wherein the spatio-temporal architecture is a feature extraction network comprising a multi-scale recurrent convolutional neural network. The method further includes receiving a plurality of user EEG signals from a user for cognitive load assessment, wherein the user’s cognitive load is assessed using the trained spatio-temporal deep network architecture for multiple norms in cognitive load assessment comprising a cognitive activity state and a quality of cognitive usage.
[006] In yet another aspect, a non-transitory computer readable medium for spatio-temporal analysis of EEG signals for cognitive load assessment is provided. The program includes receiving a plurality of Electroencephalography (EEG) signals from a plurality of subjects using a pre-defined number of electrodes. The program further includes pre-processing the plurality of EEG signals to obtain a plurality of pre-processed EEG signals using a plurality of signal pre-processing techniques. The program further includes transforming the set of pre-processed EEG signals to obtain a plurality of EEG transformation signals, wherein the plurality of EEG transformation signals is obtained by (a) segmenting the plurality of pre-processed EEG signals at a pre-defined time instance and (b) applying a plurality of transformation techniques to the segmented plurality of pre-processed EEG signals. The program further includes generating a plurality of topographic representation for the plurality of EEG transformation signals at the pre-defined time instance using a topographic generation technique, wherein the topographic representation is a spatial representation of the plurality of EEG transformation signals with a pre-defined height and a pre-defined width. The program further includes creating an EEG video using the plurality of topographic representation, wherein the EEG videos is a spatio-temporal representation of the plurality of topographic representation with a pre-defined length and the EEG video is created by an optimized arrangement of the plurality of topographic representation. The program further includes extracting a set of EEG-keyframes from the EEG video, wherein the EEG-keyframes are a sub-set of the EEG video extracted based on a keyframe extraction technique. The program further includes training a spatio-temporal deep network architecture for the cognitive load assessment using the EEG video and the set of EEG-keyframes, wherein the spatio-temporal architecture is a feature extraction network comprising a multi-scale recurrent convolutional neural network. The program further includes receiving a plurality of user EEG signals from a user for cognitive load assessment, wherein the user’s cognitive load is assessed using the trained spatio-temporal deep network architecture for multiple norms in cognitive load assessment comprising a cognitive activity state and a quality of cognitive usage.
[007] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[008] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[009] FIG.1 illustrates an exemplary system for spatio-temporal analysis of EEG signals for cognitive load assessment according to some embodiments of the present disclosure.
[010] FIG.2A is a functional block diagram of the system of FIG. 1, for spatio-temporal analysis of EEG signals for cognitive load assessment according to some embodiments of the present disclosure.
[011] FIG.2B is a functional block diagram a cognitive load assessment module (a spatio-temporal deep network architecture) of the system of FIG. 2, for spatio-temporal analysis of EEG signals for cognitive load assessment in accordance with some embodiments of the present disclosure.
[012] FIG.3A, FIG.3B and FIG.3C is a flow diagram illustrating a method (300) for spatio-temporal analysis of EEG signals for cognitive load assessment, using the system of FIG. 1, in accordance with some embodiments of the present disclosure.
[013] FIG.4A and FIG.4B is a flow diagram illustrating a method (400) for training spatio-temporal deep network architecture for spatio-temporal analysis of EEG signals for cognitive load assessment, using the system of FIG. 1, in accordance with some embodiments of the present disclosure.
[014] FIG.5A and FIG.5B is a flow diagram illustrating a method (500) for cognitive load assessment of the EEG signals based on spatio-temporal analysis using the trained spatio-temporal deep network architecture, by the system of FIG. 1, in accordance with some embodiments of the present disclosure;
[015] FIG.6A, FIG.6B and FIG.6C illustrates a frame in an entropy-based processed EEG video of a user at rest state, wherein the frame comprises a blue channel (FIG.6A), a green channel (FIG.6B) and a red channel (FIG.6C), in accordance with some embodiments of the present disclosure;
[016] FIG.6D, FIG.6E and FIG.6F illustrates the frame in the entropy-based processed EEG video of a user at an active state, wherein the frame comprises a blue channel (FIG.6D), a green channel (FIG.6E) and a red channel (FIG.6F), in accordance with some embodiments of the present disclosure;
[017] FIG.7A, FIG.7B and FIG.7C illustrates a frame in a power spectral density (PSD)-based processed EEG video of a user at a rest state, wherein the frame comprises a blue channel (FIG.7A), a green channel (FIG.7B) and a red channel (FIG.7C), in accordance with some embodiments of the present disclosure; and
[018] FIG.7D, FIG.7E and FIG.7F illustrates the frame in the power spectral density (PSD)-based processed EEG video of a user at a active state, wherein the frame comprises a blue channel (FIG.7D), a green channel (FIG.7E) and a red channel (FIG.7F), in accordance with some embodiments of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS
[019] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
[020] Referring now to the drawings, and more particularly to FIG. 1 through FIG.7B, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[021] FIG.1 is a functional block diagram of a system 100 for spatio-temporal analysis of EEG signals for cognitive load assessment in accordance with some embodiments of the present disclosure.
[022] In an embodiment, the system 100 includes a processor(s) 104, communication interface device(s), alternatively referred as input/output (I/O) interface(s) 106, and one or more data storage devices or a memory 102 operatively coupled to the processor(s) 104. The system 100 with one or more hardware processors is configured to execute functions of one or more functional blocks of the system 100.
[023] Referring to the components of the system 100, in an embodiment, the processor(s) 104, can be one or more hardware processors 104. In an embodiment, the one or more hardware processors 104 can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 104 is configured to fetch and execute computer-readable instructions stored in the memory 102. In an embodiment, the system 100 can be implemented in a variety of computing systems including laptop computers, notebooks, hand-held devices such as mobile phones, workstations, mainframe computers, servers, a network cloud and the like.
[024] The I/O interface(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, a touch user interface (TUI) and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface (s) 106 can include one or more ports for connecting a number of devices (nodes) of the system 100 to one another or to another server.
[025] The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random-access memory (SRAM) and dynamic random-access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
[026] Further, the memory 102 may include a database 108 configured to include information regarding data associated the cognitive load analysis. The memory 102 may comprise information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system 100 and methods of the present disclosure. In an embodiment, the database 108 may be external (not shown) to the system 100 and coupled to the system via the I/O interface 106.
[027] Functions of the components of system 100 are explained in conjunction with functional overview of the system 100 in FIG.2A and flow diagram of FIGS.3A-3C, FIGS.4A-4B and FIGS.5A-5B for spatio-temporal analysis of EEG signals for cognitive load assessment.
[028] The system 100 supports various connectivity options such as BLUETOOTH®, USB, ZigBee and other cellular services. The network environment enables connection of various components of the system 100 using any communication link including Internet, WAN, MAN, and so on. In an exemplary embodiment, the system 100 is implemented to operate as a stand-alone device. In another embodiment, the system 100 may be implemented to work as a loosely coupled device to a smart computing environment. The components and functionalities of the system 100 are described further in detail.
[029] FIG.2A is an example functional block diagram of the various modules of the system of FIG.1, in accordance with some embodiments of the present disclosure. As depicted in the architecture, the FIG.2A illustrates the functions of the modules of the system 100 that includes spatio-temporal analysis of EEG signals for cognitive load assessment.
[030] The system 200 for spatio-temporal analysis of EEG signals for cognitive load assessment works in two modes – a training mode and a testing mode based on a user requirement. The training mode comprises of training a spatio-temporal deep network architecture with information regarding spatio-temporal analysis of EEG signals, wherein the spatio-temporal architecture is a feature extraction network comprising a multi-scale recurrent convolutional neural network. The testing mode comprises of using the spatio-temporal deep network architecture for spatio-temporal analysis of EEG signals.
[031] The system 200 for spatio-temporal analysis of EEG signals for cognitive load assessment is configured to receive a plurality of Electroencephalography (EEG) signals from a plurality of subjects, via one or more hardware processors 104, using a pre-defined number of electrodes in the training mode. During the testing mode, the system 200 is configured to receive a plurality of Electroencephalography (EEG) signals from a user, via one or more hardware processors 104, using the pre-defined number of electrodes.
[032] The system 200 further comprises a pre-processor 202 configured to pre-process the plurality of EEG signals to obtain a plurality of pre-processed EEG signals during the training mode. The plurality of pre-processed EEG is obtained using a plurality of signal pre-processing techniques. During the testing mode, the pre-processor 202 configured to pre-process the plurality of user EEG signals to obtain a plurality of pre-processed user EEG signals. The signal pre-processing techniques comprise of one of a power line inference removal technique, a band pass filtering technique, an artifact removal technique and a bad channel identification technique
[033] The system 200 further comprises an EEG signal transformer 204 configured to transforming the set of pre-processed EEG signals to obtain a plurality of EEG transformation signals during training mode. The plurality of EEG transformation signals is obtained by:
(a) segmenting the plurality of pre-processed EEG signals at a pre-defined time instance, and
(b) applying a plurality of transformation techniques to the segmented plurality of pre-processed EEG signals.
[034] Further during the testing mode, the EEG signal transformer 204 configured to transforming the set of pre-processed user EEG signals to obtain a plurality of user EEG transformation signals.
[035] The system 200 further comprises a topographic representation generator 206 configured for generating a plurality of topographic representation for the plurality of EEG transformation signals at the pre-defined time instance during the training mode. The topographic representation is generated using a topographic generation technique. The topographic representation is a spatial representation of the plurality of EEG transformation signals with a pre-defined height and a pre-defined width. The topographic generation technique comprises one of a Montage type setting technique, a channel mapping technique, a reference electrode setting technique, and an interpolation technique.
[036] During testing mode, the topographic representation generator 206 configured for generating a plurality of user topographic representation for the plurality of user EEG transformation signals at the pre-defined time instance.
[037] The system 200 further comprises an EEG video creator 208 configured for creating an EEG video using the plurality of topographic representation during the training mode. The EEG videos is a spatio-temporal representation of the plurality of topographic representation with a pre-defined length and the EEG video is created by an optimized arrangement of the plurality of topographic representation.
[038] During testing mode, the EEG video creator 208 configured for creating a user EEG video using the plurality of user topographic representation.
[039] The system 200 further comprises a key-frame extractor 210 configured for extracting a set of EEG-keyframes from the EEG video during the training mode. The EEG-keyframes are a sub-set of the EEG video extracted based on a keyframe extraction techniques.
[040] During testing mode, the key-frame extractor 210 configured for extracting a set of user EEG-keyframes from the user EEG video.
[041] The system 200 further comprises a cognitive load assessment module 212 configured for training a spatio-temporal deep network architecture for the cognitive load assessment. The spatio-temporal deep network architecture is trained using the EEG video and the set of EEG-keyframes, wherein the spatio-temporal architecture is a feature extraction network comprising a multi-scale recurrent convolutional neural network. During the testing mode, the cognitive load assessment module 212 configured for cognitive load assessment of the user’s EEG signals based on spatio-temporal analysis using the trained spatio-temporal deep network architecture.
[042] The cognitive load assessment module 212 is explained as an example functional block diagram of FIG.2A in the FIG.2A. As depicted in the architecture, the FIG.2B illustrates the functions of the modules of the cognitive load assessment module 212 of system 200 for cognitive load assessment.
[043] The cognitive load assessment module 212 comprises a video processor 214 and a spatio-temporal deep network architecture 216, wherein the spatio-temporal deep network architecture is trained during the testing mode based on method described in FIG.4A and FIG.4B, which is explained in detail in the further sections.
[044] The video processor 214 in the cognitive load assessment module 212 is configured for processing the EEG video using a processing technique to obtain a processed EEG video.
[045] The spatio-temporal deep network architecture 216 is trained for the cognitive load assessment to obtain a trained spatio-temporal deep network architecture during training mode. During the testing mode, the trained spatio-temporal deep network architecture is configured for cognitive load assessment of the user’s EEG signals based on spatio-temporal analysis. The spatio-temporal deep network architecture 216 further comprises of several modules, which is explained below:
[046] The spatio-temporal deep network architecture 216 comprises a 2D-CL 218 that is a plurality of two-dimensional convolution layers. The 2D-CL 218 is configured for extracting a plurality of spatial features from the set of EEG-keyframes.
[047] The spatio-temporal deep network architecture 216 further comprises a 3D-CL 220 that is a plurality of three-dimensional convolution layers. The 3D-CL220 is configured for extracting a plurality of short-term spatio-temporal features from the set of EEG-keyframes and the processed EEG video.
[048] The spatio-temporal deep network architecture 216 further comprises a 3D-CL 222 and a LSTM 224 that is a plurality of three-dimensional convolution layers and a plurality of long short-term memory layers (LSTM). The 3D-CL 222 and LSTM 224 is configured for extracting a plurality of long-term spatio-temporal features from the set of EEG-keyframes and the processed EEG video.
[049] The spatio-temporal deep network architecture 216 further comprises a spatial attention unit 226 that is a plurality of two-dimensional convolution layers and a plurality of fully connected layers. The spatial attention unit 226 is configured for extracting a plurality of attention-weighted spatial features from the plurality of spatial features.
[050] The spatio-temporal deep network architecture 216 further comprises a spatial-temporal attention unit 228 comprised of a plurality of three-dimensional convolution layers and a plurality of fully connected layers. The spatial-temporal attention unit 228 is configured for extracting a plurality of attention-weighted spatial-temporal features from the plurality of short-term spatio-temporal features and the plurality of long-term spatio-temporal features.
[051] The spatio-temporal deep network architecture 216 further comprises a pooler 230 configured for pooling the plurality of attention-weighted spatial features and the plurality of attention-weighted spatio-temporal features at a pre-defined temporal scale to obtain a plurality of video level features. The plurality of video level features is utilized by the trained spatio-temporal deep network architecture for cognitive load assessment.
[052] The various modules of the system 100 and the functional blocks in FIG.2A and FIG.2B are configured for spatio-temporal analysis of EEG signals for cognitive load assessment are implemented as at least one of a logically self-contained part of a software program, a self-contained hardware component, and/or, a self-contained hardware component with a logically self-contained part of a software program embedded into each of the hardware component that when executed perform the above method described herein.
[053] Functions of the components of the system 200 are explained in conjunction with functional modules of the system 100 stored in the memory 102 and further explained in conjunction with flow diagram of FIG.3A, FIG.3B and FIG.3C. The FIG.3A, FIG.3B and FIG.3C with reference to FIG.1, is an exemplary flow diagram illustrating a method 300 for spatio-temporal analysis of EEG signals for cognitive load assessment using the system 100 of FIG.1 according to an embodiment of the present disclosure.
[054] The steps of the method of the present disclosure will now be explained with reference to the components of the system (200) for spatio-temporal analysis of EEG signals for cognitive load assessment and the modules (202-230) as depicted in FIG.2 and the flow diagrams as depicted in FIG.3A, FIG.3B and FIG.3C. Although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps to be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
[055] At step 302 of the method 300, a plurality of Electroencephalography (EEG) signals is received from a plurality of subjects, via one or more hardware processors 104. The EEG signals are collected using a pre-defined number of electrodes.
[056] In an embodiment, the pre-defined number of electrodes required for collecting the EEG signals is pre-defined based on the norms of the cognitive load assessment evaluated application and on an application complexity, wherein the norms of the cognitive load assessment evaluated application includes a cognitive activity state and a quality of cognitive usage. The application complexity is defined as a possible use/application of the cognitive load assessment such as one of the applications of the cognitive load assessment includes estimating if the user is in active/rest state. The application complexity includes a high, medium and low complexity. In an example scenario, for the cognitive state classification with a medium system complexity, the EEG signal is collected with 23 electrodes and follows the international 10-20 standard for electrode placement.
[057] In an example scenario, the plurality of EEG signal is collected the pre-defined number of electrodes in a controlled environment with a plurality of subjects who were given an arithmetic task. The arithmetic task was carried out for four minutes with EEG and ECG data of every subject collected during the experiment.
[058] At step 304 of the method 300, the plurality of EEG signals is pre-processed at the pre-processor 202. The plurality of EEG signals is pre-processed to obtain a plurality of pre-processed EEG signals using a plurality of signal pre-processing techniques.
[059] In an embodiment, the signal pre-processing techniques comprise of a power line inference removal technique, a band pass filtering technique, an artifact removal technique and a bad channel identification technique.
[060] In an example scenario, a 23-channel EEG signal is collected pre-processed using a high-pass filter with a 30 Hz cut-off frequency to remove the noise from the signal of interest. Further, a notch filter of frequency 50 Hz is used to remove power line noise. Further, Independent Component Analysis (ICA) is used to eliminate the artifacts such as those arising from eye movement, muscle movement and cardiac overlapping.
[061] At step 306 of the method 300, the set of pre-processed EEG signals is transformed to obtain a plurality of EEG transformation signals in the EEG signal transformer 204. The plurality of EEG transformation signals is obtained by:
(a) segmenting the plurality of pre-processed EEG signals at a pre-defined time instance, and,
(b) applying a plurality of transformation techniques to the segmented plurality of pre-processed EEG signals.
[062] In an embodiment, the plurality of transformation techniques comprises computing atleast one transformation parameters for the set of pre-processed EEG signals. The transformation parameters comprise one of a plurality of statistical parameters, a plurality of power-based parameters, and a plurality of information theory parameters.
[063] The plurality of pre-processed EEG signals is segmented at a pre-defined time instance, wherein the pre-defined time instance is a smaller time duration in comparison with the plurality of EEG signals and is usually less than or equal to one second. Hence the complete set of plurality of EEG signals are segmented into smaller sub-sets based on the pre-defined time instance.
[064] In an example scenario, the pre-processed EEG signal, with total duration of 60 seconds, is segmented at a time instance of 0.5 seconds. This results in 120 segmented pre-processed EEG signals. The power spectral density (PSD) parameter is computed from each of the segmented plurality of pre-processed EEG signals using Morlet wavelet transform at 40 frequencies from 1 Hz - 40 Hz, generating 4800 transformation EEG signals.
[065] In another example scenario, the pre-processed EEG signal, with total duration of 60 seconds, is segmented at a time instance of 1 second, resulting in 60 segmented pre-processed EEG signals. The entropy parameter is computed from each of the segmented plurality of pre-processed EEG signals, generating 60 transformation EEG signals. At step 308 of the method (300), a plurality of topographic representation is generated for the plurality of EEG transformation signals in the topographic representation generator 206 at the pre-defined time instance. The plurality of topographic representation is generated using a topographic generation technique. The topographic representation is a spatial representation of the plurality of EEG transformation signals with a pre-defined height and a pre-defined width.
[066] In an embodiment, the topographic generation technique comprises of a Montage type setting technique, a channel mapping technique, a reference electrode setting technique, and an interpolation technique.
[067] The topographic representation is a two-dimensional representation, with the pre-defined width and the pre-defined height, wherein the pre-defined width and pre-defined height are determined based on the application complexity.
[068] The topographic representation is also alternatively referred to as topographic maps. In an example scenario, the topographic representations are generated for the 4800 transformation EEG signals, created using PSD transformed signal. A standard channel mapping is applied using the International 10-20 system for electrode placement. Each of the plurality of EEG transformation signals is mapped to a topographic representation of pre-defined height of 224 and a pre-defined width of 224 using bilinear interpolation technique, for a system with medium application complexity.
[069] In another example scenario, the topographic representations are generated for the 60 transformation EEG signals, created using an entropy transformation signals. A standard channel mapping is applied using the International 10-20 system for electrode placement. Each of the plurality of EEG transformation signals (entropy transformed signal) is mapped to a topographic representation of a pre-defined height of 224 and a pre-defined width of 224 using bilinear interpolation technique, for a system with medium application complexity.
[070] The topographic representation is also alternatively referred to as topographic maps. In an example scenario, the topographic representations are generated for 4800 transformation EEG signals, created using PSD based transformation technique. Each of the plurality of EEG transformation signals is mapped to a topographic representation of pre-defined height of 224 and a pre-defined width of 224 using channel mapping technique and bilinear interpolation technique.
[071] In another example scenario, topographic representations are generated for the 60 transformation EEG signals, created using entropy parameter-based transformation technique. Each of the plurality of EEG transformation signals is mapped to a topographic representation of pre-defined height of 224 and a pre-defined width of 224 using channel mapping technique and bilinear interpolation technique.
[072] At step 310 of the method 300, an EEG video is created using the plurality of topographic representation in the EEG video creator 208. Each EEG videos is a spatio-temporal representation of the plurality of topographic representation with a pre-defined length. Further the EEG video is created by an optimized arrangement of the plurality of topographic representations.
[073] In an embodiment, the EEG video is a 4-dimensional representation of the plurality of topographic representation, created by optimized arrangement of the plurality of topographic maps into four dimensions such that the first dimension equals the number of segmented EEG signals, the second dimension is the pre-defined height (as in topographic representation), the third dimension is the pre-defined width (as in topographic representation), and the fourth dimension is the number of transformation parameters computed.
[074] In an example scenario, the EEG video is created from the topographic representations of 4800 transformation EEG signals, created using PSD parameter-based transformation technique. The created EEG video has first dimension 120, second and third dimensions equal to 224, and fourth dimension 40. The first dimension is also alternatively referred to as number of frames, therefore the EEG video consists of 120 frames, wherein each frame has dimensions 224 x 224 x 40, which is represented as [120, 224,224,40].
[075] In another example scenario, the EEG video is created from the topographic representations of 60 transformation EEG signals, created using entropy parameter-based transformation technique. The created EEG video has 60 frames, wherein each frame has dimensions 224 x 224 x 3, which is represented as [60, 224,224,3].
[076] The processed EEG video also has four dimensions, wherein the first dimension equals the number of segmented EEG signals, the second dimension is the pre-defined height, the third dimension is the pre-defined width, and the fourth dimension is 3, representing RGB channels namely a red color intensity, a green color intensity and a blue color intensity.
[077] At step 312 of the method (300), a set of EEG-keyframes is extracted from the EEG video the key-frame extractor 210. The EEG-keyframes are a sub-set of the EEG video extracted based on a keyframe extraction techniques.
[078] In an embodiment, the keyframe extraction techniques comprises of a clustering technique, a regular sampling technique and an inter-frame similarity technique.
[079] In an example scenario, the EEG-keyframes are extracted from the PSD based EEG video using inter-frame similarity technique. For each frame in the EEG video, the structural similarity with the next consecutive frame is computed, and frames with similarity below a threshold are selected as EEG-keyframes.
[080] In an example scenario, the EEG-keyframes are extracted from the entropy parameter-based EEG video using clustering technique. All the frames in the EEG video are clustered into a pre-defined number of clusters based on the entropy value, and the center of each cluster is selected as an EEG-keyframe.
[081] At step 314 of the method (300), a spatio-temporal deep network architecture is trained for the cognitive load assessment in the cognitive load assessment module 212. The spatio-temporal architecture is a feature extraction network comprising a multi-scale recurrent convolutional neural network. The spatio-temporal deep network architecture is trained using the EEG video and the set of EEG-keyframes.
[082] In an embodiment, the method for training the spatio-temporal deep network architecture 216 to obtain a trained spatio-temporal deep network architecture is explained using flowchart of 400 as depicted in FIG.4A and FIG.4B. The training of the spatio-temporal deep network architecture comprises:
[083] At step 402 of the method 400, a processed EEG video is received at the spatio-temporal deep network architecture in the video processor 214. The processed EEG video is obtained by processing the EEG video using a video processing technique
[084] In an embodiment, the video processing technique includes auto-encoder, intensity normalization technique. The video processing technique is decided based on the transformed signal.
[085] In an example scenario, the PSD EEG video is processed using the auto-encoder technique. This technique uses an auto-encoder model that reconstructs the input at the decoder output. The PSD EEG video is given as input to the encoder in the auto-encoder model. The encoder outputs a processed EEG video with 120 frames, each of dimensions 224 x 224 x 3, represented as [120,224,224,3].
[086] In an example scenario, the entropy EEG video is processed using the intensity normalization technique. This technique normalizes the intensity values in EEG video to vary between 0 to 1, and generates a processed EEG video with 60 frames, each of dimensions 224 x 224 x 3, represented as [60,224,224,3].
[087] The processed EEG video also has four dimensions, wherein the first dimension equals the number of segmented EEG signals, the second dimension is the pre-defined height, the third dimension is the pre-defined width, and the fourth dimension is 3, representing RGB channels namely red color intensity, green color intensity and blue color intensity.
[088] At step 404 of the method (400), a plurality of spatial features is extracted from the set of EEG-keyframes at the 2D-CL 218. The plurality of spatial features is extracted using a plurality of two-dimensional convolution layers.
[089] In an embodiment, the plurality of spatial features is extracted in several steps including passing the EEG-keyframe through multiple two-dimensional convolution layers with leaky rectified linear unit activation and using multiple filters to capture the distinct properties in each EEG-keyframe, and a set of maxpool layers to reduce the spatial dimension to finally extract the set of spatial features.
[090] At step 406 of the method (400), a plurality of short-term spatio-temporal features is extracted from the set of EEG-keyframes and the processed EEG video at the 3D-CL 220. The plurality of short-term spatio-temporal features is extracted using a plurality of three-dimensional convolution layers.
[091] In an embodiment, plurality of short-term spatio-temporal features is extracted in several steps including a extraction of a short clip of frames of a pre-defined short duration around each EEG-keyframe and passing the extracted short clip of frames through a set of three-dimensional convolution layers with leaky rectified linear unit activation and using a set of multiple filters to capture the patterns across the frames, and a set of three-dimensional maxpool layers to reduce the spatial and temporal dimensions to finally extract the set of short-term spatio-temporal features.
[092] At step 408 of the method (400), a plurality of long-term spatio-temporal features is extracted from the set of EEG-keyframes and the processed EEG video at the 3D-CL 222 and the LSTM 224. The plurality of long-term spatio-temporal features is extracted using a plurality of three-dimensional convolution layers and a plurality of long short-term memory layers.
[093] In an embodiment, plurality of long-term spatio-temporal features is extracted in several steps including extracting a long clip of frames of a pre-defined long duration around each EEG-keyframe and passing the extracted long clip of frames through a set of three-dimensional convolution layers with multiple filters and leaky rectified linear unit activation, and to capturing the patterns across the frames, a set of two dimensional convolutional long short-term memory (LSTM) layers to retain the patterns learned over the pre-defined long duration, and a set of three-dimensional maxpool layers to reduce the spatial and temporal dimensions to finally extract the set of long-term spatio-temporal features.
[094] At step 410 of the method (400), a plurality of attention-weighted spatial features is extracted from the plurality of spatial features using the spatial attention unit 226. The spatial attention unit 226 comprises a plurality of two-dimensional convolution layers and a plurality of fully connected layers configured for extracting a plurality of attention-weighted spatial features from the plurality of spatial features.
[095] In an embodiment, the spatial attention unit consists of a two-dimensional convolution layer to compute the attention weights at each spatial location, and a multiplication unit to compute product of the attention weights with the spatial features, and an addition unit to add this product back to the spatial features to extract the attention-weighted spatial features.
[096] At step 412 of the method 400, a plurality of attention-weighted spatial-temporal features is extracted from the plurality of short-term spatio-temporal features and the plurality of long-term spatio-temporal features using a spatial-temporal attention unit 228.
[097] The spatial-temporal attention unit 228 comprises a plurality of three-dimensional convolution layers and a plurality of fully connected layers configured for extracting a plurality of attention-weighted spatial-temporal features from the plurality of short-term spatio-temporal features and the plurality of long-term spatio-temporal features.
[098] In an embodiment, the spatio-temporal attention unit consists of a three-dimensional convolution layer to compute the attention weights at each spatial and temporal location, and a multiplication unit to compute product of the attention weights with the short-term spatio-temporal features and the long-term spatio-temporal features, and an addition unit to add this product back to the short-term spatio-temporal features and the long-term spatio-temporal features to extract the attention-weighted spatio-temporal features.
[099] At step 414 of the method 400, the plurality of attention-weighted spatial features and the plurality of attention-weighted spatio-temporal features is pooled at a pre-defined temporal scale to obtain a plurality of video level features at the pooler 230. The pooling is applied using one of a pooling technique such as concatenation technique, global average pooling technique, attention-based weighted pooling technique, Recurrent neural network (RNN) -based pooling technique. The plurality of video level features is utilized by the trained spatio-temporal deep network architecture for cognitive load assessment.
[0100] In an example scenario, pooling is applied using concatenation technique, wherein the attention-weighted spatial features and attention-weighted spatio-temporal features are concatenated along the last dimension to obtain the video-level features.
[0101] Referring to FIG.3C, at step 316 of the method 300, a plurality of user EEG signals are received from a user for cognitive load assessment. The user’s cognitive load is assessed using the trained spatio-temporal deep network architecture for multiple norms in cognitive load assessment comprising a cognitive activity state and a quality of cognitive usage.
[0102] In an embodiment, multiple norms in cognitive load assessment comprises of validation using two levels of cognitive load assessment tasks, first to identify if the user is at rest or active state, and the second is to classify the subject based on count quality in the given arithmetic task. The multiple norms in cognitive load assessment comprise:
(a) a cognitive activity state – wherein the user EEG video is classified for the activity state as a “rest” or “active state”, and
(b) a quality of cognitive usage - wherein the user EEG video is classified for the quality of cognitive usage as “good” or “bad”.
[0103] During the cognitive load assessment of user EEG signal, the system 200 works in the testing mode. In an embodiment, the method for the cognitive load assessment of the user’s EEG signals based on spatio-temporal analysis using the spatio-temporal deep network architecture 216 is explained using flowchart of 500 as depicted in FIG.5A and FIG.5B. The processing of the user EEG signal till the user EEG video and the user EEG-keyframes is obtained remains the same the processing of the plurality of EEG signal received from the plurality of subjects using the same modules (202 to 210). The steps involved in cognitive load assessment of user EEG signal is explained below using FIG.5A and FIG.5B.
[0104] At step 502 of the method 500, a plurality of user Electroencephalography (EEG) signals is received from a plurality of subjects, via one or more hardware processors 104. The user EEG signals are collected using a pre-defined number of electrodes.
[0105] In an embodiment, is collected using the pre-defined number of electrodes in a controlled environment from a user, who is given an arithmetic task. The arithmetic task is carried out for four minutes with EEG signal of every subject collected during the experiment.
[0106] At step 504 of the method 500, the plurality of user EEG signals is pre-processed at the pre-processor 202. The plurality of user EEG signals is pre-processed to obtain a plurality of pre-processed user EEG signals using a plurality of signal pre-processing techniques.
[0107] In an embodiment, the signal pre-processing techniques comprise of a power line inference removal technique, a band pass filtering technique, an artifact removal technique and a bad channel identification technique.
[0108] At step 506 of the method 500, the set of pre-processed user EEG signals is transformed to obtain a plurality of user EEG transformation signals in the EEG signal transformer 204. The plurality of user EEG transformation signals is obtained by:
(a) segmenting the plurality of pre-processed user EEG signals at a pre-defined time instance and,
(b) applying a plurality of transformation techniques to the segmented plurality of pre-processed user EEG signals.
[0109] In an embodiment, the plurality of transformation techniques comprises computing atleast one transformation parameters for the set of pre-processed user EEG signals. The transformation parameters comprise one of a plurality of statistical parameters, a plurality of power-based parameters, and a plurality of information theory parameters.
[0110] At step 508 of the method 500, a plurality of user topographic representation is generated for the plurality of user EEG transformation signals in the topographic representation generator 206 at the pre-defined time instance. The plurality of user topographic representation is generated using a topographic generation technique. The user topographic representation is a spatial representation of the plurality of user EEG transformation signals with a pre-defined height and a pre-defined width.
[0111] In an embodiment, the topographic generation technique comprises of a Montage type setting technique, a channel mapping technique, a reference electrode setting technique, and an interpolation technique.
[0112] At step 510 of the method 500, a user EEG video is created using the plurality of user topographic representation in the EEG video creator 208. The user EEG videos is a spatio-temporal representation of the plurality of user topographic representation with a pre-defined length. Further the user EEG video is created by an optimized arrangement of the plurality of topographic representation.
[0113] In an embodiment, the user EEG video is created by optimized arrangement of the plurality of topographic maps into four dimensions such that the first dimension equals the number of segmented user EEG signals, the second dimension is the pre-defined height, the third dimension is the pre-defined width, and the fourth dimension is the number of transformation parameters computed.
[0114] At step 512 of the method 500, a set of user EEG-keyframes is extracted from the user EEG video the key-frame extractor 210. The user EEG-keyframes are a sub-set of the EEG video extracted based on a keyframe extraction techniques.
[0115] In an embodiment, the keyframe extraction techniques comprises of a clustering technique, a regular sampling technique and an inter-frame similarity technique.
[0116] At step 514 of the method 500, the cognitive load of the user EEG video is assessed for the multiple norms at the cognitive load assessment module 212 using the video processor 214 and the spatio-temporal deep network architecture 216. The cognitive load assessment of the user’s EEG video comprises:
(a) classification of the user’s EEG video for the multiple norms using the trained spatio-temporal deep network architecture, and
(b) the classification is visualized by generating a plurality of activation maps representing a plurality of activation regions in the user’s brain.
[0117] In an embodiment, the multiple norms comprises:
(c) a cognitive activity state – wherein the user EEG video is classified for the activity state as a “rest” or “active state”, and,
(d) a quality of cognitive usage - wherein the user EEG video is classified for the quality of cognitive usage as “good” or “bad”.
The plurality of activation maps representing a plurality of activation regions in the user’s brain is generated using one of a plurality of techniques such as class activation map (CAM) generation technique and gradient class activation map (Grad-CAM) technique.
[0118] Thus, the cognitive load assessment comprising the analysis of (a) cognitive load for multiple norms (cognitive activity state and quality of cognitive usage), (b) plurality of activation maps is determined and is displayed on the I/O interface(s) 106.
[0119] EXPERIMENTS:
[0120] An experiment has been conducted the disclosed spatio-temporal deep network architecture network is implemented using Tensorflow libraries and the experiments are performed on a NVidia Tesla V100 GPU (for experimental purposes only). The spatio-temporal deep network architecture network is trained for 200 epochs using an Adam optimizer with learning rate 0.0001 and a binary cross entropy loss. A tenfold cross-validation is performed for both levels of classification. The classification performance is evaluated by computing three metrics: accuracy, sensitivity and specificity and compared for performance with existing state-of-the-art approaches.
[0121] The cross-validation performance for both levels of classification is summarized in Table.1. The disclosed techniques (also referred to as EEG Topo net) with a PSD-based EEG video achieves the best results with accuracy 98.3% for the cognitive activity state classification and 95% for cognitive usage quality classification. The ResNet with maxpool achieves comparable performance of 97.5% for the cognitive activity state classification. In comparison, a) the entropy based EEG-video do not perform well in both stages of classification; b) the ResNet + maxpool has the least number of training parameters but does not perform well for cognitive usage quality classification; c) The ResNet + LSTM has a lot of training parameters, however, does not impress at both the tasks; and 4) The convolution based temporal aggregation achieves the second-best performance at the cognitive usage quality assessment with 1.8 million trainable parameters. The proposed approach with PSD has twice the number of parameters but gives the highest performance for both the tasks and shows an improvement over the state-of-the-art by a considerable margin of 6.8% for cognitive usage quality classification.
art
[0122] A unique merit of the disclosed approach is that, unlike the conventional approach which use signal analysis and are based on a number of features, the proposed approach operates on only one feature - either a PSD based EEG-video or entropy-based EEG-video. Conventional techniques employ multiple hand-crafted features specific to each task. The baseline performance using handcrafted features and classical machine learning classifier achieves an accuracy of 99.9% at cognitive activity state task and 94.2% at cognitive usage quality task. In contrast, the disclosed approach achieves comparable performance with a single feature and an end-to-end training. Further, the same framework can be used with a variety of inputs and targeting multiple applications highlighting the horizontal nature of the proposed method.
[0123] FIG.6A, FIG.6B and FIG.6C illustrates a frame in an entropy-based processed EEG video of a user at rest state, wherein the frame comprises a blue channel (FIG.6A), a green channel (FIG.6B) and a red channel (FIG.6C). From FIG.6A, FIG.6B and FIG.6C, it can be observed that the intensity variation is mostly uniform across the entire frame, and the same is observed in most other frames in the entropy-based EEG video of a user at rest state.
[0124] FIG.6D, FIG.6E and FIG.6F illustrates the frame in the entropy-based processed EEG video of a user at an active state, wherein the frame comprises a blue channel (FIG.6D), a green channel (FIG.6E) and a red channel (FIG.6F). From FIG.6D, FIG.6E and FIG.6F, it can be observed that there is more intensity variation especially around the boundaries of the frames for a user in active state.
[0125] FIG.7A, FIG.7B and FIG.7C illustrates a frame in a power spectral density (PSD)-based processed EEG video of a user at a rest state, wherein the frame comprises a blue channel (FIG.7A), a green channel (FIG.7B) and a red channel (FIG.7C). From FIG.7A, FIG.7B and FIG.7C, it can be observed that the intensity variation is mostly uniform across the entire frame and the intensity in the frame is mostly white with very few dark regions, for a user in rest state.
[0126] FIG.7D, FIG.7E and FIG.7F illustrates the frame in the power spectral density (PSD)-based processed EEG video of a user at a active state, wherein the frame comprises a blue channel (FIG.7D), a green channel (FIG.7E) and a red channel (FIG.7F). From FIG.7D, FIG.7E and FIG.7F, it can be observed that there is more intensity variation across the frame and multiple darker regions indicating brain activity, for a user in active state.
[0127] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[0128] The embodiments of present disclosure herein provide a solution to address a problem of analysis of EEG signals for cognitive load assessment. The cognitive load assessment in real-time helps in preventing burnout, prolonged stress and ensuring safety in high mental load working environments. The state of art techniques perform offline processing to analyze EEG signals to make a post facto assessment, and the existing techniques mostly learn the spatial and temporal features sequentially. The disclosure is a spatio-temporal analysis of EEG signals for cognitive load assessment using a spatio-temporal deep network architecture. The EEG signals are processed based on several techniques to obtain a topographic representation of the EEG signals and an EEG video. The topographic representation and the EEG videos are used to train the spatio-temporal deep network architecture for spatio-temporal analysis of EEG signals for cognitive load assessment. The cognitive load assessment includes analysis of multiple norms, comprising a cognitive activity state and a quality of cognitive usage.
[0129] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
[0130] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[0131] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[0132] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[0133] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
| # | Name | Date |
|---|---|---|
| 1 | 202121049771-STATEMENT OF UNDERTAKING (FORM 3) [29-10-2021(online)].pdf | 2021-10-29 |
| 2 | 202121049771-REQUEST FOR EXAMINATION (FORM-18) [29-10-2021(online)].pdf | 2021-10-29 |
| 3 | 202121049771-FORM 18 [29-10-2021(online)].pdf | 2021-10-29 |
| 4 | 202121049771-FORM 1 [29-10-2021(online)].pdf | 2021-10-29 |
| 5 | 202121049771-FIGURE OF ABSTRACT [29-10-2021(online)].jpg | 2021-10-29 |
| 6 | 202121049771-DRAWINGS [29-10-2021(online)].pdf | 2021-10-29 |
| 7 | 202121049771-DECLARATION OF INVENTORSHIP (FORM 5) [29-10-2021(online)].pdf | 2021-10-29 |
| 8 | 202121049771-COMPLETE SPECIFICATION [29-10-2021(online)].pdf | 2021-10-29 |
| 9 | Abstract1.jpg | 2021-12-14 |
| 10 | 202121049771-Proof of Right [25-03-2022(online)].pdf | 2022-03-25 |
| 11 | 202121049771-FORM-26 [14-04-2022(online)].pdf | 2022-04-14 |
| 12 | 202121049771-FER.pdf | 2024-01-10 |
| 13 | 202121049771-FORM 3 [04-04-2024(online)].pdf | 2024-04-04 |
| 14 | 202121049771-FER_SER_REPLY [23-05-2024(online)].pdf | 2024-05-23 |
| 15 | 202121049771-COMPLETE SPECIFICATION [23-05-2024(online)].pdf | 2024-05-23 |
| 16 | 202121049771-CLAIMS [23-05-2024(online)].pdf | 2024-05-23 |
| 17 | 202121049771-US(14)-HearingNotice-(HearingDate-15-10-2025).pdf | 2025-09-17 |
| 18 | 202121049771-FORM-26 [10-10-2025(online)].pdf | 2025-10-10 |
| 19 | 202121049771-FORM-26 [10-10-2025(online)]-1.pdf | 2025-10-10 |
| 20 | 202121049771-Correspondence to notify the Controller [10-10-2025(online)].pdf | 2025-10-10 |
| 21 | 202121049771-US(14)-ExtendedHearingNotice-(HearingDate-27-10-2025)-1130.pdf | 2025-10-14 |
| 22 | 202121049771-Correspondence to notify the Controller [24-10-2025(online)].pdf | 2025-10-24 |
| 23 | 202121049771-Written submissions and relevant documents [06-11-2025(online)].pdf | 2025-11-06 |
| 1 | SS59E_20-12-2023.pdf |