Abstract: The present disclosure addresses the technical problem of noise originated due to camera sensor noise, inevitable eye blinking, facial poses and expression variations, camera parameters, face movements, respiration and various environmental factors during the heart rate monitoring of the person using facial videos. A system and method for monitoring accurate heart rate monitoring of the person using his/her facial videos has been provided. The system is providing an adaptive temporal signal selection mechanism which identifies and removes facial area affected by the facial expressions. The system further provides a post-processing mechanism which perform heart rate monitoring by utilizing face reconstruction and quality. The system estimated a local heart rate and global heart rate of the person and corrects the local HR if it is erroneous based on a predefined conditions.
Claims:1. A method for monitoring heart rate of a person using a facial video of the person, the method comprising a processor implemented steps of:
capturing the facial video of the person using a camera (202);
dividing the facial video into a plurality of overlapping windows of a predefined time interval, wherein each window have a plurality of frames (204);
detecting a region of interest (ROI) from a first frame of the facial video (206);
extracting a temporal signal from the region of interest, wherein the temporal signal depicting facial variations due the flow of blood (208);
applying a band pass filter to remove a noise from the extracted temporal signal (210);
removing a part of temporal signal corresponding to facial expressions of the person from the filtered temporal signal (212);
extracting a pulse signal from the temporal signal using a blind source separation methodology (214);
estimating a local heart rate (HR) of the person corresponding to each of the plurality of windows using an FFT analysis of the extracted pulse signal (216);
estimating a global heart rate (HR) of the person by extracting the temporal signal for the full facial video (218);
checking if the local HR is erroneous based on a predefined condition (220);
correcting the erroneous HR if the local HR is erroneous HR (222); and
monitoring the heart rate of the person using either the local HR or the corrected HR depending on the predefined condition (224).
2. The method of claim 1, wherein the step of detecting the ROI further comprises:
determining a plurality of rectangular blocks by applying Viola Jones face detector on a frame of the facial video, wherein the rectangular blocks comprising a plurality of pixels;
determining and removing non informative pixels out of the plurality of pixels;
determining and removing the pixels out of the plurality of pixels corresponding to eye blink by applying Viola Jones eye detector;
performing the morphological erosion operation to remove face boundary pixels out of the plurality of pixels;
dividing the resulting area into a plurality of non-overlapping square blocks, wherein each of the plurality of non-overlapping square blocks containing only skin pixels, referred to as the ROI.
3. The method of claim 1, wherein a quality of the pulse signal is defined using a peak signal to noise ratio (PSNR).
4. The method of claim 1, further comprising the step of determining the temporal signal corresponding to the facial expression of the person as follows:
calculating a confidence score by dividing standard deviation with the mean of the temporal signal corresponding to each of the ROI; and
discarding the top 20% of temporal signals large confidence score.
5. The method of claim 1, wherein the predefined condition is an absolute difference between the global HR and the local HR is greater than a first predefined threshold and a quality of the local HR is less than a second predefined threshold.
6. The method of claim 1, wherein the band pass filter is applied to remove the temporal signal outside a range of 0.7 Hz to 4 Hz.
7. The method of claim 1, further comprising the step of normalizing the temporal signal using z-score normalization to maintain a shape of the temporal signal.
8. The method of claim 1, further comprising the removal of non-stationary trends in the temporal signal using a de-trending filter.
9. A system for monitoring heart rate of a person using a facial video of the person, the system comprises:
a camera (102) for capturing the facial video of the person;
a memory (104); and
a processor (106) in communication with the memory, the processor further comprises:
a segmentation module (108) dividing the facial video into a plurality of overlapping windows of a predefined time interval, wherein each window have a plurality of frames;
an ROI detection module (110) detecting a region of interest (ROI) from a first frame of the facial video;
a signal extraction module (112) extracting a temporal signal from the region of interest, wherein the temporal signal depicting facial variations due the flow of blood;
a filtering module (114) configured to:
apply a band pass filter to remove a noise from the extracted temporal signal, and
remove a part of temporal signal corresponding to facial expressions of the person from the filtered temporal signal;
a pulse extraction module (116) extracting a pulse signal from the temporal signal using a blind source separation methodology;
a local HR estimation module (118) estimating a local heart rate (HR) of the person corresponding to each of the plurality of windows using an FFT analysis of the extracted pulse signal;
a global HR estimation module (120) estimating a global heart rate (HR) of the person by extracting the temporal signal for the full facial video;
a checking module (122) for checking if the local HR is erroneous based on a predefined condition;
an error correction module (124) correcting the erroneous HR if the local HR is erroneous HR; and
a monitoring module (126) for monitoring the heart rate of the person using either the local HR or the corrected HR depending on the predefined condition.
, Description:FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention:
METHOD AND SYSTEM FOR MONITORING HEART RATE OF A PERSON USING FACIAL VIDEOS
Applicant:
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th Floor,
Nariman Point, Mumbai 400021,
Maharashtra, India
The following specification particularly describes the invention and the manner in which it is to be performed.
TECHNICAL FIELD
[001] The embodiments herein generally relates to the field of heart rate monitoring, and, more particularly, to a method and system for monitoring heart rate of the person using facial video of the person.
BACKGROUND
[002] Heart rate (HR) monitoring is indispensable for several real-world scenarios, especially when acquired in a non-contact manner. Doctors extensively utilize heart rate (HR) monitoring for diagnosing the human illness. Applicability of HR monitoring in the human physiological and pathological parameters evaluation, has elicited the attention in various fields.
[003] Existing HR monitoring approaches based on electrocardiography (ECG) and photo-plethysmograph (PPG) require skin contact due to which they are not user friendly, cannot perform long-term monitoring and unable to analyze neonates, skin damaged persons and human while sleeping or exercising. Moreover, they require bulky, expensive and dedicated sensors which further limits the applicability. This has motivated to design the HR monitoring system using face videos which allow the acquisition from cheap and portable sensor in a non-contact manner.
[004] The ideology behind HR estimation using face videos is that blood traverse in the human body and varies according to the heart beat. These blood variations can be noticed in terms of change in facial color and head motion. Both the color and motion variations are subtle and hence imperceptible to the human eye, but these can be perceived using camera.
[005] Existing face videos based HR estimation systems consist of four stages, viz., Region of interest (ROI) detection, temporal signal extraction, pulse extraction and HR estimation. The temporal signals extracted from face videos contain pulse signal along with the noise originated due to camera sensor noise, inevitable eye blinking; facial poses and expression variations; camera parameters (like, auto-focus can change illumination); face movements; respiration; and environmental factors (like, illumination variations). These issues make correct HR monitoring using face videos challenging.
SUMMARY
[006] The following presents a simplified summary of some embodiments of the disclosure in order to provide a basic understanding of the embodiments. This summary is not an extensive overview of the embodiments. It is not intended to identify key/critical elements of the embodiments or to delineate the scope of the embodiments. Its sole purpose is to present some embodiments in a simplified form as a prelude to the more detailed description that is presented below.
[007] In view of the foregoing, an embodiment herein provides a system for monitoring heart rate of a person using a facial video of the person. The system comprises a camera, a memory and a processor in communication with the memory. The camera captures the facial video of the person. The processor further comprises a segmentation module, an ROI detection module, a signal extraction module, a filtering module, a pulse extraction module, a local HR estimation module, a global HR estimation module, a checking module, an error correction module and a monitoring module. The segmentation module divides the facial video into a plurality of overlapping windows of a predefined time interval, wherein each window have a plurality of frames. The ROI detection module detects a region of interest (ROI) from a first frame of the facial video. The signal extraction module extracts a temporal signal from the region of interest, wherein the temporal signal depicting facial variations due the flow of blood. The filtering module configured to apply a band pass filter to remove a noise from the extracted temporal signal, and remove a part of temporal signal corresponding to facial expressions of the person from the filtered temporal signal. The pulse extraction module extracts a pulse signal from the temporal signal using a blind source separation methodology. The local HR estimation module estimates a local heart rate (HR) of the person corresponding to each of the plurality of windows using an FFT analysis of the extracted pulse signal. The global HR estimation module estimates a global heart rate (HR) of the person by extracting the temporal signal for the full facial video. The checking module checks if the local HR is erroneous based on a predefined condition. The error correction module corrects the erroneous HR if the local HR is erroneous HR. The monitoring module monitors the heart rate of the person using either the local HR or the corrected HR depending on the predefined condition.
[008] In another aspect the embodiment here provides a method for monitoring heart rate of a person using a facial video of the person. Initially, the facial video of the person is captured using a camera. In the next step, the facial video is divided into a plurality of overlapping windows of a predefined time interval, wherein each window have a plurality of frames. In the next step, a region of interest (ROI) is detected from a first frame of the facial video. Later, a temporal signal is detected from the region of interest, wherein the temporal signal depicting facial variations due the flow of blood. In the next step, a band pass filter is applied to remove a noise from the extracted temporal signal. In the next step, a part of temporal signal corresponding to facial expressions of the person is removed from the filtered temporal signal. In the next step, a pulse signal is extracted from the temporal signal using a blind source separation methodology. A local heart rate (HR) of the person corresponding to each of the plurality of windows is then estimated using an FFT analysis of the extracted pulse signal. A global heart rate (HR) of the person is then extracted by extracting the temporal signal for the full facial video. In the next step, it is checked if the local HR is erroneous based on a predefined condition. In the next step, the erroneous HR is corrected if the local HR is erroneous HR. And finally, the heart rate of the person is monitored using either the local HR or the corrected HR depending on the predefined condition.
[009] It should be appreciated by those skilled in the art that any block diagram herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computing device or processor, whether or not such computing device or processor is explicitly shown.
BRIEF DESCRIPTION OF THE DRAWINGS
[010] The embodiments herein will be better understood from the following detailed description with reference to the drawings, in which:
[011] Fig. 1 illustrates a block diagram of a system for monitoring heart rate of a person using a facial video of the person according to an embodiment of the present disclosure;
[012] Fig. 2 shows a schematic flow diagram of the system for monitoring heart rate of the person using the facial video of the person according to an embodiment of the disclosure; and
[013] Fig. 3A and 3B shows the importance the adaptive temporal signal selection according to an embodiment of the disclosure;
[014] Fig. 4 shows examples of temporal signals discarded and utilized in the proposed adaptive temporal signal selection according to an embodiment of the disclosure;
[015] Fig. 5 shows example of incorrect HR estimation when face reconstruction is avoided and face reconstruction is used according to an embodiment of the disclosure;
[016] Fig. 6A-6B is a flowchart illustrating the steps involved in monitoring heart rate of the person using the facial video of the person according to an embodiment of the present disclosure; and
[017] Fig. 7 shows BA plots of the HR estimation according to an embodiment of the disclosure.
DETAILED DESCRIPTION
[018] The embodiments herein and the various features and advantageous details thereof are explained more fully with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
[019] Referring now to the drawings, and more particularly to Fig. 1 through Fig. 7, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[020] According to an embodiment of the disclosure, a system 100 for monitoring heart rate of a person using the facial video of the person is shown in Fig. 1. A schematic flow diagram of the system 100 is shown in Fig. 2. The system 100. The system 100 provides an adaptive temporal signal selection mechanism which identifies and removes facial areas affected by facial expressions. Further, the system 100 provides a post-processing mechanism to perform HR monitoring by utilizing face reconstruction and quality.
[021] According to an embodiment of the disclosure, the system 100 further comprises a camera 102, a memory 104 and a processor 106 as shown in the block diagram of Fig. 1. The processor 106 works in communication with the memory 104. The processor 106 further comprises a plurality of modules. The plurality of modules accesses the set of algorithms stored in the memory 104 to perform a specific task. The processor 106 further comprises a segmentation module 108, an ROI detection module 110, a signal extraction module 112, a filtering module 114, a pulse extraction module 116, a local heart rate (HR) estimation module 118, a global HR estimation module 120, a checking module 122 an error correction module 124 and a monitoring module 126.
[022] According to an embodiment of the disclosure, the facial video is captured using the camera 102. The system 100 can use any available camera in the prior art.
[023] According to an embodiment of the disclosure, the system 100 comprises the segmentation module 108. The segmentation module 108 is configured to divide or segment the facial video into a plurality of overlapping windows of a predefined time interval. In HR monitoring, several local HRs are estimated at multiple contiguous time intervals and they are eventually concatenated. In an example, the facial video is divided into multiple overlapping windows such that the width of each window is chosen to be 250 frames and the overlap between successive windows is 150 frames.
[024] According to an embodiment of the disclosure, the system 100 further comprises the region of interest (ROI) detection module 110. The ROI detection module 110 is configured to detect the region of interest (ROI) from a first frame of the facial video. The ROI consists of a lot of skin pixels. The rectangular block consisting of the face is determined by applying Viola-Jones face detector on the first frame.
[025] Apart from the facial skin pixels, the block also consists of pixels belonging to background, eye areas, hairs and moustaches which are devoid of any HR information. Such non-informative pixels are detected by applying skin detection proposed and subsequently they are removed from the ROI. It is observed that the eye areas are influenced by inevitable eye blinking which results in spurious HR estimation. Hence, these areas are detected by applying Viola-Jones eyes detector and removed. Another prominent factor for erroneous HR estimation is the unavoidable slight variations in face boundary pixels that tremendously alter the temporal signals. Hence, morphological erosion operation is performed to remove the boundary pixels. The resulting area is divided into multiple non-overlapping square blocks and each block containing only skin pixels is considered as the region of interest (ROI).
[026] According to an embodiment of the disclosure, the system 100 further comprises the signal extraction module 112. The signal extraction module 112 is configured to extract a temporal signal from the region of interest. The temporal signals depicts the subtle facial variations due to blood flow. Eulerian philosophy is used rather than Lagrangian philosophy for the extraction because: i) features available in the ROI are less discriminatory which makes Lagrangian temporal signals error-prone; and ii) signal extraction using Lagrangian philosophy is highly time consuming as compared to the extraction using Eulerian philosophy. Amongst the RGB color channel, the strongest plethysmograph signals is present in the green channel. Hence, the temporal signal is defined by the mean green value of the pixels in a block. Mathematically, the temporal signal corresponding to ith ROI is given by:
………….. (1)
where denote the green channel intensity of the kth pixel belonging to the ith ROI of jth frame; f is the number of frame; and n is the number of pixels in the ROI.
[027] According to an embodiment of the disclosure, the system 100 comprises the filtering module 114. Each temporal signal contains heart information along with noise due to facial expression, respiration, illumination and facial movements. The filtering module 114 is configured to mitigate the above mentioned noises.
[028] A band pass filter is used to remove the noise from the extracted temporal signal. Normally the heart beats at a rate of 42 to 240 beats per minute (BPM). Thus, band-pass filter is applied, which removes the frequencies outside the range of 0.7 to 4 Hz. It is useful to remove the noise due to respiration rate, which lies outside this frequency range. Further, it is observed that changes in illumination can introduce non-stationary trends in the temporal signals. Such trends are alleviated by de-trending filter.
[029] The filtering module 114 is further configured to remove a part of the temporal signal corresponding to facial expressions of the person from the band pass filtered temporal signal. Facial expressions introduce large variations in the temporal signals. As an instance, when the person smiles, temporal signals obtained from mouth area are affected while temporal signals obtained from the nose and head are least affected.
[030] For correct HR estimation, the system 100 removes those temporal signals that can be affected by the facial expressions. The importance of avoiding the affected temporal signals can be visualized from Fig. 3. It can be observed from Fig. 3A that estimated local HR is significantly deviated from the actual local HR when the temporal signals affected by expressions are considered. But it can be seen from Fig. 3B that when the affected temporal signals are not considered the estimated and actual HR are same. The affected temporal signals are detected using the intuition that the facial expression tremendously alter the amplitude of the affected temporal signals and thereby increases the standard deviation. This is leveraged by defining the confidence of whether a signal is affected by the expression or not in the following manner:
…………… (2)
where ci denote the confidence of Ti temporal signal while std and mean are the standard deviation and mean operators respectively. In Equation (2), confidence is normalized by dividing with their mean because different ROI have different signal strength depending upon the facial structure and the blood flow mechanism. Top 20% of the temporal signals having large confidence values are discarded to reduce the problems due to facial expressions. Fig. 4 shows some examples of temporal signals that are discarded and utilized according to the confidence, c.
[031] According to an embodiment of the disclosure, the system 100 comprises the pulse extraction module 116. The pulse extraction module 116 extracts a pulse signal from the filtered temporal signal using a blind source separation methodology. Each temporal signal contains pulse information along with noise. The blind source separation is used to estimate the source signal, i.e., pulse signal. Blood flow in the face is dependent on the bones and artery structure of the face which results in variable amplitude of temporal signals. Hence, the temporal signals are normalized using z-score normalization to maintain the signal shape. The normalized temporal signal, Ti can be written as:
……. (3)
where P is the original pulse signal; e is noise; and A is the transformation matrix. The objective of the blind source separation is the pulse signal estimation using the temporal signals, such that original and estimated pulse signals are similar. That is shown in Equation (4):
………. (4)
where is the estimated pulse; B is an appropriate transformation matrix; C = BA and . It can be observed from Equation (4) that estimated and original pulse can be similar when magnitude of C is approximately 1. Higher order cumulants are used to impose the shape constraints, viz., estimated pulse signal is similar to original pulse signal. It is shown in that cumulant up to fourth order is sufficient for this purpose. Cumulants higher than four are sensitive to outliers and hence avoided. Hence the objective function is defined as:
……….. (5)
where denote conjugate, absolute and Kurtosis operations respectively. Solution of Equation (5) is obtained Kurtosis based maximization proposed in because it provides the global convergence in a time efficient manner.
[032] According to an embodiment of the disclosure, the system 100 further comprises the local heart rate (HR) estimation module 118. The local HR estimation module 118 estimates a local heart rate (HR) of the person corresponding to each of the plurality of windows using an FFT analysis of the extracted pulse signal. FFT is applied to transform the pulse signal into frequency domain using which the local HR, h is evaluated by:
h = f * 60……………… (6)
where f is the frequency containing the maximum amplitude in the frequency spectrum. The quality of the pulse signal is defined using peak signal to noise ratio, PSNR. Usually, the pulse spectrum contains large amplitudes near HR frequencies and low amplitude at the remaining frequencies that correspond to noise. Thus, signal in PSNR corresponds to the amplitudes of local HR frequency and its few neighbors while noise composed of the amplitudes at the remaining frequencies. That is, the quality, q is given by:
……………..(7)
where Se is the frequency spectrum; ne is the neighborhood size; and mh provides the position of the local HR frequency. The neighborhood size is chosen such that if h is the local HR then signal constitute of amplitudes corresponding to [h – 5, h + 5].
[033] The system 100 reduces the problems of facial expression, but it can still deteriorate the heart rate (HR) estimation. Moreover, out-of-plane deformations can result in spurious HR estimation. Thus such cases are determined and a better local HR estimation algorithm is applied that can handle such problems. HR of the full pulse signal referred as global HR is estimated and used to determine the erroneous cases. According to an embodiment of the disclosure, the global HR estimation module 120 estimates a global heart rate (HR) of the person by extracting the temporal signal for the full facial video.
[034] According to an embodiment of the disclosure, the system 100 further comprises the checking module 122. The checking module 122 is configured to if the local HR is erroneous based on a predefined condition. It is intuitive that HR derived from the full pulse signal (global HR) should not differ significantly from the HR derived from one of its windows (local HR) unless the window contains noise. The pulse signal affected by noise contains multiple peaks thus, its quality (i.e., PSNR) is low. Based on this the predefined condition is defined. The predefined condition is an absolute difference between the global HR and the local HR is greater than a first predefined threshold t1 and a quality of the local HR is less than a second predefined threshold t2. In an example, the threshold t1 and t2 are set to 10 and 0.33.
[035] According to an embodiment of the disclosure, the system 100 further comprises the error correction module 124 and the monitoring module 126. The error correction module 124 is configured to correct the erroneous heart rate if the local heart rate is having error as per the predefined condition. The erroneous local HR estimates are corrected for better HR monitoring. Assume that the previously estimated local HR and quality are determined by hp and qp respectively, while the re-estimated local HR and quality are denoted by hn and qn respectively. It is possible that both the hp and hn are affected by noise and thus incapable for good HR monitoring. One such example is illustrated in Fig. 5A and 5B. In such cases, it is better to set local HR equal to global HR for better HR monitoring. More clearly, if both the quality estimates, qp and qn are less than a third threshold t3 then local HR is given by global HR. While if any quality estimate is greater than t3, local HR is given by the HR estimate corresponding to the larger quality. Threshold t3 is set at 0.2 in an example of the system 100. Intuition behind the decision is that the face reconstruction can be erroneous due to which hn does not provide better HR estimation than hp in some cases. In essence, the local HR, hl (i) for ith window is:
………….. (8)
[036] In operation, a flowchart 200 illustrating the steps of monitoring heart rate of a person using a facial video of the person is shown in flow chart of Fig. 6A-6B. Initially at step 202, the facial video of the person is captured using the camera 102. At step 204, the facial video is divided or segmented into a plurality of overlapping windows of a predefined time interval, wherein each window have a plurality of frames. In the next step 206, a region of interest (ROI) is detected from the first frame of the facial video. At step 208, the temporal signal is extracted from the region of interest, wherein the temporal signal depicting facial variations due the flow of blood. In the next step 210, a band pass filter is applied to remove a noise from the extracted temporal signal and at step 212, a part of temporal signal corresponding to facial expressions of the person is removed from the filtered temporal signal. Thus, the temporal signal is selected adaptively by removing facial areas affected by facial expressions.
[037] In the next step 214, a pulse signal is extracted from the temporal signal using a blind source separation methodology. At step 216, the local heart rate (HR) of the person is estimated corresponding to each of the plurality of windows using an FFT analysis of the extracted pulse signal. And at step 218, the global heart rate (HR) of the person is estimated by extracting the temporal signal for the full facial video. In the next step 220, it is checked if the local HR is erroneous based on a predefined condition. The predefined condition is an absolute difference between the global HR and the local HR is greater than a first predefined threshold and a quality of the local HR is less than a second predefined threshold. In the next step 222, the erroneous HR is corrected if the local HR is erroneous HR. And finally, the heart rate of the person is monitored using either the local HR or the corrected HR depending on the predefined condition.
[038] The system 100 can also be explained with the help of following experimental results.
Dataset Acquisition
[039] The efficacy of the proposed system is evaluated on Intel i5-2400, CPU 3.10GHz in MATLAB 2016a. Total 50 face videos have been acquired from Logitech webcam C270 camera such that each video belongs to a different subject. The camera is placed on the top of a laptop. The subjects have been instructed to sit in front of laptop and camera such that their face is visible in the face video. They are permitted to perform natural facial movements, like slight head tilting, eye blinking and small lip movements. The acquisition has been performed under natural illumination. Each video is 54 seconds long and acquired at 28 frames per second. Ground-truth for the performance evaluation, is simultaneously acquired with the face videos by keeping CMS 50D+ pulse oximeter on the right index fingertip.
Performance Measurement
[040] Assume that he (i; j) and ha (i; j) represent the estimated and actual HR corresponding to the jth window of ith subject. The performance of the proposed HR monitoring system is evaluated using the following metrics:
[041] 1. Bland-Altman (BA) plot illustrates the consensus between estimated and actual measurements. Its abscissas and ordinate denote average HR (viz. ) and estimation error (viz., he (i, j) – ha (i, j)) respectively. An accurate HR monitoring system require the estimation error to be close to zero. Alternatively, the mean, µ and standard deviation, s of estimation error should be close to zero. In addition, the percentage of samples with absolute error less than 5, 10 and 15 BPM denoted by err5, err10 and err15 should be as small as possible.
[042] 2. The mean average error, MAE is calculated by:
……………. (9)
where z is the total number of subjects; is the absolute operator; and ni is the number of windows for ith subject. Lower value of MAE indicates better HR monitoring.
[043] 3. The Pearson correlation coefficient, ? provides similarity between estimated and actual HR. It is given by:
……………….. (10)
where cov and s denote the covariance and standard deviation operator respectively. High ? denote high similarity between the predicted and actual HR, i.e, better HR monitoring.
[044] 4. Mean and standard deviation of average absolute error, err1 (i) and average absolute error percentage, err2 (i) are also used for the evaluation [23]. Consider that ni represent the number of windows for ith subject, then
………………….. (11)
…………………..(12)
The mean and standard deviation of err1 for all the subject are denoted by err1m and err1s respectively while mean and standard deviation of err2 are denoted by err2m and err2s respectively. Lower values of err1m, Err1s, err2m and err2s indicate better HR monitoring.
[045] Fig. 7A and 7B and Table 1 depict the BA plots and performance metrics respectively for Systems I, II, III, IV, V, VI and the proposed system. To conform to the proposed system, pulse signals of these systems are divided into multiple windows and local HR is estimated.
[046] System I is designed by modifying the proposed system such that HR monitoring is performed using temporal signals extracted after face reconstruction. System II is designed by avoiding the post-processing and adaptive temporal signal selection mechanisms in the proposed system, i.e., it uses all the Eulerian temporal signals obtained without any face registration. System III avoids the post-processing, but it uses adaptive temporal signal selection mechanism. The remaining systems employ different post-processing techniques and the proposed adaptive temporal signal selection. Whenever the absolute difference between global HR and local HR is greater than a threshold t1, the local HR is replaced by previous local HR, average of local HR and global HR in Systems IV, V and VI respectively. It is evident from Fig. 7A and 7B and Table 1 that:
[047] System I can handle the non-linear deformations introduced in the face. Despite this, Systems II and III perform better than System I due to the noise originated from: i) skin pixel interpolation required in face reconstruction; and ii) improper face registration using error-prone location of eye centers. There exist some cases where System I perform better than Systems II and III. This points out that better HR monitoring can be achieved when System I can be used along with System II or III. It can be seen that the System I is the most computationally expensive due to face reconstruction.
[048] System III performs better than System II, which indicates that adaptive temporal signal selection mechanism provide better HR monitoring. Moreover, Systems II and III neglects the non-rigid transformations introduced in the face due to which the proposed system provide better HR monitoring than system II. Another reason for better HR monitoring depicted by the proposed system is this that there are few cases where none of the system perform accurately due to noise and in such cases, the proposed system provide better HR monitoring using the global HR.
[049] System VI performs better than Systems IV and V which indicate that it is better to replace the local HR from global HR rather than average of local HR and previous local HR. But the best performance is achieved when the proposed post-processing is employed, which incorporates local HR, global HR and face reconstruction.
[050] The time taken by the proposed system is 16.78 seconds, which can be considered negligible for 54 seconds long face video. Further, the proposed system provides significantly better HR monitoring than the other systems. Thus, it can be inferred that the proposed system can be used for HR monitoring.
[051] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[052] The embodiments of present disclosure herein provides a method and system for monitoring heart rate of the person using facial video of the person. The method thus treat the problem of distortion of signal generated by the facial expression of the person.
[053] It is, however to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software modules located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
[054] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various modules described herein may be implemented in other modules or combinations of other modules. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[055] The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W) and DVD.
[056] A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
[057] Input/output (I/O) devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
[058] A representative hardware environment for practicing the embodiments may include a hardware configuration of an information handling/computer system in accordance with the embodiments herein. The system herein comprises at least one processor or central processing unit (CPU). The CPUs are interconnected via system bus to various devices such as a random access memory (RAM), read-only memory (ROM), and an input/output (I/O) adapter. The I/O adapter can connect to peripheral devices, such as disk units and tape drives, or other program storage devices that are readable by the system. The system can read the inventive instructions on the program storage devices and follow these instructions to execute the methodology of the embodiments herein.
[059] The system further includes a user interface adapter that connects a keyboard, mouse, speaker, microphone, and/or other user interface devices such as a touch screen device (not shown) to the bus to gather user input. Additionally, a communication adapter connects the bus to a data processing network, and a display adapter connects the bus to a display device which may be embodied as an output device such as a monitor, printer, or transmitter, for example.
[060] The preceding description has been presented with reference to various embodiments. Persons having ordinary skill in the art and technology to which this application pertains will appreciate that alterations and changes in the described structures and methods of operation can be practiced without meaningfully departing from the principle, spirit and scope.
| Section | Controller | Decision Date |
|---|---|---|
| # | Name | Date |
|---|---|---|
| 1 | 201821007473-IntimationOfGrant27-02-2024.pdf | 2024-02-27 |
| 1 | 201821007473-STATEMENT OF UNDERTAKING (FORM 3) [27-02-2018(online)].pdf | 2018-02-27 |
| 2 | 201821007473-PatentCertificate27-02-2024.pdf | 2024-02-27 |
| 2 | 201821007473-REQUEST FOR EXAMINATION (FORM-18) [27-02-2018(online)].pdf | 2018-02-27 |
| 3 | 201821007473-Written submissions and relevant documents [21-12-2023(online)].pdf | 2023-12-21 |
| 3 | 201821007473-FORM 18 [27-02-2018(online)].pdf | 2018-02-27 |
| 4 | 201821007473-FORM 1 [27-02-2018(online)].pdf | 2018-02-27 |
| 4 | 201821007473-Correspondence to notify the Controller [04-12-2023(online)].pdf | 2023-12-04 |
| 5 | 201821007473-FORM-26 [04-12-2023(online)]-1.pdf | 2023-12-04 |
| 5 | 201821007473-FIGURE OF ABSTRACT [27-02-2018(online)].jpg | 2018-02-27 |
| 6 | 201821007473-FORM-26 [04-12-2023(online)].pdf | 2023-12-04 |
| 6 | 201821007473-DRAWINGS [27-02-2018(online)].pdf | 2018-02-27 |
| 7 | 201821007473-US(14)-HearingNotice-(HearingDate-13-12-2023).pdf | 2023-11-13 |
| 7 | 201821007473-COMPLETE SPECIFICATION [27-02-2018(online)].pdf | 2018-02-27 |
| 8 | 201821007473-Proof of Right (MANDATORY) [15-03-2018(online)].pdf | 2018-03-15 |
| 8 | 201821007473-FER.pdf | 2021-10-18 |
| 9 | 201821007473-ABSTRACT [16-04-2021(online)].pdf | 2021-04-16 |
| 9 | 201821007473-FORM-26 [30-03-2018(online)].pdf | 2018-03-30 |
| 10 | 201821007473-CLAIMS [16-04-2021(online)].pdf | 2021-04-16 |
| 10 | Abstract1.jpg | 2018-08-11 |
| 11 | 201821007473-COMPLETE SPECIFICATION [16-04-2021(online)].pdf | 2021-04-16 |
| 11 | 201821007473-ORIGINAL UNDER RULE 6 (1A)-FORM 1-210318.pdf | 2018-08-11 |
| 12 | 201821007473- ORIGINAL UR 6( 1A) FORM 26-050418.pdf | 2018-08-11 |
| 12 | 201821007473-FER_SER_REPLY [16-04-2021(online)].pdf | 2021-04-16 |
| 13 | 201821007473-OTHERS [16-04-2021(online)].pdf | 2021-04-16 |
| 14 | 201821007473- ORIGINAL UR 6( 1A) FORM 26-050418.pdf | 2018-08-11 |
| 14 | 201821007473-FER_SER_REPLY [16-04-2021(online)].pdf | 2021-04-16 |
| 15 | 201821007473-COMPLETE SPECIFICATION [16-04-2021(online)].pdf | 2021-04-16 |
| 15 | 201821007473-ORIGINAL UNDER RULE 6 (1A)-FORM 1-210318.pdf | 2018-08-11 |
| 16 | 201821007473-CLAIMS [16-04-2021(online)].pdf | 2021-04-16 |
| 16 | Abstract1.jpg | 2018-08-11 |
| 17 | 201821007473-FORM-26 [30-03-2018(online)].pdf | 2018-03-30 |
| 17 | 201821007473-ABSTRACT [16-04-2021(online)].pdf | 2021-04-16 |
| 18 | 201821007473-FER.pdf | 2021-10-18 |
| 18 | 201821007473-Proof of Right (MANDATORY) [15-03-2018(online)].pdf | 2018-03-15 |
| 19 | 201821007473-US(14)-HearingNotice-(HearingDate-13-12-2023).pdf | 2023-11-13 |
| 19 | 201821007473-COMPLETE SPECIFICATION [27-02-2018(online)].pdf | 2018-02-27 |
| 20 | 201821007473-FORM-26 [04-12-2023(online)].pdf | 2023-12-04 |
| 20 | 201821007473-DRAWINGS [27-02-2018(online)].pdf | 2018-02-27 |
| 21 | 201821007473-FORM-26 [04-12-2023(online)]-1.pdf | 2023-12-04 |
| 21 | 201821007473-FIGURE OF ABSTRACT [27-02-2018(online)].jpg | 2018-02-27 |
| 22 | 201821007473-FORM 1 [27-02-2018(online)].pdf | 2018-02-27 |
| 22 | 201821007473-Correspondence to notify the Controller [04-12-2023(online)].pdf | 2023-12-04 |
| 23 | 201821007473-Written submissions and relevant documents [21-12-2023(online)].pdf | 2023-12-21 |
| 23 | 201821007473-FORM 18 [27-02-2018(online)].pdf | 2018-02-27 |
| 24 | 201821007473-REQUEST FOR EXAMINATION (FORM-18) [27-02-2018(online)].pdf | 2018-02-27 |
| 24 | 201821007473-PatentCertificate27-02-2024.pdf | 2024-02-27 |
| 25 | 201821007473-IntimationOfGrant27-02-2024.pdf | 2024-02-27 |
| 25 | 201821007473-STATEMENT OF UNDERTAKING (FORM 3) [27-02-2018(online)].pdf | 2018-02-27 |
| 1 | SearchStretegy-201821007473E_21-10-2020.pdf |