Abstract: The present disclosure provides a system 200 for estimating and explaining vital parameters from photoplethysmographic (PPG) signals. The system 200 communicatively coupled to a PPG sensor 208. The system 200 includes a processor 202, and memory 204 storing executable programs. The processor 202 receives the PPG signals to extract spatial and temporal features using an ensembling model to predict heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP). The system 200 configured to explain and employs Gradient-weighted Class Activation Mapping (Grad-CAM) to compute gradients of the final feature with respect to convolutional layers, generating heatmaps that highlight the most influential regions of the PPG signal. These heatmaps are overlaid onto the signal and displayed with the predicted parameters on a graphical user interface (GUI), providing users to visualize, interpret, and trust the predictions of the model. (To be published with Fig. 1)
Description:FIELD OF INVENTION
[0001] The present disclosure generally relates to a method for an estimation of vital parameters of a person. Most particularly, the present disclosure relates to a method for explainable estimation of vital parameters of the person from photoplethysmography (PPG) signals using artificial intelligence (AI).
BACKGROUND
[0002] The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also correspond to implementations of the claimed technology.
[0003] In the field of healthcare technology, the integration of artificial intelligence (AI) has led to the development of advanced models aimed at improving patient care and clinical outcomes. AI-driven solutions have been increasingly employed to automate complex diagnostic and monitoring tasks, enhancing the ability of healthcare providers to deliver timely interventions.
[0004] Despite these advancements, many AI models, particularly those based on deep learning techniques, operate as "black boxes" with limited transparency into their decision-making processes. This lack of interpretability raises serious concerns about the trustworthiness, accountability, and clinical reliability of AI systems, especially when applied to critical healthcare tasks such as the estimation of vital parameters. In clinical environments, healthcare professionals require not only accurate predictions but also clear insights into how those predictions are derived to support informed decision-making.
[0005] Photoplethysmography (PPG) has emerged as a promising technology for continuous, non-invasive monitoring of vital physiological parameters such as heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP). PPG signals, captured using light to measure blood volume changes in the microvascular bed, are cost-effective, portable, and easily integrable into wearable medical devices.
[0006] However, existing AI models for estimating vital parameters from PPG signals often suffer from a lack of explainability. These models typically provide outputs without offering insights into the specific regions or features of the PPG signals that influenced their predictions. This opacity makes it difficult for healthcare professionals to understand, validate, and trust the predictions generated by such models, limiting their practical deployment in clinical settings.
[0007] Thus, there exists a need for a system that not only delivers accurate vital parameter estimations from PPG signals but also provides interpretable, transparent explanations for each prediction, thereby enhancing trust, usability, and adoption in healthcare environments.
[0008] Through applied effort, ingenuity, and innovation, the inventors have solved and proposed the above problem(s) by developing the solutions embodied in the present disclosure, the details of which are described further herein.
SUMMARY OF THE INVENTION
[0009] In general, embodiments of the present disclosure herein provide a solution for explaining an estimation of vital parameters of a person. Other implementations will be or will become, apparent to one with skill in the art upon examination of the following figures and detailed description. It is intended that all such additional implementations be included within this description within the scope of the disclosure.
[0010] According to an embodiment of the present disclosure, a system to explain an estimation of vital parameters of a person from photoplethysmographic (PPG) signals is disclosed. The system comprises a processor, and a memory storing machine readable instructions, when executed cause the processor to receive PPG signals from one or more sources for the estimation of the vital parameters, wherein the vital parameters include heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP) using a deep learning model (VPE-Net). The processor is further configured to process the PPG signals using an ensembling model to extract a spatial feature and a temporal feature. The processor is further configured to combine the spatial feature and the temporal feature of the PPG signal to generate a final feature. The processor is further configured to predict the vital parameters of the person based on the final feature using the ensembling model. The processor is further configured to compare the final feature of the PPG signals with gradients of prediction of the ensembling model to determine an accuracy of the prediction. The processor is further configured to generate heatmaps based on comparison to highlight most influential regions of the PPG signal contributing to each prediction. The processor is further configured to display the heatmaps with the vital parameters to a user.
[0011] In an embodiment, the PPG signals are received from one or more sources, including a wearable device, a remote health monitoring system, a smartphone camera, a dedicated PPG sensor, or a multimodal monitoring system.
[0012] In an embodiment, the processor is configured to pre-process the received PPG signals by applying a bandpass butterworth filter to remove noise and motion artifacts. The processor is configured to segment the PPG signals into fixed-length windows with overlapping intervals for feature extraction.
[0013] In an embodiment, the processor is configured to extract the spatial feature from the PPG signals using one or more convolutional neural network (CNN) layers. The processor is configured to extract the temporal feature from the PPG signals using at least one of a Long Short-Term Memory (LSTM) layer or a Gated Recurrent Unit (GRU) layer.
[0014] In an embodiment, the processor is configured to pre-process by normalize the PPG signals to improve stability of the ensembling model. The processor is configured to detect and handling missing or corrupted signal segments using interpolation techniques.
[0015] In an embodiment, the processor is configured to combine the spatial feature and the temporal feature of the PPG signal to generate a final feature. The processor is configured to pass the extracted spatial and temporal feature through a fully connected dense layer. The processor is further configured to apply at least one of batch normalization and dropout regularization to generalization and reduce overfitting.
[0016] In an embodiment, the processor is configured to predict the vital parameters of the person based on the final feature using the ensembling model. The processor is configured to map representation of the final feature to the heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP) using a prediction layer, and optimize the prediction accuracy using a loss function based on mean squared error (MSE) or mean absolute error (MAE).
[0017] In an embodiment, the processor is configured to determine the accuracy of the predicted vital parameters by computing the gradient based attributions of the ensembling model using a Gradient-weighted Class Activation Mapping (Grad-CAM). The processor is configured to validate the accuracy using a reference dataset of known vital parameter values.
[0018] In an embodiment, the processor is configured to generate the heatmaps by computing the gradients of the final feature with respect to a CNN feature maps. The processor is configured to assign higher intensities to the most influential regions of the PPG signal. The processor is further configured to overlay the heatmaps onto input the PPG signals for interpretability.
[0019] In an embodiment, the processor is configured to display the heatmaps with the vital parameters to the user. The processor is configured to present the estimated vital parameters alongside their respective heatmaps on a graphical user interface (GUI). The processor is further configured to allow the user to interact with visualizations for interpretability and trust assessment.
[0020] In an embodiment, the ensembling model is deployed on an edge device for real-time analysis of PPG signals in wearable or remote healthcare applications.
[0021] In an embodiment, wherein the edge device is at least one of an Nvidia Jetson Orin, a mobile processor, or an embedded system, enabling real-time and energy-efficient inference.
[0022] In an embodiment, the processor is configured to fine-tune the ensembling model based on real-time feedback from healthcare professionals. The processor is configured to update a model weight using newly collected PPG data to improve prediction accuracy over time.
[0023] According to an embodiment of the present disclosure, a method for explaining an estimation of vital parameters of a person from photoplethysmographic (PPG) signals is disclosed. The method includes receiving PPG signals from one or more sources for the estimation of the vital parameters, wherein the vital parameters include heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP) using a deep learning model (VPE-Net). The method further includes processing the PPG signals using an ensembling model to extract a spatial feature and a temporal feature. The method further includes combining the spatial feature and the temporal feature of the PPG signal to generate a final feature. The method further includes predicting the vital parameters of the person based on the final feature using the ensembling model. The method further includes comparing the final feature of the PPG signals with gradients of prediction of the ensembling model to determine an accuracy of the prediction. The method further includes generating heatmaps based on comparison to highlight most influential regions of the PPG signal contributing to each prediction. The method further includes displaying the heatmaps with the vital parameters to a user.
[0024] The above summary is provided merely for the purpose of summarizing some exemplary embodiments to provide a basic understanding of some aspects of the present disclosure. Accordingly, it will be appreciated that the above-described embodiments are merely examples and should not be construed to narrow the scope or spirit of the present disclosure in any way. It will be appreciated that the scope of the present disclosure encompasses many potential embodiments in addition to those here summarized, some of which will be further described below. Other features, aspects, and advantages of the subject will become apparent from the description, the drawings, and the claims.
DESCRIPTION OF THE DRAWINGS
[0025] Having thus described the embodiments of the disclosure in general terms, reference now will be made to the accompanying drawings, which are not necessarily drawn to scale, and wherein:
[0026] FIG. 1 illustrates an example environment for explaining an estimation of vital parameters from a photoplethysmographic (PPG) sensor, according to an embodiment of the present disclosure;
[0027] FIG. 2 illustrates a general block diagram of a system to explain the estimation of the vital parameters from the PPG sensor, according to an embodiment of the present disclosure;
[0028] FIG. 3 illustrates a block diagram of modules for explaining the estimation of the vital parameters, according to an embodiment of the present disclosure;
[0029] FIG. 4a illustrates an example of a heatmap for explaining a heart rate estimation from the PPG signal, according to an embodiment of the present disclosure;
[0030] FIG. 4b illustrates an example of a heatmap for explaining a respiratory rate estimation from the PPG signal, according to an embodiment of the present disclosure; and
[0031] FIG. 5 illustrates a flow chart for a method for explaining the estimation of the vital parameters from the PPG signal, according to an embodiment of the present disclosure.
DESCRIPTION OF THE INVENTION
[0032] The description set forth below in connection with the appended drawings is intended as a description of various embodiments of the present invention and is not intended to represent the only embodiments in which the present invention may be practiced. Each embodiment described in this invention is provided merely as an example or illustration of the present invention, and should not necessarily be construed as preferred or advantageous over other embodiments. The description includes specific details for the purpose of providing a thorough understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention may be practiced without these specific details. Further, the reference numerals for similar components, modules, units, and operation steps have been kept same for the ease of understanding.
[0033] Some embodiments of the present disclosure now will be described with reference to the accompanying drawings, in which some, but not all, embodiments of the disclosure are shown. Indeed, embodiments of the disclosure may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements.
[0034] The present disclosure provides a method for explainable estimation of vital parameters from photoplethysmographic (PPG) sensor using a deep learning-based ensembling model. The PPG sensor sends PPG signals from various sources, preprocesses them using noise-reduction and segmentation techniques, and extracts spatial and temporal features using convolutional and recurrent neural networks. These features are combined to predict vital parameters. The explanation of the estimation is done by Gradient-weighted Class Activation Mapping (Grad-CAM) to generate heatmaps that visually highlight signal regions influencing each prediction. The predictions and corresponding heatmaps are displayed to users. To provide insights to the user in real-time operation on edge devices, making it suitable for wearable and remote healthcare applications. The forthcoming paragraphs will explain the methodology in detail.
[0035] FIG. 1 illustrates an example environment 100 for explaining an estimation of vital parameters from a photoplethysmographic (PPG) sensor, according to an embodiment of the present disclosure. According to an embodiment, the environment 100 for explaining prediction of the estimation of the vital parameters may include a wearable device 102, a server 104, and a user device 106, all interconnected via a communication network.
[0036] In an exemplary embodiment, the wearable device 102 includes the PPG sensor embedded within a wristband, patch, or similar form factor, capable of detecting blood volume changes in the microvascular bed using light-based sensing. The wearable device 102 may be configured to continuously capture the PPG signals from a user. The wearable device 102 is equipped with wireless communication functionality, such as Bluetooth or Wi-Fi, enabling it to transmit the captured PPG signals in real time to the server 104 for processing. The wearable device 102 may also include local buffering or minimal preprocessing capabilities for noise reduction before data transmission.
[0037] In one embodiment, the server 104 is configured to receive the PPG signals from the wearable device 102 and perform a series of processing steps including pre-processing, prediction, and explainability mapping. The server 104 applies a bandpass Butterworth filter to remove motion artifacts and ambient noise, followed by segmentation of the signal into overlapping windows for time-series analysis. The segmented signals are passed through an ensembling deep learning model comprising convolutional neural network (CNN) layers for spatial feature extraction and either Long Short-Term Memory (LSTM) or Gated Recurrent Unit (GRU) layers for capturing temporal features. The spatial and temporal features are combined to form a final feature representation, which is then used to predict vital parameters including heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP).
[0038] According to one implementation, the server 104 further applies Gradient-weighted Class Activation Mapping (Grad-CAM) to compute gradient-based attributions over the final feature space. The Grad-CAM generates heatmaps that highlight the most influential time regions of the PPG signal that contributed to each predicted vital parameter. These heatmaps offer a visual explanation for each prediction, enhancing transparency and clinical interpretability. The server 104 formats the predictions and heatmaps for downstream transmission to a user device.
[0039] In another embodiment, the results from the server 104 are transmitted to a user device, such as a smartphone, tablet, or clinician workstation. The user device is configured to display both the estimated vital parameters and their corresponding heatmaps in real time. The interface may allow the user to interact with the displayed data, zoom in on specific signal regions, and review temporal trends. This real-time visualization enables healthcare professionals to interpret and validate the model’s outputs directly, facilitating trust and immediate decision-making in remote or clinical monitoring scenarios. The system may further support alerts for abnormal parameter readings or patterns identified in the signal explanation, enhancing proactive patient care.
[0040] FIG. 2 illustrates a general block diagram of a system to explain the estimation of the vital parameters from the PPG sensor, according to an embodiment of the present disclosure. In an embodiment, the system 200 may be communicatively coupled to a PPG sensor 208. The system 200 may be configured to explain the estimation of vital parameters in real-time to the user. The system 200 includes a processor 202, a memory 204, modules 206, the server 104, a network interface 212, and a Gradient-weighted Class Activation Mapping (Grad-CAM) 214. The PPG sensor 208 may continuously capture the PPG signals from the user and transmit them to the system 200. The system 200 may be configured to process the PPG signals using an ensembling deep learning model that combines convolutional and recurrent neural networks to extract spatial and temporal features for the prediction of the vital parameters.
[0041] To enhance interpretability, the system 200 may generate heatmaps using the Grad-CAM 214, visually identifying the most influential portions of the PPG signal for each prediction. These results are transmitted to the display unit 210 of the user device, where both the predicted parameters and heatmaps are displayed on a graphical interface for real-time monitoring and clinical decision support.
[0042] For an example, the processor(s) 202 may be a single processing unit or a number of units, all of which could include multiple computing units. The processor(s) 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, logical processors, virtual processors, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor(s) 202 is configured to fetch and execute computer-readable instructions and data stored in the memory 204.
[0043] The memory 204 may include any non-transitory computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read-only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
[0044] In an embodiment, the processor 202 may be configured to receive the PPG signals from the wearable device 102. Upon receiving the PPG signals, the processor 202 applies the bandpass Butterworth filter to remove noise and motion artifacts and segments the signal into fixed-length overlapping windows for effective feature extraction. The processor 202 may further normalize the signal and handle any missing or corrupted segments using interpolation techniques to ensure data quality and consistency.
[0045] The processor 202 is further configured to process the segmented PPG signals using the ensembling deep learning model to extract the spatial and temporal features. These features are passed through a dense layer to generate the final feature. The processor 202 is configured to predict the vital parameters of the user based on the final feature by mapping the representation through a prediction layer.
[0046] According to an embodiment, the processor 202 is configured to compute gradient-based attributions using Grad-CAM 214. This computation highlights the contribution of specific time segments in the PPG signal to the predicted vital parameters. The processor 202 generates heatmaps that visually indicate these influential regions, thereby providing interpretability to the model’s output. The processor 202 may format and transmit the heatmaps, along with the predicted vital parameters, to the user device for real-time visualization.
[0047] In certain embodiments, the processor 202 may be present within a server 104 or be integrated into an edge computing device, such as the Nvidia Jetson Orin, to support real-time inference and visualization. The processor may also be operably coupled to a user interface module, enabling interactive display of the predicted parameters and heatmaps on a mobile device, tablet, or clinician dashboard. This setup facilitates remote health monitoring, clinical validation, and real-time decision support for healthcare professionals.
[0048] In an example, the module(s), engine(s), and/or unit(s) 206 may include a program, a subroutine, a portion of a program, a software component, or a hardware component capable of performing a stated task or function. As used herein, the module(s), engine(s), and/or unit(s) may be implemented on a hardware component such as a server independently of other modules, or a module can exist with other modules on the same server, or within the same program. The module (s), engine(s), and/or unit(s) 206 may be implemented on a hardware component such as processor one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. The module (s), engine(s), and/or unit(s) 206 when executed by the processor(s) 202 may be configured to perform any of the described functionalities.
[0049] As a further example, the server 104 may be implemented with integrated hardware and software. The hardware may include a hardware disk controller with programmable search capabilities or a software system running on general-purpose hardware. Examples of databases are but are not limited to, in-memory databases, cloud databases, distributed databases, embedded databases, and the like. The database amongst other things, serves as a repository for storing data processed, received, and generated by one or more of the processor(s) 202, and the modules/engines/units 206.
[0050] The modules/engines/units 206 may be implemented with an AI module that may include a plurality of neural network layers. Examples of neural networks include, but are not limited to, a convolutional neural network (CNN), a deep neural network (DNN), a recurrent neural network (RNN), and a Restricted Boltzmann Machine (RBM). The learning technique is a method for training a predetermined target device using a plurality of learning data to cause, allow, or control the target device to make a determination or prediction. Examples of the learning techniques include, but are not limited to, supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning. At least one of a plurality of CNN, DNN, RNN, RMB models and the like may be implemented to thereby achieve execution of the present subject matter’s mechanism through an AI model. A function associated with the AI model may be performed through the non-volatile memory, the volatile memory, and the processor. The processor may include one or a plurality of processors. At this time, one or a plurality of processors may be a general-purpose processor, such as a central processing unit (CPU), an application processor (AP), or the like, a graphics-only processing unit such as a graphics processing unit (GPU), a visual processing unit (VPU), and/or an AI-dedicated processor such as a neural processing unit (NPU). The one or a plurality of processors control the processing of the input data in accordance with a predefined operating rule or the artificial intelligence (AI) model stored in the non-volatile memory and the volatile memory. The predefined operating rule or artificial intelligence model is provided through training or learning.
[0051] As an example, the display unit 210 includes a computer monitor, a touch screen, an output device capable of displaying the graphics, and the like. The display unit 210 is configured to display visual output in desktops, laptops, and workstations. The display unit 210 may come in different sizes, resolutions, and types (such as LCD, LED, or OLED).
[0052] As a further example, the network interface 212 is configured to provide and establish communication with any electronic device via a public network, private network, or any wireless communication technology.
[0053] FIG. 3 illustrates a block diagram of modules for explaining the estimation of the vital parameters, according to an embodiment of the present disclosure. According to an embodiment, the module 206 includes a receiving module 302, a processing module 304, a generating module 306, and a displaying module 308 coupled with each other. In an alternate embodiment, the functions of the aforesaid modules may be performed by the processor(s) 202.
[0054] In an embodiment, the receiving module 302, the processing module 304, the generating module 306, and the displaying module 308 are uniquely designed hardware or software that are integrated within the system 200. According to some embodiments, the receiving module 302, the processing module 304, the generating module 306, and the displaying module 308 may be a part of an AI framework of the application to develop AI-powered solutions for specific tasks. An explanation will be made by referring to the modules depicted in FIG. 3. Furthermore, the labels depicted in the representative drawings are kept the same for similar components and operations throughout the disclosure for ease of understanding. The detailed functioning of each module will be explained in the following paragraphs.
[0055] The receiving module 302 is configured to acquire the PPG signals from one or more external sources for estimating the vital parameters includes heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP). In an embodiment, the PPG signals are received from one or more sources, including a wearable device, a remote health monitoring system, a smartphone camera, a dedicated PPG sensor, or a multimodal monitoring system.
[0056] For example, the receiving module 302 may obtain real-time PPG data from a smartwatch worn by a user, transmitting the signal over Wi-Fi to a healthcare server. Upon receipt, the module performs integrity checks, timestamps synchronization, and metadata extraction to ensure accurate classification and analysis. It can handle both streaming and batch-uploaded PPG data, supporting asynchronous transmission from mobile apps. In some implementations, the module also manages multi-sensor inputs, such as ECG, isolating the PPG signal for further processing. This flexibility allows seamless integration into diverse healthcare environments and ensures robust signal delivery to downstream AI models for vital parameter prediction.
[0057] The processing module 304 is configured to pre-processing the received PPG signals by applying the bandpass butterworth filter to remove noise and motion artifacts. Further, the processing module 304 is segmenting the PPG signals into fixed-length windows with overlapping intervals for feature extraction. The processing module 304 is configured to process the received PPG signals using the ensembling model to extract both the spatial feature and the temporal feature, which are accurately estimating the vital parameters. In one exemplary implementation, after the receiving module 302 forwards a cleaned and segmented PPG signal, the processing module 304 first passes the PPG signal through one or more CNN layers to extract the spatial features. Then extracting the temporal feature from the PPG signals using at least one of the Long Short-Term Memory (LSTM) layer or the Gated Recurrent Unit (GRU) layer. These layers act on the 1D waveform to identify spatial characteristics such as peaks, valleys, and amplitude variations, which correlate with physiological activity like heartbeats and respiratory oscillations. The extracted spatial feature maps are then passed to LSTM or GRU layers, which analyze the time-series dependencies within the PPG signal capturing trends, cycles, and delays over time. This sequential modeling is critical for understanding long-term correlations necessary for accurate blood pressure and respiration rate estimation.
[0058] According to an embodiment, the processing module 304 may be pre-process by normalizing the PPG signals to improve stability of the ensembling model, and detecting and handling missing or corrupted signal segments using interpolation techniques. The processing module 304 is further configured to combine the spatial feature and the temporal feature extracted from the PPG signal to generate a final feature. In an embodiment, passing the extracted spatial and temporal feature through a fully connected dense layer and applying at least one of batch normalization and dropout regularization to generalization and reduce overfitting.
[0059] Upon generation of the final feature, the processing module 304 may be configured to predict one or more vital parameters of the person using the ensembling model. In one embodiment, the ensembling model include a plurality of sub-models operatively coupled to a prediction layer, wherein each sub-model is trained to analyze different aspects of the final feature derived from the PPG signals. The prediction layer is configured to map the representation of the final feature to one or more physiological metrics including, but not limited to, heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP).
[0060] In an embodiment, the prediction layer includes a set of fully connected neurons that perform a transformation of the input feature space into a corresponding output space of numerical values representing the estimated vital parameters. The outputs from the individual sub-models of the ensembling model may be combined using averaging, weighted aggregation, or model stacking techniques to produce a final predicted value for each of the vital parameters.
[0061] In an exemplary implementation, the processing module 304 may receive a segmented PPG signal, extract spatial and temporal features therefrom, and generate a final feature representation. The final feature is then processed by the ensembling model to yield estimates of the vital parameters of the individual. The predicted values may subsequently be utilized for further processing, such as explainability mapping or visualization on the user interface. The processing module 304 may optimize the prediction accuracy using a loss function based on mean squared error (MSE) or mean absolute error (MAE).
[0062] According to an embodiment, the processing module 304 is further configured to determine the reliability or interpretability of the predicted vital parameters by performing a comparison between the final feature of the PPG signal and the gradients associated with the predictions generated by the ensembling model. Specifically, the processing module 304 may utilize a gradient-based explainability technique, such as Gradient-weighted Class Activation Mapping (Grad-CAM), to compute partial derivatives of the output prediction with respect to the feature maps derived from the convolutional layers of the ensembling model. The processing module 304 may determine the accuracy of the predicted vital parameters by computing the gradient based attributions of the ensembling model using the Grad-CAM and validating the accuracy using a reference dataset of known vital parameter values.
[0063] The processing module 304 may analyze the gradient vectors to identify which regions of the final feature had the most significant influence on the output predictions. By comparing the location and magnitude of these gradients to the original final feature representation, the module may generate attribution scores or activation maps that indicate the degree of correspondence between the input signal features and the prediction pathway of the model. This comparison enables the system 200 to assess the explanatory alignment between the model’s decision-making process and the actual characteristics of the PPG signal, thereby serving as a proxy for accuracy validation. In some embodiments, regions of high gradient concentration aligned with physiologically relevant waveform features (e.g., systolic peaks or respiratory cycles) may indicate higher confidence in the prediction, while diffuse or misplaced gradient regions may suggest lower interpretability or potential model uncertainty.
[0064] For example, upon receiving a 60-second PPG segment, the processing module 304 extracts the final feature, predicts the heart rate and blood pressure values, and computes the gradients of these outputs with respect to the CNN feature maps. The module then overlays the gradients onto the temporal axis of the original PPG signal and compares them with the known structural patterns of the final feature. This process yields an explanation-driven validation of the model’s output, which may be further visualized as a heatmap or stored for clinical review.
[0065] The generating module 306 may be configured to generate the heatmaps based on comparison to highlight most influential regions of the PPG signal contributing to each prediction. The generating module 306 is configured to compute the gradients of the final feature with respect to the convolutional neural network (CNN) feature maps used in the ensembling model. The gradients represent the partial derivatives of the predicted output with respect to each activation in the CNN layers, thereby indicating the contribution of each region of the input PPG signal to the final prediction. Based on the computed gradients, the generating module 306 may assign higher intensity values to the regions of the PPG signal that exert the most significant influence on the predicted vital parameters. These intensities are then utilized to construct a heatmap, wherein the most influential portions of the signal are visually emphasized. The generating module 306 overlays the generated heatmap onto the original PPG signal to provide an interpretable visual representation, enabling healthcare professionals to understand which segments of the signal the model relied upon during prediction.
[0066] Further, the displaying module 308 may be configured to display the generated heatmaps alongside the predicted vital parameters to the user via a graphical user interface (GUI) on the user device 106. The GUI is designed to present the predicted values of heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP) in a structured and easily interpretable format.
[0067] Concurrently, the heatmaps generated from the gradient-based analysis are superimposed onto the time-domain PPG signal, allowing the user to visually identify which segments of the signal contributed most significantly to each prediction. The interface may use color intensities or gradient scales to represent the relative importance of each region, where brighter or more saturated colors correspond to higher influence scores.
[0068] FIG. 4a illustrates an example of a heatmap for explaining a heart rate estimation 400 from the PPG signal, according to an embodiment of the present disclosure. FIG. 4b illustrates an example of a heatmap for explaining a respiratory rate estimation 402 from the PPG signal, according to an embodiment of the present disclosure. Referring to the accompanying illustrations, the images labeled Fig. 4a and Fig. 4b represent the heatmaps generated by the deep learning-based explainability framework to interpret the prediction of vital parameters from the PPG signals. Specifically, Fig. 4a illustrates the heatmaps for heart rate (HR) estimation 400, while Fig. 4b corresponds to the heatmaps for respiratory rate (RR) estimation 402. In each figure, three representative examples (A, B, and C) are shown, where the underlying black waveform represents the raw PPG signal plotted over time, and the overlaid color gradients represent the attention intensity values derived from Gradient-weighted Class Activation Mapping (Grad-CAM).
[0069] In these examples, the colored regions in each waveform highlight the temporal segments of the PPG signal that most significantly influenced the model’s prediction. Brighter color intensities (e.g., yellow to red) correspond to higher attribution weights, indicating that these segments were more influential in the estimation of the vital parameter. Conversely, cooler tones (e.g., blue or purple) represent regions with lower influence. The predicted and ground-truth values are displayed on top-right corner of each subplot for reference.
[0070] The heatmaps shown in Fig. 4a enable interpretability of HR estimation 400 by revealing that the model primarily focuses on the peaks and rising edges of the PPG waveform characteristic features that correlate strongly with heartbeats. In contrast, the heatmaps in Fig. 4b demonstrate the model’s focus on broader waveform fluctuations or low-frequency modulations, which are indicative of respiratory-induced changes in the PPG amplitude envelope. This variation in attention pattern between HR and RR heatmaps illustrates the model’s ability to adaptively extract relevant physiological patterns for each target parameter.
[0071] These visual explanations are overlaid onto the original PPG signal and presented to the user or clinician via the graphical user interface (GUI). This approach enables not only real-time vital sign estimation but also transparent insight into the AI model’s decision-making process, thereby enhancing trust, clinical usability, and model accountability.
[0072] FIG. 5 illustrates a flow chart for a method for explaining the estimation of the vital parameters from the PPG signal, according to an embodiment of the present disclosure. The method 500 is implemented in the system 200 of FIGs. 2, and 3. Further, steps of the method 500 are explained in detail through FIGs 2 to 3, therefore for the sake of brevity, the detailed explanation has been omitted here.
[0073] In an embodiment, the system 200 explain the estimation of the vital parameters by generating a heatmaps. The PPG signal is being acquired by the PPG sensors 208 and process by the processing module 304.
[0074] According to an embodiment, the method 500, at step 502 includes receiving PPG signals from one or more sources for the estimation of the vital parameters, wherein the one or more sources may include wearable devices, smartphone-based optical sensors, remote health monitoring systems, or dedicated clinical-grade PPG sensors. In an embodiment, this step 502 is performed by a receiving module 302, which is configured to acquire continuous or periodic PPG signal streams transmitted from the respective data sources over wireless or wired communication protocols. The receiving module 302 configured to acquire the PPG signals are temporally aligned and formatted for downstream processing, thereby enabling robust and reliable estimation of vital parameters including heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP).
[0075] According to an embodiment, the method 500, at step 504 includes processing the PPG signals using an ensembling model to extract a spatial feature and a temporal feature. In one embodiment, this step 504 is performed by a processing module 304, which is configured to apply a deep learning architecture comprising both convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to the pre-processed PPG input.
[0076] The spatial features are extracted using one or more CNN layers that operate on the time-series PPG signal to identify localized patterns such as pulse peaks, amplitude variations, and waveform morphology. These CNN layers apply learnable filters across the temporal axis of the signal, capturing essential morphological characteristics that are often associated with cardiovascular dynamics and peripheral perfusion.
[0077] Simultaneously, the temporal features are extracted using one or more RNN layers, including but not limited to Long Short-Term Memory (LSTM) units or Gated Recurrent Units (GRU). These layers are configured to capture sequential dependencies, cyclic variations, and long-range temporal patterns within the PPG signal, such as respiratory modulation or blood pressure trends that unfold over time.
[0078] According to an embodiment, the method 500, at step 506 includes combining the spatial and temporal features extracted from the PPG signal using the dense neural layer to generate a final feature. The final feature encapsulates both morphological patterns and time-based variations, enabling more accurate and comprehensive prediction of the vital parameters.
[0079] According to an embodiment, the method 500, at step 508 includes predicting the vital parameters of the person based on the final feature using the ensembling model. In an embodiment, the step 508 is performed by the processing module 304, which is configured to apply a trained prediction layer that maps the final feature comprising both spatial and temporal characteristics to specific physiological outputs.
[0080] According to an embodiment, the method 500, at step 510 includes comparing the final feature of the PPG signals with gradients of prediction of the ensembling model to determine an accuracy of the prediction. In an embodiment, the step 510 is performed by the processing module 304, which is configured to compute gradient-based attributions between the predicted output and the underlying feature representations generated by the ensembling model.
[0081] The processing module 304 utilizes explainability techniques such as Gradient-weighted Class Activation Mapping (Grad-CAM) to calculate the gradients of the predicted values with respect to the feature maps, particularly those derived from the CNN layers. These gradients indicate the contribution of different regions within the PPG signal to the final prediction. By analyzing the alignment between the gradient distribution and the original final feature, the module determines how accurately the model focused on physiologically relevant segments of the signal during prediction. This comparison enables transparency in model behavior and serves as a proxy for validating the reliability of the AI-generated vital parameter estimates.
[0082] According to an embodiment, the method 500, at step 512 includes generating heatmaps based on the comparison to highlight the most influential regions of the PPG signal contributing to each prediction. In an embodiment, the step 512 is performed by the generating module 306, which is configured to visualize the gradient-based attribution results as interpretable heatmaps.
[0083] The generating module 306 uses the gradients computed with respect to the convolutional feature maps to identify which temporal segments of the PPG signal most significantly influenced the model’s prediction of each vital parameter. The resulting heatmaps assign higher intensity values to these influential regions, indicating stronger contribution to the prediction outcome. The heatmaps are overlaid on the original PPG waveform to produce a composite visual representation that clearly delineates the model’s focus during inference. This enables healthcare professionals to interpret not only the predicted values but also the rationale behind them, enhancing trust and clinical transparency.
[0084] According to an embodiment, the method 500, at step 514 includes displaying the heatmaps along with the vital parameters on the graphical user interface (GUI). In this embodiment, the heatmaps are generated by comparing the predicted vital parameters with gradients of the model’s predictions, highlighting the most influential areas of the PPG signal. The step of displaying the heatmaps is carried out by the display module, which overlays the heatmaps onto the input PPG signals. This allows the user to view both the vital parameters (such as heart rate, respiratory rate, and blood pressure) and the corresponding heatmap, helping to understand which parts of the PPG signal contributed the most to the predictions, thus providing a clear and interpretable result.
[0085] The disclosed technique provides an improved method for explaining the estimation of vital parameters from PPG signals by generating heatmaps that highlight the most influential regions of the signal. This is achieved by comparing the final feature with the gradients of the prediction, which are used to generate the heatmaps. These heatmaps, along with the predicted vital parameters, are then displayed to the user on a graphical user interface (GUI). This approach enhances interpretability by visualizing the contributions of the PPG signal to each vital parameter prediction, enabling users to trust and assess the accuracy of the system's output.
[0086] The figures of the disclosure are provided to illustrate some examples of the invention described. The figures are not to limit the scope of the depicted embodiments of the appended claims. Aspects of the disclosure are described herein with reference to the invention to example embodiments for illustration. It should be understood that specific details, relationships, and methods are set forth to provide a full understanding of the example embodiments. One of the ordinary skills in the art recognize the example embodiments that can be practiced without one or more specific details and/or with other methods.
[0087] Aspects of the present disclosure may be implemented as computer program products that comprise articles of manufacture. Such computer program products may include one or more software components including, for example, applications, software objects, methods, data structure, and/or the like. In some embodiments, a software component may be stored on one or more non-transitory computer-readable media, which computer program product may comprise the computer-readable media with software component, comprising computer executable instructions, included thereon. The various control and operational systems described herein may incorporate one or more of such computer program products and/or software components for causing the various conveyors and components thereof to operate in accordance with the functionalities described herein.
[0088] It is to be understood that the disclosure is not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation, unless described otherwise. , Claims:1. A method for explaining an estimation of vital parameters of a person from photoplethysmographic (PPG) signals, comprising:
receiving PPG signals from one or more sources for the estimation of the vital parameters, wherein the vital parameters include heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP) using a deep learning model (VPE-Net);
processing the PPG signals using an ensembling model to extract a spatial feature and a temporal feature;
combining the spatial feature and the temporal feature of the PPG signal to generate a final feature;
predicting the vital parameters of the person based on the final feature using the ensembling model;
comparing the final feature of the PPG signals with gradients of prediction of the ensembling model to determine an accuracy of the prediction;
generating heatmaps based on comparison to highlight most influential regions of the PPG signal contributing to each prediction; and
displaying the heatmaps with the vital parameters to a user.
2. The method as claimed in claim 1, wherein the PPG signals are received from one or more sources, including a wearable device, a remote health monitoring system, a smartphone camera, a dedicated PPG sensor, or a multimodal monitoring system.
3. The method as claimed in claim 1, further comprising:
pre-processing the received PPG signals by applying a bandpass butterworth filter to remove noise and motion artifacts; and
segmenting the PPG signals into fixed-length windows with overlapping intervals for feature extraction.
4. The method as claimed in claim 3, wherein processing the PPG signals using the ensembling model comprises:
extracting the spatial feature from the PPG signals using one or more convolutional neural network (CNN) layers; and
extracting the temporal feature from the PPG signals using at least one of a Long Short-Term Memory (LSTM) layer or a Gated Recurrent Unit (GRU) layer.
5. The method as claimed in claim 3, wherein pre-processing further comprises:
normalizing the PPG signals to improve stability of the ensembling model; and
detecting and handling missing or corrupted signal segments using interpolation techniques.
6. The method as claimed in claim 1, wherein combining the spatial feature and the temporal feature of the PPG signal to generate a final feature comprises:
passing the extracted spatial and temporal feature through a fully connected dense layer; and
applying at least one of batch normalization and dropout regularization to generalization and reduce overfitting.
7. The method as claimed in claim 1, wherein predicting the vital parameters of the person based on the final feature using the ensembling model comprises:
mapping representation of the final feature to the heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP) using a prediction layer; and
optimizing the prediction accuracy using a loss function based on mean squared error (MSE) or mean absolute error (MAE).
8. The method as claimed in claim 1, further comprising:
determining the accuracy of the predicted vital parameters by computing the gradient based attributions of the ensembling model using a Gradient-weighted Class Activation Mapping (Grad-CAM); and
validating the accuracy using a reference dataset of known vital parameter values.
9. The method as claimed in claim 8, wherein generating the heatmaps comprises:
computing the gradients of the final feature with respect to a CNN feature maps;
assigning higher intensities to the most influential regions of the PPG signal; and
overlaying the heatmaps onto input the PPG signals for interpretability.
10. The method as claimed in claim 1, wherein displaying the heatmaps with the vital parameters to the user comprises:
presenting the estimated vital parameters alongside their respective heatmaps on a graphical user interface (GUI); and
allowing the user to interact with visualizations for interpretability and trust assessment.
11. The method as claimed in claim 1, wherein the ensembling model is deployed on an edge device for real-time analysis of PPG signals in wearable or remote healthcare applications.
12. The method as claimed in claim 11, wherein the edge device is at least one of an Nvidia Jetson Orin, a mobile processor, or an embedded system, enabling real-time and energy-efficient inference.
13. The method as claimed in claim 1, further comprising:
fine-tuning the ensembling model based on real-time feedback from healthcare professionals; and
updating a model weights using newly collected PPG data to improve prediction accuracy over time.
14. A system for explaining an estimation of vital parameters of a person from photoplethysmographic (PPG) signals comprises:
a processor 202;
a memory 204 storing machine readable instructions, when executed cause the processor 202 to:
receive PPG signals from one or more sources for the estimation of the vital parameters, wherein the vital parameters include heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP) using a deep learning model (VPE-Net);
process the PPG signals using an ensembling model to extract a spatial feature and a temporal feature;
combine the spatial feature and the temporal feature of the PPG signal to generate a final feature;
predict the vital parameters of the person based on the final feature using the ensembling model;
compare the final feature of the PPG signals with gradients of prediction of the ensembling model to determine an accuracy of the prediction;
generate heatmaps based on comparison to highlight most influential regions of the PPG signal contributing to each prediction; and
display the heatmaps with the vital parameters to a user.
15. The system as claimed in claim 14, wherein the PPG signals are received from one or more sources, including a wearable device, a remote health monitoring system, a smartphone camera, a dedicated PPG sensor, or a multimodal monitoring system.
16. The system as claimed in claim 14, wherein the processor is configured to:
pre-process the received PPG signals by applying a bandpass butterworth filter to remove noise and motion artifacts; and
segment the PPG signals into fixed-length windows with overlapping intervals for feature extraction.
17. The system as claimed in claim 16, wherein the processor is configured to process the PPG signals using the ensembling model comprises:
extract the spatial feature from the PPG signals using one or more convolutional neural network (CNN) layers; and
extract the temporal feature from the PPG signals using at least one of a Long Short-Term Memory (LSTM) layer or a Gated Recurrent Unit (GRU) layer.
18. The system as claimed in claim 16, wherein the processor is configured to pre-process by:
normalize the PPG signals to improve stability of the ensembling model; and
detect and handling missing or corrupted signal segments using interpolation techniques.
19. The system as claimed in claim 17, wherein the processor is configured to combine the spatial feature and the temporal feature of the PPG signal to generate a final feature comprises:
pass the extracted spatial and temporal feature through a fully connected dense layer; and
apply at least one of batch normalization and dropout regularization to generalization and reduce overfitting.
20. The system as claimed in claim 14, wherein the processor is configured to predict the vital parameters of the person based on the final feature using the ensembling model comprises:
map representation of the final feature to the heart rate (HR), respiratory rate (RR), systolic blood pressure (SBP), and diastolic blood pressure (DBP) using a prediction layer; and
optimize the prediction accuracy using a loss function based on mean squared error (MSE) or mean absolute error (MAE).
21. The system as claimed in claim 14, wherein the processor is configured to:
determine the accuracy of the predicted vital parameters by computing the gradient based attributions of the ensembling model using a Gradient-weighted Class Activation Mapping (Grad-CAM); and
validate the accuracy using a reference dataset of known vital parameter values.
22. The system as claimed in claim 21, wherein the processor is configured to generate the heatmaps comprises:
compute the gradients of the final feature with respect to a CNN feature maps;
assign higher intensities to the most influential regions of the PPG signal; and
overlay the heatmaps onto input the PPG signals for interpretability.
23. The system as claimed in claim 14, wherein the processor is configured to display the heatmaps with the vital parameters to the user comprises:
present the estimated vital parameters alongside their respective heatmaps on a graphical user interface (GUI); and
allow the user to interact with visualizations for interpretability and trust assessment.
24. The system as claimed in claim 14, wherein the ensembling model is deployed on an edge device for real-time analysis of PPG signals in wearable or remote healthcare applications.
25. The system as claimed in claim 24, wherein the edge device is at least one of an Nvidia Jetson Orin, a mobile processor, or an embedded system, enabling real-time and energy-efficient inference.
26. The system as claimed in claim 14, wherein the processor is configured to:
fine-tune the ensembling model based on real-time feedback from healthcare professionals; and
update a model weight using newly collected PPG data to improve prediction accuracy over time.
| # | Name | Date |
|---|---|---|
| 1 | 202541046970-STATEMENT OF UNDERTAKING (FORM 3) [15-05-2025(online)].pdf | 2025-05-15 |
| 2 | 202541046970-FORM FOR STARTUP [15-05-2025(online)].pdf | 2025-05-15 |
| 3 | 202541046970-FORM FOR SMALL ENTITY(FORM-28) [15-05-2025(online)].pdf | 2025-05-15 |
| 4 | 202541046970-FORM 1 [15-05-2025(online)].pdf | 2025-05-15 |
| 5 | 202541046970-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [15-05-2025(online)].pdf | 2025-05-15 |
| 6 | 202541046970-EVIDENCE FOR REGISTRATION UNDER SSI [15-05-2025(online)].pdf | 2025-05-15 |
| 7 | 202541046970-DRAWINGS [15-05-2025(online)].pdf | 2025-05-15 |
| 8 | 202541046970-DECLARATION OF INVENTORSHIP (FORM 5) [15-05-2025(online)].pdf | 2025-05-15 |
| 9 | 202541046970-COMPLETE SPECIFICATION [15-05-2025(online)].pdf | 2025-05-15 |
| 10 | 202541046970-STARTUP [21-07-2025(online)].pdf | 2025-07-21 |
| 11 | 202541046970-Proof of Right [21-07-2025(online)].pdf | 2025-07-21 |
| 12 | 202541046970-FORM28 [21-07-2025(online)].pdf | 2025-07-21 |
| 13 | 202541046970-FORM-9 [21-07-2025(online)].pdf | 2025-07-21 |
| 14 | 202541046970-FORM-26 [21-07-2025(online)].pdf | 2025-07-21 |
| 15 | 202541046970-FORM 18A [21-07-2025(online)].pdf | 2025-07-21 |
| 16 | 202541046970-FORM 3 [07-10-2025(online)].pdf | 2025-10-07 |