Abstract: Cardio-vascular disease (CVD) is one of prominent disease types in humans. There is no criticality-aware ECG classification model development which is a practical requirement for automated decisive diagnosis solution. This disclosure relates a method to detect a domain-principled inference with a Resnet-transformer model for electrocardiogram (ECG) classification. A physiological signal from a twelve-lead ECG is received as an input. The physiological signal is processed by a residual network to extract lower dimensional feature embeddings. A classification model is constructed by a transformer network based on the one or more lower dimensional feature embeddings. One or more patterns are obtained based on the classification model at a domain-principled ResNet-Transformer network. A trained model is obtained by learning the one or more patterns and information between the one or more lower dimensional feature embeddings. An inference decision boundary associated with one or more class labels is derived based on the trained model.
DESC:FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention:
METHOD AND SYSTEM FOR DOMAIN-PRINCIPLED INFERENCE WITH RESNET-TRANSFORMER MODEL FOR ECG CLASSIFICATION
Applicant:
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th Floor,
Nariman Point, Mumbai 400021,
Maharashtra, India
The following specification particularly describes the invention and the manner in which it is to be performed.
CROSS-REFERENCE TO RELATED APPLICATIONS AND PRIORITY
The present application claims priority from Indian provisional patent application no. 202121048027, filed on October 21, 2021. The entire contents of the aforementioned application are incorporated herein by reference.
TECHNICAL FIELD
The disclosure herein generally relates to health monitoring system, and, more particularly, to method and system for domain-principled inference with RESNET-transformer model for ECG classification.
BACKGROUND
Cardio-vascular disease (CVD) is one of the prominent disease types in humans. An Electrocardiogram or ECG is an essential and fundamental technique to observe a cardiac activity and acts as a good reference for clinical experts to diagnose different kinds of heart diseases. Traditionally, in automatic diagnosis, ECG classification methods employ signal processing techniques and standard features of the ECG waveforms to distinguish between the waveforms of different cardiac diseases. The automated detection of cardiac abnormality is of vast practical importance that assists both physician and patient communities. Currently, end-to-end deep learning techniques have a significant break-through in the field of healthcare. The automated detection of cardiovascular diseases (CVDs) from the ECG recordings is a problem of immense practical interest and associated with considerable research challenges due to multitude machine learning issues like multi-class, multi-label classification, unequal number of sampling instances, heterogeneity in class distribution (i.e., owing to disease prevalence or collection process constraints) and many others.
In a typical approach, a convolution neural network (CNN) was developed for classification of different arrhythmias from a single lead ECG. This approach outperformed diagnosis of the cardiologists, but it is limited to single lead ECG classification with a capability of detecting cardiac arrhythmia condition. However, 12 lead ECG classification is significantly complex and it is of practical as well as research challenge to develop model for 12 lead ECG signals for detecting diverse CVD conditions. Recent research effort exhibits that auto-encoder based embedding extractors in terms of heartbeat segmentation can also be considered for 12 lead ECG classification, where LSTM is also used for classification purpose for the prediction of CVDs. It is noted that training auto-encoders is slightly difficult, and main drawback of the approach is poor classification performance of a multi-step approach of LSTM-classification preceded by heart-beat segmentation as autoencoder based feature selector. The state-of-the methods have a sole aim of maximizing some aggregated or overall classification performance from the ECG recordings while not concentrating on the importance of domain principled criteria.
SUMMARY
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one aspect, a processor implemented method of detecting the domain-principled inference with the Resnet-transformer model for the ECG classification. The processor implemented method includes at least one of: receiving, via one or more hardware processors, at least one physiological signal from a twelve-lead electrocardiogram (ECG) as an input; processing, by a residual network (ResNet), the at least one physiological signal from the twelve-lead electrocardiogram (ECG) to extract one or more lower dimensional feature embeddings; constructing, by a transformer network, a classification model based on the one or more lower dimensional feature embeddings; obtaining, via the one or more hardware processors, one or more patterns based on the classification model at a domain-principled ResNet-Transformer network; obtaining, via the one or more hardware processors, a trained model by learning the one or more patterns and information between the one or more lower dimensional feature embeddings by a classification head; and deriving, via the one or more hardware processors, an inference decision boundary associated with one or more class labels based on the trained model. The residual network (ResNet) is a feature extractor. The transformer network corresponds to a multiclass multilabel classifier. The domain-principled ResNet-Transformer network includes one or more integrated functional components. The one or more integrated functional components corresponds to (i) a ResNet-Transformer network model, and (ii) a domain-principled inference.
In an embodiment, the residual network (ResNet) includes one or more residual units. In an embodiment, the one or more residual units corresponds to ten residual blocks with two convolutional layers per block. In an embodiment, the at least one physiological signal is expressed as a function of an initial signal along with a learnable residue function (F) to create an indirect path for the input signal to propagate entire depth of the network. In an embodiment, the function of the initial signal is based on a derivable function (A) which is obtained by one or more skip connections. In an embodiment, the classification head includes a dense layer and a sigmoid activation function. In an embodiment, the sigmoid activation function learns probabilities independently with number of classes, and high probabilities to obtain the one or more class labels. In an embodiment, the one or more class labels corresponds to one or more cardiovascular diseases (CVDs) group derived based on a clinical severity level. In an embodiment, the one or more cardiovascular diseases (CVDs) group corresponds to: (a) a non-critical (Cn) cardiovascular disease (CVD) group, (b) a critical (Cc) cardiovascular disease (CVD) group, and (c) a super-critical (Cs) cardiovascular disease (CVD) group.
In another aspect, there is provided a system for detection of the domain-principled inference with the Resnet-transformer model for the ECG classification. The system includes a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to: receive, at least one physiological signal from a twelve-lead electrocardiogram (ECG) as an input; process, by a residual network (ResNet), the at least one physiological signal from the twelve-lead electrocardiogram (ECG) to extract one or more lower dimensional feature embeddings; construct, by a transformer network, a classification model based on the one or more lower dimensional feature embeddings; obtain, one or more patterns based on the classification model at a domain-principled ResNet-Transformer network; obtain, a trained model by learning the one or more patterns and information between the one or more lower dimensional feature embeddings by a classification head; and derive, an inference decision boundary associated with one or more class labels based on the trained model. The residual network (ResNet) is a feature extractor. The transformer network corresponds to a multiclass multilabel classifier. The domain-principled ResNet-Transformer network includes one or more integrated functional components. The one or more integrated functional components corresponds to (i) a ResNet-Transformer network model, and (ii) a domain-principled inference.
In an embodiment, the residual network (ResNet) includes one or more residual units. In an embodiment, the one or more residual units corresponds to ten residual blocks with two convolutional layers per block. In an embodiment, the at least one physiological signal is expressed as a function of an initial signal along with a learnable residue function (F) to create an indirect path for the input signal to propagate entire depth of the network. In an embodiment, the function of the initial signal is based on a derivable function (A) which is obtained by one or more skip connections. In an embodiment, the classification head includes a dense layer and a sigmoid activation function. In an embodiment, the sigmoid activation function learns probabilities independently with number of classes, and high probabilities to obtain the one or more class labels. In an embodiment, the one or more class labels corresponds to one or more cardiovascular diseases (CVDs) group derived based on a clinical severity level. In an embodiment, the one or more cardiovascular diseases (CVDs) group corresponds to: (a) a non-critical (Cn) cardiovascular disease (CVD) group, (b) a critical (Cc) cardiovascular disease (CVD) group, and (c) a super-critical (Cs) cardiovascular disease (CVD) group.
In yet another aspect, there are provided one or more non-transitory machine readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors causes at least one of: receiving, at least one physiological signal from a twelve-lead electrocardiogram (ECG) as an input; processing, by a residual network (ResNet), the at least one physiological signal from the twelve-lead electrocardiogram (ECG) to extract one or more lower dimensional feature embeddings; constructing, by a transformer network, a classification model based on the one or more lower dimensional feature embeddings; obtaining, one or more patterns based on the classification model at a domain-principled ResNet-Transformer network; obtaining, a trained model by learning the one or more patterns and information between the one or more lower dimensional feature embeddings by a classification head; and deriving, an inference decision boundary associated with one or more class labels based on the trained model. The residual network (ResNet) is a feature extractor. The transformer network corresponds to a multiclass multilabel classifier. The domain-principled ResNet-Transformer network includes one or more integrated functional components. The one or more integrated functional components corresponds to (i) a ResNet-Transformer network model, and (ii) a domain-principled inference.
In an embodiment, the residual network (ResNet) includes one or more residual units. In an embodiment, the one or more residual units corresponds to ten residual blocks with two convolutional layers per block. In an embodiment, the at least one physiological signal is expressed as a function of an initial signal along with a learnable residue function (F) to create an indirect path for the input signal to propagate entire depth of the network. In an embodiment, the function of the initial signal is based on a derivable function (A) which is obtained by one or more skip connections. In an embodiment, the classification head includes a dense layer and a sigmoid activation function. In an embodiment, the sigmoid activation function learns probabilities independently with number of classes, and high probabilities to obtain the one or more class labels. In an embodiment, the one or more class labels corresponds to one or more cardiovascular diseases (CVDs) group derived based on a clinical severity level. In an embodiment, the one or more cardiovascular diseases (CVDs) group corresponds to: (a) a non-critical (Cn) cardiovascular disease (CVD) group, (b) a critical (Cc) cardiovascular disease (CVD) group, and (c) a super-critical (Cs) cardiovascular disease (CVD) group.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
FIG. 1 illustrates a system for detection of a domain-principled inference with a Resnet-transformer model for ECG classification, according to some embodiments of the present disclosure.
FIG. 2A and FIG. 2B are exemplary functional block diagrams illustrating architectural overviews of the system of FIG. 1, according to some embodiments of the present disclosure.
FIG. 3 is an exemplary flow diagram illustrating method of detecting the domain-principled inference with the Resnet-transformer model for the ECG classification, according to an embodiment of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
There is a need for criticality-aware ECG classification model development which is a practical requirement for automated diagnosis solution i.e., minimizing false negative rates of decisive diagnosis, and ensuring improved sensitivity of a critical diagnosis classes. The clinically significant diagnosis class is to be handled in more sensitive manner such that the automated model is competent of better inference capability of critical classes. Embodiments of the present disclosure provide a method and system of domain-principled inference with Resnet-transformer model for ECG classification. An ECG classification model is developed which is capable of interpreting practical ECG signals (e.g., 12 lead ECG recordings). The embodiment of the present disclosure discloses a hybrid deep neural model architecture i.e., domain-principled ResNet- Transformer, which includes two functional components: a ResNet-Transformer network and a domain-principled inference. The residual network or ResNet acts as a feature extractor from the ECG signal to construct a lower dimensional feature embeddings, which are fed to transformer network to capture patterns in ECG signals. The internal information is learned between the lower dimensional feature embeddings for classification by a classification model with one or more class labels.
Referring now to the drawings, and more particularly to FIGS. 1 through 3, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
FIG. 1 illustrates a system 100 for detection of a domain-principled inference with a Resnet-transformer model for ECG classification, according to some embodiments of the present disclosure. In an embodiment, the system 100 includes one or more processor(s) 102, communication interface device(s) or input/output (I/O) interface(s) 106, and one or more data storage devices or memory 104 operatively coupled to the one or more processors 102. The memory 104 includes a database. The one or more processor(s) processor 102, the memory 104, and the I/O interface(s) 106 may be coupled by a system bus such as a system bus 108 or a similar mechanism. The one or more processor(s) 102 that are hardware processors can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more processor(s) 102 are configured to fetch and execute computer-readable instructions stored in the memory 104. In an embodiment, the system 100 can be implemented in a variety of computing systems, such as laptop computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud, and the like.
The I/O interface device(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface device(s) 106 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a keyboard, a mouse, an external memory, a camera device, and a printer. Further, the I/O interface device(s) 106 may enable the system 100 to communicate with other devices, such as web servers and external databases. The I/O interface device(s) 106 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, local area network (LAN), cable, etc., and wireless networks, such as Wireless LAN (WLAN), cellular, or satellite. In an embodiment, the I/O interface device(s) 106 can include one or more ports for connecting number of devices to one another or to another server.
The memory 104 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random-access memory (SRAM) and dynamic random-access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, the memory 104 includes a plurality of modules 110 and a repository 112 for storing data processed, received, and generated by the plurality of modules 110. The plurality of modules 110 may include routines, programs, objects, components, data structures, and so on, which perform particular tasks or implement particular abstract data types.
Further, the database stores information pertaining to inputs fed to the system 100 and/or outputs generated by the system (e.g., data/output generated at each stage of the data processing) 100, specific to the methodology described herein. More specifically, the database stores information being processed at each step of the proposed methodology.
Additionally, the plurality of modules 110 may include programs or coded instructions that supplement applications and functions of the system 100. The repository 112, amongst other things, includes a system database 114 and other data 116. The other data 116 may include data generated as a result of the execution of one or more modules in the plurality of modules 110. Further, the database stores information pertaining to inputs fed to the system 100 and/or outputs generated by the system (e.g., at each stage), specific to the methodology described herein. Herein, the memory for example the memory 104 and the computer program code configured to, with the hardware processor for example the processor 102, causes the system 100 to perform various functions described herein under.
FIG. 2A and FIG. 2B are exemplary functional block diagrams illustrating the architectural overviews of the system 100 of FIG. 1, according to some embodiments of the present disclosure. A system 200 may be an example of the system 100 (FIG. 1). The system includes a twelve-lead electrocardiogram (ECG) 202, a Resnet block 204, an embedded spacing 206, a transformer block 208, and a classification head 210. The Resnet block 204 includes a residual Network or a ResNet. The transformer block 208 includes a transformer network. A residual network with a transformer is utilized for construction of a classification model. A hybrid deep neural model architecture referred as a domain-principled ResNet-Transformer, which includes two integrated functional components: a ResNet-Transformer network and a domain-principled inference. The residual network or the ResNet acts as a feature extractor from the ECG signal to extract lower dimensional feature embeddings, which are fed to the transformer network with the classification head 210 to capture one or more patterns in ECG signals and learns an internal information between the lower dimensional feature embeddings to construct the classification model with one or more class labels.
A ResNet-Transformer model considers an annotated twelve lead ECG datasets with different disease classes as an input training example, where a set of N examples constitute a training dataset X^train=[x^((1) ),x^((2) ),… x^((n) )], where each of the x^((n) ), n= 1, 2,.., N is a recordings from the twelve lead electrocardiogram (ECG) 202 and a complete training set also includes corresponding labels ?C^X?^train, where each of the training examples is associated with a single or multiple labels with total G number of diagnosis classes. In an embodiment, practical ECG recordings may be a multi-label, with more than a single diagnosis annotated by an expert. The classification model is developed for the recordings from the twelve lead Electrocardiogram (ECG) 202 based on: (a) a base deep learning model to perform an accurate classification of signals of the twelve lead electrocardiogram (ECG) 202; and (b) a domain-principled model, where an inference capability ResNet-transformer is influenced upon a severity of the one or more class-labels with an intention of severe cardiovascular diseases (CVDs) to obtain better sensitivity over the base ResNet-Transformer model.
With reference to FIG. 2B, the base deep learning model includes one or more module: (a) the ResNet extracts low level embedding representation from input ECG waveform, (b) Transformer encoder network learns information from a low-level embedding representations to construct the classification model. The ResNet part of the model include one or more residual units, where each of the one or more residual units performs a following computation:
y ?_l=h(x ?_l )+F (x ?_l,W ?_l) (1)
x ?_(l ?+1)=f (y ?_l) (2)
x ?_l is an input feature to l^th residual block. W ?_l are set of weights and biases, and x ?_(l ?+1) is output of the l^th residual block. f is an activation function for l^th layer. Considering L residual units, the output of a final block x ?_L can be expressed in form of initial input x ?_l using equation (1) and (2) as:
x ?_L=A (x ?_1 )+B (x ?_1,W ?_1,W ?_2,…,W ?_L) (3)
where the ‘A’ is a derivable function obtained because of one or more skip connections which can be expressed as a function of initial signal. The ‘B’ can be expressed as a summation of residue function ‘F’. A key architectural advantage of the residual network is attachment of one or more skip connections. The given input signal can be expressed as a function of an initial signal along with a learnable residue function which creates an indirect path for the input signal to propagate for an entire depth of the network thereby preserving a gradient for deeper networks.
In an embodiment, the network considers representations of one or more images by segmenting into patches, and learnable positional encoding are encoded to apply positional information to the images. Subsequently, the transformer network is developed for time series classification task. Considering, one or more representations as (a ?_1,a ?_2,…,a ?_m ), where a ?_i? R^g and g represents number of representations. A positioning encoding (p ?_1,p ?_2,…,p ?_m) are added to the representations to retain a positional information and to form set of sequences Z ?=(z ?_1,z ?_2,…,z ?_m). Each representation is processed in the attention block based on the following equations:
Q ?= (W_q ) ?Z ? (4)
K ?= (W_k ) ?Z ? (5)
V ?= (W_v ) ?Z ? (6)
The individual vectors Q ?, K ?,V ? are Query, Key, Value vectors respectively calculated using weights (W_q ) ?,(W_k ) ?,(W_v ) ? inside each attention head. Self-attention is calculated as:
Attention ( Q ?,K ?,V ? )=softmax(?Q ?K ??^T/v(d_k )) V ? (7)
The attention vector is normalized and fed to a multi-layer perceptron (MLP) block. The v(d_k ) is a scaling factor where the d_k is size of the Q ? vector. Alternating layers of the self-attention and the MLP block are considered with residual connection between blocks.
The classification head 210 includes a dense layer followed by a sigmoid activation function. As the classification contain samples with one or more output classes, the sigmoid activation function is used to learn probabilities independently with an advantage of having number of classes, and high probabilities to assist the multi-label classification. In an embodiment, with a decision threshold hyper-parameter (< 0.5), a prediction is controlled for multi-label inference such that predicted classes are those with sigmoid output probabilities more than a pre-assigned decision threshold. For high number of target classes and the multi-label scenario, a decision threshold is chosen in a conservative manner.
The diagnosis or a treatment plan of a cardiologist are imitated into an inference decision making process of the ResNet-Transformer model over unseen test recordings from the twelve lead ECG 202. The inference decision making process is termed as a domain-principled inference that attempts a higher sensitivity on the severe or critical classes such that detection capability of the ResNet-Transformer model increases on the critical CVDs. The classification labels i.e., with respect to one or more diagnosis classes from one or more datasets obtained from the twelve lead ECG 202 are divided into three groups: (a) non-critical CVDs group, (b) critical CVDs group, and (c) super-critical CVDs group. The non-critical CVDs group, the critical CVDs group, and the super-critical CVDs group are denoted as Cn, Cc, Cs respectively. A domain principled based inference model M* is derived from the base deep learning model (i.e., the ResNet-Transformer (M)) such that the M* guarantees higher sensitivities (e.g., at least not lesser than that of M) in case of the critical CVDs group (Cc), and the supercritical CVDs group (Cs). For example, classification of the diagnosis into different CVDS i.e., criticality-aware categorization is as mentioned in below table 1:
A reward centric decision for critically sensitive classes is derived by incentivizing the inference decision to correctly identify the Cc, Cs group of CVDs over Cn CVDs under multi-label classification task. In an embodiment, to estimate one or more decision boundaries for each of the CVD groups, the responses of the sigmoid outcome probabilities are captured as an apriori knowledge (e.g., an estimation of decision threshold relaxation parameter (d)). Considering the apriori knowledge from each of ct, t = 1, 2, · · · , G classes as a sigmoid outcome probabilities for X^Val datasets when validated by the trained model as denoted by O_ct , ?ct, t = 1, 2, · · · , G. For example, a CVD class, ct consists of total Nct number of training instances and r% of training instances are taken as a validation set. The set O_ct for that CVD class contains (100*r)/N_(c_t ) number of elements. Each class contributes to build the apriori knowledge to understand the classifier response on X^Val, which is denoted by O= O_(c1,) O_c2,.......,O_cG, where each O_ct, t= 1, 2, …, G contains the sigmoid output probabilities ? [0, 1]. In an embodiment, a reliable decision boundary ?_ct from the knowledge of O_ct is estimated by eliminating one or more spurious outcomes, which mainly constitutes from one or more outliers in the O_ct. Considering an upper bound of ?_ct = 0.5, ?ct, where ?_ct is the decision boundary of class ct. The rewards of correctly detecting the super-critical (Cc) and critical (Cs) classes are conferred by relaxing the decision boundaries, ?_ct < 0.5, ct ? Cc, and Cs. The inference degradation due to higher penalty in specificity and lower gain in sensitivity are avoided by removing the spurious or outlier sigmoid probabilities on validation datasets from the decision boundary estimation. An affinity propagation clustering technique is configured to identify such outliers in O_ct. For example, a loopy belief propagation technique in the affinity propagation maximizes chance of similar data points are grouped together by evaluating net similarity. The spurious cluster is identified as one with minimum centroid value.
The domain-principled inference algorithm ensures differential decision thresholds to infer a class prediction from a last activation layer (i.e., sigmoid) output probabilities, where the decision thresholds are relaxed with a single parameter i.e., a decision threshold relaxation parameter (d) to ensure a better detection capability of the critical and the super-critical classes. The single parameter (d) is to be chosen as d > 1 but not as a very high value and recommended that d < 2, so that specificity is not significantly (i.e., negatively) impacted.
FIG. 3 is an exemplary flow diagram illustrating method 300 of detecting the domain-principled inference with the Resnet-transformer model for the ECG classification, according to an embodiment of the present disclosure. In an embodiment, the system 100 comprises one or more data storage devices or the memory 104 operatively coupled to the one or more hardware processors 102 and is configured to store instructions for execution of steps of the method by the one or more processors 102. The flow diagram depicted is better understood by way of following explanation/description. The steps of the method of the present disclosure will now be explained with reference to the components of the system as depicted in FIGS. 1, 2A, and 2B.
At step 302, one or more physiological signals are received from a twelve-lead electrocardiogram (ECG) 202 as an input. The one or more physiological signals are expressed as a function of an initial signal along with a learnable residue function (F) to create an indirect path for the input signal to propagate entire depth of the network. The function of the initial signal is based on a derivable function (A) which is obtained by one or more skip connections. At step 304, the one or more physiological signals from the twelve-lead electrocardiogram (ECG) 202 are processed by the residual network (ResNet) to extract one or more lower dimensional feature embeddings. The residual network (ResNet) is a feature extractor. The residual network (ResNet) includes one or more of residual units. The one or more residual units corresponds to ten residual blocks with two convolutional layers per block.
At step 306, a classification model is constructed by a transformer network based on the one or more lower dimensional feature embeddings. The transformer network corresponds to a multiclass multilabel classifier. At step 308, one or more patterns is obtained based on the classification model at a domain-principled ResNet-Transformer network. The domain-principled ResNet-Transformer network includes one or more integrated functional components. The one or more integrated functional components corresponds to (i) a ResNet-Transformer network model, and (ii) a domain-principled inference. At step 310, a trained model is obtained by learning the one or more patterns and information between the one or more lower dimensional feature embeddings by the classification head 210. The classification head 210 includes a dense layer and a sigmoid activation function. The sigmoid activation function learns probabilities independently with number of classes, and high probabilities to obtain the one or more class labels. At step 312, an inference decision boundary associated with one or more class labels is derived based on the trained model. The one or more class labels corresponds to one or more cardiovascular diseases (CVDs) group derived based on a clinical severity level. The one or more cardiovascular diseases (CVDs) group corresponds to: (a) a non-critical (Cn) cardiovascular disease (CVD) group, (b) a critical (Cc) cardiovascular disease (CVD) group, and (c) a super-critical (Cs) cardiovascular disease (CVD) group.
Consider, below mentioned are exemplary pseudo codes for the above method steps described herein.
In an exemplary scenario, when the apriori knowledge is available, then the domain-principled inference algorithm for which an input is a trained model M, a test signal X^test, an inference decision threshold ? < 0.5, a domain-principled factor d where d > 1. The output corresponds to inference labels on the X^test, denoted by C^(X^test ) consisting of C_(C_n)^(X^test ), C_(C_c)^(X^test ), C_(C_s)^(X^test ) for C_n,C_c,C_s respectively. Let the output probabilities obtained from the model M for the X^test input be P_j^(X^test ) j?G, where G is the total number of class labels i.e., P_j^(X^test )={p_1,p_2,…,p_G}.
[040] In another exemplary scenario, when the apriori knowledge is unavailable or uncertain, then the domain-principled inference algorithm, for which an input is the trained model M, ???????? be the set of validation data, minimum inference decision boundary, ????????, domain principled factor, ??, where ??>1. The output corresponds to Inference decision boundary for each class denoted by ??????,???G, such that a test data belong to the tth class when the sigmoid outcome is more than ??????. Let the sigmoid output probabilities for each of the classes obtained from model M over the validation set ???????? data be ??????????????
??????????????= ????1????????,????2????????,….
Where ?????? is a response of ????h relevant validation data (i.e., only those validation data where a ground truth is positive) for ????h class and ?????? constitutes the apriori knowledge ??.
for t ? 1 to G do
Step1: Identify the sigmoid outcomes that require investigation:
??????????????*= ??????????????<0.5
Step2: Find the set of affinity propagation cluster centers ????, where ???? = affinity propagation ??????????????*,??, ?? is a damping ratio.
If ???? = = 0 then
????????????| ????????????????=???????? (??????????????*)
end
else
???? | ???????????????? = ????\min(????) [removing the outlier cluster to minimize chance of spurious interference].
????????????| selected = min (??????????????* (????| ????????????????))
[?????????????????????? ??????????????* from the remaining cluster].
End
Step3: Inference decision boundary for each class is calculated as:
??????=?????? (????????????| selected, ????????) [the decision boundary is lower bounded by ???????? to restrict the penalty in specificity].
end
Step 4: With super-critical class detection getting more reward than critical class detection and hence the decision boundary for each of these classes is finally calculated as:
??????=??????????, where degree of criticality, ??=0,1,2 for non-critical, critical, and super-critical classes respectively. [disease group severity-based reward, parameterized by a].
[041] Experimental results:
[042] For example, a study is conducted with an experimental datasets which consist of 43101 ECG recording samples from six different sources. A challenge is to classify twenty-seven clinical diagnosis classes from the 12 Lead ECG recording as multi-label classification task. As the collection of samples are from different sources, time domain length of the signal and sampling frequency are different among each of the sources. Out of 111 abnormalities, total 27 abnormalities were included as they were relatively common during diagnosis and have more clinical importance in real world perspective. The datasets are sourced from one or more healthcare organization at different geographic locations and the ECG recordings are sampled at different frequencies and resampled to a fixed rate of 500Hz. A uniform number of time steps with time period of T sec is chosen. Initial T sec window is taken if the recording is of length more than T sec and zero padding is applied if length of the raw signal is less than T sec.
[043] Let ??=[????] be collection of diagnoses. A multi class confusion matrix ??=[??????], where ?????? is a normalized number of recordings that were classified as belonging to class ???? but they belong to class ????. A metric assigns different weights ??=[??????] in the matrix for different entries based on similarity of treatment. The unnormalized score is given by ??= ????[????????????]. The score is normalized in between 0 and 1. A higher score indicates better performance in overall classification.
[044] Due to prevalence or limitation of collection process, the class distribution may not be uniform, and accuracy may not be a good metric. Under such scenario, considering ???? score as a more reliable metric. ???? score can be interpreted as a weighted harmonic mean of precision and recall which allows to decide on balance between the precision and the recall using a parameter ß.
????=1+??2*??????????????????*???????????? ??2*??????????????????+???????????? (8)
[045] The significance of selecting the ß in the ???? score is of importance because the ß acts as a criteria between precision and recall or sensitivity. Like a typical prediction objective in a medical domain analysis, aim of this classification model is to ensure to have a minimal false negatives. Therefore, emphasis is on the outcomes when ß > 1, where sensitivity is more weighted than precision.
[046] The residual network part consists of ten residual blocks with two convolutional layers per block. Batch Normalization (BN) with a Rectified Linear Unit (ReLU) activation function is used before each convolution layer. Alternate residual blocks are down sampled using a max-pooling operation by a factor of two, which brings down an original input length by 25 times. In the transformer network, a scalar dot product attention is calculated for each individual embedded representation inside a multiheaded attention layer. For each embedded representations, a query, a key, a value vectors are calculated by multiplying with learned weight matrices. In an embodiment, stacking of eight attention modules in series which are referred as one or more heads. The output of the multiheaded attention module is normalized and fed to the MLP layer to generate an input to a next layer. Alternating layers of self-attention, and the MLP layer are stacked to obtain the final output before classification. In an embodiment, to solve multi-label classification task, the sigmoid activation function is considered to compute the output probabilities.
[047] In an embodiment, Table 2 highlights description of important hyper-parameters that are used in developing the trained model.
Minimum inference decision boundary ?_min) 0.1
TABLE 2
The experimental dataset is partitioned into training and testing parts with a random split by k-fold (k=10) stratified cross-validation. The model is constructed on each of the training set (e.g., 90%) and the inference is drawn by the trained model over the rest or the test dataset (e.g., 10%). The outcome of empirical investigation is illustrated as below:
A two type of study is focused, and first type is a macro study i.e., a performance under an aggregate scale like average challenge metric and F_ß score, where all the diagnosis classes are collectively considered and compared under ten-fold cross-validation study against current benchmark model- PRNA. Table 3 is a comparative study of average challenge metric (C.M) with standard deviation (s) (i.e., values in bold indicates better performance merit).
As mentioned below, table 4 depicts a variation in F_ß at different ß. F_ß scores for ß > 1 in the domain-principled ResNet-Transformer model comprehensively outperform existing models. An empirical validation of the domain-principled model that signifies better response to sensitivity metric, while that the average challenge metric value is insignificantly less impacted. In fact, with high ß like ß = 5, 6, 7, 8, .., the domain-principled method predominates.
[053] Considering, the Table 5 and the table 6, the efficacy of the domain-principled model, where the sensitivity is always higher than the base ResNet-Transformer model for critical and super-critical diagnosis, which is the emphasis of this research work. However, the cost of higher sensitivity is traditionally balanced with lower specificity. A drop in specificity values as depicted in table 7 and table 8 are much less than the gain in sensitivity. For non-critical diagnosis, the decision thresholds for inference does not change.
[054] A second type is a micro study or diagnosis-specific study i.e., an efficacy of the domain-principled model over the base deep learning model to show the sensitivity metric improvement of the critical and the super-critical class prediction of the domain principled model in the table 5, the table 6, table 7, and table 8. As mentioned below, table 7 is comparative study of specificity for super-critical classes.
[056] The outcomes of the inference with respect to the domain-principled factor, d, which is a control parameter for fine-tuning the rewards in the inference method.
[057] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[058] The embodiment of present disclosure herein addresses unresolved problem of overall classification performance from the ECG recordings. The embodiment of present disclosure herein provides a CVD screening solution, where a medical domain principle is to minimize false negative rates of decisive diagnosis or to improve sensitivity of the critical CVD classes. The domain-principled ResNet-Transformer model-based inference algorithm ensures higher sensitivity measures of the critical diagnosis classes, where reward-based decision-making approach is proposed to maximize the reward of successfully predicting the critical CVDs, where a risk-averted approach is largely considered. A clinical domain principle of minimizing chance of misdiagnosis is translated into a machine learning context as maximizing the sensitivity of the critical disease classes.
[059] The embodiment of present disclosure herein thus provides a hybrid deep neural model architecture called domain-principled ResNet- Transformer for overall classification performance. the proposed ResNet-Transformer model demonstrates good classification performance, the domain-principled inference ensures that the model is capable of higher sensitivity for the detection of the critical CVD diagnosis classes. The embodiment of present disclosure herein provides better relative sensitivity of clinically significant classes over the base deep learning model. The domain-principled model endeavors better sensitivity of Atrial fibrillation (AF) classification relative to the inference from base deep learning model. The domain principled ResNet transformer model is uniquely positioned to not only improve the aggregated metric for classification performance than the current state-of-the-art algorithms, but also respects the cardiologist’s criteria for CVD detection as required in a practical automated ECG classification model. The ResNet-Transformer or the corresponding domain-principled version is around 7.6x lighter and with better performance efficacy than the relevant state-of-the-art model PRNA.
[060] The proposed approach is to incorporate the diagnosis principle followed by cardiologists such that false negative alarms are minimized for specific types of disease classes. Automated and reliable identification of critical CVDs from ECG recordings paves path towards development of artificial intelligent (AI) assistant for medical caregivers particularly in emergency situations and in rural or remote areas. The claimed approach may be extended to introduce a domain-knowledge e.g., rules that guide the cardiologists to decide the diagnosis into the proposed domain-principled model with an objective of enriching the feature space for the purpose of more accurate learning model development. The proposed domain-principled inference algorithm uses affinity-propagation based clustering approach that estimates decision boundaries to provide higher reward in correctly classifying the critical CVD diagnosis. The proposed ResNet-Transformer model demonstrates reliable classification performance, the domain-principled inference algorithm ensures that the model is capable of higher sensitivity measures of the critical diagnosis classes. The proposed domain-principled ResNet-Transformer model is uniquely positioned not only to improve the aggregated metric for multilabel classification performance than the current state-of-the art algorithms over publicly available, expert-annotated 12- lead ECG recording datasets, but also respects the cardiologist’s criteria for the CVD detection as required in a practical automated ECG classification model.
[061] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
[062] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[063] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[064] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[065] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
,CLAIMS::
1. A processor implemented method (300), comprising:
receiving, via one or more hardware processors, at least one physiological signal from a twelve-lead electrocardiogram (ECG) as an input (302);
processing, by a residual network (ResNet), the at least one physiological signal from the twelve-lead electrocardiogram (ECG) to extract a plurality of lower dimensional feature embeddings, and wherein the residual network (ResNet) is a feature extractor (304);
constructing, by a transformer network, a classification model based on the plurality of lower dimensional feature embeddings, wherein the transformer network corresponds to a multiclass multilabel classifier (306);
obtaining, via the one or more hardware processors, a plurality of patterns based on the classification model at a domain-principled ResNet-Transformer network, wherein the domain-principled ResNet-Transformer network comprises plurality of integrated functional components, wherein the plurality of integrated functional components corresponds to (i) a ResNet-Transformer network model, and (ii) a domain-principled inference (308);
obtaining, via the one or more hardware processors, a trained model by learning the plurality of patterns and information between the plurality of lower dimensional feature embeddings by a classification head (310); and
deriving, via the one or more hardware processors, an inference decision boundary associated with a plurality of class labels based on the trained model (312).
2. The processor implemented method (300) as claimed in claim 1, wherein the residual network (ResNet) comprises a plurality of residual units, and wherein the plurality of residual units corresponds to ten residual blocks with two convolutional layers per block.
3. The processor implemented method (300) as claimed in claim 1, wherein the at least one physiological signal is expressed as a function of an initial signal along with a learnable residue function (F) to create an indirect path for the input signal to propagate entire depth of the network, and wherein the function of the initial signal is based on a derivable function (A) which is obtained by a plurality of skip connections.
4. The processor implemented method (300) as claimed in claim 1, wherein the classification head comprise a dense layer and a sigmoid activation function, and wherein the sigmoid activation function learns probabilities independently with number of classes, and high probabilities to obtain the plurality of class labels.
5. The processor implemented method (300) as claimed in claim 1, wherein the plurality of class labels corresponds to a plurality of cardiovascular diseases (CVDs) group derived based on a clinical severity level, and wherein the plurality of cardiovascular diseases (CVDs) group corresponds to: (a) a non-critical (Cn) cardiovascular disease (CVD) group, (b) a critical (Cc) cardiovascular disease (CVD) group, and (c) a super-critical (Cs) cardiovascular disease (CVD) group.
6. A system (100), comprising:
a memory (104) storing instructions;
one or more communication interfaces (106); and
one or more hardware processors (102) coupled to the memory (104) via the one or more communication interfaces (106), wherein the one or more hardware processors (102) are configured by the instructions to:
receive, at least one physiological signal from a twelve-lead electrocardiogram (ECG) (202) as an input;
process, by a residual network (ResNet), the at least one physiological signal from the twelve-lead electrocardiogram (ECG) (202) to extract a plurality of lower dimensional feature embeddings, and wherein the residual network (ResNet) is a feature extractor;
construct, by a transformer network, a classification model based on the plurality of lower dimensional feature embeddings, wherein the transformer network corresponds to a multiclass multilabel classifier;
obtain, a plurality of patterns based on the classification model at a domain-principled ResNet-Transformer network, wherein the domain-principled ResNet-Transformer network comprises plurality of integrated functional components, wherein the plurality of integrated functional components corresponds to (i) a ResNet-Transformer network model, and (ii) a domain-principled inference;
obtain, a trained model by learning the plurality of patterns and information between the plurality of lower dimensional feature embeddings by a classification head (210); and
derive, an inference decision boundary associated with a plurality of class labels based on the trained model.
7. The system (100) as claimed in claim 6, wherein the residual network (ResNet) comprises a plurality of residual units, and wherein the plurality of residual units corresponds to ten residual blocks with two convolutional layers per block.
8. The system (100) as claimed in claim 6, wherein the at least one physiological signal is expressed as a function of an initial signal along with a learnable residue function (F) to create an indirect path for the input signal to propagate entire depth of the network, and wherein the function of the initial signal is based on a derivable function (A) which is obtained by a plurality of skip connections.
9. The system (100) as claimed in claim 6, wherein the classification head (210) comprise a dense layer and a sigmoid activation function, and wherein the sigmoid activation function learns probabilities independently with number of classes, and high probabilities to obtain the plurality of class labels.
10. The system (100) as claimed in claim 6, wherein the plurality of class labels corresponds to a plurality of cardiovascular diseases (CVDs) group derived based on a clinical severity level, and wherein the plurality of cardiovascular diseases (CVDs) group corresponds to: (a) a non-critical (Cn) cardiovascular disease (CVD) group, (b) a critical (Cc) cardiovascular disease (CVD) group, and (c) a super-critical (Cs) cardiovascular disease (CVD) group.
| # | Name | Date |
|---|---|---|
| 1 | 202121048027-STATEMENT OF UNDERTAKING (FORM 3) [21-10-2021(online)].pdf | 2021-10-21 |
| 2 | 202121048027-PROVISIONAL SPECIFICATION [21-10-2021(online)].pdf | 2021-10-21 |
| 3 | 202121048027-FORM 1 [21-10-2021(online)].pdf | 2021-10-21 |
| 4 | 202121048027-DRAWINGS [21-10-2021(online)].pdf | 2021-10-21 |
| 5 | 202121048027-DECLARATION OF INVENTORSHIP (FORM 5) [21-10-2021(online)].pdf | 2021-10-21 |
| 6 | 202121048027-Proof of Right [07-03-2022(online)].pdf | 2022-03-07 |
| 7 | 202121048027-FORM-26 [14-04-2022(online)].pdf | 2022-04-14 |
| 8 | 202121048027-FORM 3 [15-06-2022(online)].pdf | 2022-06-15 |
| 9 | 202121048027-FORM 18 [15-06-2022(online)].pdf | 2022-06-15 |
| 10 | 202121048027-ENDORSEMENT BY INVENTORS [15-06-2022(online)].pdf | 2022-06-15 |
| 11 | 202121048027-DRAWING [15-06-2022(online)].pdf | 2022-06-15 |
| 12 | 202121048027-COMPLETE SPECIFICATION [15-06-2022(online)].pdf | 2022-06-15 |
| 13 | Abstract1.jpg | 2022-06-23 |
| 14 | 202121048027-FER.pdf | 2023-11-08 |
| 15 | 202121048027-Power of Authority [27-03-2024(online)].pdf | 2024-03-27 |
| 16 | 202121048027-PETITION u-r 6(6) [27-03-2024(online)].pdf | 2024-03-27 |
| 17 | 202121048027-OTHERS [27-03-2024(online)].pdf | 2024-03-27 |
| 18 | 202121048027-FER_SER_REPLY [27-03-2024(online)].pdf | 2024-03-27 |
| 19 | 202121048027-Covering Letter [27-03-2024(online)].pdf | 2024-03-27 |
| 20 | 202121048027-COMPLETE SPECIFICATION [27-03-2024(online)].pdf | 2024-03-27 |
| 21 | 202121048027-CLAIMS [27-03-2024(online)].pdf | 2024-03-27 |
| 1 | 202121048027E_07-11-2023.pdf |