Abstract: Embodiments of the present disclosure provide a method and system few-shot meta-learning based Remaining Useful Life (RUL) estimation of machines. The method disclosed herein generates task definition through the support and the query set preparation for the multi-sensor timeseries partial lifecycle data of machines, and unlike existing methods does not require full life cycle data. The task definition is used to train a meta learning model implemented using an external storage (ES) entry overriding strategy using Long Short Term External Memory (LSTEM) network to improves the performance in the RUL estimation. The ES search result is combined along with normal LSTM cell penultimate layer output using a sigmoid gate and a Pearson’s correlation coefficient in the LSTEM network. Further, a median based RUL estimation process is applied on prediction result of the meta learning based model. [To be published with 1B]
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION (See Section 10 and Rule 13)
Title of invention:
METHOD AND SYSTEM FOR FEW-SHOT META-LEARNING BASED REMAINING USEFUL LIFE (RUL) ESTIMATION
Applicant
Tata Consultancy Services Limited A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description
The following specification particularly describes the invention and the manner in which it is to be performed.
TECHNICAL FIELD
[001] The embodiments herein generally relate to field of Remaining Useful Life (RUL) estimation of machines and, more particularly, to a method and system for few-shot meta-learning based RUL estimation.
BACKGROUND
[002] Most of the existing Machine Learning (ML) or Deep learning (DL) based solutions, in the healthcare and industrial IoT (IIoT) for use cases like manufacturing, energy and utility industry segments, for the Remaining Useful Life(RUL) estimation for a machine largely depend on either or all of the following conditions: It requires large no. of annotated samples for the model creation(i.e. training). Each training sample instance must represent full-lifecycle of the similar types of machines (or previous runs of the machine) i.e. each training instance must contain start to end of a machine lifecycle.
[003] But in the real world, for the above mentioned use cases there is hardly any properly annotated dataset that satisfies all the above conditions. Most of the clients have real life datasets with the constraints such as i) sparsity in size due to very few relevant training samples present, ii) incomplete in nature as majority of the training sample instances do not representing the entire lifecycle from the start to the end but only partial lifecycles (e.g. mid-life to end etc.).
[004] The existing state of the art ML/DL based solutions underperform (or even cannot be applied) in RUL estimation in these cases. As well known in art Meta-learning addresses the challenges in few-shot learning scenarios or scenarios where there exists scarcity of labelled data for training. Recently attempts have been made in literature to utilize few-shot meta-learning based approach for RUL estimation. However, RUL estimation models need to handle time-series data across different scales and across multiple sensors. Thus, long-short-term memory (LSTM) neural networks (NNs), which are effective and scalable models for handling time-sequential data are commonly utilized to implement few-shot meta-
learning models. Works in literature have proposed modified LSTM for RUL estimation, which include works such LSTM which integrates a novel partial least square based on a genetic algorithm (GAPLS-LSTM), or a variant LSTM tracking cell states actively (AST-LSTM NN), where the AST-LSTM cell determines old information and new data simultaneously through a fixed connection. However, the above modified LSTMs are not standard approaches for implementation of few-shot meta-learning.
[005] Further, most of the works in literature rely-on or assume to have full life cycle data of machines while building meta learning models and hardly attempt to resolve technical challenge in the practical real life problem of unavailability of full life cycles for training RUL estimation models.
SUMMARY
[006] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
[007] For example, in one embodiment, a method for few-shot meta-learning based Remaining Useful Life (RUL) estimation is provided. The method receives training data comprising time series data captured from a plurality of sensors associated with a limited number of machines being monitored for building a Meta-model for the RUL estimation. Further, creates a training task definition comprising a plurality of training tasks and a dataset pair (ds , dq) per each of the plurality of training tasks, wherein the dataset pair (ds , dq ) to generate a training support set (ds) and a training query set(dq) is derived from a training data, wherein creating the training task definition comprises: i) recording, from the training data, the time series data associated with the limited number of machines (m) of varying types and comprising a limited number of instances (p) of each of the limited number of machines, wherein each of the limited number of instances comprise partial life cycles of each of the limited number of machines (m) and is represented
by a sequence of tuples with a tuple count (n’); ii) determining a minimum instance length (n) from among lengths of the plurality of time instances, wherein the minimum instance length is selected in accordance with a minimum instance length that is present in the training data; iii) creating datasets for each of the plurality of training tasks selecting a set of machines (m’) from among the limited number of machines (m) of varying types with a randomly selected instance of tuple count (n’) from among a limited number of instances (p) available in the training data, wherein summation of the tuple count (n’) for the randomly selected instances for each of the set of machines (m’) is equal to the minimum instance length (n); and iv) selecting data set pair (ds, dq) for the training support set and the training query set respectively for each of the plurality of task definitions. Further, the method builds the Meta-model for the few-shot meta-learning using Long Short Term External Memory (LSTEM) network, wherein the training support set per task is used for learning of the Meta-model and the training query set per task is used for evaluating of the Meta-model generate a trained Meta-model. Furthermore, predict the RUL of a machine based on a median based mechanism used by the trained Meta-model for prediction when tested in accordance with a testing task definition comprising a testing support set and a testing query set.
[008] In another aspect, a system for few-shot meta-learning based Remaining Useful Life (RUL) estimation is provided. The system comprises a memory storing instructions; one or more Input/Output (I/O) interfaces; and one or more hardware processors coupled to the memory via the one or more I/O interfaces, wherein the one or more hardware processors are configured by the instructions to receive training data comprising time series data captured from a plurality of sensors associated with a limited number of machines being monitored for building a Meta-model for the RUL estimation. Further, create a training task definition comprising a plurality of training tasks and a dataset pair (ds, dq) per each of the plurality of training tasks, wherein the dataset pair (ds, dq) to generate a training support set (ds)and a training query set(dq) is derived from a training data, wherein creating the training task definition comprises: i) recording, from the
training data, the time series data associated with the limited number of machines (m) of varying types and comprising a limited number of instances (p) of each of the limited number of machines, wherein each of the limited number of instances comprise partial life cycles of each of the limited number of machines (m) and is represented by a sequence of tuples with a tuple count (n’); ii) determining a minimum instance length (n) from among lengths of the plurality of time instances, wherein the minimum instance length is selected in accordance with a minimum instance length that is present in the training data; iii) creating datasets for each of the plurality of training tasks selecting a set of machines (m’) from among the limited number of machines (m) of varying types with a randomly selected instance of tuple count (n’) from among a limited number of instances (p) available in the training data, wherein summation of the tuple count (n’) for the randomly selected instances for each of the set of machines (m’) is equal to the minimum instance length (n); and iv) selecting data set pair (ds, dq) for the training support set and the training query set respectively for each of the plurality of task definitions. Further, build the Meta-model for the few-shot meta-learning using Long Short Term External Memory (LSTEM) network, wherein the training support set per task is used for learning of the Meta-model and the training query set per task is used for evaluating of the Meta-model generate a trained Meta-model. Furthermore, predict the RUL of a machine based on a median based mechanism used by the trained Meta-model for prediction when tested in accordance with a testing task definition comprising a testing support set and a testing query set.
[009] In yet another aspect, there are provided one or more non-transitory machine-readable information storage mediums comprising one or more instructions, which when executed by one or more hardware processors causes a method for few-shot meta-learning based Remaining Useful Life (RUL) estimation.
[0010] The method receives training data comprising time series data captured from a plurality of sensors associated with a limited number of machines being monitored for building a Meta-model for the RUL estimation. Further, creates a training task definition comprising a plurality of training tasks and a dataset pair
(ds, dq) per each of the plurality of training tasks, wherein the dataset pair (ds, dq) to generate a training support set (ds)and a training query set(dq) is derived from a training data, wherein creating the training task definition comprises: i) recording, from the training data, the time series data associated with the limited number of machines (m) of varying types and comprising a limited number of instances (p) of each of the limited number of machines, wherein each of the limited number of instances comprise partial life cycles of each of the limited number of machines (m) and is represented by a sequence of tuples with a tuple count (n’); ii) determining a minimum instance length (n) from among lengths of the plurality of time instances, wherein the minimum instance length is selected in accordance with a minimum instance length that is present in the training data; iii) creating datasets for each of the plurality of training tasks selecting a set of machines (m’) from among the limited number of machines (m) of varying types with a randomly selected instance of tuple count (m’) from among a limited number of instances (p) available in the training data, wherein summation of the tuple count (n’) for the randomly selected instances for each of the set of machines (m’) is equal to the minimum instance length (n); and iv) selecting data set pair (ds, dq) for the training support set and the training query set respectively for each of the plurality of task definitions. Further, the method builds the Meta-model for the few-shot meta-learning using Long Short Term External Memory (LSTEM) network, wherein the training support set per task is used for learning of the Meta-model and the training query set per task is used for evaluating of the Meta-model generate a trained Meta-model. Furthermore, predicts the RUL of a machine based on a median based mechanism used by the trained Meta-model for prediction when tested in accordance with a testing task definition comprising a testing support set and a testing query set.
[0011] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[0012] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[0013] FIG. 1A is a functional block diagram of a system for few-shot meta-learning based Remaining Useful Life (RUL) estimation, in accordance with some embodiments of the present disclosure.
[0014] FIG. 1B illustrates an architectural and process overview of the system of FIG. 1A, in accordance with some embodiments of the present disclosure.
[0015] FIGS. 2A and FIG. 2B (collectively referred as FIG. 2) is a flow diagram illustrating a method for few-shot meta-learning based Remaining Useful Life (RUL) estimation, using the system of FIG. 1A, in accordance with some embodiments of the present disclosure.
[0016] It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems and devices embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
DETAILED DESCRIPTION OF EMBODIMENTS
[0017] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
[0018] Embodiments of the present disclosure provide a method and system few-shot meta-learning based Remaining Useful Life (RUL) estimation of machines. The machines herein refer to any industrial machines, vehicle engines or the like.
[0019] The method disclosed herein generates task definition through the support and the query set preparation for the multi-sensor timeseries partial lifecycle data of machines, and unlike existing method does not require full life cycle data. The task definition is used to train a Meta-model implemented using an external storage (ES) entry overriding strategy using Long Short Term External Memory (LSTEM) network. LSTEM disclosed herein is an approach that comes under Memory augmented LSTM (MANN) based implementation of the few-shot meta-learning technique. However, in standard MANN few-shot meta-learning follows weight based approach. The MANN uses Least Recently Used (LRU) as a memory replacement strategy but it is not efficient in the above mentioned use cases like manufacturing, energy and utility industry segments. The technical approach followed by the LSTEM and its advancement over the MANN are better understood in conjunction with description of figures
[0020] The LSTEM improves the performance in the RUL estimation while using meta-learning process for RUL estimation of machines, specifically in the manufacturing and energy-utility domain, where there exists scarcity of labelled training data. The ES search result is combined along with normal LSTM cell penultimate layer output using a sigmoid gate and a Pearson’s correlation coefficient in the LSTEM network. Further, a median based RUL estimation process is applied on prediction result of the meta learning based model.
[0021] Referring now to the drawings, and more particularly to FIGS. 1A through 2B, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[0022] FIG. 1A is a functional block diagram of a system for few-shot meta-learning based Remaining Useful Life (RUL) estimation, in accordance with some embodiments of the present disclosure.
[0023] In an embodiment, the system 100 includes a processor(s) 104, communication interface device(s), alternatively referred as input/output (I/O) interface(s) 106, and one or more data storage devices or a memory 102 operatively coupled to the processor(s) 104. The system 100 with one or more hardware processors is configured to execute functions of one or more functional blocks of the system 100.
[0024] Referring to the components of system 100, in an embodiment, the processor(s) 104, can be one or more hardware processors 104 and can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 104 are configured to fetch and execute computer-readable instructions stored in the memory 102. In an embodiment, the system 100 can be implemented in a variety of computing systems including laptop computers, notebooks, hand-held devices such as mobile phones, workstations, mainframe computers, servers, and the like.
[0025] The I/O interface(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface to display the generated target images and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular and the like. In an embodiment, the I/O interface (s) 106 can include one or more ports for connecting to a number of external devices to receive multisensory time series data for RUL estimation received from a plurality of sensors.
[0026] The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or
non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
[0027] Further, the memory 102 includes a database 108 that stores the training data received, a training support set, a training query set, a testing support set, and a testing query set generated by the system 100. Further, the memory 102 includes modules such as the LSTEM and the like. Further, the memory 102 may comprise information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system100 and methods of the present disclosure. The database 108 may also comprise a plurality of executable instructions which when executed cause the hardware processor(s) 104 to perform various actions/steps associated with the few-shot meta-learning based Remaining Useful Life (RUL) estimation being handled by the system of FIG. 1A. In an embodiment, the database 108 may be external (not shown) to the system 100 and coupled to the system via the I/O interface 106. Functions of the components of the system 100 are explained in conjunction with architectural overview of FIG. 1B and flow diagram of FIG. 2.
[0028] FIG. 1B illustrates an architectural and process overview of the system of FIG. 1A, in accordance with some embodiments of the present disclosure and is explained in conjunction with flow diagram of FIG. 2.
[0029] FIG. 2A and FIG. 2B (collectively referred as FIG. 2) is a flow diagram illustrating a method 200 for the few-shot meta-learning based Remaining Useful Life (RUL) estimation, using the system of FIG. 1A, in accordance with some embodiments of the present disclosure. In an embodiment, the system 100 comprises one or more data storage devices or the memory 102 operatively coupled to the processor(s) 104 and is configured to store instructions for execution of steps of the method 200 by the processor(s) or one or more hardware processors 104. The steps of the method 200 of the present disclosure will now be explained with reference to the components or blocks of the system 100 as depicted in FIG. 1A and 1B and the steps of flow diagram as depicted in FIG. 2. Although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods, and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily
indicate a requirement that the steps to be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
[0030] Referring to the steps of the method 200, at step 202 of the method 200, the one or more hardware processors 104 receive training data comprising time series data captured from a plurality of sensors associated with a limited number of machines being monitored for building a Meta-model for the RUL estimation. As mentioned, there exist practical challenge in obtaining large number of labelled data from large number of machines, and also getting full life cycle data for generating training datasets. Thus, the method 200 discloses an data preparation approach to address the technical challenge of i) sparsity in size due to very few relevant training samples present, ii) the training data being incomplete in nature as majority of the training sample instances do not represent the entire lifecycle from the start to the end but only partial lifecycles (e.g. mid-life to end etc.), when handling minimum-instance, active-supervised learning based prognostics for the time-series data captured from the plethora of sensors. Thus to handle this technical challenge with training data, at step 204 of the method 200, the one or more hardware processors 104 create a training task definition comprising a plurality of training tasks and dataset pair (ds, dq) per training task to generate a training support set (ds)and a training query set(dq), derived from the training data. The training task definition creation comprises:
a) Recording, from the training data, the time series data associated with a limited number of machines (m) of varying types and comprising a limited number of instances (p) of each of the limited number of machines. Each of the limited number of instances comprise partial life cycles of each of the limited number of machines (m) and is represented by a sequence of tuples with a tuple count (n’) (204a).
b) Determining a minimum instance length (n) from among lengths of the plurality of time instances. The minimum instance length is
selected in accordance with a minimum instance length that is present in the training data (204b);.
c) Creating datasets for each of the plurality of training tasks selecting a set of machines (m’) from among the limited number of machines (m) of varying types with a randomly selected instance of tuple count (n’) from among a limited number of instances (p) available in the training data. The summation of the tuple count (n’) for the randomly selected instances for each of the set of machines (m’) is equal to the minimum instance length (n) (204c).
d) Selecting data set pair (ds, dq) for the training support set and the training query set respectively for each of the plurality of task definitions (204d).
[0031] The steps 204a through 204d for training task definition through data preparation are explained below:
• Let, in the training dataset there be m different machines (e.g. vehicle) with different types (like make/model) and there are at least p instances for each machine.
• Each instance does not necessarily contain the full life cycle but only a partial lifecycle (e.g. mid-life to end etc.).
• Both m and p are very less, m >=15 and p >=4 (few-shot learning scenario).
• Let n = minimum of the set of the training instance lengths i.e. at least n timesteps are given for each of the training instances.
Note: This sequence length ‘n’ should be chosen in such a way so that in the training dataset (on which RUL predictions to be made) has each test instance of length at least ‘n’.
• Let, one instance be represented as sequence of tuples
{(x1,r1), (x2,r2), ..., (xt,rt), ..., (xn , rn′)}, where xt= feature vector at the time-step t, rt= annotated rul at the time-step t, and n′ ≥ n
• One training task is associated with one Training Support Set and one Training
Query Set.
• One dataset pertaining to the training task is defined as a set of n tuples, d = {(xi ,ri )}, i = 1, 2, ..., n
• Formation of the dataset (associated with the task):
° Step-1:m1 different machines can be chosen at random out of m different
training machines such that 1 ≤ m1 ≤ m
° Step-2: Now each of these m1 different machines has at least p instances
and one subsequence is chosen at random from each machine instance.
Let ni be the length of the chosen subsequence from any of the instances
of machine i where i = 1, 2, ... , m1
Then ∑mi =11 ni =n
e.g. From 3 randomly selected machines out of m different machines, any 3
subsequence of length n1, n2and n3 can be selected such that ∑3i=1 ni = n
Furthermore, a subsequence of length n1 chosen from the machine-1
instances can be represented as:
{(x11,r11),(x12,r12),...,(x1t,r1t),...,(x1n1,r1n1)}
• From the above steps, it is evident that a set of a very large no of datasets,
{d1, d2, ..., dk} where dj = {(xi ,ri )}ni=1 , j = 1, 2, ..., k can be formed by the
random selections of the sub-sequences and by varying the order of the
selected sub-sequences.
e.g. If 3 sub-sequences SS1, SS2 and SS3 are put in this order to form a dataset, then many other datasets like {SS2, SS1, SS3}, {SS1, SS3, SS2} etc. can be formed just by changing their order of the inclusion. Thus, this way method 200 generates abundance data from limited training data received from sensors, or other sources.
• Support and Query set formation: After forming a very large number of
datasets associated with the training tasks, this set of datasets
{d1, d2,..., dk} can be used to select dataset-pairs (ds , dq) such that both ds
and dq are formed using instances of the same type of the machines. ds is
treated as the support set and used for learning the mechanism for solving
the task during the Meta-model training phase.
dq is treated as the query set and used for evaluation of the mechanism learned using ds during the Meta-model training phase.
• Justification for the Task definition: The main aim is NOT that the model
should learn the exact sequence mapping based on the actual content of the
data and label but to learn the general mechanism of binding the data with
the label (i.e. to learn to learn). Data shuffling using the random sub¬
sequence selection and subsequence order variation during the task creation
are used to prevent the model from the exact sequence learning and to
encourage the meta-learning.
[0032] Further, at step 206, the one or more hardware process 104 build the Meta-model for the few-shot meta-learning using Long Short Term External Memory (LSTEM) network. The training support set per task is used for learning of the Meta-model and the training query set per task is used for evaluating of the Meta-model generate a trained Meta-model. The LSTEM network combines search result of an External Storage (ES) along with output of a standard LSTM cell penultimate layer using the Sigmoid gate and a Pearson’s Correlation Coefficient (PCC). The ES is read in accordance with a read frequency vector that records frequency of read operations of memory locations of the ES and the ES is overwritten in accordance to an unused frequency vector that records frequency of write operations of the memory locations of the ES and the read frequency vector.
[0033] The Meta-model Building using LSTEM:
• During the Meta-model training/building phase, the training support sets are used for learning the mechanism for solving the tasks and the training query sets are used for the evaluation of that learning.
• For the metamodel building, the LSTEM network is used, which employs LSTEM employs both LSTM and a 2-dimensional External Storage, denoted by (ES) to build, and execute the meta-model.
• (ES) is used to store (i, l) pairs where i = input feature vector, and l = label (annotated RUL value) corresponding to the input i.
• (ES) ∈ RNXLwhere N = number of (i, l) pairs, and L = length(i) + length(l)
• For any time-step t, let the input beit. Thisit is first checked with (ES) and from there the most suitable(is , ls)pair is retrieved using ES Read operation (discussed in detail later).
• As discussed in ES Read operation later, r (Pearson’s Correlation Coefficient) is used as the similarity measure and r ∈ [-1,1].
• Nextit is processed using a LSTM cell and letl ′be the output from its penultimate layer.
• Then the effective prediction/output(lt)from this LSTM cell is given by,
lt = w(sig(r). ls + (1- sig(r)). l′) +b, where w (weight) and b (bias) are
trainable parameters which will be learnt using the back-propagation
and sig(r) = (1 + exp(-r))-1= Sigmoid function of r (Pearson’s Correlation Coefficient).
Note: sig(r)is used to control the trade off for giving the importance betweenls andl′ as the input to the final output layer of LSTM cell.
• When ES Read operation yields high degree of similarity, value of r will be
highly positive i.e.r → +1. Then sig(r) → +1
Hence in this casels will be given higher importance as compared tol′.
• Similarly, when ES Read operation is not so effective, value of r will be low/negative and in this casel′ will get the higher importance.
• The label ls retrieved from ES is also passed to the next LSTM cell as an additional state.
• During the back-propagation phase (applicable only for the training), as the weight vector is updated, it automatically induces/promotes this (i, l) pairing strategy in the metamodel.
• Let lactual be the actual label for the input i .After the training at the time-step t, (i, lactual ) pair is put in the (ES) using ES Write operation (discussed in detail later).
• During the testing, the last snapshot of ES as used in the training was used.
ES Operations:
Already stated that(ES) ∈ RNXL. The following 2 data-structures are maintained to carry out ES Read and Write operations.
• Read-frequency Vector (f r):
A vector of real numbers∈ RN
Initially f r(k) = 0∀k where k is the vector index.
When ES Read operation takes place at ES index k,f r(k) = f r(k) + 1 When ES Write operation takes place at ES index k, set f r(k) = 0
• Unused-frequency Vector (fun):
A vector of real numbers ∈ RN
Initially fun(k) = 0∀k, where k is the vector index.
Once an ES location with index k (say) is written, then onwards for every read/write operation at ES location k′,
fun(k) = fun(k) + 1∀k ≠ k′
= 0 otherwise
Pseudo code1:
ES_Read():
Input: it at time-step t. Processing:
Searches the ES for the most similar input entry out of all (i, l) pairs. is = argmaxr(it , i), where r(it , i) = Pearson’s Correlation Coefficient between it
and i , and r ∈ [-1,1]
Pseudo code1:
ES_Write():
Input: The (i, l) pair to be written. Processing:
If ES is not full, then (i, l) can be written to the next available ES location.
But as (ES) 6 RNXLcan contain only N such pairs, ES would be eventually completely full. In that case ES entry overwriting would be required.
Once the writing is done at an index (say,ks), the following updates are done:
Update fr:fr(ks) = 0
Update fun :fun(k)=fun(k)+1∀k ≠ ks
0 = fork = ks
ES entry overwriting strategy (Specific to RUL estimation in manufacturing and Utility domain): From fun vector, top y (<= 10) unused entries are taken and then out of these y entries, that entry is selected for the overwriting which has the least read frequency as determined from f r vector. Justification: Rationale behind the approach disclosed in the embodiments herein is that, unlike MANN, the topmost unused entry may not be the one with the least read frequency out of the top n unused entries. So that location is found for writing, which has the least read frequency out of group of the rarely used entries. It can be understood that rarely used entries with the high read-frequency has the greater chance to be read again than the comparatively more recently used entry with the lesser number of read frequency. It is seen that the frequency vector based approach of the method 200 disclosed herein is particularly helpful to improve the performance in the RUL estimation using the few-shot meta-learning process in the manufacturing and energy-utility domain. May be the same make (manufacturer) and nearby model of the machine (like vehicle) seen in the remote past is playing an important role in the current machine with the same make and the nearby (e.g. next or previous) model RUL estimation. [0034] Once the Meta-model is built, at step 208, the one or more hardware processors 104, predict the RUL of a machine based on a median based mechanism used by the trained Meta-model for prediction, when tested in accordance with a testing task definition comprising a testing support set and a testing query set. The data set pairs for the testing support set and the testing query set are generated in
accordance with steps for data set pairs of the training support set and training query set.
[0035] Testing Task Definition through Data Preparation:
• Test dataset contains the data for the large number of machines, each with different type (make/model) which were not present in the training dataset.
• For each machine type, at least one labelled instance is present along with the current running instance on which RUL needs to be predicted.
Practically it is the most common practice of having a few labelled lifecycle instances for a machine and required is to predict the RUL for the current running instance.
• One testing task is associated with one testing support set and one testing query set.
• Testing support set is formed from the labelled instances following the same process as discussed in the training support set creation section, the only difference being the testing query set is created from the unlabelled instances.
• At the testing time, no learning takes place. Testing support set is used only for the forward propagation of LSTEM network. Testing query set is used for the prediction purpose.
[0036] Prediction Generation: For the prediction generation, testing support set and testing query set are used along with the final trained model. At the testing time, no learning takes place. Testing support set is used only for the forward propagation in LSTEM. Testing query set is used for the prediction purpose.
[0037] Median based mechanism for the RUL estimation:
• Consider any unlabelled test instance of length p,
{x1,x2...,xtp}
where p ≥ n(n is defined above in the Section-1 Training task definition through
Data Preparation)
andti = ti -1 + 1
and xti is a feature vector at time-stepti
• Now [(tp -t1)-r + 1]different testing queries, each of length n can be
formed from this test instance by selecting any subsequence of length n
and together they form the testing query set.
• Next consider any testing query,
qtest = {xt1 ,xt2′,...,xtn′ }
where xt i′ is the feature vector at the time-step ti′
and ti′ = ti′ ′-1 + 1 and t1 ≤ t1′ ≤ (tp-n + 1) and(t1 + n - 1) ≤ ′tn ≤ tp
• Then n different RUL predictions,{rt′1′,rt′2′, ..., rt′i′, ..., rt′n′}will be resulted
from the model execution. Now for any such predictionrti, the actual RUL estimation can be obtained as rt'i = (ti′ + rt'i ) - tp where(ti′ + rt'i )=
estimated full lifecycle of the instance and tp= last known time-step for that instance.
• Thus, for the given testing query qtest, there will be n actual RUL estimations{rt′1′, rt′2′, ..., rt′i′, ..., rt′n′}
• The effective RULreff for qtest is then defined as the median of the series{rt′i′′}in=1
i.e. reff = median({rt′i′′}in= 1)
• Since qtest was one of the[(tp -t1)-n + 1]= v (say) different possible queries, there will be v effective RUL(s) for the unlabelled test instance and let, they are denoted as {reff 1,reff 2, ..., reff }.
• Then the predicted RUL rpredicted is calculated as the median of the series {reff j}jv=1. i.e. rpredicted = median({reff j}jv=1)
[0038] The method 200 is better understood with a practical use case explained herein. USE CASE: Consider a real-life business scenario. One vehicle engine manufacturing company (referred as client) has a few vehicle engine maintenance related timeseries data (limited number of machines). The data includes 20 different vehicle engines (different with respect to their make, model, workings etc.). For each of these vehicles, the client has annotated partial lifecycle data for the last 5 maintenance cycles (limited number of instances). Note: Here the
full lifecycle implies the time-period from just after the one maintenance service of the engine to just before the next maintenance. But in reality it is very difficult (if not impossible) to collect full lifecycle data and hence the client could gather the data only when the vehicle reached maintenance center for its servicing. In this case, partial lifecycle data ranging from 40% to 60% of the full lifecycle is available for each maintenance cycle. Also, the data is annotated with the Remaining Useful Life (RUL) duration for each maintenance cycle.
[0039] Now the client’s requirement is to devise a process which can predict the RUL for a very large number of engines (nearly 100) which are of different types from the aforementioned 20 vehicle engines. For each of them, the client is able to provide annotated partial lifecycle timeseries data for the last 2 maintenance cycles. The client also has timeseries data for the current run ( i.e. since last maintenance) of those engines. The client wants to find the RUL of the current run for each of them for the obvious betterment of their business.
[0040] Constraints to the system 100 while predicting RUL:
• Annotated data are very few in number and
• Full-lifecycle data are not available for the RUL estimation. [0041] Approach as provided by the method 200 implemented by the
system 100: In the real-world this is a common scenario and here few-shot, meta-learning based approach is applied as disclosed. Overall process is as follows:
• The system 100 utilizes the annotated timeseries data for the last 5 maintenance cycles of 20 different vehicle engines (as described above) for the Meta-model generation (i.e. meat learning based model) creation(training) purpose.
• For the prediction purpose, 100 vehicle engines data is used including both the annotated 2 maintenance cycles(partial) data for each engine, and the current running cycle data (unannotated) for which RUL needs to be predicted.
• From the training dataset, a large number of training tasks, each consisting of (training support set, training query set) pair are generated. Training support set and query set are formed from the
training data using the data shuffling technique which includes the random subsequence selection, subsequence order variation etc.
• Prediction dataset (i.e. the test dataset on which RUL prediction are carried out) is also treated in the similar fashion for the testing support and query set generation.
• Next meta-learning based model is created by training on the large number of training tasks (as generated above) using LSTEM based technique.
• Once the meta-model is created successfully, it is used to predict the RUL on the test/prediction dataset applying median based RUL estimation approach.
[0042] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[0043] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means, and software means. The method embodiments described herein could be
implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
[0044] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[0045] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[0046] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which
information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[0047] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
We Claim:
1. A processor implemented method (200) for few-shot meta-learning based Remaining Useful Life (RUL) estimation, the method comprising:
receiving, by one or more hardware processors, training data comprising time series data captured a plurality of sensors associated with a limited number of machines being monitored for building a Meta-model for the RUL estimation (202);
creating, by the one or more hardware processors, a training task definition comprising a plurality of training tasks and a dataset pair (ds, dq) per each of the plurality of training tasks, wherein the dataset pair (ds, dq) to generate a training support set (ds)and a training query set(dq) is derived from a training data (204), wherein creating the training task definition comprises:
recording, from the training data, the time series data associated with the limited number of machines (m) of varying types and comprising a limited number of instances (p) of each of the limited number of machines, wherein each of the limited number of instances comprise partial life cycles of each of the limited number of machines (m) and is represented by a sequence of tuples with a tuple count (n’) (204a);
determining a minimum instance length (n) from among lengths of the plurality of time instances, wherein the minimum instance length is selected in accordance with a minimum instance length that is present in the training data (204b);
creating datasets for each of the plurality of training tasks selecting a set of machines (m’) from among the limited number of machines (m) of varying types with a randomly selected instance of tuple count (n’) from among a limited number of instances (p) available in the training data, wherein summation of the tuple count (n’) for the randomly selected instances for each of the set of
machines (m’) is equal to the minimum instance length (n) (204c);
and
selecting data set pair (ds, dq) for the training support set
and the training query set respectively for each of the plurality of
task definitions (204d); and
building, by the one or more hardware processors, the Meta-model for the few-shot meta-learning using Long Short Term External Memory (LSTEM) network (206), wherein the training support set per task is used for learning of the Meta-model and the training query set per task is used for evaluating of the Meta-model generate a trained Meta-model.
2. The method as claimed in claim 1, further comprising predicting, by the one or more hardware processors, the RUL of a machine based on a median based mechanism used by the trained Meta-model for prediction when tested in accordance with a testing task definition comprising a testing support set and a testing query set (208).
3. The method as claimed in claim 1, wherein the Long Short Term External Memory (LSTEM) network combines search result of an External Storage (ES) along with output of a standard LSTM cell penultimate layer using the Sigmoid gate and a Pearson’s Correlation Coefficient (PCC).
4. The method as claimed in claim 3, wherein the ES is read in accordance with a read frequency vector that records frequency of read operations of memory locations of the ES and the ES is overwritten in accordance to an unused frequency vector that records frequency of write operations of the memory locations of the ES and the read frequency vector.
5. The method as claimed in claim1, wherein predicting the RUL based on a median based mechanism comprises:
a) determining a plurality of actual RULs from predictions with respect to each time-step of a given query instance for a plurality of query instances in the testing query set;
b) determining an effective RUL from the plurality of the actual RULs for the each of the plurality of query instances using the median based approach; and
c) determining a final RUL from the plurality of the effective RULs corresponding to the plurality of query instances generated for a single test instance using median based approach.
6. A system (100) for few-shot meta-learning based Remaining Useful Life (RUL) estimation , the system (100) comprising:
a memory (102) storing instructions;
one or more Input/Output (I/O) interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the
one or more I/O interfaces (106), wherein the one or more hardware
processors (104) are configured by the instructions to:
receive training data comprising time series data captured from a plurality of sensors associated with a limited number of machines being monitored for building a Meta-model for the RUL estimation;
create a training task definition comprising a plurality of training tasks and a dataset pair (ds, dq) per each of the plurality of training tasks, wherein the dataset pair (ds, dq) to generate a training support set (ds,)and a training query set (dq) is derived from a training data, wherein creating the training task definition comprises:
recording, from the training data, the time series data associated with the limited number of machines (m) of varying types and comprising a limited number of instances (p) of each of the limited number of machines, wherein each of the limited number of instances comprise partial life cycles of each of the limited
number of machines (m) and is represented by a sequence of tuples with a tuple count (n’);
determining a minimum instance length (n) from among lengths of the plurality of time instances, wherein the minimum instance length is selected in accordance with a minimum instance length that is present in the training data;
creating datasets for each of the plurality of training tasks selecting a set of machines (m’) from among the limited number of machines (m) of varying types with a randomly selected instance of tuple count (n’) from among a limited number of instances (p) available in the training data, wherein summation of the tuple count (n’) for the randomly selected instances for each of the set of machines (m’) is equal to the minimum instance length (n); and
selecting data set pair (ds, dq) for the training support set and the training query set respectively for each of the plurality of task definitions; and
build the Meta-model for the few-shot meta-learning using Long Short Term External Memory (LSTEM) network, wherein the training support set per task is used for learning of the Meta-model and the training query set per task is used for evaluating of the Meta-model generate a trained Meta-model.
7. The system as claimed in claim 6, further comprising predicting, by the one or more hardware processors, the RUL of a machine based on a median based mechanism used by the trained Meta-model for prediction when tested in accordance with a testing task definition comprising a testing support set and a testing query set.
8. The method as claimed in claim 6, wherein the Long Short Term External Memory (LSTEM) network combines search result of an External Storage
(ES) along with output of a standard LSTM cell penultimate layer using the Sigmoid gate and a Pearson’s Correlation Coefficient (PCC).
9. The system as claimed in claim 8, wherein the ES is read in accordance with a read frequency vector that records frequency of read operations of memory locations of the ES and the ES is overwritten in accordance to an unused frequency vector that records frequency of write operations of the memory locations of the ES and the read frequency vector.
10. The system as claimed in claim 6, wherein the one or more hardware processors are configured to predict the RUL based on a median based mechanism by:
a) determining a plurality of actual RULs from predictions with respect to each time-step of a given query instance for a plurality of query instances in the testing query set;
b) determining an effective RUL from the plurality of the actual RULs for the each of the plurality of query instances using the median based approach; and
c) determining a final RUL from the plurality of the effective RULs
corresponding to the plurality of query instances generated for a single test
instance using median based approach.
| # | Name | Date |
|---|---|---|
| 1 | 202121042760-STATEMENT OF UNDERTAKING (FORM 3) [21-09-2021(online)].pdf | 2021-09-21 |
| 2 | 202121042760-REQUEST FOR EXAMINATION (FORM-18) [21-09-2021(online)].pdf | 2021-09-21 |
| 3 | 202121042760-PROOF OF RIGHT [21-09-2021(online)].pdf | 2021-09-21 |
| 4 | 202121042760-FORM 18 [21-09-2021(online)].pdf | 2021-09-21 |
| 5 | 202121042760-FORM 1 [21-09-2021(online)].pdf | 2021-09-21 |
| 6 | 202121042760-FIGURE OF ABSTRACT [21-09-2021(online)].jpg | 2021-09-21 |
| 7 | 202121042760-DRAWINGS [21-09-2021(online)].pdf | 2021-09-21 |
| 8 | 202121042760-DECLARATION OF INVENTORSHIP (FORM 5) [21-09-2021(online)].pdf | 2021-09-21 |
| 9 | 202121042760-COMPLETE SPECIFICATION [21-09-2021(online)].pdf | 2021-09-21 |
| 10 | 202121042760-FORM-26 [21-10-2021(online)].pdf | 2021-10-21 |
| 11 | Abstract1.jpg | 2021-12-02 |
| 12 | 202121042760-FER.pdf | 2023-08-29 |
| 13 | 202121042760-FER_SER_REPLY [17-11-2023(online)].pdf | 2023-11-17 |
| 14 | 202121042760-CLAIMS [17-11-2023(online)].pdf | 2023-11-17 |
| 15 | 202121042760-US(14)-HearingNotice-(HearingDate-02-12-2025).pdf | 2025-11-18 |
| 1 | SearchStrategy202121042760E_22-08-2023.pdf |