System And Method For Scalable Machine Fault Analysis

< Back

System And Method For Scalable Machine Fault Analysis

Abstract: The present disclosure provides a system (132) and a method for using a source machine (low-capacity) to generate data representative of a target machine (high-capacity) and model target system for fault diagnosis with the data. This is achieved by learning a feature space transformation from the source machine to the target machine in the healthy condition, employing constrained maximum likelihood linear regression (CMLLR). Feature spaces of source and target systems in the healthy condition are modelled using two Gaussian mixture models (GMMs) to estimate CMLLR transformation parameters. The transformation is applied on source system’s fault data to synthesize data representative of the target system’s fault condition. For feature space transformation to work, the feature spaces should be linearly transformable. Linearity of feature space is maintained by varying the number of nodes in feature extraction layer. A deep neural network with discriminative feature extraction capabilities is employed for deriving the features.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

23 May 2024

Publication Number

21/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

AMRITA VISHWA VIDYAPEETHAM

Amrita Vishwa Vidyapeetham, Coimbatore Campus Coimbatore- 641 112, Tamil Nadu, India

Inventors

1. KUMAR, C. Santhosh

180 Chandranagar Colony, Chandranagar PO Palakkad, Kerala 678007

2. KUDUKKEN THAZHATHEVEETTIL, Sreekumar

Kudukken Thazhatheveettil, Alappadamba Ettukudukka PO Kannur, Kerala 670521

3. KI, Ramachandran

Amrita Vishwa Vidyapeetham, Amrita Nagar PO Ettimadai, Coimbatore, Tamil Nadu 641112

Specification

DESC:FIELD OF THE INVENTION
[001] The invention generally relates to machine fault analysis and in particular the present disclosure relates to a system and method for scalable machine fault analysis.

DESCRIPTION OF THE RELATED ART
[002] Condition monitoring of machines is an important area of research in industrial applications. Various methods and techniques have been proposed to diagnose and predict faults in machines, including current signal analysis, vibration analysis, acoustic emission, and thermography. Recently, with the advancements in machine learning and deep learning techniques, researchers have explored the use of these techniques in machine fault diagnosis and prognostics.
[003] Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) are some of the popular deep learning techniques used for machine fault diagnosis. J Fu et al. proposed a method that combined CNN and LSTM to monitor and warn fault of bearing in gearbox. W. Zhang et al. described a methodology that makes use of CNNs and transfer learning to achieve intelligent machine fault diagnosis. Y. Zhang et al proposed a method for fault diagnosis in rotating machinery based on RNNs that exploits temporal information and achieves superior performance and robustness. However, the imbalanced nature of machine fault dataset poses a challenge in developing accurate and robust models. Researchers have proposed various techniques to address this issue, including data augmentation, transfer learning, and cost-sensitive learning. C Liu et al. proposed a domain adaptation method for rolling element bearing fault diagnosis that utilizes physical-based simulations to generate vibration signals. The method, which uses a domain adversarial neural network (DANN) to align source and target domain data, achieved high accuracy with minimal real data and holds promise for industrial applications. Xiang Li et al. proposed a new fault diagnosis method for industrial scenarios that addresses the problem of limited labelled data. This approach leverages unsupervised data exploration to improve the performance of the model with limited labelled data. The method combines deep distance metric learning and k-means clustering within a deep learning framework, which improves the model robustness and enables effective handling of semi- supervised and unsupervised fault diagnosis challenges. Yanting Li et al. proposed a fault diagnosis method for wind turbines using parameter-based transfer learning and convolutional autoencoder (CAE) to extract relevant features from operational data. This method aims to transfer knowledge from similar wind turbines to the target wind turbine to establish fault diagnosis models. Yaowei Shi et al. proposed a fault diagnosis method based on domain generalization that utilizes multi-source augmentation and adversarial training to enhance the robustness and generalization of feature representations. Yongyi Chen et al. proposed PfAReLU, a parameter- free adaptively rectified linear unit activation function that improves the adaptability of deep learning models for cross- domain fault diagnosis in rolling bearings. Zuoyi Chen et al. proposed deep attention related network (DARN), a zero- shot learning method for bearing fault diagnosis under multiple unknown domains. Wang et al. used the unilateral feature alignment method to enhance the discriminatory power of domain adversarial learning in a missing-class training scenario. In real industrial scenario, it is challenging to collect fault data from the target high-capacity machine, and this work addresses this critical gap by proposing scalable machine fault diagnosis method that utilizes the knowledge transfer from source low-capacity to the target high-capacity machine. This approach is different from the existing cross domain fault diagnosis method using transfer learning and domain adaptation methods that generally focus on knowledge transfer across different operating condition within the same machine. In general, limited research has explored fault diagnosis without real fault condition data.
[004] Fault diagnosis and classification are critical in large- scale industries maintenance to ensure human safety, reliability of machinery, and reduce downtime. To enhance diagnostic capabilities for different faults, data-driven approaches have been proposed. Data driven machine fault diagnosis includes signal acquisition, feature extraction and classification. Necessary signals are acquired through multiple sensors that are connected to the machines. Feature extraction techniques such as, time-frequency analysis, decomposition techniques, and convolutional neural networks (CNN) aim to remove redundant information, reduce dimensionality and de-noise the data. In the classification stage, extracted features are used as input for machine learning models. To perform a data driven condition-based maintenance (CBM) of any engineering system data during normal and faulty condition from the system to be monitored. In the context of inducing faults in high capacity or expensive machinery in the industry for the purpose of capturing signals during faulty condition can be challenging due to various reasons. Inducing faults in a high capacity or expensive machine can be costly, not only in terms of the direct financial cost of inducing the fault but also in terms of the potential cost of repairing the damage caused by the fault. Further, inducing certain types of faults may not be technically feasible due to the design or operation of the machine. Also, inducing faults may pose a risk to the safety of personnel working with or in close proximity to the machine. Furthermore, inducing faults and capturing data in faulty conditions can be a time-consuming process that may disrupt the normal operation of the machine.
[005] It is not possible inject faults and capture the fault data samples from the high-capacity machines to be monitored for capturing the fault characteristics about the system. This imposes a major challenge to the implementation of data driven method based on conventional supervised techniques to actual fault diagnosis. Therefore, there is a requirement for developing an intelligent fault diagnosis system.

OBJECTS OF THE PRESENT DISCLOSURE
[006] The present disclosure related to a system and method for scalable machine fault analysis that receive inputs samples that include features associated with a healthy and fault condition data of a source machine (low-capacity), and healthy condition data of a target machine (high-capacity).
[007] The present disclosure related to a system that determines discriminative features associated with the source and the target machines based on the analyzed features.
[008] The present disclosure related to a system that generates Gaussian Mixture Models (GMMs) associated with the source machine and the target machine that include the discriminative features using healthy condition data.
[009] The present disclosure related to a system that generates the transformation matrix associated with the source and target machines healthy condition data using the generated GMMs, where the transformation matrix maps the source machine feature space to a corresponding target machine feature space.
[0010] The present disclosure related to a system that generates the synthetic fault data associated with the target machine based on the generated transformation matrix corresponding to the fault condition data of the source machine.

SUMMARY
[0011] In an aspect, the present disclosure relates to a system for fault diagnosis. The system includes a processor, and a memory communicatively coupled to the processor, where said memory stores instructions to be executed by the processor, causes the processor to receive one or more inputs samples. The one or more inputs samples include one or more features associated with a source machine (low-capacity) and a target machine (high-capacity). The processor determines one or more discriminative features associated with the source machine and the target machine based on the analyzed one or more features. The processor generates one or more Gaussian Mixture Models (GMMs) associated with the source and target machines comprising the one or more discriminative features in the healthy condition. The processor generates a generate a transformation matrix associated with the source and target machines healthy condition data using the generated one or more GMMs that maps a source machine feature space to a corresponding target machine feature space. The processor generates synthetic fault data of the target machine based on the transformation matrix for the fault diagnosis using the fault condition data of the source machine.
[0012] In an embodiment, to determine the one or more discriminative features, the processor may be configured to aggregate the one or more inputs samples to determine a feature representation of the aggregated one or more inputs samples. The processor may be configured to utilize a supervised clustering technique to generate a specified margin between the aggregated one or more samples belonging to different classes while minimising the cosine distance between the feature representations. The processor may be configured to determine the aggregated one or more samples belonging to a similar class by minimizing the distance between the feature representations of the aggregated one or more samples from the aggregated one or more samples of the similar class and maximizing the distance between the feature representation of the aggregated one or more samples belonging to different classes.
[0013] In an embodiment, the transformation matrix generated by the processor may map the source healthy data to the target healthy data while minimizing divergence between one or more Gaussian Mixture Models (GMMs).
[0014] In an embodiment, the corresponding target data may include a log-likelihood of a target GMM associated with the target data from the one or more Gaussian Mixture Models (GMMs).
[0015] In an embodiment, the one or more inputs samples may include any or a combination of a healthy data associated with a source machine, a fault condition data associated with the source machine, and a healthy data associated with the target machine.
[0016] In an embodiment, the processor may be configured to generate the synthetic fault data associated with a target machine based on the generated transformation matrix corresponding to the fault condition data of the source machine.
[0017] In an aspect, the present disclosure relates to a method for fault diagnosis. The method includes receiving, by a processor, associated with a system, one or more inputs samples, where the one or more inputs samples include one or more features associated with a source machine and a target machine. The method includes determining, by the processor, one or more discriminative features associated with the source machine and the target machine based on the analyzed one or more features. The method includes generating, by the processor, one or more Gaussian Mixture Models (GMMs) associated with the source and target machines comprising the one or more discriminative features in the healthy condition. The method includes generating, by the processor, a transformation matrix associated with the source and target machines healthy condition data using the generated one or more GMMs that maps a source machine feature space to a corresponding target machine feature space data. The method includes generating, by the processor, synthetic fault data of the target machine based on the transformation matrix for the fault diagnosis using the fault condition data of the source machine.
[0018] In an embodiment, for determining the one or more discriminative features, the method may include aggregating, by the processor, the one or more inputs samples to determine a feature representation of the aggregated one or more inputs samples. The method may include utilizing, by the processor, a supervised clustering technique to generate a specified margin between the aggregated one or more samples belonging to different classes while minimising the cosine distance between the feature representations. The method may include determining, by the processor, the aggregated one or more samples belonging to a similar class by minimizing the distance between the feature representations of the aggregated one or more samples from the aggregated one or more samples of the similar class and maximizing the distance between the feature representation of the aggregated one or more samples belonging to different classes.
[0019] In an embodiment, the corresponding fault condition data may include a log-likelihood maximization of a target GMM associated with the fault condition data from the one or more Gaussian Mixture Models (GMMs).
[0020] In an embodiment, the one or more inputs samples may include any or a combination of a healthy data associated with a source machine, a source fault condition data associated with the source machine, and a healthy data associated with the target machine.
[0021] In an embodiment, the method may include generating, by the processor, the synthetic fault data associated with a target machine based on the generated transformation matrix corresponding to the fault condition data of the source machine.
BRIEF DESCRIPTION OF DRAWINGS
[0022] The accompanying drawings, which are incorporated herein, and constitute a part of this invention, illustrate exemplary embodiments of the disclosed methods and systems which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale; emphasis instead being placed upon clearly illustrating the principles of the present invention. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that the invention of such drawings includes the invention of electrical components, electronic components or circuitry commonly used to implement such components.
[0023] FIG. 1 illustrates a schematic representation of the proposed system 132, in accordance with an embodiment of the present disclosure.
[0024] FIG. 2 illustrates an exemplary block diagram architecture of the proposed system 132, in accordance with an embodiment of the present disclosure.
[0025] FIG. 3 illustrates a schematic diagram of a synchronous generator fault diagnosis by the proposed system 132, in accordance with an embodiment of the present disclosure.
[0026] FIG. 4 illustrates a schematic diagram of a gearbox fault diagnosis by the proposed system 132, in accordance with an embodiment of the present disclosure.
[0027] FIG. 5 illustrates an architecture diagram of a baseline Convolutional Neural network (CNN) implemented by the proposed system 132, in accordance with an embodiment of the present disclosure.
[0028] FIG. 6 illustrates an architecture diagram of a deep discriminative learning method implemented by the proposed system 132, in accordance with an embodiment of the present disclosure.
[0029] FIGs. 7-9 illustrate T-distributed stochastic neighbour embedding (t-SNE) 3D plot to visualize the distribution of data, in accordance with an embodiment of the present disclosure.
[0030] FIGs. 10-12 illustrate performance analysis of each case study carried out using DET curves, in accordance with an embodiment of the present disclosure.
[0031] FIGs. 13-15 illustrate fault data showing imbalanced distributions, with normal operating conditions, in accordance with an embodiment of the present disclosure.

DETAILED DESCRIPTION
[0032] While the present disclosure has been disclosed with reference to certain embodiments, it will be understood by those skilled in the art that various changes may be made, and equivalents may be substituted without departing from the scope of the invention. In addition, many modifications may be made to adapt to a particular situation or material to the teachings of the invention without departing from its scope.
[0033] Throughout the specification and claims, the following terms take the meanings explicitly associated herein unless the context clearly dictates otherwise. The meaning of "a", "an", and "the" include plural references. The meaning of "in" includes "in" and "on." Referring to the drawings, like numbers indicate like parts throughout the views. Additionally, a reference to the singular includes a reference to the plural unless otherwise stated or inconsistent with the disclosure herein.
[0034] The present disclosure discloses a constrained maximum likelihood linear regression (CMLLR) based feature space transformation 136 that leverages available source (low-capacity) healthy data and target (high-capacity) healthy data. This method effectively finds the linear feature transformation between the source and target healthy data. The resulting transformation is then applied to source machine's fault data, enabling the synthesis of target fault condition data which is used in the training of the models for the target system.
[0035] Various embodiments of the present disclosure are described using FIGs. 1 to 15.
[0036] FIG. 1 illustrates a schematic representation of the proposed system 132, in accordance with an embodiment of the present disclosure.
[0037] As illustrated in FIG. 1, in an embodiment, the system 132 may receive raw Fast Fourier Transform (FFT) data 102, where the raw FFT data 102 may include one or more inputs samples and the one or more inputs samples may include one or more features associated with a source data (available source healthy data 104 (with low capacity)) and a target data (target healthy data 106 (with high capacity)). The one or more inputs samples may include but not limited to the healthy data 104 associated with a source machine, a fault condition data 108 associated with the source machine, and healthy data 106 associated with the target machine.
[0038] In an embodiment, the system 132 may determine one or more discriminative features 134 associated with the source machine and the target machine based on the analyzed one or more features. To determine the one or more discriminative features 134, the system 106 may aggregate the one or more inputs samples to determine a feature representation of the aggregated one or more inputs samples. The system 132 may utilize a supervised clustering technique 110 to generate a specified margin between the aggregated one or more samples belonging to different classes while minimising the cosine distance between the feature representations. The system 132 may determine the aggregated one or more samples belonging to a similar class by minimizing the distance between the feature representations of the aggregated one or more samples from the aggregated one or more samples of the similar class and maximizing the distance between the feature representation of the aggregated one or more samples belonging to different classes.
In an embodiment, during the task of detecting machine faults, a binary CNN classification model may be used by the system 132 to predict the probability distribution of two classes based on the input data. Let D = X, Y N be a fault diagnosis dataset with N samples, where X is the training data and Y is the training label. Usually, the classifier is trained by minimizing the cross-entropy loss (LCE) function

where represents the predicted probability and denotes the class weights. The conventional training approach for machine fault diagnosis classifiers is to give equal weight to the loss-term components for both healthy and faulty classes. This approach maximizes classification accuracy, which is the ratio of the number of correct predictions to the total number of input samples. However, when training is performed on imbalanced dataset, this approach may not effectively learn class-discriminative features, as the classifier may simply bias its predictions towards the majority class that contains a substantially larger amount of samples. Although non-equal weights in the loss-term can alleviate this bias, it may not always result in optimal performance due to differences between the training and test sets regarding the level of class imbalance.
[0039] In various embodiments, the system 132 may implement scalable machine fault analysis with deep discriminative feature learning 112 and further implement synthetic data generation of the target high-capacity system fault data using feature space transformation 136 as illustrated in FIG. 1. The deep discriminative feature learning 112 may combine inherent supervised clustering of deep features and discriminative feature extraction 114. The deep discriminative feature learning 112 not only increases the separation between the data belonging to separate classes but also increases the intra-class compactness by pulling the samples of the same class closer to their corresponding class centers in the feature space. The deep discriminative feature learning 112 uses a weighted combination of the standard cross-entropy loss, supervised clustering loss, and discriminative loss. The supervised clustering loss encourages the feature representations of samples from the same class to be close to their corresponding class centers, while the discriminative loss enforces the feature representations of the separate classes to be far apart. As described earlier, the system 132 utilizes the supervised clustering technique 110 to generate a specified margin between the aggregated one or more samples belonging to different classes while minimising the cosine distance between the feature representation
[0040] The combined loss function may be defined as:

where is the standard cross-entropy loss, and are weight factors that balance the influences of the supervised clustering loss ( ), discriminative loss ( ), and regularization term ( ) respectively.
[0041] In an embodiment, the discriminative loss may be computed over triplets of ( ), where is the deep feature representation of an input sample (For example, the source data), is a positive sample (For example, source healthy data) from the same class, and (For example, source faulty data) is a negative sample from a different class. The discriminative loss enforces the distance between and to be smaller than the distance between and , by a margin specified by the parameter .The supervised clustering loss may be computed over the feature representation of all samples in a batch. This encourages the feature representation of samples from the same class to be closer to their corresponding class center, which is defined as the average of the feature representations of all samples from the same class. As described earlier, the system 106 may determine the aggregated samples belonging to the similar class by minimizing the distance between the feature representations of the aggregated samples from the aggregated samples of the similar class and maximizing the distance between the feature representations of the aggregated samples belonging to different classes.
[0042] In an embodiment, the regularization term may be defined as the L2 norm of the distances between the class centers and the overall mean of the feature representation. This encourages the class centers to be close to the mean of the feature representations, which can help reduce over-fitting. The gradient of the proposed loss function can be computed as:

[0043] In an embodiment, during training, triplets of ( ) is sampled to optimize the loss function using stochastic gradient descent. The weight factors, and can be tuned to achieve a good trade-off between the influences of the different loss terms on the final feature representation. By using supervised clustering 110 and discriminative feature extraction 114, deep feature representation can be learned that not only separate different classes but also increase the intra-class compactness by pulling the samples of the same class closer to their corresponding class centers in the feature space, while also reducing over-fitting through regularization. The resulting feature vectors derived from the deep discriminative neural network are ensured empirically, by adjusting the number of nodes in the feature extraction layer to be in a linear space, making the feature space transformation 136 applicable.
[0044] In an embodiment, the supervised clustering loss may be used as a regularization term in addition to the standard classification loss. The supervised clustering loss minimizes the distance between the deep feature representation of each sample and the corresponding class center in the feature space. This encourages the samples of the same class to be closer to each other in the feature space, which in turn increases the intra- class compactness. Let be the training data of samples, where each sample is associated with a label indicating its class. Let be the learned feature representation of . The center of each class as the average of the feature representations of all samples in that class is defined as:

[0045] The system 106 may utilize the supervised clustering technique 110 to generate a specified margin between the aggregated one or more samples belonging to different classes while minimising the cosine distance between the feature representations. The objective of supervised clustering technique 110 is to maintain a specified margin m between samples belonging to different classes while minimising the cosine distance between the feature representation and its corresponding class centre .

where denotes the cosine similarity and is the cosine of the margin .By minimizing , the feature representations of samples within the same class are encouraged to be similar.
[0046] In an embodiment, discriminative feature learning 112 may be used for feature representations that can well discriminate between different classes. Discriminative feature learning 112 minimizes the distance between feature representations of samples within the same class while maximizing the distance between samples from different classes. To formalize this, let be the set of training samples with associated labels , where each sample is associated with a class label . Mapping function that maps each sample to a feature representation in a high-dimensional space, such that samples from the same class are close to each other and samples from different classes are far apart. Here, a discriminative loss function may be used that effectively uses cosine similarity as the distance metric. This loss function encourages the feature representations of samples from the same class to be closer to each other than those from different classes. The discriminative loss function is defined as:

where is a training sample, is a positive sample that belongs to the same class, x is a negative sample that belongs to a different class, is the cosine similarity between the feature representations of samples and in the high-dimensional space, and margin is a parameter that specifies the minimum distance between the positive and negative pairs.
[0047] Accurately diagnosing faults in complex high-capacity systems often necessitates extensive testing under diverse fault conditions. However, acquiring real fault data can be very expensive and impractical due to potential downtime and safety concerns. To address this critical problem, the system 106 may be configured to generate synthetic fault data for high-capacity machines using a feature transformation method called constrained maximum likelihood linear regression (CMLLR) is used. The CMLLR may determine the feature space transformation 136 between low- capacity and high-capacity machines.
[0048] In an embodiment, the system 132 may determine one or more discriminative features 134 associated with the source machine and the target machine based on the analyzed one or more features. Further, the system 132 may generate one or more Gaussian Mixture Models (GMMs) associated with the source machine and the target machine including the one or more discriminative features.
[0049] In an embodiment, data distributions may be modelled in both source and target systems with healthy condition using two separate Gaussian mixture models (GMMs). Each GMM captures the inherent multi-modality and complexities of the feature space through a weighted sum of Gaussian components. This probabilistic framework allows representation of the diverse healthy operating conditions observed in each system. Let and denote sets of sources and target healthy feature vectors, respectively. For the source healthy data, the GMM may be denoted as and for target healthy data the GMM may be denoted as .

where, and represents the mixture weights, and are the mean vectors, and and are the covariance matrices of source and target, respectively. The system 132 may generate a linear transformation matrix 138 associated with the source and target machines healthy condition data using the generated one or more GMMs that maps a source machine feature space to a corresponding target machine feature. The system 132 may generate the corresponding target data with a log-likelihood of a target GMM associated with the target data from the one or more Gaussian Mixture Models (GMMs). To achieve this, the joint log-likelihood of the one or more GMMs may be maximised by the system 132 for both transformed source healthy data and target healthy data:

where, and are the source and target healthy data matrices, respectively, represents the transformed source data and is the log-likelihood of the target GMM. To prevent unrealistic deviations, CMLLR incorporates two crucial constraints:
• = 1, it guarantees the transformed space preserves the original data density.
• , it ensures the transformed source data maintains the average characteristics of target machine.
[0050] These constraints are incorporated using Lagrange multipliers, leading to a generalized eigenvalue problem for solving :

where is the Lagrange multiplier. The joint log-likelihood function can be reformulated in terms of trace operation :

[0051] Two constraint matrices, and are defined, corresponding to the constraints. The objective function and constraints are combined into a generalized eigenvalue problem:

[0052] Solving the generalized eigenvalue problem presented allows the system (106) in in determining the transformation matrix 138 . The Eigen vectors corresponding to the smallest non-zero eigenvalues may be the desired solution.
[0053] Finally, the estimated may be applied to the actual faulty source data ( ) to generate the synthesized faulty target data ( ):

[0054] To generate synthetic fault data, the transformed feature vectors may be randomly perturbed using a small amount of Gaussian noise. Let denotes the set of random Gaussian noise vectors, where each for some small variance . The synthetic fault data can be generated by adding the noise vectors to the transformed fault feature vectors as follows:

where and .The synthesized data are then used to train fault diagnosis models for the target machines. Thus, the system 132 may estimate 138 the synthetic fault data 116 associated with a high-capacity machine based on the generated transformation matrix 138 corresponding to the fault condition data of the low-capacity machine. This may apply 140 the feature space transformation matrix 138 to the source data to generate the synthetic fault data 116 associated with a high-capacity machine.
[0055] In an embodiment, performance evaluation using a single operating point may not be optimum for describing the capabilities of the system when a trade-off exists. Thus, a performance curve plotted across all the operating points can effectively describe the system. The detection error trade-off (DET) curve captures the intricate balance between the false acceptance rate (FAR) and the false rejection rate (FRR) across various decision thresholds. This curve offers a comprehensive system performance perspective, revealing trade-offs between FAR and FRR. Equal error rate (EER), where FAR equals FRR, signifies a balanced operational point. Additionally, the DET curve can be used to compute the minimum detection cost function (minDCF), which factors in costs associated with false acceptances and rejections, assisting in optimizing system parameters for cost efficiency.

where and are the cost associated with FAR and FPR, respectively. An EER threshold is often used, where FAR and FRR are equal. The DCF can be computed at this threshold, providing a single performance metric that balances the trade-off between FAR and FRR based on the specified cost factors.
[0056] In an embodiment, the system 132 may use the generated output 118 (including the generated synthetic fault data/generated faulty data 116, the target healthy data 106, the source healthy data 104, and the faulty data 108 as training data 120. The system 106 may perform classification 122 on the training data 120. Further, the system 106 may implement training 124 on the training data 120 using a dense network 126 and a sigmoid classifier 128. The system 106 may use test data 130 with the sigmoid classifier 128 and further generate a decision 130.
[0057] FIG. 2 illustrates an exemplary block diagram architecture of the proposed system 132, in accordance with an embodiment of the present disclosure.
[0058] In an aspect, referring to FIG. 2, the system 132 may include one or more processor(s) 202. The one or more processor(s) 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, edge or fog microcontrollers, digital signal processors, central processing units, logic circuitries, and/or any devices that process data based on operational instructions. Among other capabilities, the one or more processor(s) 202 may be configured to fetch and execute computer-readable instructions stored in a memory 204 of the system 132. The memory 204 may be configured to store one or more computer-readable instructions or routines in a non-transitory computer readable storage medium, which may be fetched and executed to create or share data packets over a network service. The memory 204 may include any non-transitory storage device including, for example, volatile memory such as Random Access Memory (RAM), or non-volatile memory such as Erasable Programmable Read-Only Memory (EPROM), flash memory, and the like.
[0059] Referring to FIG. 2, the system 106 may include an interface(s) 206. The interface(s) 206 may include a variety of interfaces, for example, interfaces for data input and output devices, referred to as I/O devices, storage devices, and the like. The interface(s) 206 may facilitate communication to/from the system 132. The interface(s) 206 may also provide a communication pathway for one or more components of the system 106. Examples of such components include, but are not limited to, processing unit/engine(s) 208, a database 210, and a data parameter engine 212.
[0060] In an embodiment, the processing unit/engine(s) 208 may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processing engine(s) 208. In examples described herein, such combinations of hardware and programming may be implemented in several different ways. For example, the programming for the processing engine(s) 208 may be processor-executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the processing engine(s) 208 may include a processing resource (for example, one or more processors), to execute such instructions. In the present examples, the machine-readable storage medium may store instructions that, when executed by the processing resource, implement the processing engine(s) 208. In such examples, the system 132 may include the machine-readable storage medium storing the instructions and the processing resource to execute the instructions, or the machine-readable storage medium may be separate but accessible to the system 132 and the processing resource. In other examples, the processing engine(s) 208 may be implemented by electronic circuitry.
[0061] In an embodiment, the processor 202 may receive one or more inputs samples through the data parameter engine 212. The processor 202 may record the one or more inputs in the database 210. The one or more inputs samples may include one or more features associated with a source machine and a target machine. The one or more inputs samples may include but not limited to a healthy data associated with a source machine, a fault condition data associated with the source machine, and a healthy data associated with the target machine.
[0062] In an embodiment, the processor 202 may determine one or more discriminative features associated with the source machine and the target machine based on the analyzed one or more features. To determine the one or more discriminative features, the processor 202 may be configured to aggregate the one or more inputs samples to determine a feature representation of the aggregated one or more inputs samples. The processor 202 may be configured to utilize a supervised clustering technique to generate a specified margin between the aggregated one or more samples belonging to different classes while minimising the cosine distance between the feature representations. The processor 202 may be configured to determine the aggregated one or more samples belonging to a similar class by minimizing the distance between the feature representations of the aggregated one or more samples from the aggregated one or more samples of the similar class and maximizing the distance between the feature representation of the aggregated one or more samples belonging to different classes.
[0063] In an embodiment, the processor 202 may generate one or more Gaussian Mixture Models (GMMs) associated with the source machine and the target machine comprising the one or more discriminative features.
[0064] In an embodiment, the processor 202 may generate a transformation matrix associated with the source and target machines healthy condition data using the generated one or more GMMs that maps a source machine feature space to a corresponding target machine feature space. The corresponding target data may include a log-likelihood of a target GMM associated with the target data from the one or more Gaussian Mixture Models (GMMs). Further, the transformation matrix generated by the processor 202 may map the source data to the target data while minimizing divergence between one or more Gaussian Mixture Models (GMMs).
[0065] In an embodiment, the processor 202 may generate synthetic fault data based on the transformation matrix for the fault diagnosis. The processor 202 may be configured to generate the synthetic fault data associated with a target machine based on the generated transformation matrix corresponding to the healthy data of the source machine and the target machine, and the fault condition data of the source machine.
[0066] FIG. 3 illustrates a schematic diagram of a synchronous generator fault diagnosis by the proposed system 132, in accordance with an embodiment of the present disclosure.
[0067] Synchronous generators are crucial components of power systems, but faults in their stator and field windings can cause significant problems. Inter-turn short circuit faults in a machine disrupt the magnetic field symmetry and cause an increase in certain harmonic components in the electrical signals. These harmonic components can cause problems for the machine and its operation, making it important to detect and diagnose the faults early to prevent further damage. This can be detected by analysing the current spectrum of a synchronous generator. However, here also there can be an increase in harmonics due to varying load conditions as well. So, the spectrum analysis in the primary level will not help in identifying inter-turn short circuit faults. To develop better methods for diagnosing these faults, it is necessary to inject faults of varying magnitudes in the generator windings. Presented are the design and fabrication of 3 kVA and 5 kVA (3 phase) synchronous generators 302 with fault injection capabilities in the stator winding coils. As illustrated in FIG. 3, in an embodiment, the synchronous generator(s) 302 may include front panel access to the leads from the stator winding coils, which allow injection of short circuit faults between any of these points using a fault inducing terminal 304. The synchronous generator(s) 302 may be coupled to a DC shunt motor 306 for operation. The synchronous generator(s) 302 may further be coupled to three phase resistive load 308 for loading the synchronous generator 302 under various conditions. The stator winding of the synchronous generator(s) 302 may include 18 taps in each phase, providing a total of 54 taps across three phases. Further an auto transformer 320 and a rectifier module 318 may be utilized to conduct the experiments. The experiments may be conducted to inject short circuit faults 304 at different magnitudes in the stator winding coils. Each phase of the stator winding may include six coils, each with 28 turns, providing a total of 168 turns per phase. In the experiments, only a small percentage of the total number of turns in each phase is shorted, specifically 6 turns (3.57%), 8 turns (4.76%), and 14 turns (8.33%). The experiments may be performed individually on both the 3kVA and 5kVA synchronous generators 302, but the same methodology was used for both. Current sensors 310 connected to the stator winding terminals may be integrated with a data acquisition device 312 to record data. Current may be measured through an ammeter 316. Software may be used to acquire signals through the PXI system. The system 106 may be implemented to identify inter-turn faults in the stator windings of synchronous generators 302 by examining the current signals through the current sensors 310. For safety reasons, inter-turn faults may be introduced in a controlled manner through a rheostat 314. Each experiment was conducted for 10 seconds, and current signals were recorded at a sampling frequency of 1 kHz. This resulted in 10,000 samples for each trial. To collect complete data, this process was repeated for various fault conditions. To capture the current signals from the synchronous generator 302 under both no-fault and inter-turn fault conditions, loading conditions ranging from 0.5A to 3.5A loads were used.
[0068] This experiment explores the feasibility of a scalable machine fault diagnosis system, using two synchronous generators of different capacities (3kVA and 5kVA) as test subjects. Employing stator current data from both healthy and faulty conditions under various loading scenarios, fast Fourier transform (FFT) analysis may be used for detailed frequency domain representation. The number of samples in the FFT was selected to get a resolution sufficient to capture the 50Hz and its harmonics accurately. The optimum number of samples in the window for the FFT computation was found to be 2000 samples per frame, with an overlap of 200 samples. To implement the scalable fault modelling method, 3kVA synchronous generator 302 may be considered to be the low-capacity machine and 5kVA synchronous generator may be considered to be the high-capacity machine. The fault diagnosis model for the target system is trained using the healthy data retrieved from the machine and synthesized fault data using the low-capacity machine fault data. Final testing of the model is done using both healthy and fault data of the high-capacity machine. Fault data from the high-capacity system was not used for building the model. Here, a training dataset comprising 28,126 samples may be employed, with 7,500 samples each for low-capacity healthy, low-capacity fault, and high-capacity healthy conditions. Each sample consisted of 2,000 features (FFT points) across 3 channels (data from R-phase, Y-phase, and B-Phase). Similarly, the test dataset included 5,626 samples, with 2,813 samples each for high-capacity healthy and faulty conditions. These samples also had dimensions of 2,000 features across 3 channels. Hence, the system 106 generated synthetic fault data for the fault diagnosis of the high-capacity machine.
[0069] The dimension of the extracted features is fine tuned to ensure linearity in the learnt feature representations. To validate the efficiency of the method, three different case studies were conducted. In the first case study, focus was on synchronous generator fault diagnosis, utilizing a 3kVA machine as the low-capacity variant and 5kVA machine as the high-capacity variant. The second case study was focused on geared motor fault diagnosis, leveraging vibration data from two geared motors of different capacities. Here, the low-capacity data was employed to train the system, empowering it to accurately predict gear fault in the high-capacity geared motor. Finally, the third case study utilized a publicly available bearing fault diagnosis dataset featuring two distinct experimental setups. Collectively, these diverse case studies provide compelling evidence for the effectiveness and generalizability of the proposed system 132 in tackling real-world fault diagnosis challenges.
[0070] FIG. 4 illustrates a schematic diagram of a gearbox fault diagnosis by the proposed system 132, in accordance with an embodiment of the present disclosure.
[0071] To investigate gearbox failure mechanisms and evaluate fault diagnosis techniques, a comprehensive experimental facility was constructed, meticulously replicating real-world conditions. This setup utilizes an industrial helical geared motor 402 as the primary driver, seamlessly coupled with two parallel automobile gearboxes (404, 406) and an eddy current dynamometer 408. The two parallel automobile gearboxes (404, 406) may be operatively coupled to the geared motor 402 and a gearbox 402, where the geared motor 402 may be operated through a variable frequency drive 414. The additional gearboxes (404, 406) effectively increase the rotational speed to match the specifications of the eddy current dynamometer 408, enabling higher-speed experimentation. The dynamometer 408 may be coupled to a load cell 422 and monitored through a load monitor 424. The operation of the dynamometer 408 may be based on a dynamometer torque controller 426. The automotive gear boxes (404, 406) may also provide a means to adjust the load on the system, further enhancing the realism of the simulated operational scenarios. The schematic diagram and complete experimental facility is illustrated in FIG 4. To introduce controlled damage and analyse its unique vibration signature, a precise 0.1cm cut may be made within a gear, replicating a common fault. High-fidelity data acquisition may be achieved through strategically placed PCB 325C33 sensors (vertical and horizontal direction) 416. NI-9234 data acquisition module 410 and software may be used to collect the vibration data using a computer 424. Each signal may be captured at a sampling frequency of 25.6 kHz for duration of 5 secs under various operating conditions (including input speeds of 500rpm, 600rpm, and 700rpm, as well as full load and no-load scenarios). Two helical geared motors with different capacities were used in this study. Their specifications are listed in TABLE I.

[0072] The experiments leverage the publicly available MAFAULDA - Machinery Fault Database dataset, a rich repository of multivariate time-series data for machinery fault analysis to validate proposed methodology. The dataset was generated simulating six distinct states: normal operation, imbalance, misalignment, and inner and outer bearing faults. Recordings capture machine behaviour at sampling frequency of 50 kHz for 5 seconds across various speeds (700-3000 rpm) using RPM sensors 420. The dataset incorporates data from various sensors 418, including Industrial IMI Sensor accelerometers 422, including a triaxial accelerometer, a tachometer, and a microphone, ensuring comprehensive machine behaviour analysis. The experiments employ the HUST Bearing dataset, public resource offering high-resolution vibration data for diverse ball bearing faults. This dataset captures machine behaviour under controlled conditions, utilizing a 750 W induction motor with a multi-step shaft with a powder brake (load). Different bearing faults, including inner race, outer race, and ball faults, are monitored using accelerometer (PCB325C33). The dataset may include 99 raw vibration signal encompassing six different fault conditions at three operating conditions (0 W, 200 W, and 400 W). Each signal is captured at 51,200 samples per second for 10 seconds, provides a high-resolution representation of machine health. The research focuses specifically on vibration data obtained during normal conditions, inner race faults, and ball faults.
[0073] In an embodiment, to investigate the adaptability of the fault analysis method across different gearbox configurations, vibration data may be leveraged to explore the model’s potential for machine scalability. The model may be trained on a comprehensive dataset incorporating both healthy and fault data from a low-capacity geared motor, complemented by healthy data from a high-capacity geared motor. This strategic training approach aims to equip the model with a robust understanding of fault signatures while fostering adaptability to variations in machine characteristics. The raw vibration signals may be segmented into 4096 samples with an overlap of 256 samples and applied FFT to transform them into the frequency domain. A training data set may be utilized comprising 4320 samples, with 1440 samples each for low-capacity healthy, low-capacity faulty, and high-capacity healthy conditions. Each sample consisted of 4096 features (FFT points) across 2 channels (data from 2 different accelerometers). Similarly, the test dataset comprised 720 samples, with 360 samples each for high-capacity healthy and high-capacity faulty categories. Each sample consisted of 4096 features (FFT points) across 2 channels. As described earlier, the fault model for the target high-capacity system may be trained by the system 106 using the healthy data retrieved from the machine and synthesized fault data may be generated using the low-capacity machine’s fault data. Final testing of the model may be performed using both healthy and fault data of the high-capacity machine.
[0074] FIG. 5 illustrates an architecture diagram of a baseline Convolutional Neural network (CNN) implemented by the proposed system 132, in accordance with an embodiment of the present disclosure.
[0075] As illustrated in FIG. 5, in an embodiment, the baseline system 106 may be developed using CNN. The input layer of the CNN may be configured as an N-channel input, where each channel represents a specific aspect of the data. In scalable system for synchronous generators, the healthy and faulty signals collected from R, Y, and B phases may be considered as the three channels (as described in FIG. 3). The CNN can simultaneously process and analyze the information from each source separately. In another case study (as described in FIG. 4), the vibrations signals acquired from two different accelerometers may be considered as two channels by the CNN. In the cross machine scalable system, the healthy and faulty vibration signals collected from a single accelerometer may be considered as a single channel input. To process information, the CNN architecture may include two 1-D convolutional layers. The first convolutional layer may include 256 filters, each with a kernel size of 4 × 1. This layer specializes in capturing long-range dependencies and detecting larger patterns in the input data.
[0076] The second convolutional layer may include 128 filters, each with a kernel size of 3 × 1. The second layer may be designed to capture more localized patterns and finer details in the data. By utilizing these two convolutional layers (first layer and the second layer) with different filter sizes, the system 132 can effectively extract and learn features at different scales, enabling the system 132 to capture both global and local information from the input data. In addition, two fully connected layers (no. of neurons 64 and 32) may be used. The fully connected layers may serve a crucial role in the architecture by transforming the extracted features from the previous convolutional layers into a format that can be used to make accurate classifications or predictions. An optimizer with a learning rate of 0.01 is used with the CNN architecture. Further, he detailed CNN architecture is illustrated in FIG 5. Performance metrics obtained from the baseline system are tabulated in TABLE II.

[0077] FIG. 6 illustrates an architecture diagram of a deep discriminative learning method implemented by the proposed system 132, in accordance with an embodiment of the present disclosure.
[0078] In an embodiment, in addition to the baseline CNN architecture a custom metric embedding layer may be introduced, which may be responsible for learning a metric space where similar samples are closer together and dissimilar samples are farther apart. In an effort to achieve better feature representations, a combination of supervised clustering loss and discriminative loss may be used. The supervised clustering loss incorporates the class labels during the clustering process, encouraging similar samples from the same class to be grouped together. This helps in learning discriminative clusters that capture the underlying class structure in the data. The discriminative loss focuses on maximizing the inter-class separability and minimizing the intra-class variance. During training, the CNN network minimizes the loss function to encourage effective feature representations. Finally, bottleneck features may be extracted from the CNN network to capture most compact and informative representation of the data. This may effectively reduce dimensionality and capture most relevant and discriminative information from the input signal. These bottleneck features serve as the input for the subsequent dense network that performs the classification task. The CNN architecture may use an activation function in the dense layers, which address the dying ReLU problem by allowing the propagation of gradients for negative input values. As a result, training becomes more efficient and reliable, allowing the network to take in both positive and negative input values and learn complex patterns and relationships. Further, the dense CNN network consists of three dense layers in different filter sizes (32, 512, 64), together with batch normalization. Classification may be carried out in the CNN’s final output layer using a sigmoid activation function, which allows the CNN to map its learned representations to binary classification probabilities. To fine-tune the number of units, activation functions, and dropout rates within the CNN, a systematic approach may be used by defining a search space for hyper parameters. The performance of the model may be evaluated systematically across a validation dataset while iteratively testing various combinations of the hyper parameters. Bayesian optimization algorithm may be used for effectively searching optimal hyper parameter configurations for the deep learning model. The hyper parameters used are presented in TABLE III. The performance evaluation of scalable machine fault diagnosis system using discriminative feature learning is illustrated in TABLE IV and discriminative feature learning with supervised clustering is illustrated in TABLE V.

[0079] Further, in an embodiment, the performances of the deep discriminative features may be evaluated using a support vector machine (SVM) classifier with different kernels, as illustrated in TABLE VI. Remarkably, the linear kernel consistently showed better performance indicating that the linear separability of the embedding vector space is achieved. This analysis was done to verify if the deep discriminative features exhibit linear separability between classes. However, the performance comparison between SVM (TABLE VI) and CNN (TABLE V) using deep discriminative features demonstrated superior results for CNN.

[0080] T-distributed stochastic neighbour embedding (t-SNE) 3D plot may be employed for each case to visualize the distribution of the data and are illustrated in FIGs. 7, 8, and 9. In the visualization of features, the system 106 may separate different classes and increase intra-class compactness.
[0081] In an embodiment, using the synthetic data generation method may initially source healthy and faulty discriminative bottleneck features and target healthy discriminative bottleneck features extracted by the CNN. Using this source and target healthy features, linear transformation may be obtained from source healthy space to target healthy space with the help of CMLLR algorithm. Then, this linear transformation matrix may be applied on to the source fault data to generate synthetic target fault data. Further, the generated target fault data may be added to the training dataset and a dense network may be trained. The dense network architecture may be used in the discriminative feature learning with supervised clustering approach may also be used for classification. TABLE VII illustrates the performance evaluation of scalable system with using synthetic target fault data for training. The experimental results show strong evidence of the effectiveness of our proposed approach in addressing the data unavailability problem in machine fault diagnosis. By analysing t-SNE based 3D visualization, it is clearly evident that the synthesized data and original data exhibit a similar feature distribution.

[0082] The performance analysis of each case study may be carried out using DET curves as shown in FIG 10, 11 and 12, and the corresponding EER and minDCF may be tabulated in TABLE IX. DET analysis offers a visual representation of the discriminative power and effectiveness of the proposed methodology in each scenario. This comprehensive evaluation provides insights into the model’s performance across diverse case studies and strengthens the robustness of our findings and underscores the generalizability of the method.

[0083] In an embodiment, a feature space transformation may be compared with the state-of-the-art domain adaptation techniques, which are widely used in cases where there is a lack of training data or an imbalanced condition. Domain adaptation techniques aim to address the challenges arising from such disparities by aligning the feature distributions between the source and target domains. This method falls within the domain adaptation framework and focuses on trans- forming the features to bridge the gap between the domains and improve generalization on the target domain. Feature based domain adaptation methods such as but not limited to deep correlation alignment (deep CORAL), discriminative adversarial neural network (DANN), generative adversarial network (GAN)-based domain adaptation (GAN-DA), and adversarial domain adaptation convolutional neural network (ADACNN) may be used for comparing with current method. These methods often require unlabelled data from the target fault condition, which can be particularly hard to obtain for complex, high-capacity machines. Furthermore, fault data itself often suffers from imbalanced distributions, with normal operating conditions vastly outnumbering actual faults as depicted in FIGs 13, 14, and 15.
[0084] The proposed system 132 overcomes these limitations by using a scalable and practical solution that does not require any unlabelled target fault data. The imbalanced data challenge is addressed by leveraging synthetic fault data generation from a low-capacity counterpart. This ensures a balanced representation of fault and non-fault scenarios in the training data, empowering the model to effectively diagnose faults in high-capacity machines. The results are tabulated in TABLE X. From the experimental analysis, it is evident that the method outperformed domain adaptation techniques in the context of scalable fault diagnosis.

[0085] The need for collecting signal for the fault condition from the high capacity and expensive machinery to model the faults for diagnosis applications may be addressed by introducing the concept of scalable machine fault diagnosis. In this method, faults are injected into a low-capacity machine to capture the machine’s behaviour, and the learned intelligence studying the fault and healthy data for the source and the healthy data for the target systems is used for synthesizing the fault data of the target system. The method for scalable machine fault diagnosis centres on two pivotal components:
• Deep discriminative feature learning: This method plays a vital role in extracting hierarchical and adaptive representations from raw data. Leveraging deep neural networks, this method captures intricate fault patterns and relationships, providing a nuanced understanding of complex fault scenarios. The objective is to extract robust and discriminative features that effectively differentiate between various fault classes. To achieve this, two loss functions may be introduced called supervised clustering loss and discriminative loss. The supervised clustering loss function allows the features to group samples belonging to the same class closer together (minimize intra-class variance). Discriminative loss function emphasizes maximizing the distance between features belonging to different classes (maximize inter-class distance). By combining these losses with the standard cross-entropy loss, the network learns feature representations that are not only accurate for classification but also highly discriminative for fault detection. Furthermore, an empirical fine-tuning of the feature dimension, by varying the number of nodes in the feature extraction layer of the neural network, is employed to ensure linearity within the learned representation.
• Synthetic data generation using feature space transformation: Synthetic data generation method leverages both constrained maximum likelihood linear regression (CMLLR) and Gaussian mixture model (GMM) to effectively transform the healthy data from a low-capacity source machine to the target high-capacity machine. This approach identifies an optimal linear transformation aligning both machines’ healthy data in a shared feature space. This transformation will enable us to synthesize diverse and realistic fault data of high-capacity machines using the low-capacity source fault data by applying the feature space transformation 136 to the fault data of the low-capacity source machine. The transformed data may be used to train fault diagnosis model for the target high-capacity system. The system 132 may be configured to train the fault diagnosis model for the target system without having to use the fault data from the target system, instead the system 106 may use the synthesized data using the transformation learnt.
[0086] In an embodiment, to validate the effectiveness of the system 106, three diverse case studies were conducted. The first case study focussed on inter-turn short circuit faults in R-Phase, Y-Phase, and B-Phase of synchronous generators with the capacity of 3kVA (source machine) and 5KVA (target machine). The approach outperformed a baseline convolutional neural network (CNN) based method, achieving an accuracy improvement of 4.65%, 4.75%, and 5.11% for R, Y, and B phase faults, respectively. This demonstrates enhanced fault diagnosis across different machine capacities. The second case study on geared motors of two different capacities showed our method’s generalizability with a 23.76% improvement in accuracy compared to a baseline CNN. In the third case study, entirely different bearing fault datasets (MAFAULDA and HUST) representing distinct experimental facilities were utilized. By using the MAFAULDA dataset, the system 106 generated synthetic fault data for HUST dataset. Compared to a baseline system, this system 106 achieved a remarkable 32.5% absolute improvement in accuracy. This showcases the system’s 106 cross-machine scalability and effectiveness in handling diverse data sources from different machines and environments, emphasizing its potential for broad applicability. The experimental findings reveal the superiority of the method over various state-of-the-art baselines, affirming its potential in enhancing the reliability of deep-learning-based machine fault diagnosis for scalable fault modeling.
[0087] While considerable emphasis has been placed herein on the preferred embodiments, it will be appreciated that many embodiments can be made and that many changes can be made in the preferred embodiments without departing from the principles of the invention. These and other changes in the preferred embodiments of the invention will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter to be implemented merely as illustrative of the invention and not as limitation.

ADVANTAGES OF THE PRESENT DISCLOSURE
[0088] The present disclosure leverages available source (low-capacity) healthy data and target (high-capacity) healthy data and effectively finds the linear feature transformation between the source and target healthy condition data.
[0089] The present disclosure provides transformation of the source machine's fault data, enabling the synthesis of target fault condition data which is used in the training of the models for the target system.
[0090] The present disclosure generates the synthetic fault data associated with the target machine based on the fault condition data of the source machine.
,CLAIMS:1. A system (132) for fault diagnosis, comprising:
a processor (202); and
a memory communicatively coupled to the processor (202), said memory storing instructions to be executed by the processor (202), causes the processor (202) to:
receive one or more inputs samples, wherein the one or more inputs samples comprise one or more features associated with a source machine and a target machine;
determine one or more discriminative features associated with the source machine and the target machine based on the analyzed one or more features;
generate one or more Gaussian Mixture Models (GMMs) associated with the source machine (low-capacity) and the target machine (high-capacity) comprising the one or more discriminative features;
generate a transformation matrix associated with the source and target machines healthy condition data using the generated one or more GMMs that maps a source machine feature space to a corresponding target machine feature space; and
generate synthetic fault data of the target machine based on the transformation matrix for the fault diagnosis using the fault condition data of the source machine.
2. The system (132) for fault diagnosis as claimed in claim 1, wherein to determine the one or more discriminative features, the processor (202) is configured to:
aggregate the one or more inputs samples to determine a feature representation of the aggregated one or more inputs samples;
utilize a supervised clustering technique to generate a specified margin between the aggregated one or more samples belonging to different classes while minimising the cosine distance between the feature representation; and
determine the aggregated one or more samples belonging to a similar class by minimizing the distance between the feature representations of the aggregated one or more samples from the aggregated one or more samples of the similar class and maximizing the distance between the feature representation of the aggregated one or more samples belonging to different classes.
3. The system (132) for fault diagnosis as claimed in claim 1, wherein the transformation matrix generated by the processor (202) maps the source healthy data to the target healthy data while minimizing divergence between one or more Gaussian Mixture Models (GMMs).
4. The system (132) for fault diagnosis as claimed in claim 1, wherein the corresponding target data comprises a log-likelihood of a target GMM associated with the target data from the one or more Gaussian Mixture Models (GMMs).
5. The system (132) for fault diagnosis as claimed in claim 1, wherein the one or more inputs samples comprise any or a combination of: a healthy condition data associated with source machine, a fault condition data associated with the source machine, and a healthy condition data associated with the target machine.
6. The system (132) for fault diagnosis as claimed in claim 5, wherein the processor (202) is configured to generate the synthetic fault data associated with a target machine based on the generated transformation matrix corresponding to the fault condition data of the source machine.
7. A method for fault diagnosis, the method comprises:
receiving, by a processor (202), associated with a system (132), one or more inputs samples, wherein the one or more inputs samples comprise one or more features associated with a source machine and a target machine;
determining, by the processor (202), one or more discriminative features associated with the source machine and the target machine based on the analyzed one or more features;
generating, by the processor (202), one or more Gaussian Mixture Models (GMMs) associated with the source machine and the target machine comprising the one or more discriminative features;
generating, by the processor (202), a transformation matrix associated with the source and target machines healthy condition data using the generated one or more GMMs that maps a source machine feature space to a corresponding target machine feature space data; and
generating, by the processor (202), synthetic fault data based on the transformation matrix for the fault diagnosis using the fault condition data of the source machine.
8. The method as claimed in claim 7, wherein for determining the one or more discriminative features, the method comprises:
aggregating, by the processor (202), the one or more inputs samples to determine a feature representation of the aggregated one or more inputs samples;
utilizing, by the processor (202), a supervised clustering technique to generate a specified margin between the aggregated one or more samples belonging to different classes while minimising the cosine distance between the feature representation; and
determining, by the processor (202), the aggregated one or more samples belonging to a similar class by minimizing the distance between the feature representations of the aggregated one or more samples from the aggregated one or more samples of the similar class and maximizing the distance between the feature representation of the aggregated one or more samples belonging to different classes.
9. The method as claimed in claim 7, wherein the corresponding target data comprises a log-likelihood of a target GMM associated with the target data from the one or more Gaussian Mixture Models (GMMs).
10. The method as claimed in claim 7, wherein the one or more inputs samples comprise any or a combination of: a healthy condition data associated with a source machine, a fault condition data associated with the source machine, and a healthy condition data associated with the target machine.
11. The method as claimed in claim 10, comprising generating, by the processor (202), the synthetic fault data associated with a target machine based on the generated transformation matrix corresponding to the fault condition data of the source machine.

Documents

Application Documents

#	Name	Date
1	202441040203-STATEMENT OF UNDERTAKING (FORM 3) [23-05-2024(online)].pdf	2024-05-23
2	202441040203-PROVISIONAL SPECIFICATION [23-05-2024(online)].pdf	2024-05-23
3	202441040203-OTHERS [23-05-2024(online)].pdf	2024-05-23
4	202441040203-FORM FOR SMALL ENTITY(FORM-28) [23-05-2024(online)].pdf	2024-05-23
5	202441040203-FORM 1 [23-05-2024(online)].pdf	2024-05-23
6	202441040203-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [23-05-2024(online)].pdf	2024-05-23
7	202441040203-EDUCATIONAL INSTITUTION(S) [23-05-2024(online)].pdf	2024-05-23
8	202441040203-FORM-26 [05-07-2024(online)].pdf	2024-07-05
9	202441040203-RELEVANT DOCUMENTS [01-04-2025(online)].pdf	2025-04-01
10	202441040203-POA [01-04-2025(online)].pdf	2025-04-01
11	202441040203-FORM 13 [01-04-2025(online)].pdf	2025-04-01
12	202441040203-OTHERS [12-05-2025(online)].pdf	2025-05-12
13	202441040203-EDUCATIONAL INSTITUTION(S) [12-05-2025(online)].pdf	2025-05-12
14	202441040203-FORM-5 [15-05-2025(online)].pdf	2025-05-15
15	202441040203-DRAWING [15-05-2025(online)].pdf	2025-05-15
16	202441040203-CORRESPONDENCE-OTHERS [15-05-2025(online)].pdf	2025-05-15
17	202441040203-COMPLETE SPECIFICATION [15-05-2025(online)].pdf	2025-05-15
18	202441040203-FORM-9 [19-05-2025(online)].pdf	2025-05-19
19	202441040203-FORM 18 [19-05-2025(online)].pdf	2025-05-19