Abstract: The present disclosure provides an unsupervised feature recommendation for time series data without using any label information. Initially, the system receives a time series data. A feature matrix is computed based on the time series data and a corresponding transposed feature matrix is generated. A matrix with Auto Encoded Compact Representation (AECS) is computed based on the transposed feature matrix. Further, a plurality of consistent hierarchical clusters are generated using AECS of features based on the choice of selected best distance measure. Further, a plurality of unique features are obtained from the plurality of AECS based on a corresponding autocorrelation score. Finally, a plurality of discriminative features are selected from the plurality of unique features based on sorted list of the pair-wise distances between the feature representations using the selected distance measure.
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION (See Section 10 and Rule 13)
Title of invention:
METHOD AND SYSTEM FOR UNSUPERVISED FEATURE RECOMMENDATION FOR UNLABELED TIME SERIES DATA
Applicant
Tata Consultancy Services Limited A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description
The following specification particularly describes the invention and the manner in which it is to be performed.
TECHNICAL FIELD [001] The disclosure herein generally relates to the field of signal processing and, more particular, to a method and system for unsupervised feature recommendation for unlabeled time series data.
BACKGROUND
[002] Feature selection or recommendation can be defined as a process of choosing a subset of features from a larger set of features, which, in combination retains the most important properties of the dataset. It contributes to reducing the redundancy between the features and represents the data in a reduced feature space, thereby, reducing the computation time. Further, feature selection influences the learning mostly by controlling the overfitting and by improving the learning efficacy. In most scenarios, such feature selection algorithms require annotated data to recommend the features which contributes to discrimination between the classes the most. But in real world applications, annotations may not be available and labelling by experts is costly, especially for time-series data.
[003] Further, feature extraction from a dataset is a completely unsupervised process. Hence building an unsupervised feature selection method can contribute massively towards building an end-to-end feature recommendation approach in absence of any annotated data. Exploiting these recommended features further to either unsupervised or supervised settings impacts diverse machine learning applications to achieve robust performance.
[004] Conventional methods for feature selection and recommendation use label information and sometimes specific classifier. Hence there is an urgent need in developing an end to end accurate unsupervised method for generic feature selection and recommendation.
SUMMARY [005] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one
embodiment, a method for unsupervised feature recommendation for unlabeled time series data is provided. The method includes receiving, by one or more hardware processors, a time series data with a plurality of instances, wherein the time series data comprises one of, a univariate time series data and a multivariate time series data. Further, the method includes computing, by the one or more hardware processors, a feature matrix comprising a plurality of features based on the time series data using at least one of a signal processing technique, a statistical model and an information theory technique. Furthermore, the method includes generating, by the one or more hardware processors, a transposed feature matrix by interchanging a plurality of rows and a plurality of columns associated with the feature matrix. Furthermore, the method includes generating, by the one or more hardware processors, a latent feature matrix by computing an Auto Encoded Compact Sequence (AECS) representation on the transposed feature matrix using a multi-layer undercomplete auto encoder. Furthermore, the method includes generating, by the one or more hardware processors, a plurality of consistent hierarchical clusters based on the latent feature matrix using a hierarchical clustering technique, wherein the latent feature matrix comprises a plurality of auto encoded compact feature representations. Furthermore, the method includes computing, by the one or more hardware processors, an auto-correlation score between each of the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters. Furthermore, the method includes obtaining, by the one or more hardware processors, a plurality of unique auto encoded features from the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters based on the corresponding autocorrelation score, wherein the plurality of auto encoded compact features with the autocorrelation score more than a predetermined threshold are removed. Furthermore, the method includes computing, by the one or more hardware processors, a triangular distance matrix for each of the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters, wherein each distance measure in the triangular distance matrix is a distance measure between a pair of unique auto encoded features. Finally, the method
includes selecting, by the one or more hardware processors, a plurality of discriminative features from the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters based on the triangular distance matrix by: (i) obtaining a plurality of sorted unique feature pairs by sorting the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters in descending order based on a corresponding distance measure associated with the triangular distance matrix and (ii) selecting the plurality of discriminative features from the plurality of sorted unique feature pairs based on a predefined selection threshold, wherein a first n number of sorted unique feature pairs are selected as the plurality of discriminative features and, wherein n is the predefined selection threshold.
[006] In another aspect, a system for unsupervised feature recommendation for unlabeled time series data is provided. The system includes at least one memory storing programmed instructions, one or more Input /Output (I/O) interfaces, and one or more hardware processors operatively coupled to the at least one memory, wherein the one or more hardware processors are configured by the programmed instructions to receive a time series data with a plurality of instances, wherein the time series data comprises one of, a univariate time series data and a multivariate time series data. Further, the one or more hardware processors are configured by the programmed instructions to compute a feature matrix comprising a plurality of features based on the time series data using at least one of a signal processing technique, a statistical model and an information theory technique. Furthermore, the one or more hardware processors are configured by the programmed instructions to generate a transposed feature matrix by interchanging a plurality of rows and a plurality of columns associated with the feature matrix. Furthermore, the one or more hardware processors are configured by the programmed instructions to generate a latent feature matrix by computing an Auto Encoded Compact Sequence (AECS) representation on the transposed feature matrix using a multi-layer undercomplete auto encoder. Furthermore, the one or more hardware processors are configured by the programmed instructions to generate a plurality of consistent hierarchical clusters based on the latent feature
matrix using a hierarchical clustering technique, wherein the latent feature matrix comprises a plurality of auto encoded compact feature representations. Furthermore, the one or more hardware processors are configured by the programmed instructions to compute an auto-correlation score between each of the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters. Furthermore, the one or more hardware processors are configured by the programmed instructions to obtain a plurality of unique auto encoded features from the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters based on the corresponding autocorrelation score, wherein the plurality of auto encoded compact features with the autocorrelation score more than a predetermined threshold are removed. Furthermore, the one or more hardware processors are configured by the programmed instructions to compute a triangular distance matrix for each of the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters, wherein each distance measure in the triangular distance matrix is a distance measure between a pair of unique auto encoded features. Finally, the one or more hardware processors are configured by the programmed instructions to select a plurality of discriminative features from the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters based on the triangular distance matrix by: (i) obtaining a plurality of sorted unique feature pairs by sorting the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters in descending order based on a corresponding distance measure associated with the triangular distance matrix and (ii) selecting the plurality of discriminative features from the plurality of sorted unique feature pairs based on a predefined selection threshold, wherein a first n number of sorted unique feature pairs are selected as the plurality of discriminative features and, wherein n is the predefined selection threshold.
[007] In yet another aspect, a computer program product including a non-transitory computer-readable medium having embodied therein a computer program for unsupervised feature recommendation for unlabeled time series data is provided. The computer readable program, when executed on a computing device,
causes the computing device to receive a time series data with a plurality of instances, wherein the time series data comprises one of, a univariate time series data and a multivariate time series data. Further, the computer readable program, when executed on a computing device, causes the computing device to compute a feature matrix comprising a plurality of features based on the time series data using at least one of a signal processing technique, a statistical model and an information theory technique. Furthermore, the computer readable program, when executed on a computing device, causes the computing device to generate a transposed feature matrix by interchanging a plurality of rows and a plurality of columns associated with the feature matrix. Furthermore, the computer readable program, when executed on a computing device, causes the computing device to generate a latent feature matrix by computing an Auto Encoded Compact Sequence (AECS) representation on the transposed feature matrix using a multi-layer undercomplete auto encoder. Furthermore, the computer readable program, when executed on a computing device, causes the computing device to generate a plurality of consistent hierarchical clusters based on the latent feature matrix using a hierarchical clustering technique, wherein the latent feature matrix comprises a plurality of auto encoded compact feature representations. Furthermore, the computer readable program, when executed on a computing device, causes the computing device to compute an auto-correlation score between each of the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters. Furthermore, the computer readable program, when executed on a computing device, causes the computing device to obtain a plurality of unique auto encoded features from the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters based on the corresponding autocorrelation score, wherein the plurality of auto encoded compact features with the autocorrelation score more than a predetermined threshold are removed. Furthermore, the computer readable program, when executed on a computing device, causes the computing device to compute a triangular distance matrix for each of the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters, wherein each distance measure in the triangular
distance matrix is a distance measure between a pair of unique auto encoded features. Finally, the computer readable program, when executed on a computing device, causes the computing device to select a plurality of discriminative features from the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters based on the triangular distance matrix by: (i) obtaining a plurality of sorted unique feature pairs by sorting the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters in descending order based on a corresponding distance measure associated with the triangular distance matrix and (ii) selecting the plurality of discriminative features from the plurality of sorted unique feature pairs based on a predefined selection threshold, wherein a first n number of sorted unique feature pairs are selected as the plurality of discriminative features and, wherein n is the predefined selection threshold.
[008] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[009] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[0010] FIG. 1 is a functional block diagram of a system for unsupervised feature recommendation for unlabeled time series data, in accordance with some embodiments of the present disclosure.
[0011] FIGS. 2A and 2B are exemplary flow diagrams illustrating a method for unsupervised feature recommendation for unlabeled time series data, implemented by the system of FIG. 1, in accordance with some embodiments of the present disclosure.
[0012] FIG. 3 illustrates a lower triangular distance matrix for the processor implemented method for unsupervised feature recommendation for unlabeled time
series data implemented by the system of FIG. 1, in accordance with some embodiments of the present disclosure.
[0013] FIG. 4 is an example overall architecture for the processor implemented method for unsupervised feature recommendation for unlabeled time series data implemented by the system of FIG. 1, in accordance with some embodiments of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS [0014] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the spirit and scope of the disclosed embodiments. [0015] Embodiments herein provide a method and system for unsupervised feature recommendation for unlabeled time series data for identifying discriminative features. Initially, the system receives a time series data with a plurality of instances. A feature matrix comprising a plurality of features is computed based on the time series data using at least one of a signal processing technique, a statistical model and an information theory technique. Further, a transposed feature matrix is generated by interchanging a plurality of rows and a plurality of columns associated with the feature matrix. Post generating the transposed matrix, a latent feature matrix is generated by computing an Auto Encoded Compact Sequence (AECS) representation on the transposed feature matrix using a multi-layer undercomplete auto encoder. Further, a plurality of consistent hierarchical clusters are generated based on the latent feature matrix using a hierarchical clustering technique. The latent feature matrix includes a plurality of auto encoded compact feature representations. Further, an auto¬correlation score is computed between each of the plurality of auto encoded compact feature representation corresponding to each of the plurality of hierarchical
clusters. After computing the autocorrelation score, a plurality of unique auto encoded features are obtained from the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters based on the autocorrelation score. Further, a triangular distance matrix is computed for each of the plurality of unique features corresponding to each of the plurality of hierarchical clusters. Finally, a plurality of discriminative features are selected from the plurality of unique features corresponding to the triangular distance matrix for each of the plurality of hierarchical clusters based on the corresponding distance measure. The plurality of unique features with the higher pair wise distance measure using the first top predetermined feature pairs from the sorted list of features in descending order are selected as discriminative features.
[0016] Referring now to the drawings, and more particularly to FIGS. 1 through 4, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[0017] FIG. 1 is a functional block diagram of a system 100 for unsupervised feature recommendation for unlabeled time series data, according to some embodiments of the present disclosure. The system 100 includes or is otherwise in communication with hardware processors 102, at least one memory such as a memory 104, an I/O interface 112. The hardware processors 102, memory 104, and the Input /Output (I/O) interface 112 may be coupled by a system bus such as a system bus 108 or a similar mechanism. In an embodiment, the hardware processors 102 can be one or more hardware processors.
[0018] The I/O interface 112 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface 112 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a keyboard, a mouse, an external memory, a printer and the like. Further, the I/O interface 112 may enable the system 100 to communicate with other devices, such as web servers, and external databases.
[0019] The I/O interface 112 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, local area network (LAN), cable, etc., and wireless networks, such as Wireless LAN (WLAN), cellular, or satellite. For the purpose, the I/O interface 112 may include one or more ports for connecting several computing systems with one another or to another server computer. The I/O interface 112 may include one or more ports for connecting several devices to one another or to another server.
[0020] The one or more hardware processors 102 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, node machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 102 is configured to fetch and execute computer-readable instructions stored in the memory 104.
[0021] The memory 104 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, the memory 104 includes a plurality of modules 106. The memory 104 also includes a data repository (or repository) 110 for storing data processed, received, and generated by the plurality of modules 106.
[0022] The plurality of modules 106 include programs or coded instructions that supplement applications or functions performed by the system 100 for unsupervised feature recommendation for unlabeled time series data. The plurality of modules 106, amongst other things, can include routines, programs, objects, components, and data structures, which performs particular tasks or implement particular abstract data types. The plurality of modules 106 may also be used as, signal processor(s), node machine(s), logic circuitries, and/or any other device or component that manipulates signals based on operational instructions. Further, the plurality of modules 106 can be used by hardware, by computer-readable instructions executed by the one or more hardware processors 102, or by a
combination thereof. The plurality of modules 106 can include various sub-modules (not shown). The plurality of modules 106 may include computer-readable instructions that supplement applications or functions performed by the system 100 for unsupervised feature recommendation for unlabeled time series data. In an embodiment, plurality of modules 106 includes a feature matrix computation module (not shown in FIG. 1), a transpose computation module (not shown in FIG. 1), an AECS matrix computation module (not shown in FIG. 1), a clustering module (not shown in FIG. 1), an autocorrelation computation module (not shown in FIG. 1), a distance measure computation module (not shown in FIG. 1) and a discriminative feature selection module (not shown in FIG. 1).
[0023] The data repository (or repository) 110 may include a plurality of abstracted piece of code for refinement and data that is processed, received, or generated as a result of the execution of the plurality of modules in the module(s) 106.
[0024] Although the data repository 110 is shown internal to the system 100, it will be noted that, in alternate embodiments, the data repository 110 can also be implemented external to the system 100, where the data repository 110 may be stored within a database (not shown in FIG. 1) communicatively coupled to the system 100. The data contained within such external database may be periodically updated. For example, new data may be added into the database (not shown in FIG. 1) and/or existing data may be modified and/or non-useful data may be deleted from the database (not shown in FIG. 1). In one example, the data may be stored in an external system, such as a Lightweight Directory Access Protocol (LDAP) directory and a Relational Database Management System (RDBMS).
[0025] FIGS. 2A and 2B are exemplary flow diagrams illustrating a method 200 for unsupervised feature recommendation for unlabeled time series data implemented by the system of FIG. 1 according to some embodiments of the present disclosure. In an embodiment, the system 100 includes one or more data storage devices or the memory 104 operatively coupled to the one or more hardware processor(s) 102 and is configured to store instructions for execution of steps of the method 200 by the one or more hardware processors 102. The steps of the method
200 of the present disclosure will now be explained with reference to the components or blocks of the system 100 as depicted in FIG. 1 and the steps of flow diagram as depicted in FIG. 2A and 2B. The method 200 may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, etc., that perform particular functions or implement particular abstract data types. The method 200 may also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communication network. The order in which the method 200 is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method 200, or an alternative method. Furthermore, the method 200 can be implemented in any suitable hardware, software, firmware, or combination thereof.
[0026] At step 202 of the method 200, the one or more hardware processors 102 are configured by the programmed instructions to receive a time series data. The time series data includes one of, a univariate time series data and a multivariate time series data. In one embodiment, the univariate timeseries data is received as input and in another embodiment, the multivariate timeseries data is received as input. The univariate time series data includes only variable varying over time. For example, data collected from a sensor measuring the temperature of a room every second. Here, for each time second, there is a one dimensional value called temperature. On the other hand, the multivariate time series data is having a plurality of variables varying over time. For example, a tri-axial accelerometer with three accelerations for x-axis, y-axis and z-axis which varies simultaneously over time.
[0027] At step 204 of the method 200, the one or more hardware processors 102 are configured by the programmed instructions to compute a feature matrix comprising a plurality of features based on the time series data using at least one of a signal processing technique, a statistical model and an information theory technique. The feature matrix includes a plurality of rows and a plurality of columns. Each of the plurality of rows includes an instance of the time series data
and each of the plurality of columns includes a feature associated with the corresponding instance of the time series data. In an embodiment, the plurality of features can be Signal Properties based Generic Features (SPGF). The SPGF are a group of 392 features computed from raw time-series which captures the morphological and statistical properties of the signal. Features are mainly extracted from 3 domains (1) temporal, (2) spectral and (3) wavelet. For example, the plurality of features includes Box-PierceStat1 of Discrete Wavelet Transform (DWT) (d1)', 'RootMeanSquare of DWT (a2)‘, 'Box-PierceStat3 of DWT (a3)', 'Mean of DWT (a1)' , 'Variance of Temporal Difference‘, 'Variance of DWT (a1) and the like.
[0028] At step 206 of the method 200, the one or more hardware processors 102 are configured by the programmed instructions to generate a transposed feature matrix by interchanging a plurality of rows and a plurality of columns associated with the feature matrix.
[0029] At step 208 of the method 200, the one or more hardware processors 102 are configured by the programmed instructions to generate a latent feature matrix by computing AECS on the transposed feature matrix using a multi-layer undercomplete auto encoder. For example, a multilayer Seq-2-Seq auto-encoder which learns a compact representation is utilized. Towards generating/obtaining AECS, Mean Squared Error (MSE) loss function is utilized. The stochastic gradient descent function is used for optimization with a learning rate 0.004 and a momentum of zero. In an embodiment, the batch size used was 32. For example, latent feature matrix for a MiddlePhalanxTW dataset is depicted in table I. Here, every row represents corresponding AECS representation.
Table I
0 1 2 3 ……. 9 10 11 12
0 -0.280 -1.203 -0.288 0.236 ……. 1.357 -0.377 -1.611 0.502
1 0.990 -0.532 -0.772 -0.530 ……. -1.574 0.652 1.519 0.681
2 0.990 -0.532 -0.771 -0.530 ……. -1.574 0.652 1.520 0.680
3 0.983 -0.413 -0.710 -0.526 ……. -1.609 0.659 1.574 0.596
…… …… …… …… …… ……. …… …… …… ……
…… …… …… …… …… ……. …… …… …… ……
391 0.863 0.633 -0.066 -0.449 ……. -1.619 0.547 1.676 -0.269
[0030] At step 210 of the method 200, the one or more hardware processors 102 are configured by the programmed instructions to generate a plurality of consistent hierarchical clusters based on the latent feature matrix using a hierarchical clustering technique. The latent feature matrix includes a plurality of auto encoded compact feature representations.
[0031] In an embodiment, the hierarchical clustering generates a plurality of consistent clusters formed either by merging or dividing existing clusters recursively by choosing best clustering exploiting the choice of best distance measure among Chebyshev, Manhattan and Mahalanobis. Here the agglomerative hierarchical clustering is used. It takes each input time-series as an individual cluster and then start successively merging pair of clusters having highest similarity between them until a single cluster is formed. Hence it measures similarities in terms of distance measure among the pair of time-series to put them into same cluster and the dissimilarities between the pair of groups of time-series to take the decision of further fusing the clusters. A pairwise inter-cluster dissimilarities are computed using average linkage.
[0032] At step 212 of the method 200, the one or more hardware processors 102 are configured by the programmed instructions to compute an auto-correlation score between each of the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters.
[0033] At step 214 of the method 200, the one or more hardware processors 102 are configured by the programmed instructions to obtain a plurality of unique auto encoded compact features from the plurality of auto encoded compact feature representation corresponding to each of the plurality of hierarchical clusters based on the autocorrelation score. The plurality of auto encoded compact feature representation with the autocorrelation scores more than a predetermined threshold
are removed. In an embodiment, the plurality of auto encoded compact features with the autocorrelation score greater than or equal to 0.96 are removed.
[0034] At step 220 of the method 200, the one or more hardware processors 102 are configured by the programmed instructions to compute a triangular distance matrix for each of the plurality of unique features corresponding to each of the plurality of hierarchical clusters after that distance measure. Here, the distance measure is performed using a best distance measure technique from a plurality of distance measure techniques including Chebyshev distance, Manhattan distance and a Mahalanobis distance. Here, the best distance measure is chosen based the recommended distance for the best clustering.
[0035] In an embodiment, the present disclosure utilizes a modified Hubert Statistic value, an internal clustering measure to identify best clustering and corresponding distance measure. Hubert Statistic evaluates sum of distance between each pair of instances weighted by distance between the centers of clusters which they belong to. In present disclosure Hubert Statistic uses Mahalanobis distance. The clustering formed by different distance measures are ranked based on Hubert statistic value. Best clustering has highest Hubert statistic value. All are chosen as best clustering and corresponding distance measure in case two or more distance measures produces the highest value of Hubert statistic
[0036] At step 218 of the method 200, the one or more hardware processors 102 are configured by the programmed instructions to select a plurality of discriminative features from the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters based on the triangular distance matrix by the following steps. Initially, a plurality of sorted unique feature pairs are obtained by sorting the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters in descending order based on a corresponding distance measure associated with the triangular distance matrix. Further, the plurality of discriminative features are selected from the plurality of sorted unique feature pairs based on a predefined selection threshold, wherein a first n number of sorted unique feature pairs are selected as the plurality of discriminative features and, wherein n is the predefined selection threshold.
[0037] For example, considering the distance measure between the feature pairs (A, D) = 6, (A, B) = 8, (D, E) = 1 and (A, C) = 10, then the sorted list of unique feature pairs sorted, in decreasing order based on the corresponding distance measure associated with the distance matrix, are (A, C), (A, B), (A, D), (D, E) etc.. Then the selected discriminative features are A, B, C, D corresponding to (A, C), (A, B) and (A, D). Here the predefined selection threshold is three.
[0038] FIG. 3 illustrates a lower triangular distance matrix for the processor implemented method for unsupervised feature recommendation for unlabeled time series data implemented by the system of FIG. 1, in accordance with some embodiments of the present disclosure.
[0039] Now referring to FIG. 3, the lower triangular distance matrix for each of the plurality of unique features corresponding to each of the plurality of hierarchical clusters is depicted. The row and the column of the triangular distance matrix represents the plurality of unique auto encoded features. Each element of the triangular distance matrix is the distance between a pair of unique auto encoded features (computed by AECS representation). The triangular distance matrix is symmetric, and one pair is considered at a time, wherein either lower or upper part has been considered. The unique auto encoded features 199 and 254 of FIG. 3 are most distance from each other. Hence the unique auto encoded features 199 and 254 are selected from the cluster.
[0040] FIG. 4 is an example overall architecture for the processor implemented method for unsupervised feature recommendation for unlabeled time series data implemented by the system of FIG. 1, in accordance with some embodiments of the present disclosure. Now referring to FIG. 4, the architecture includes a feature matrix computation module 402, a transpose computation module 404, an AECS matrix computation module 406, a clustering module 408, an autocorrelation computation module 310, a distance measure computation module 412 and a discriminative feature selection module 414. The feature matrix computation module 402 receives the time series data and computes a plurality of features corresponding to a plurality of instances. The transpose computation module 404 generates the transpose of the feature matrix. The AECS matrix
computation module 406 computes a compressed representation of the transpose matrix using seq-2-seq auto encoder. The clustering module 408 clusters the AECS matrix features into the plurality of clusters using agglomerative consistent hierarchical clustering method. The autocorrelation computation module 410 computes the autocorrelation score for the plurality of compact features corresponding to each of the plurality of clusters. The plurality of unique features are selected based on the corresponding autocorrelation score. The distance measure computation module 412 computes the distance measure for each of the plurality of unique features corresponding to each of the plurality of hierarchical clusters. The discriminative feature selection module 414 selects a plurality of discriminative features from the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters based on the triangular distance matrix. Here, initially, the plurality of sorted unique feature pairs are obtained by sorting the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters in descending order based on a corresponding distance measure associated with the triangular distance matrix. Further, the plurality of discriminative features are selected from the plurality of sorted unique feature pairs based on a predefined selection threshold, wherein a first n number of sorted unique feature pairs are selected as the plurality of discriminative features and, wherein n is the predefined selection threshold.
[0041] In an embodiment, the present disclosure is experimented in 15 diverse datasets from UCR Time Series Classification Archive. The 392 features of SPGF feature matrix have been used. The classifier used is TreeBagger classifier. The accuracy of the present system 100 is depicted in table II.
Table II
Sensor Dataset Accuracy
Camera
(Image
outlines as
time-series) ProximalPhalanxOutlineAgeGroup 0.795
DistalPhalanxOutlineAgeGroup 0.778
ProximalPhalanxTW 0.68
MiddlePhalanxTW 0.591
MedicalImages 0.699
DistalPhalanxOutlineCorrect 0.728
DistalPhalanxTW 0.68
MiddlePhalanxOutlineCorrect 0.6
ProximalPhalanxOutlineCorrect 0.8
Yoga 0.68
Food Spectrograph Strawberry 0.92
ECG ECG5000 0.88
Vibration FordB 0.855
Simulated ChlorineConcentration 0.59
SyntheticControl 0.92
[0042] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[0043] The embodiments of present disclosure herein address the unresolved problem of unsupervised hierarchical clustering based feature recommendation for classifiers. The present disclosure is capable of recommending discriminative features even in the presence of unlabeled data. Further, the present disclosure uses AECS based encoding for compact representation of features. The consistent hierarchical agglomerative clustering and the autocorrelation score plays an important role in accurate selection of discriminating features.
[0044] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein such computer-readable storage means contain program-code means for implementation of one or more steps of the method when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC
and an FPGA, or at least one microprocessor and at least one memory with software modules located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs, GPUs and edge computing devices.
[0045] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various modules described herein may be implemented in other modules or combinations of other modules. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. Furthermore, one or more computer-
readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e. non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[0046] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
WE CLAIM:
1. A processor implemented method (200), the method comprising:
receiving (202), by one or more hardware processors, a time series data with a plurality of instances, wherein the time series data comprises one of, a univariate time series data and a multivariate time series data;
computing (204), by the one or more hardware processors, a feature matrix comprising a plurality of features based on the time series data using at least one of a signal processing technique, a statistical model and an information theory technique;
generating (206), by the one or more hardware processors, a transposed feature matrix by interchanging a plurality of rows and a plurality of columns associated with the feature matrix;
generating (208), by the one or more hardware processors, a latent
feature matrix by computing an Auto Encoded Compact Sequence (AECS)
representation on the transposed feature matrix using a multi-layer
undercomplete auto encoder;
generating (210), by the one or more hardware processors, a plurality of consistent hierarchical clusters based on the latent feature matrix using a hierarchical clustering technique, wherein the latent feature matrix comprises a plurality of auto encoded compact feature representations;
computing (212), by the one or more hardware processors, an auto-correlation score between each of the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters;
obtaining (214), by the one or more hardware processors, a plurality of unique auto encoded features from the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters based on the corresponding autocorrelation score, wherein the plurality of auto encoded compact features with the autocorrelation score more than a predetermined threshold are removed;
computing (216), by the one or more hardware processors, a triangular distance matrix for each of the plurality of unique auto encoded
features corresponding to each of the plurality of hierarchical clusters, wherein each distance measure in the triangular distance matrix is a distance measure between a pair of unique auto encoded features; and
selecting (218), by the one or more hardware processors, a plurality of discriminative features from the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters based on the triangular distance matrix by:
obtaining a plurality of sorted unique feature pairs by sorting the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters in descending order based on a corresponding distance measure associated with the triangular distance matrix; and
selecting the plurality of discriminative features from the plurality of sorted unique feature pairs based on a predefined selection threshold, wherein a first n number of sorted unique feature pairs are selected as the plurality of discriminative features and, wherein n is the predefined selection threshold.
2. The method as claimed in claim 1, wherein the feature matrix comprises a plurality of rows and a plurality of columns, wherein each of the plurality of rows comprise an instance of the time series data, wherein each of the plurality of columns comprises a feature associated with the corresponding instance of the time series data.
3. The method as claimed in claim 1, wherein the consistent hierarchical clusters are generated by iteratively clustering the plurality of compact feature representations until no further feature sub grouping is possible.
4. The method as claimed in claim 1, wherein the discriminative features are in latent feature space.
5. The method as claimed in claim 1, wherein the discriminative features are
selected by forming one of, a lower triangular distance matrix and an upper
triangular distance matrix using the plurality of auto encoded compact
feature representations of plurality of hierarchical clusters exploiting the
recommended distance measure during clustering.
6. A system (100) comprising:
at least one memory (104) storing programmed instructions; one or more Input /Output (I/O) interfaces (112); and one or more hardware processors (102) operatively coupled to the at least one memory (104), wherein the one or more hardware processors (102) are configured by the programmed instructions to:
receive a time series data with a plurality of instances, wherein the time series data comprises one of, a univariate time series data and a multivariate time series data;
compute a feature matrix comprising a plurality of features based on the time series data using at least one of a signal processing technique, a statistical model and an information theory technique;
generate a transposed feature matrix by interchanging a plurality of rows and a plurality of columns associated with the feature matrix;
generate a latent feature matrix by computing an Auto Encoded
Compact Sequence (AECS) representation on the transposed feature
matrix using a multi-layer undercomplete auto encoder;
generate a plurality of consistent hierarchical clusters based on the latent feature matrix using a hierarchical clustering technique, wherein the latent feature matrix comprises a plurality of auto encoded compact feature representations;
compute an auto-correlation score between each of the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters;
obtain a plurality of unique auto encoded features from the plurality of auto encoded compact features corresponding to each of the plurality of hierarchical clusters based on the corresponding autocorrelation score, wherein the plurality of auto encoded compact features with the autocorrelation score more than a predetermined threshold are removed;
compute a triangular distance matrix for each of the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters, wherein each distance measure in the triangular distance matrix is a distance measure between a pair of unique auto encoded features; and
select a plurality of discriminative features from the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters based on the triangular distance matrix by:
obtaining a plurality of sorted unique feature pairs by sorting the plurality of unique auto encoded features corresponding to each of the plurality of hierarchical clusters in descending order based on a corresponding distance measure associated with the triangular distance matrix; and
selecting the plurality of discriminative features from the plurality of sorted unique feature pairs based on a predefined selection threshold, wherein a first n number of sorted unique feature pairs are selected as the plurality of discriminative features and, wherein n is the predefined selection threshold.
7. The system of claim 1, wherein the feature matrix comprises a plurality of rows and a plurality of columns, wherein each of the plurality of rows comprise an instance of the time series data, wherein each of the plurality of columns comprises a feature associated with the corresponding instance of the time series data.
8. The system of claim 1, wherein the consistent hierarchical clusters are generated by iteratively clustering the plurality of compact feature representations until no further feature sub grouping is possible.
9. The system of claim 1, wherein the discriminative features are in latent feature space.
10. The system of claim 1, wherein the discriminative features are selected by
forming one of, a lower triangular distance matrix and an upper triangular
distance matrix using the plurality of auto encoded compact feature
representations of plurality of hierarchical clusters exploiting the
recommended distance measure during clustering.
| # | Name | Date |
|---|---|---|
| 1 | 202121044014-STATEMENT OF UNDERTAKING (FORM 3) [28-09-2021(online)].pdf | 2021-09-28 |
| 2 | 202121044014-REQUEST FOR EXAMINATION (FORM-18) [28-09-2021(online)].pdf | 2021-09-28 |
| 3 | 202121044014-FORM 18 [28-09-2021(online)].pdf | 2021-09-28 |
| 4 | 202121044014-FORM 1 [28-09-2021(online)].pdf | 2021-09-28 |
| 5 | 202121044014-FORM 1 [28-09-2021(online)]-1.pdf | 2021-09-28 |
| 6 | 202121044014-FIGURE OF ABSTRACT [28-09-2021(online)].jpg | 2021-09-28 |
| 7 | 202121044014-DRAWINGS [28-09-2021(online)].pdf | 2021-09-28 |
| 8 | 202121044014-DECLARATION OF INVENTORSHIP (FORM 5) [28-09-2021(online)].pdf | 2021-09-28 |
| 9 | 202121044014-COMPLETE SPECIFICATION [28-09-2021(online)].pdf | 2021-09-28 |
| 10 | 202121044014-FORM-26 [21-10-2021(online)].pdf | 2021-10-21 |
| 11 | Abstract1.jpg | 2021-12-01 |
| 12 | 202121044014-Proof of Right [11-01-2022(online)].pdf | 2022-01-11 |