Sign In to Follow Application
View All Documents & Correspondence

Method And System For Detecting Point Anomalies In Time Series Data

Abstract: A method (300) for detecting point anomalies in time-series data is disclosed. The method (300) includes receiving time series data (208) of a plurality of time units; allocating each of the plurality of points in a corresponding time bucket from the set of time buckets, based on an associated time value in the time series data (208); for each time bucket, determining a moving average and corresponding variance of a set of points allocated in the time bucket; calculating a distance metric of each of the set of points in the time bucket based on the moving average; comparing the distance metric of the each of the set of points with a pre-determined threshold value of the time bucket; for each point of the set of points, classifying the point as one of a point anomaly or a non-anomaly in the time bucket based on the comparison. [To be published with FIG. 3]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
09 September 2025
Publication Number
39/2025
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

HCL Technologies Limited
806, Siddharth, 96, Nehru Place, New Delhi, 110019, India

Inventors

1. Vedasamhitha Challapalli
Flat no 1505, Tower 12, NCC Urban One, near Narsingi Junction, Kokapet, Telangana, 500075, India
2. Rupesh Prasad
House No-1259, Part no-1, Gali No-86, Tri Nagar, New Delhi, 110035, India
3. Atul Singh
C406 SJR Brooklyn, ITPL Main Road Bengaluru, Karnataka, 560037, India
4. Arvind Maurya
B1/212, T-9, Silvercity-2, Sec-Pi2, Greater Noida, Uttar Pradesh, 201310, India
5. Archana Ganji
D/O Rachappa, #5128/1, kanyal agasi, Betgeri, Gadag, Karnataka, 582102, India
6. Swathika Nivedini
No: 141, Eraniyan Nagar, V.A.O Street, Opposite Shakthi School, Sirkali, Mayiladuthurai dist, Tamil Nadu, 609110, India
7. Navin Sabharwal
N 3A Jangpura Extension New Delhi, 110014, India

Specification

Description:DESCRIPTION
Technical Field
This disclosure relates generally to anomaly detection, and more particularly to a method and system for detecting point anomalies in time-series data.
Background
Anomaly detection plays an important role in identifying unusual events or observations in data that statistically differ significantly from rest of the data. Such unusual events or observations are anomalies that may indicate potential problems, such as credit card fraud, system failures, or cyberattacks. In time series data, the anomalies may be real anomalies or contextual anomalies (or contextual spikes). The contextual anomalies may correspond to data points that deviate significantly within a specific context but appear normal outside of that context. In other words, the contextual anomalies may only be anomalous within some specific contexts/situations. For example, if users of a particular machine typically start their day at 10 AM, and a CPU usage of 30% is observed for the machine around 10 AM, then the CPU usage may not indicate an anomaly. However, if the CPU usage of 30% is observed for the machine at 6 AM (i.e., after business hours), the CPU usage may indicate an anomaly and may raise suspicion of a cyberattack on the machine.
In the present state of the art, various time series techniques are being used to detect contextual anomalies, such as ARIMA, Kalman Filters, and Facebook® Prophet. These techniques may identify trends and deviations in the time series data. However, the existing techniques require up to 6 cycles of seasonality data to be trained to capture seasonality in data. Moreover, the existing techniques may require hyperparameter tuning and prior assumptions about data distribution, which may not be feasible in dynamic and evolving environments.
By way of an example, a goal may be to detect contextual anomalies within device usage data (e.g., CPU or memory) on devices with limited resources. The computational requirements of training the time series techniques on the device usage data may exceed the capabilities of resource-constrained environments, such as a consumer laptop. A static threshold is typically used to address the resource limitations. However, this approach fails to capture contextual anomalies.
The present invention is directed to overcome one or more limitations stated above or any other limitations associated with the known arts.

SUMMARY
In one embodiment, a method for detecting point anomalies in time-series data is disclosed. In one example, the method may include receiving time series data of a plurality of time units. It should be noted that the time series data may include a plurality of points of each of one or more variables. It should also be noted that each of the plurality of time units may include a set of predefined time buckets. The method may further include allocating each of the plurality of points in a corresponding time bucket from the set of time buckets, based on an associated time value in the time series data. For each time bucket of the set of time buckets, the method may further include determining a moving average and a corresponding variance of a set of points allocated in the time bucket. It should be noted that the moving average is one of a sliding window average and an Exponentially Weighted Moving Average (EMWA). The method may further include calculating a distance metric of each of the set of points in the time buckets based on the moving average. The method may further include comparing the distance metric of each of the set of points with a pre-determined threshold value of the time bucket. It should be noted that the pre-determined threshold value is based on the moving average. For each point of the set of points, the method may further include classifying the point as one of a point anomaly or a non-anomaly in the time bucket based on the comparison. It should be noted that the point anomaly corresponds to an outlier point in a data distribution of the set of points.
In another embodiment, a system for detecting point anomalies in time-series data is disclosed. In one example, the system may include a processor and a computer-readable medium communicatively coupled to the processor. The computer-readable medium may store processor-executable instructions, which, on execution, may cause the processor to receive time series data of a plurality of time units. It should be noted that the time series data may include a plurality of points of each of one or more variables. It should also be noted that each of the plurality of time units may include a set of predefined time buckets. The processor-executable instructions, on execution, may further cause the processor to allocate each of the plurality of points in a corresponding time bucket from the set of time buckets, based on an associated time value in the time series data. For each time bucket of the set of time buckets, the processor-executable instructions, on execution, may further cause the processor to determine a moving average and a corresponding variance of a set of points allocated in the time bucket. It should be noted that the moving average is one of a sliding window average and an EMWA. The processor-executable instructions, on execution, may further cause the processor to calculate a distance metric of each of the set of points in the time bucket based on the moving average. The processor-executable instructions, on execution, may further cause the processor to compare the distance metric of the each of the set of points with a pre-determined threshold value of the time bucket. It should be noted that the pre-determined threshold value is based on the moving average and the variance. For each point of the set of points, the processor-executable instructions, on execution, may further cause the processor to classify the point as one of a point anomaly or a non-anomaly in the time bucket based on the comparison. It should be noted that the point anomaly corresponds to an outlier point in a data distribution of the set of points.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles.
FIG. 1 is a block diagram of an exemplary system for detecting point anomalies in time-series data, in accordance with some embodiments of the present disclosure.
FIG. 2 illustrates a functional block diagram of a system for detecting point anomalies in time-series data, in accordance with some embodiments of the present disclosure.
FIG. 3 illustrates a flow diagram of an exemplary process for detecting point anomalies in time-series data, in accordance with some embodiments of the present disclosure.
FIG. 4 illustrates a flow diagram of an exemplary process for classifying a new point as point anomaly or non-anomaly in a time bucket, in accordance with some embodiments of the present disclosure.
FIG. 5A illustrates an exemplary confusion matrix showing results of anomaly detection analysis for contextual spikes in a univariate scenario, in accordance with some embodiments of the present disclosure.
FIG. 5B illustrates an exemplary confusion matrix showing results of anomaly detection analysis for real anomalies in a univariate scenario, in accordance with some embodiments of the present disclosure.
FIG. 6A illustrates an exemplary confusion matrix showing results of anomaly detection analysis for contextual spikes in a multivariate scenario, in accordance with some embodiments of the present disclosure.
FIG. 6B illustrates an exemplary confusion matrix showing results of anomaly detection analysis for real anomalies in a multivariate scenario, in accordance with some embodiments of the present disclosure.
FIG. 7A illustrates an exemplary comparison table representing performance data of anomaly detection analysis with pre-existing anomaly detection analysis in the univariate scenario, in accordance with some embodiments of the present disclosure.
FIG. 7B illustrates an exemplary comparison table representing performance data of anomaly detection analysis with pre-existing anomaly detection analysis in the multivariate scenario, in accordance with some embodiments of the present disclosure.
FIG. 8A illustrates an exemplary comparison table representing training and inference time of anomaly detection analysis with pre-existing anomaly detection analysis in the univariate scenario, in accordance with some embodiments of the present disclosure.
FIG. 8B illustrates an exemplary comparison table representing training and inference time of anomaly detection analysis with pre-existing anomaly detection analysis in the multivariate scenario, in accordance with some embodiments of the present disclosure.
FIG. 9 is a block diagram of an exemplary computer system for implementing embodiments consistent with the present disclosure.
DETAILED DESCRIPTION
Exemplary embodiments are described with reference to the accompanying drawings. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the spirit and scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope and spirit being indicated by the following claims.
Referring now to FIG. 1, an exemplary system 100 for detecting point anomalies in time-series data is illustrated, in accordance with some embodiments of the present disclosure. The system 100 may include a computing device 102. The computing device 102 may be, for example, but may not be limited to, server, desktop, laptop, notebook, netbook, tablet, smartphone, mobile phone, or any other computing device, in accordance with some embodiments of the present disclosure. The computing device 102 may classify points as one of a point anomaly or a non-anomaly in time series data using time buckets based on moving average and variance. The time series data may be univariate or multivariate. If any new point may be added to previously existing points in the time series data, the computing device 102 may update the moving average and the variance based on the new point. Further, the computing device 102 may classify the new point as one of the point anomaly or the non-anomaly in the time bucket based on the updated moving average and the variance.
As will be described in greater detail in conjunction with FIGS. 2 – 9, the computing device 102 may receive time series data of a plurality of time units. By way of an example, a time unit may be a day, a week, a month, or the like. It should be noted that the time series data may include a plurality of points of one or more variables. It should also be noted that each of the plurality of time units may include a set of predefined time buckets. The computing device 102 may further allocate each of the plurality of points in a corresponding time bucket from the set of time buckets, based on an associated time value in the time series data. For each time bucket of the set of time buckets, the computing device 102 may further determine a moving average and a corresponding variance of a set of points allocated in the time bucket. It should be noted that the moving average is one of a sliding window average and an Exponentially Weighted Moving Average (EWMA). The computing device 102 may further calculate a distance metric of each of the set of points in the time bucket based on the moving average. The computing device 102 may further compare the distance metric of the each of the set of points with a pre-determined threshold value of the time bucket. It should be noted that the pre-determined threshold value is based on the moving average and the variance. For each point of the set of points, the computing device 102 may further classify the point as one of a point anomaly or a non-anomaly in the time bucket based on the comparison. It should be noted that the point anomaly corresponds to an outlier point in a data distribution of the set of points.
In some embodiments, the computing device 102 may include one or more processors 104 and a memory 106. Further, the memory 106 may store instructions that, when executed by the one or more processors 104, may cause the one or more processors 104 to detect point anomalies in time-series data, in accordance with aspects of the present disclosure. The memory 106 may also store various data (for example, time series data, a plurality of points, distance metric, moving average, variance, a set of time buckets, anomaly points, non-anomaly points, and the like) that may be captured, processed, and/or required by the system 100. The memory 106 may be a non-volatile memory (e.g., flash memory, Read Only Memory (ROM), Programmable ROM (PROM), Erasable PROM (EPROM), Electrically EPROM (EEPROM) memory, etc.) or a volatile memory (e.g., Dynamic Random Access Memory (DRAM), Static Random-Access memory (SRAM), etc.).
The system 100 may further include a display 108. The system 100 may interact with a user interface 110 accessible via the display 108. The system 100 may also include one or more external devices 112. In some embodiments, the computing device 102 may interact with the one or more external devices 112 over a communication network 114 for sending or receiving various data. The communication network 114 may include, for example, but may not be limited to, a wireless fidelity (Wi-Fi) network, a light fidelity (Li-Fi) network, a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a satellite network, the internet, a fiber optic network, a coaxial cable network, an infrared (IR) network, a radio frequency (RF) network, and a combination thereof. The one or more external devices 112 may include, but may not be limited to, a remote server, a laptop, a netbook, a notebook, a smartphone, a mobile phone, a tablet, or any other computing device.
Referring now to FIG. 2, a functional block diagram of a system 200 for detecting point anomalies in time-series data is illustrated, in accordance with some embodiments of the present disclosure. FIG. 2 is explained in conjunction with FIG. 1. The system 200 may be analogous to the system 100. The system 200 may implement the computing device 102. The system 200 may include, within the memory 106, a time series data receiving module 202, a bucketing module 204, and a classifying module 206.
Initially, the time series data receiving module 202 may receive time series data 208 of a plurality of time units (for example, days, weeks, months). It should be noted that the time series data 208 may include a plurality of points of each of one or more variables. Each of the plurality of points may include a variable value and a corresponding time value in the time series data 208. As will be appreciated, the time series data 208 may be univariate data when the time series data 208 includes the plurality of points of one variable. The time series data 208 may be multivariate data when the time series data 208 includes the plurality of points of each of two or more variables.
The time series data 208 may be obtained from a plurality of data sources. By way of an example, the one or more variables in the time series data 208 may correspond to one or more computational metrics (such as, CPU load, CPU utilization, CPU time, memory usage, flops, clock, idle time, GPU utilization, GPU memory usage, power, memory bandwidth, etc.) obtained from a plurality of computing devices in an organization. It should also be noted that each of the plurality of time units may divided into a set of predefined time buckets of a fixed size (for example, 20 minutes, 30 minutes, 40 minutes, 60 minutes, or the like). By way of an example, system usage data of 30 days (i.e., time units) may be received from a laptop in form of the time series data 208. The system usage data of each day may be further divided into time buckets of 30 minutes each (i.e., 48 time buckets per day). Further, the time series data receiving module 202 may send the time series data 208 to the bucketing module 204.
Further, the bucketing module 204 may allocate each of the plurality of points in a corresponding time bucket from the set of time buckets, based on the associated time value of the point in the time series data 208. Further, for each time bucket of the set of time buckets, the bucketing module 204 may determine a moving average and a corresponding variance of a set of points allocated in the time bucket. It should be noted that when the time series data 208 is univariate data, the moving average may be one of a sliding window average and an EWMA. When the time series data 208 is multivariate data, the moving average may be an EWMA mean vector. In some embodiments, if any new point may be added in the set of points, the bucketing module 204 may update the moving average and the variance based on the new point.
When the moving average is the sliding window average, the sliding window average may be the average of points over a fixed-size time window (e.g., last ‘n’ days, where ‘n’ corresponds to 30 days). When a new point is added to the set of points, the bucketing module 204 may update the sliding window average and the corresponding sliding window variance instead of recalculating the full sum at each step. A smart adjustment formula may be used to update the average only by considering the outgoing and incoming values in the window. If a user may want to compute the sliding window average of the last ‘n’ values on the day ‘t’ (i.e., from day ‘t - n + 1’ to ‘t’). The traditional formula for calculating the sliding window average is described in equation (1).
x ̅_(t-n+1→t)=1/n ∑_(i=t-n+1)^t▒x_i
(1)
However, instead of computing the entire sum from the scratch each time, the bucketing module 204 may use an efficient update rule to update the sliding window average based on the previous average using equation (2).
x ̅_(t-n+1→t)=x ̅_(t-n→t-1)+ (x_t-x_(t-n))/n
(2)
Where, ‘x̄ₜ₋ₙ → ₜ’ corresponds to the current average (from day ‘t – n + 1’ to day ‘t’), ‘x̄ₜ₋ₙ → ₜ₋₁’ corresponds to the previous average (from day ‘t – n’ to day ‘t – 1’), ‘xₜ₋ₙ’ corresponds to value exiting the window, ‘xₜ’ corresponds to new value entering the window, and ‘n’ corresponds to the window size (e.g., 30 days).
To efficiently calculate the corresponding sliding window variance at time ‘t’ over the window of size ‘n’ for current window (i.e., from day ‘t – n + 1’ to day ‘t’), the pre-determined average (i.e., ‘μₜ₋ₙ→ₜ₋₁’) and variance (i.e., ‘σ²ₜ₋ₙ→ₜ₋₁’) of the previous window (from day ‘t – n’ to day ‘t – 1’) may be used. In other words, to update the sliding window average and the corresponding sliding window variance, the bucketing module 204 may add the new point to the set of points in the time bucket. Further, the bucketing module 204 may simultaneously remove an oldest of the set of points from the time bucket to obtain an updated set of points. Upon obtaining the updated set of points, the bucketing module 204 may determine an updated sliding window average and an updated sliding window variance for the updated set of points in the time bucket.
A general algebraic form of variance is stated in equation (3).
σ^2=1/n ∑_(i=1)^n▒〖x_i^2-μ^2 〗
(3)
Where, ‘μ’ corresponds to mean over the time window and ‘Σxᵢ²’ = corresponds to sum of squares.
Thus, to calculate the variance at time ‘t’ for the current window, equation (4) may be used.
〖σ^2〗_(t-n+1→t)=1/n ∑_(i=t-n+1)^t▒〖x_i^2-μ_(t-n+1→t)^2 〗
(4)
Thus, two components are required to calculate the variance at time ‘t’. First component is a sum of squares of the plurality of points in the current time window. Second component is the average at time ‘t’ that may be calculated using the equation (2). To calculate the sum of squares of the plurality of points in the current time window, equation (5) may be used.
∑_(i=t-n+1)^t▒x_i^2 =∑_(i=t-n)^(t-1)▒〖x_i^2+x_t^2-x_(t-n)^2 〗
(5)
Thus, for an initial window, the sum of squares may have to be calculated by individually adding the squares of each of the plurality of points in that window. However, for the next window, the previously determined sum of squares of the initial window may be updated by using the equation (5) by adding the square of a newly added point in the next window and subtracting the square of a first point of the initial window. In other words, from the value of sum of squares at time ‘t-1’, a new value entering the window may be included and the value exiting the window may be excluded, to obtain the sum of squares at time ‘t’ (i.e., the updated sum of squares).
Further, the updated sum of squares (obtained from the equation (5)) and the new mean (obtained from the equation (2)) may be substituted into the equation (4). Thus, a final variance update formula may be obtained as described in equation (6).
〖σ^2〗_(t-n+1→t)=1/n (∑_(i=t-n)^(t-1)▒〖x_i^2+x_t^2-x_(t-n)^2 〗)-μ_(t-n+1→t)^2
(6)
The sliding window average may be accurate for fixed-window analysis. Additionally, the sliding window average may be intuitive and simple to implement.
When the moving average is the EWMA, initially, a simple average of the first ‘n’ values may be calculated as a baseline. Further, the simple average may be updated as the EWMA using equation (7).
μ_t= αx_t+(1-α) μ_(t-1)
(7)
Where, ‘xt’ corresponds to observed value at time ‘t’ (i.e., value of a point at the time value ‘t’), ‘μt’ corresponds to the EWMA at time ‘t’, ‘μt-1’ corresponds to the EWMA at time ‘t-1’, and ‘α’ corresponds to a predefined smoothing factor (0 < α < 1). It should be noted that a value of ‘α’ close to ‘1’ may give more weight to recent values, while a smaller ‘α’ may emphasize historical trends.
Further, the EWMA variance may be calculated as a weighted average of past squared deviations from the (EWMA) mean using equation (8).
σ_t^2= α(x_t- μ_(t-1) )^2+(1-α) σ_(t-1)^2
(8)
Where, ‘xt’ corresponds to newly observed value at time ‘t’, ‘μt-1’ corresponds to EWMA at time ‘t-1’ (i.e., the previous EWMA), ‘σ²’ corresponds to exponentially weighted variance at time ‘t’, and ‘α’ corresponds to the predefined smoothing factor (0 < α < 1).
In other words, when the moving average is the EWMA, the bucketing module 204 may update the EWMA and a corresponding Exponentially Weighted Variance (EWV). To update the EWMA and the corresponding EWV, the bucketing module may determine the updated EWMA based on the new point, the determined EWMA, and a predefined smoothing factor. Further, the bucketing module 204 may determine the updated EWV based on the new point, the determined EWMA, the determined variance, and the predefined smoothing factor.
Advantageously, the EWMA may be time efficient and memory efficient. Also, the EWMA may be ideal for adaptive and streaming/online systems (e.g., spot real-time CPU usage anomaly on a server), where responsiveness and memory efficiency are critical.
When the time series data 208 is multivariate data, an EWMA mean vector may be calculated for the plurality of points in the window at time ‘t’ using equation (9).
μ_t= αx_t+(1-α) μ_(t-1)
(9)
Where, ‘μt’ corresponds to the EWMA mean vector at time ‘t’, ‘xt’ corresponds to newly observed value (vector) at time ‘t’, ‘μt-1’ corresponds to the EWMA mean vector at time ‘t-1’ (i.e., previous EWMA mean vector), and ‘α’ corresponds to the predefined smoothing factor (0 < α < 1). This may ensure that recent values in the time series data 208 may have more influence while historical trends may also be retained.
Further, an EWMA covariance matrix may be calculated for the plurality if points in the window at time ‘t’ using equation (10).
Σ_t= α〖(x〗_t- μ_(t-1))〖〖(x〗_t- μ_(t-1))〗^T +(1-α) Σ_(t-1)
(10)
Where, ‘Σt’ corresponds to the EWMA covariance matrix at time ‘t’, ‘xt’ corresponds to newly observed value (vector) at time ‘t’, ‘μt-1’ corresponds to the EWMA mean vector at time ‘t-1’ (i.e., previous EWMA mean vector), and ‘α’ corresponds to the predefined smoothing factor (0 < α < 1). It should be noted that a product of the deviation vector (i.e., 〖(x〗_t- μ_(t-1))〖〖(x〗_t- μ_(t-1))〗^T) is used in the equation (10) to capture both variance and cross-variable covariance, adjusting smoothly over time.
Further, the bucketing module 204 may calculate a distance metric of each of the set of points in the time bucket based on the moving average. When the time series data 208 corresponds to the univariate data, the distance metric may be the Z-score distance. When the time series data 208 corresponds to the multivariate data, the distance metric may be the Mahalanobis distance.
Thus, when the time series data 208 corresponds to the univariate data, the bucketing module 204 may calculate the Z-score for each of the set of points in each time bucket based on the moving average. The Z-score may standardize data points by measuring how far a point is from the mean (i.e., the moving average) of the distribution in terms of standard deviations. The Z-score may assume the data follows a normal distribution and highlights deviations that are significantly different from the expected range. This makes the Z-score more effective for detecting point anomalies in the univariate (i.e., single-dimensional) data with a single parameter. The Z-score may be calculated using the equation (11).
Z=((X-μ))/σ
(11)
Where, ‘X’ corresponds to the data point being evaluated within the time bucket, ‘μ’ corresponds to the mean (i.e., the moving average) of the time bucket, and ‘σ’ corresponds to the standard deviation of the time bucket.
When the time series data 208 corresponds to the univariate data,, the bucketing module 204 may calculate the Mahalanobis distance of each of the set of points in each time bucket based on the moving average. The Mahalanobis distance may account for correlations between features and adjust for different variances. The Mahalanobis distance may measure how far a point is from the mean (i.e., the moving average) of a distribution while considering the overall shape of the data. This makes the Mahalanobis distance capable of capturing both variance and cross-variable covariance for detecting anomalies in high-dimensional data with multiple parameters. The Mahalanobis distance may be calculated using equation (12).
d_M (x)= √((x-μ)^T S^(-1) (x-μ))
(12)
Where, ‘x’ corresponds to the data point being evaluated (a vector of multiple variables) in the time bucket, ‘μ’ corresponds to the mean vector (i.e., moving average vector) of the time bucket, ‘S’ corresponds to the covariance matrix of the time bucket, and ‘S−1’ corresponds to the inverse of the covariance matrix.
The Mahalanobis distance may capture correlations considering the covariance between features. Additionally, the Mahalanobis distance may adapt to changes in the data distribution and accurately detect anomalies in high dimensional data.
Further, the bucketing module 204 may compare the distance metric of each of the set of points with a pre-determined threshold value of the time bucket. It should be noted that the pre-determined threshold value is based on the moving average and the variance.
Further, for each point of the set of points, the classifying module 206 may classify the point as one of a point anomaly (i.e., a real anomaly) or a non-anomaly (i.e., a contextual spikes) in the time bucket based on the comparison. It should be noted that the point anomaly corresponds to an outlier point in a data distribution of the set of points. The point anomaly may be genuine deviations from expected behavior of the system 200. The point anomaly may not follow any pattern (or occur at any random point of the set of points in time). The point anomaly may indicate the potential issues that require immediate attention. For example, the potential issues, may be, but may not be limited, system failures, security threats, abnormal resource consumption, or any other critical problems.
Additionally, when a new point is added to the set of points, the bucketing module 204 may determine an updated threshold value of the time bucket based on the updated moving average and the updated variance. Further, the classifying module 206 may classify the new point as one of the point anomaly or the non-anomaly in the time bucket based on the comparison with the updated threshold value.
When the time series data 208 corresponds to the univariate data, the pre-determined threshold value may be selected based on the statistical properties of the data distribution. The Z-score may measure how far a data point deviates from the mean in terms of standard deviations. For example, ‘±3 Sigma Rule’ may be used to select the pre-determined threshold value.
As will be appreciated, in a normal distribution, most of the data points may be concentrated around the mean and their spread is determined by the standard deviation (σ). According to the empirical rule for normal distribution, 68.27% of the data points lie within ±1σ, 95.45% of the data points lie within ±2σ, and 99.73% of the data points lie within ±3σ.
Thus, as per the ‘±3 Sigma Rule’, since only 0.27% of data points lie beyond ±3σ, values outside this range may be considered as point anomalies. Hence, the pre-determined threshold value range may be taken as ‘μ ± 3σ’.
When the time series data 208 corresponds to the multivariate data, point anomalies are detected based on the Mahalanobis distance. As will be appreciated, the Mahalanobis distance follows a chi-square (χ²) distribution. If the time series data 208 may follow a multivariate normal distribution, then the squared Mahalanobis distance may also follow the chi-square (χ²) distribution with degrees of freedom ‘d’ equal to the number of features (or variables) in the time series data 208.
Thus, the pre-determined threshold value may be obtained based on the chi-square distribution. Any point with squared Mahalanobis distance greater than the threshold chi-square value may be considered a point anomaly. The pre-determined threshold value is set at a high confidence level, e.g., 99.7% (α = 0.003). At this confidence interval, approximately 0.3% of the extreme points may be considered as outliers (or point anomalies). Thus, the classifying module 206 may provide an output 210 including bucketed time series data and point anomalies detected (if any) for each time bucket.

Through the use of time buckets, any recurring fluctuations in system metrics (such as, Central Processing Unit (CPU) or memory usage) that occur within specific time bucket may not be classified as point anomalies. This is because these spikes are not necessarily anomalies in the broader sense but may instead be expected variations due to routine system behavior, scheduled tasks, or user activity patterns.

It should be noted that all such aforementioned modules 202 – 206 may be represented as a single module or a combination of different modules. Further, as will be appreciated by those skilled in the art, each of the modules 202 – 206 may reside, in whole or in parts, on one device or multiple devices in communication with each other. In some embodiments, each of the modules 202 – 206 may be implemented as dedicated hardware circuit comprising custom application-specific integrated circuit (ASIC) or gate arrays, off-the-shelf semiconductors such as logic chips, transistors, or other discrete components. Each of the modules 202 – 206 may also be implemented in a programmable hardware device such as a field programmable gate array (FPGA), programmable array logic, programmable logic device, and so forth. Alternatively, each of the modules 202 – 206 may be implemented in software for execution by various types of processors (e.g., processor 104). An identified module of executable code may, for instance, include one or more physical or logical blocks of computer instructions, which may, for instance, be organized as an object, procedure, function, or other construct. Nevertheless, the executables of an identified module or component need not be physically located together but may include disparate instructions stored in different locations which, when joined logically together, include the module and achieve the stated purpose of the module. Indeed, a module of executable code could be a single instruction, or many instructions, and may even be distributed over several different code segments, among different applications, and across several memory devices.
As will be appreciated by one skilled in the art, a variety of processes may be employed for detecting point anomalies in time-series data. For example, the exemplary system 100 and the associated computing device 102 may detect point anomalies in time-series data by the processes discussed herein. In particular, as will be appreciated by those of ordinary skill in the art, control logic and/or automated routines for performing the techniques and steps described herein may be implemented by the system 100 and the associated computing device 102 either by hardware, software, or combinations of hardware and software. For example, suitable code may be accessed and executed by the one or more processors on the system 100 to perform some or all of the techniques described herein. Similarly, application specific integrated circuits (ASICs) configured to perform some or all of the processes described herein may be included in the one or more processors on the system 100.
Referring now to FIG. 3, an exemplary process 300 for detecting point anomalies in time-series data is depicted via a flow chart, in accordance with some embodiments of the present disclosure. FIG. 3 is explained in conjunction with FIG. 1 and FIG. 2. The process 300 may be implemented by the computing device 102 of the system 100. In some embodiments, the process 300 may include receiving, by a time series data receiving module (such as the time series data receiving module 202), time series data (such as the time series data 208) of a plurality of time units, at step 302. It should be noted that the time series data may include a plurality of points of each of one or more variables. It should also be noted that each of the plurality of time units may include a set of predefined time buckets. Upon receiving the time series data, the process 300 may include allocating, by a bucketing module (such as the bucketing module 204), each of the plurality of points in a corresponding time bucket from the set of time buckets, based on the associated time value in the time series data, at step 304.
Further, for each time bucket of the set of time buckets, the process 300 may include determining, by the bucketing module, a moving average and a corresponding variance of a set of points allocated in the time bucket, at step 306. It may be noted that the moving average is one of a sliding window average and a EWMA. This step is explained sequentially in greater detail in conjunction with FIG. 4 and FIG. 5.
Upon determining the moving average and the corresponding variance, the process 300 may include calculating, by the bucketing module, a distance metric of each of the set of points in the time bucket based on the moving average, at step 308. Further, the process 300 may include comparing, by the bucketing module, the distance metric of each of the set of points with a pre-determined threshold value of the time bucket, at step 310. It should be noted that the pre-determined threshold value is based on the moving average and the variance.
Further, for each point of the set of points, the process 300 may include classifying, by a classifying module (such as the classifying module 206), the point as one of a point anomaly or a non-anomaly (such as the output 210) in the time bucket based on the comparison, at step 312. It should be noted that the point anomaly corresponds to an outlier point in a data distribution of the set of points.
Referring now to FIG. 4, an exemplary process 400 for classifying a new point as point anomaly or non-anomaly in a time bucket is depicted via a flow chart, in accordance with some embodiments of the present disclosure. FIG. 4 is explained in conjunction with FIGS. 1, 2, and 3. The process 400 may be implemented by the computing device 102 of the system 100. In some embodiments, when any new point is added to a set of points, the process 400 may include updating, by a bucketing module (such as the bucketing module 204), a moving average and a variance based on a new point, at step 402.
When the moving point is a sliding window average, the process 400 may include adding, by the bucketing module, the new point to the set of points in a time bucket, at step 404. Upon adding the new point to the set of points, the process 400 may include simultaneously removing, by the bucketing module, an oldest of the set of points from the time bucket to obtain an updated set of points, at step 406. Further, the process 400 may include determining, by the bucketing module, an updated sliding window average and an updated sliding window variance for the updated set of points in the time bucket, at step 408.
When the moving average is an EWMA, the process 400 may include updating the EWMA and a corresponding Exponentially Weighted Variance (EWV), at step 410. The step 410 may include steps 412 and 414. To update the EWMA and the EWV, the process 400 may include determining, by the bucketing module, the updated EWMA based on the new point, the determined EWMA, and a predefined smoothing factor, at step 412. Further, the process 400 may include determining, by the bucketing module, the updated EWV based on the new point, the determined EWMA, the determined variance, and the predefined smoothing factor, at step 414.
Upon updating the moving average and the variance, the process 400 may include determining, by the bucketing module, an updated threshold value of the time bucket based on the updated moving average and the updated variance, at step 416. Further, the process 400 may include classifying, by a classifying module (such as the classifying module 206), the new point as one of a point anomaly or a non-anomaly (such as the output 210) in the time bucket based on a comparison with the updated threshold value, at step 418.
EXAMPLES
By way of an example, to evaluate anomaly detection techniques, a system usage dataset (analogous to the time series data 208) was collected from a laptop over a 4 hour period (i.e., time unit). The system usage dataset includes various system metrics, such as, CPU usage, memory usage, swap usage, and network activity (bytes sent/received). For example, only the CPU usage data were used for univariate experiments. On the other hand, all system usage data (i.e., the CPU usage, the memory usage, the swap usage, and the network activity) were used for multivariate experiments. To simulate a longer operational period, the 4-hour data was extrapolated to 30 days, creating a continuous system usage dataset. Further, the anomalies were artificially introduced to test detection accuracy.
Two types of anomalies were injected in the univariate and multivariate data. The two types of anomalies include contextual spikes (i.e., contextual anomalies) and real anomalies (i.e., point anomalies). The contextual spike anomalies are time-dependent and occur daily at specific time windows (i.e., time buckets). For example, the contextual spike anomalies may occur between ‘09:55 to 10:05’ AM and ‘15:55 to 16:05’ PM. As will be appreciated, in an ideal anomaly detection scenario, the contextual spikes should not be identified as point anomalies since they are repeated every day. So, when considering the context of the spikes, if the spikes are being repeated around the same time bucket (or window) everyday, such spikes are not point anomalies.
The real anomalies were randomly distributed across the 30-day system usage dataset. Additionally, the real anomalies may represent the abrupt and unexpected changes that are independent of time patterns.
In the univariate data, the real anomalies were inserted as extreme values of the CPU usage data. On the other hand, in the multivariate data, the real anomalies were inserted in form of a plurality of combinations. The plurality of combinations included a high memory, low CPU usage combination, a high CPU usage, low Memory, and disk spike combination, a low memory, high swap usage combination, all resources maxed combination, and a CPU spike, memory unchanged combination.
The real anomalies of high memory, low CPU usage combination were based on a rule that the memory utilization spikes to 95-100%, while CPU utilization drops to 0-2%. The real anomalies of the high memory, low CPU usage combination represents cases where memory-intensive background processes run without significant CPU involvement. The real anomalies of the high-CPU, low memory, and disk spike combination were based on a rule that CPU usage increase to 90-100%, memory drop to 0-10%, and disk activity spikes to 90-100%. The real anomalies of the high-CPU, low memory, and disk spike combination simulate intensive computation tasks with high disk access and minimal memory usage. The real anomalies of the low memory, high swap usage combination were based on a rule that memory utilization drops to 0-5%, while swap usage rises to 90-100%, The real anomalies of the low memory, high swap usage combination represent memory exhaustion scenarios where the system relies heavily on swap memory. The real anomalies of the all resource maxed combination were based on a rule that CPU, memory, disk, and swap usage all rise to 98-100%. The real anomalies of the all resource maxed combination simulate system overload conditions where all resources are critically utilized. The real anomalies of the CPU spike, memory unchanged combination were based on a rule that CPU utilization rises to 85-100%, while memory fluctuates slightly within a ±10% range of its previous value The real anomalies of the CPU spike, memory unchanged combination model short bursts of CPU-intensive processes without significant impact on memory.
The anomaly insertion in the system usage dataset allows for realistic testing of anomaly detection techniques by ensuring that injected anomalies resemble real-world irregularities rather than artificial patterns.
The system usage dataset was then be divided into train and test sets (80:20 ratio, i.e., 80% of the dataset formed the train set and 20% of the dataset formed the test set). The time buckets were created. Each day of the 30 days was divided into 48 fixed intervals (i.e., time buckets) of 30 minutes. Each fixed interval started from midnight. The anomaly detection threshold value for each time bucket was determined from the train set by using all the points in the same time bucket of the 30 days as a single data distribution. For example, for the time bucket 9:00 AM to 9:30 AM, the points within this time period for all 30 days were taken as a single dataset and the anomaly detection threshold value was determined for that time bucket accordingly. The test set was bucketed according to the time buckets extracted from the train set, and anomalies were detected in the test set based on the anomaly detection threshold computed using the train set.
Referring now to FIG. 5A, an exemplary confusion matrix 500A showing results of time bucket-based anomaly detection analysis for contextual spikes in univariate test set is illustrated, in accordance with some embodiments of the present disclosure. The confusion matrix 500A provides a comparison of true and false predictions of Z-score anomalies 502A with true and false inserted CPU spikes 504A among a plurality of data points of the CPU usage variable in the system usage dataset. The confusion matrix 500A includes a column for false predicted Z-score anomalies 506A and a column for true predicted Z-score anomalies 508A. Additionally, the confusion matrix 500A includes a row for false inserted CPU spikes 510A and a row for true inserted CPU spikes 512A. Thus, the rows define the actual data and the columns define the predicted data.
‘1679’ data points were the false predicted Z-score anomalies 506A and were also the false inserted CPU spikes 510A. In other words, ‘1679’ data points are ‘True Negatives’ (TN).
‘25’ data points were the true predicted Z-score anomalies 508A but were the false inserted CPU spikes 510A. In other words, ‘25’ data points are ‘False Positives’ (FP).
‘24’ data points were the false predicted Z-score anomalies 506A but were the true inserted CPU spikes 512A. In other words, ‘24’ data points are ‘False Negatives’ (FN).
‘0’ data points were the true predicted Z-score anomalies 508A and were also the true inserted CPU spikes 512A. In other words, ‘24’ data points are ‘True Positives’ (TP).
Thus, the test set included 24 contextual spikes (i.e., inserted CPU spikes) and none of the 24 contextual spikes was falsely detected as an anomaly (i.e., Z-score anomaly). Thus, ‘0’ True Positives in this case denotes effectiveness of the predefined time bucketing in not classifying contextual spikes as anomalies.
Referring now to FIG. 5B, an exemplary confusion matrix 500B showing results of time bucket-based anomaly detection analysis for real anomalies in univariate test set is illustrated, in accordance with some embodiments of the present disclosure. The confusion matrix 500B provides a comparison of true and false predictions of Z-score anomalies 502B with true and false CPU real anomalies 504B among a plurality of data points. The confusion matrix 500B includes a column for false predicted Z-score anomalies 506B and a column for true predicted Z-score anomalies 508B. Additionally, the confusion matrix 500B includes a row for false CPU real anomalies 510A and a row for true CPU real anomalies 512A. Thus, the rows define the actual data and the columns define the predicted data.
‘1703’ data points were the false predicted Z-score anomalies 506B and were also the false CPU real anomalies 510B. In other words, ‘1703’ data points are ‘TN’.
‘1’ data point were the true predicted Z-score anomalies 508B but were the false CPU real 510B. In other words, ‘1’ data point are ‘FP’.
‘0’ data point were the false predicted Z-score anomalies 506B but may be the true CPU real anomalies 512B. In other words, ‘0’ data points are ‘FN’.
‘16’ data points were the true predicted Z-score anomalies 508B and may also be the true CPU real anomalies 512B. In other words, ‘16’ data points are ‘TP’.
Thus, the test set included 16 real anomalies (i.e., CPU real anomalies) and all of the 16 real anomalies were detected as an anomaly (i.e., Z-score anomaly). Thus, ‘16’ True Positives in this case denotes effectiveness of the predefined time bucketing in identifying real anomalies (i.e., point anomalies).
Referring now to FIG. 6A, an exemplary confusion matrix 600A showing results of time bucket-based anomaly detection analysis for contextual spikes in multivariate test set is illustrated, in accordance with some embodiments of the present disclosure. The confusion matrix 600A includes the experimental results of the 30-day synthetic data on 20% testing dataset (i.e., %CPU, %memory, %swap, and %disk). The confusion matrix 600A provides a comparison of true and false predictions of Mahalanobis anomalies 602A with true and false inserted CPU spikes 604A among a plurality of data points. The confusion matrix 600A includes a column for false predicted Mahalanobis anomalies 606A and a column for true predicted Mahalanobis anomalies 608A. Additionally, the confusion matrix 600A includes a row for false inserted CPU spikes 610A and a row for true inserted CPU spikes 612A.
‘1680’ data points were the false predicted Mahalanobis anomalies 606A and were also the false inserted CPU spikes 610A. In other words, ‘1680’ data points are ‘TN’.
‘24’ data points were the true predicted Mahalanobis anomalies 608A but were the false inserted CPU spikes 610A. In other words, ‘24’ data points are ‘FP’.
‘5’ data points may be the false predicted Mahalanobis anomalies 606A but were the true inserted CPU spikes 612A. In other words, ‘5’ data points are ‘FN’.
‘19’ data points were the true predicted Mahalanobis anomalies 608A and were also the true inserted CPU spikes 612A. In other words, ‘19’ data points are ‘TP’.
Thus, the test set included 24 contextual spikes (i.e., inserted CPU spikes) and only 5 of the 24 contextual spikes were falsely detected as an anomaly (i.e., Mahalanobis anomaly). Thus, ‘19’ True Positives in this case denotes effectiveness of the predefined time bucketing in not classifying contextual spikes as anomalies.
Referring now to FIG. 6B, an exemplary confusion matrix 600B showing results of time bucket-based anomaly detection analysis for real anomalies in a multivariate test set is illustrated, in accordance with some embodiments of the present disclosure. The confusion matrix 600B may provide a comparison of true and false predictions of Mahalanobis anomalies 602B with true and false actual CPU real anomalies 604B among a plurality of data points. The confusion matrix 600B may include a column for false predicted Mahalanobis anomalies 606B and a column for true predicted Mahalanobis anomalies 608B. Additionally, the confusion matrix 600B may include a row for false actual CPU real anomalies 610B and a row for true actual CPU real anomalies 612B.
‘1684’ data points were the false predicted Mahalanobis anomalies 606B and were also the false actual CPU real anomalies 610B. In other words, ‘1684’ data points are ‘TN’.
‘31’ data points were the true predicted Mahalanobis anomalies 608B but were the false actual CPU real anomalies 610B. In other words, ‘31’ data points are ‘FP’.
‘1’ data point were the false predicted Mahalanobis anomalies 606B but were the true actual CPU real anomalies 612B. In other words, ‘1’ data point are ‘FN’.
‘12’ data points were the true predicted Mahalanobis anomalies 608B and were also the true actual CPU real anomalies 612B. In other words, ‘19’ data points are ‘TP’.
Thus, the test set included 13 real anomalies (i.e., CPU real anomalies) and 12 out of the 13 real anomalies were detected as anomalies (i.e., Mahalanobis anomalies). Thus, ‘12’ True Positives in this case denotes effectiveness of the predefined time bucketing in identifying real anomalies (i.e., point anomalies).
Referring now to FIG. 7A, an exemplary comparison table 700A representing performance data of time bucket-based anomaly detection analysis and pre-existing anomaly detection analysis techniques for univariate test set, in accordance with some embodiments of the present disclosure. Anomaly detection was performed on the univariate test set by the time bucket-based anomaly detection analysis technique and each of the pre-existing anomaly detection analysis techniques. The comparison table 700A includes a column for a real anomaly 702A, a column for a contextual spikes 704A, and a column for anomaly detection techniques 706A. The real anomaly column 702A includes a column for True Positive (TP) anomalies predicted, a column for False Positive (FP) anomalies predicted, a column for True Negative (TN) anomalies predicted, and a column for False Negative (FN) anomalies predicted. The contextual spikes column 704A also includes a column for TP, a column for FP, a column for TN , and a column for FN.
In an ideal scenario, in the case of the contextual spikes 704A, none of the contextual spikes 704A should be identified as anomalies because they are repeating events in the time buckets. Hence, TP for contextual spikes 704A should be less. In the case of real anomalies 702A, all the real anomalies 702A should be identified as point anomalies because they are exceptional events. Hence, FN for real anomalies 702A should be less.
For the anomaly detection technique 706A ‘ARIMA’ 708A, among the real anomalies 702A, ‘16’ TP data points, ‘26’ FP data points, ‘1686’ TN data points, and ‘0’ FN data point were obtained. Among the contextual spikes 704A, ‘24’ TP data points, ‘18’ FP data points, ‘1686’ TN data points, and ‘0’ FN data point were obtained.
For the anomaly detection technique 706A ‘Kalman Filters’ 710A, among the real anomalies 702A, ‘16’ TP data points, ‘31’ FP data points, ‘1681’ TN data points, and ‘0’ FN data points were obtained. Among the contextual spikes 704A, ‘24’ TP data points, ‘23’ FP data points, ‘1681’ TN data points, and ‘0’ FN data points were obtained.
For the anomaly detection technique 706A ‘FBProphet’ 712A, among the real anomalies 702A, ‘4’ TP data points, ‘118’ FP data points, ‘1594’ TN data points, and ‘12’ FN data points were obtained. Among the contextual spikes 704A, ‘6’ TP data points, ‘116’ FP data points, ‘1588’ TN data points, and ‘18’ FN data points.
For the anomaly detection technique 706A ‘Static Window Z-score’ 714A (i.e., the time bucket-based anomaly detection analysis), among the real anomalies 702A, ‘16’ TP data points, ‘9’ FP data points, ‘1703’ TN data points, and ‘0’ FN data points were obtained. Among the contextual spikes 704A, ‘0’ TP data points, ‘25’ FP data points, ‘1679’ TN data points, and ‘24’ FN data points were obtained.
For the anomaly detection technique 706A ‘MAD’ 716A, among the real anomalies 702A, ‘16’ TP data points, ‘101’ FP data points, ‘1611’ TN data points, and ‘0’ FN data point were obtained. Among the contextual spikes 704A, ‘24’ TP data points, ‘34’ FP data points, ‘1670’ TN data points, and ‘0’ FN data point were obtained.
Thus, in the univariate test set, through the static window Z-score 714A (i.e., the time bucket-based anomaly detection analysis), all the 16 real anomalies 702A were successfully identified and none of the 24 contextual spikes 704A were identified as anomalies. Other anomaly detection techniques 706A mostly identified the 24 contextual spikes 704A as anomalies. FBProphet 712A was the closest to the static window Z-score 714A in identifying the contextual spikes 704A as non-anomalies (only 6 out of 24 contextual spikes identified as anomalies). However, FBProphet 712A failed to optimally detect the real anomalies 702A (only 4 out of 16 real anomalies identified as anomalies).
Referring now to FIG. 7B, an exemplary comparison table 700B representing performance data of time bucket-based anomaly detection analysis and pre-existing anomaly detection analysis techniques in multivariate test set, in accordance with some embodiments of the present disclosure. Anomaly detection was performed on the multivariate test set by the time bucket-based anomaly detection analysis technique and each of the pre-existing anomaly detection analysis techniques. The comparison table 700B includes a column for a real anomaly 702B, a column for a contextual spikes 704B, and a column for anomaly detection techniques 706B. The real anomaly column 702B includes a column for TP anomalies predicted, a column for FP anomalies predicted, a column for TN anomalies predicted, and a column for a FN anomalies predicted. The contextual spikes column 704B also includes a column for TP, a column for FP, a column for TN, and a column for FN.
For the anomaly detection technique 706B ‘PCA’ 708B, among the real anomalies 702B, ‘8’ TP data points, ‘15’ FP data points, ‘1700’ TN data points, and ‘5’ FN data points were obtained. Among the contextual spikes 704B, ‘15’ TP data points, ‘8’ FP data points, ‘1696’ TN data points, and ‘9’ FN data points.
For the anomaly detection technique 706B ‘DBSAN’ 710B, among the real anomalies 702B, ‘1’ TP data point, ‘1’ FP data point, ‘1714’ TN data points, and ‘12’ FN data points were obtained. Among the contextual spikes 704B, ‘1’ TP data point, ‘1’ FP data point, ‘1703’ TN data points, and ‘23’ FN data points were obtained.
For the anomaly detection technique 706B ‘Mahalanobis’ 712B, among the real anomalies 702B, ‘13’ TP data points, ‘31’ FP data points, ‘1684’ TN data points, and ‘0’ FN data points were obtained. Among the contextual spikes 704B, ‘24’ TP data points, ‘20’ FP data points, ‘1684’ TN data points, and ‘0’ FN data points were obtained.
For the anomaly detection technique 706B ‘Static Window Mahalanobis’ 714B (i.e., the time bucket-based anomaly detection analysis), among the real anomalies 702B, ‘12’ TP data points, ‘31’ FP data points, ‘1681’ TN data points, and ‘1’ FN data point were obtained. Among the contextual spikes 704B, ‘19’ TP data points, ‘24’ FP data points, ‘1680’ TN data points, and ‘5’ FN data points were obtained.
For the anomaly detection technique 706B ‘MCD’ 716B, among the real anomalies 702B, ‘13’ TP data points, ‘104’ FP data points, ‘1611’ TN data points, and ‘0’ FN data points were obtained. Among the contextual spikes 704B, ‘24’ TP data points, ‘93’ FP data points, ‘1611’ TN data points, and ‘0’ FN data points.
Thus, in the multivariate test set, through the static window Mahalanobis 714B (i.e., the time bucket-based anomaly detection analysis), 12 out of the 13 real anomalies 702B were successfully identified and 19 out of the 24 contextual spikes 704B were identified as anomalies. Other anomaly detection techniques 706B mostly identified the 24 contextual spikes 704B as anomalies. DBSAN 710B was the closest to the static window Mahalanobis 714B in identifying the contextual spikes 704B as non-anomalies (only 1 out of 24 contextual spikes identified as anomalies). However, DBSAN 710B failed to optimally detect the real anomalies 702B (only 1 out of 13 real anomalies identified as anomalies).
Referring now to FIG. 8A, an exemplary comparison table 800A representing training and inference times of time bucket-based anomaly detection analysis and pre-existing anomaly detection analysis techniques in univariate test set is illustrated, in accordance with some embodiments of the present disclosure. The comparison table 800A includes a column for anomaly detection technique 802A, a column for training time 804A, and a column for inference time 806A.
For the anomaly detection technique 802A ‘ARIMA’ 808A, the training time 804A was ‘0.045’ seconds and the inference time 806A was ‘190.297’ seconds.
For the anomaly detection technique 802A ‘Kalman Filters’ 810A, the training time 804A was ‘1.556’ seconds and the inference time 806A was ‘1.416’ seconds.
For the anomaly detection technique 802A ‘FBProphet’ 812A, the training time 804A was ‘0.448’ seconds and the inference time 806A was ’25.813’ seconds.
For the anomaly detection technique 802A ‘Z-test’ 814A, the training time 804A was ‘0.001’ second and the inference time 806A was ‘0.004’ seconds.
For the anomaly detection technique 802A ‘Static Window Z-score’ 816A, the training time 804A was ‘0.033’ seconds and the inference time 806A was ‘0.033’ seconds.
Thus, the training and inference times of the time bucket-based anomaly detection analysis technique (i.e., ‘Static Window Z-score’ 816A) are comparable with that of the pre-existing anomaly detection techniques. In fact, the training and inference times of the time bucket-based anomaly detection analysis technique is second only to that of the Z-Test 814A.
Referring now to FIG. 8B, an exemplary comparison table 800B representing training and inference times of time bucket-based anomaly detection analysis with pre-existing anomaly detection analysis in multivariate test set is illustrated, in accordance with some embodiments of the present disclosure. The comparison table 800B may include a column for anomaly detection technique 802B, a column for training time 804B, and a column for inference time 806B.
For the anomaly detection technique 802B ‘PCA’ 808B, the training time 804B was ‘0.031’ seconds, and the inference time 806B was ‘0.001’ seconds.
For the anomaly detection technique 802B ‘DBSAN’ 810B, the training time 804B was ‘0.947’ seconds, and the inference time 806B was ‘0.085’ seconds.
For the anomaly detection technique 802B ‘Mahalanobis’ 812B, the training time 804B was ‘0.011’ seconds, and the inference time 806B was ‘0.007’ seconds.
For the anomaly detection technique 802B ‘Static Window Mahalanobis’ 814B, the training time 804B was ‘0.136’ seconds and the inference time 806B was ‘0.687’ seconds.
Thus, the training and inference times of the time bucket-based anomaly detection analysis technique (i.e., ‘Static Window Mahalanobis’ 1014A) are comparable with that of the pre-existing anomaly detection techniques.
As will be also appreciated, the above-described techniques may take the form of computer or controller implemented processes and apparatuses for practicing those processes. The disclosure can also be embodied in the form of computer program code containing instructions embodied in tangible media, such as floppy diskettes, solid state drives, CD-ROMs, hard drives, or any other computer-readable storage medium, wherein, when the computer program code is loaded into and executed by a computer or controller, the computer becomes an apparatus for practicing the invention. The disclosure may also be embodied in the form of computer program code or signal, for example, whether stored in a storage medium, loaded into and/or executed by a computer or controller, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention. When implemented on a general-purpose microprocessor, the computer program code segments configure the microprocessor to create specific logic circuits.
The disclosed methods and systems may be implemented on a conventional or a general-purpose computer system, such as a personal computer (PC) or server computer. Referring now to FIG. 9, an exemplary computing system 900 that may be employed to implement processing functionality for various embodiments (e.g., as a SIMD device, client device, server device, one or more processors, or the like) is illustrated. Those skilled in the relevant art will also recognize how to implement the invention using other computer systems or architectures. The computing system 900 may represent, for example, a user device such as a desktop, a laptop, a mobile phone, personal entertainment device, DVR, and so on, or any other type of special or general-purpose computing device as may be desirable or appropriate for a given application or environment. The computing system 900 may include one or more processors, such as a processor 902 that may be implemented using a general or special purpose processing engine such as, for example, a microprocessor, microcontroller or other control logic. In this example, the processor 902 is connected to a bus 904 or other communication medium. In some embodiments, the processor 902 may be an Artificial Intelligence (AI) processor, which may be implemented as a Tensor Processing Unit (TPU), or a graphical processor unit, or a custom programmable solution Field-Programmable Gate Array (FPGA).
The computing system 900 may also include a memory 906 (main memory), for example, Random Access Memory (RAM) or other dynamic memory, for storing information and instructions to be executed by the processor 902. The memory 906 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by the processor 902. The computing system 900 may likewise include a read only memory (“ROM”) or other static storage device coupled to bus 904 for storing static information and instructions for the processor 902.
The computing system 900 may also include a storage devices 908, which may include, for example, a media drive 910 and a removable storage interface. The media drive 910 may include a drive or other mechanism to support fixed or removable storage media, such as a hard disk drive, a floppy disk drive, a magnetic tape drive, an SD card port, a USB port, a micro USB, an optical disk drive, a CD or DVD drive (R or RW), or other removable or fixed media drive. A storage media 912 may include, for example, a hard disk, magnetic tape, flash drive, or other fixed or removable medium that is read by and written to by the media drive 910. As these examples illustrate, the storage media 912 may include a computer-readable storage medium having stored therein particular computer software or data.
In alternative embodiments, the storage devices 908 may include other similar instrumentalities for allowing computer programs or other instructions or data to be loaded into the computing system 900. Such instrumentalities may include, for example, a removable storage unit 914 and a storage unit interface 916, such as a program cartridge and cartridge interface, a removable memory (for example, a flash memory or other removable memory module) and memory slot, and other removable storage units and interfaces that allow software and data to be transferred from the removable storage unit 914 to the computing system 900.
The computing system 900 may also include a communications interface 918. The communications interface 918 may be used to allow software and data to be transferred between the computing system 900 and external devices. Examples of the communications interface 918 may include a network interface (such as an Ethernet or other NIC card), a communications port (such as for example, a USB port, a micro USB port), Near field Communication (NFC), etc. Software and data transferred via the communications interface 918 are in the form of signals which may be electronic, electromagnetic, optical, or other signals capable of being received by the communications interface 918. These signals are provided to the communications interface 918 via a channel 920. The channel 920 may carry signals and may be implemented using a wireless medium, wire or cable, fiber optics, or other communications medium. Some examples of the channel 920 may include a phone line, a cellular phone link, an RF link, a Bluetooth link, a network interface, a local or wide area network, and other communications channels.
The computing system 900 may further include Input/Output (I/O) devices 922. Examples may include, but are not limited to a display, keypad, microphone, audio speakers, vibrating motor, LED lights, etc. The I/O devices 922 may receive input from a user and also display an output of the computation performed by the processor 902. In this document, the terms “computer program product” and “computer-readable medium” may be used generally to refer to media such as, for example, the memory 906, the storage devices 908, the removable storage unit 914, or signal(s) on the channel 920. These and other forms of computer-readable media may be involved in providing one or more sequences of one or more instructions to the processor 902 for execution. Such instructions, generally referred to as “computer program code” (which may be grouped in the form of computer programs or other groupings), when executed, enable the computing system 900 to perform features or functions of embodiments of the present invention.
In an embodiment where the elements are implemented using software, the software may be stored in a computer-readable medium and loaded into the computing system 900 using, for example, the removable storage unit 914, the media drive 910 or the communications interface 918. The control logic (in this example, software instructions or computer program code), when executed by the processor 902, causes the processor 902 to perform the functions of the invention as described herein.
Various embodiments provide method and system for detecting point anomalies in time-series data. The disclosed method and system may receive time series data of a plurality of time units. The time series data may include a plurality of points of each of one or more variables. Each of the plurality of time units may include a set of predefined time buckets. Further, the disclosed method and system may allocate each of the plurality of points in a corresponding time bucket from the set of time buckets, based on an associated time value in the time series data. Further, for each time bucket of the set of time buckets, the disclosed method and system may determine a moving average and a corresponding variance of a set of points allocated in the time bucket. The moving average is one of a sliding window average and standard deviation, and an EMWA. Further, the disclosed method and system may calculate a distance metric of each of the set of points in the time bucket based on the moving average. Moreover, the disclosed method and system may compare the distance metric of the each of the set of points with a pre-determined threshold value of the time bucket. The pre-determined threshold value is based on the moving average. Thereafter, for each point of the set of points, the disclosed method and system may classify the point as one of a point anomaly or a non-anomaly in the time bucket based on the comparison. The point anomaly corresponds to an outlier point in a data distribution of the set of points.
Thus, the disclosed method and system try to overcome the technical problem of detecting point anomalies in time-series data. The disclosed method and system may improve the accuracy of anomaly detection by using a context-aware static windowing mechanism. Additionally, the disclosed method and system may effectively differentiate between true anomalies and recuring contextual spikes which reduces false alarm. Further, the disclosed method and system may be designed in such a manner to handle large-scale datasets efficiently. Further, the disclosed method and system may able to adapt different data patterns by using a static windowing approach which may automatically adjust with varying data distributions and make it suitable for real-time anomaly detection in complex environments. Further, the disclosed method and system may applicable in various fields (such as, cyber-security (e.g., for intrusion detection), finance (e.g., for fraud detection), healthcare (e.g., for patient monitoring), and industrial systems (e.g., for fault prediction). Further, the disclose method and system may reduce manual intervention by adjusting detection criteria based on the data patterns. Further, the disclosed method and system may extend seamlessly to multivariate data. Additionally, the disclosed method and system may effectively capture correlations between multiple features.
In light of the above-mentioned advantages and the technical advancements provided by the disclosed method and system, the claimed steps as discussed above are not routine, conventional, or well understood in the art, as the claimed steps enable the following solutions to the existing problems in conventional technologies. Further, the claimed steps clearly bring an improvement in the functioning of the device itself as the claimed steps provide a technical solution to a technical problem.
The specification has described method and system for detecting point anomalies in time-series data. The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope and spirit of disclosed embodiments being indicated by the following claims. , Claims:CLAIMS
I/We Claim:
1. A method (300) for detecting point anomalies in time-series data, the method (300) comprising:
receiving (302), by a processor (104), time series data (208) of a plurality of time units, wherein the time series data (208) comprises a plurality of points of each of one or more variables, wherein each of the plurality of time units comprises a set of predefined time buckets;
allocating (304), by the processor (104), each of the plurality of points in a corresponding time bucket from the set of time buckets, based on an associated time value in the time series data (208);
for each time bucket of the set of time buckets,
determining (306), by the processor (104), a moving average and a corresponding variance of a set of points allocated in the time bucket, wherein the moving average is one of a sliding window average and an Exponentially Weighted Moving Average (EWMA);
calculating (308), by the processor (104), a distance metric of each of the set of points in the time bucket based on the moving average;
comparing (310), by the processor (104), the distance metric of each of the set of points with a pre-determined threshold value of the time bucket, wherein the pre-determined threshold value is based on the moving average and the variance; and
for each point of the set of points, classifying (312), by the processor (104), the point as one of a point anomaly or a non-anomaly in the time bucket based on the comparison, wherein the point anomaly corresponds to an outlier point in a data distribution of the set of points.

2. The method (300) as claimed in claim 1, comprising updating (402) the moving average and the variance based on the new point.

3. The method (300) as claimed in claim 2, wherein updating the sliding window average and a corresponding sliding window variance comprises:
adding (404) the new point to the set of points in the time bucket;
simultaneously removing (406) an oldest of the set of points from the time bucket to obtain an updated set of points; and
determining (408) an updated sliding window average and an updated sliding window variance for the updated set of points in the time bucket.

4. The method (300) as claimed in claim 2, wherein updating the EWMA and a corresponding Exponentially Weighted Variance (EWV) comprises:
determining (410) an updated EWMA based on the new point, the determined EWMA, and a predefined smoothing factor; and
determining (412) an updated EWV based on the new point, the determined EWMA, the determined variance, and the predefined smoothing factor.

5. The method (300) as claimed in claim 2, comprising:
determining (414) an updated threshold value of the time bucket based on the updated moving average and the updated variance; and
classifying (416) the new point as one of the point anomaly or the non-anomaly in the time bucket based on a comparison with the updated threshold value.

6. The method (300) as claimed in claim 1, wherein the distance metric is a Z-score when the time series data (208) corresponds to one variable, and wherein the distance metric is a Mahalanobis distance when the time series data (208) corresponds to two or more variables.

7. A system (100) for detecting point anomalies in time-series data, the system (100) comprising:
a processor (104); and
a memory (106) communicatively coupled to the processor (104), wherein the memory (106) stores processor instructions, which when executed by the processor (104), cause the processor (104) to:
receive (302) time series data (208) of a plurality of time units, wherein the time series data (208) comprises a plurality of points of each of one or more variables, wherein each of the plurality of time units comprises a set of predefined time buckets;
allocate (304) each of the plurality of points in a corresponding time bucket from the set of time buckets, based on an associated time value in the time series data (208);
for each time bucket of the set of time buckets,
determine (306) a moving average and a corresponding variance of a set of points allocated in the time bucket, wherein the moving average is one of a sliding window average and an Exponentially Weighted Moving Average (EWMA);
calculate (308) a distance metric of each of the set of points in the time bucket based on the moving average;
compare (310) the distance metric of each of the set of points with a pre-determined threshold value of the time bucket, wherein the pre-determined threshold value is based on the moving average and the variance; and
for each point of the set of points, classify (312) the point as one of a point anomaly or a non-anomaly in the time bucket based on the comparison, wherein the point anomaly corresponds to an outlier point in a data distribution of the set of points.

8. The system (100) as claimed in claim 7, wherein the processor instructions, on execution, cause the processor (104) to update (402) the moving average and the variance based on the new point.

9. The system (100) as claimed in claim 8, wherein to update the sliding window average and a corresponding sliding window variance, the processor instructions, on execution, further cause the processor (104) to:
add (404) the new point to the set of points in the time bucket;
simultaneously remove (406) an oldest of the set of points from the time bucket to obtain an updated set of points; and
determine (408) an updated sliding window average and an updated sliding window variance for the updated set of points in the time bucket.

10.The system (100) as claimed in claim 8, wherein to update the EWMA and a corresponding Exponentially Weighted Variance (EWV), the processor instructions, on execution, cause the processor (104) to:
determine (410) an updated EWMA based on the new point, the determined EWMA, and a predefined smoothing factor; and
determine (412) an updated EWV based on the new point, the determined EWMA, the determined variance, and the predefined smoothing factor.

11. The system (100) as claimed in claim 8, wherein the processor instructions, on execution, cause the processor (104) to:
determine (414) an updated threshold value of the time bucket based on the updated moving average and the updated variance; and
classify (416) the new point as one of the point anomaly or the non-anomaly in the time bucket based on a comparison with the updated threshold value.

12. The system (100) as claimed in claim 7, wherein the distance metric is a Z-score when the time series data (208) corresponds to one variable, and wherein the distance metric is a Mahalanobis distance when the time series data (208) corresponds to two or more variables.

Documents

Application Documents

# Name Date
1 202511085764-STATEMENT OF UNDERTAKING (FORM 3) [09-09-2025(online)].pdf 2025-09-09
2 202511085764-REQUEST FOR EXAMINATION (FORM-18) [09-09-2025(online)].pdf 2025-09-09
3 202511085764-REQUEST FOR EARLY PUBLICATION(FORM-9) [09-09-2025(online)].pdf 2025-09-09
4 202511085764-PROOF OF RIGHT [09-09-2025(online)].pdf 2025-09-09
5 202511085764-POWER OF AUTHORITY [09-09-2025(online)].pdf 2025-09-09
6 202511085764-FORM-9 [09-09-2025(online)].pdf 2025-09-09
7 202511085764-FORM 18 [09-09-2025(online)].pdf 2025-09-09
8 202511085764-FORM 1 [09-09-2025(online)].pdf 2025-09-09
9 202511085764-FIGURE OF ABSTRACT [09-09-2025(online)].pdf 2025-09-09
10 202511085764-DRAWINGS [09-09-2025(online)].pdf 2025-09-09
11 202511085764-DECLARATION OF INVENTORSHIP (FORM 5) [09-09-2025(online)].pdf 2025-09-09
12 202511085764-COMPLETE SPECIFICATION [09-09-2025(online)].pdf 2025-09-09