Method And System For Optimally Timing Decisions By Forecasting

< Back

Method And System For Optimally Timing Decisions By Forecasting Extreme Values Of Time Series Data

Abstract: The forecasting analysis of time series data is of great practical importance for making decisions. The existing methods for data forecasting have two key issues of ineffective forecasting and the non-stationary nature. A system and method for making optimally timed decisions by forecasting a set of extreme future values of a time series data have been provided. A learning approach have been disclosed that learns to forecast a set of extreme future values of the time series instead of forecasting all the future values of the same and bases the final decision on the set of extreme forecast values. Furthermore, to handle the non-stationarity in the time series data, an adaptively learned decision-thresholds have been proposed based on recent historical episodes. Through extensive empirical evaluation, the benefits of estimating extreme-K future values have been shown over the traditional multi-step forecasting of future values. [To be published with FIG. 1]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

23 February 2022

Publication Number

34/2023

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Tata Consultancy Services Limited

Nirmal Building, 9th Floor, Nariman Point Mumbai Maharashtra India 400021

Inventors

1. GARG, Diksha

Tata Consultancy Services Limited Global Axis - H block, KIADB Export Promotion Industrial Area, Whitefield, Bangalore Karnataka India 560066

2. MALHOTRA, Pankaj

Tata Consultancy Services Limited Galaxy Business Park, Plot no. A-44 & A45, Ground, 1st to 05th floor & 10th floor Block C&D, Sector 62, Noida Utter Pradesh India 201309

3. VIG, Lovekesh

Tata Consultancy Services Limited Block-C, Kings Canyon, ASF Insignia, Gurgaon- Faridabad, Gawal Pohari, Gurgaon Haryana India 122003

4. BHATIA, Anil

Tata Consultancy Services Limited Deccan Park, Plot No 1, Survey No. 64/2, Software Units Layout, Serilingampally Mandal, Madhapur, Hyderabad Telangana India 500081

5. SHROFF, Gautam

Tata Consultancy Services Limited Block-C, Kings Canyon, ASF Insignia, Gurgaon- Faridabad, Gawal Pohari, Gurgaon Haryana India 122003

6. BHAT, Sanjay Purushottam

Tata Consultancy Services Limited Deccan Park, Plot No 1, Survey No. 64/2, Software Units Layout, Serilingampally Mandal, Madhapur, Hyderabad Telangana India 500081

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION (See Section 10 and Rule 13)
Title of invention:
METHOD AND SYSTEM FOR OPTIMALLY TIMING DECISIONS BY FORECASTING EXTREME VALUES OF TIME SERIES DATA
Applicant
Tata Consultancy Services Limited A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description
The following specification particularly describes the invention and the manner in which it is to be performed.

TECHNICAL FIELD
[001] The disclosure herein generally relates to the field of data forecasting of time series data, and, more particularly, to a method and system for an optimally timing decisions by forecasting a set of extreme future values of a time series data.
BACKGROUND
[002] The analysis of time series data is of great practical importance in many application areas including stock marketing, astronomy, environmental analysis, molecular biology, and pharmacogenomics. As a consequence, a lot of research work has focused on similarity search in time series databases in the past years. Time series forecasting is a fundamental problem associated with a wide range of science, engineering, and other issues. Time series forecasting with neural networks has been the focus of much research in the past few decades. Given the recent deep learning revolution, there has been much attention in using deep learning models for time series prediction.
[003] Further, the time series data forecasting is also popularly used in some kind of decision making for example, in weather prediction where crops cultivation related decisions are taken based on weather forecast, foreign exchange conversion decision etc. The decision making is very important, as is related with the time, about what is optimal time to take the decision.
[004] To address the many challenges in time series forecasting, a variety of time series forecasting approaches have been developed to capture certain structural assumptions of time series. Traditional methods include the non-stationary model etc., that consider different types of uncertainties. Machine learning and deep neural network approaches have also been developed in the past to tackle the forecasting problem.
[005] Existing literature suggests using multi-step forecasting of future values of desired time series for subsequent decisions-making. But forecasting

values (closer to actual) is difficult, and erroneous estimates tend to degrade the overall performance instead of improving the same. No notion of handling inherent non-stationary nature of time series data at the time of final decision-making in the existing literature. Due to non-stationary nature, consideration of fixed threshold at final decision-making time is highly ineffective.
SUMMARY
[006] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a system for making optimally timed decisions by forecasting a set of extreme future values of a time series data is provided. The system comprises a user interface, one or more hardware processors, and a memory. The user interface provides the time series data as an input. The memory is in communication with the one or more hardware processors, wherein the one or more first hardware processors are configured to execute programmed instructions stored in the one or more first memories, to: convert the time series data into a plurality of episodes of a predefined length using a shifted window of a predefined length; split the plurality of episodes into a training dataset, a validation dataset and a test dataset, wherein the training dataset, the validation dataset and the test dataset are three independent mutually exclusive sets; train a plurality of models using the training dataset of the plurality of episodes corresponding to a plurality of hyperparameter combinations, wherein the plurality of models is trained using a weighted loss function; calculate a performance metric for each model of the plurality of trained models on the validation dataset; select a model from amongst the plurality of models that yields a highest calculated performance metric as a best model; forecast the set of extreme future values over a time horizon in the future using the best model for each of the plurality of episodes in the test dataset; adaptively select an adaptive threshold for each episode of the plurality of episodes of the test dataset, wherein the adaptive selection comprises: selecting the adaptive threshold from a candidate set of

thresholds by calculating a performance of each candidate threshold on a set of most recent past episodes of a current episode, wherein the performance is calculated in terms of the performance metric for each candidate threshold on the set of most recent past episodes; calculate an average future value of the set of extreme future values for the current episode for each time point of the time series; calculate a difference between the average future value and a current value of the time series data for the current episode; and compare the calculated difference with the adaptive threshold based on a predefined condition to trigger the optimally timed decisions at each time point of the time series for the current episode.
[007] In another aspect, a method for making optimally timed decisions by forecasting a set of extreme future values of a time series data is provided. Initially, the time series data is received as an input via a user interface. The time series data is then converted into a plurality of episodes of a predefined length using a shifted window of a predefined length. In the next step, the plurality of episodes is split into a training dataset, a validation dataset and a test dataset, wherein the training dataset, the validation dataset and the test dataset are three independent mutually exclusive sets. In the next step, a plurality of models is trained using the training dataset of the plurality of episodes corresponding to a plurality of hyperparameter combinations, wherein the plurality of models is trained using a weighted loss function. Further, a performance metric is calculated for each model of the plurality of trained models on the validation dataset. A model is then selected from amongst the plurality of models that yields a highest calculated performance metric as a best model. In the next step, the set of extreme future values is forecasted over a time horizon in the future using the best model for each of the plurality of episodes in the test dataset. Further, an adaptive threshold is adaptively selected for each episode of the plurality of episodes of the test dataset, wherein the adaptive selection comprises: selecting the adaptive threshold from a candidate set of thresholds by calculating a performance of each candidate threshold on a set of most recent past episodes of a current episode, wherein the performance is calculated in terms of the performance metric for each candidate threshold on the set of most recent past episodes. In the next step, an average future value of the set of extreme

future values is calculated for the current episode for each time point of the time series. Further, a difference is calculated between the average future value and a current value of the time series data for the current episode. And finally, the calculated difference is compared with the adaptive threshold based on a predefined condition to trigger the optimally timed decisions at each time point of the time series for the current episode.
[008] In yet another aspect, one or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors cause making optimally timed decisions by forecasting a set of extreme future values of a time series data. Initially, the time series data is received as an input via a user interface. The time series data is then converted into a plurality of episodes of a predefined length using a shifted window of a predefined length. In the next step, the plurality of episodes is split into a training dataset, a validation dataset and a test dataset, wherein the training dataset, the validation dataset and the test dataset are three independent mutually exclusive sets. In the next step, a plurality of models is trained using the training dataset of the plurality of episodes corresponding to a plurality of hyperparameter combinations, wherein the plurality of models is trained using a weighted loss function. Further, a performance metric is calculated for each model of the plurality of trained models on the validation dataset. A model is then selected from amongst the plurality of models that yields a highest calculated performance metric as a best model. In the next step, the set of extreme future values is forecasted over a time horizon in the future using the best model for each of the plurality of episodes in the test dataset. Further, an adaptive threshold is adaptively selected for each episode of the plurality of episodes of the test dataset, wherein the adaptive selection comprises: selecting the adaptive threshold from a candidate set of thresholds by calculating a performance of each candidate threshold on a set of most recent past episodes of a current episode, wherein the performance is calculated in terms of the performance metric for each candidate threshold on the set of most recent past episodes. In the next step, an average future value of the set of extreme future values is calculated for the current episode for each time point of the time series. Further,

a difference is calculated between the average future value and a current value of the time series data for the current episode. And finally, the calculated difference is compared with the adaptive threshold based on a predefined condition to trigger the optimally timed decisions at each time point of the time series for the current episode.
[009] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[0011] FIG. 1 illustrates a block diagram of a system for making optimally timed decisions by forecasting a set of extreme future values of a time series data according to some embodiments of the present disclosure.
[0012] FIG. 2 is a flowchart showing the steps involved the training of the model according to some embodiments of the disclosure.
[0013] FIG. 3 is a flowchart showing steps involved in selecting adaptive threshold according to some embodiments of the disclosure.
[0014] FIG. 4 is a flowchart showing steps involved in optimally timing the decisions according to some embodiment of the disclosure.
[0015] FIGS. 5A-5B is a flowchart of a method for making optimally timed decisions by forecasting a set of extreme future values of a time series data according to some embodiments of the present disclosure
DETAILED DESCRIPTION OF EMBODIMENTS
[0016] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number

identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
[0017] The existing methods for data forecasting have two key issues of working with the highly non-stationary time series data, that render standard solutions ineffective. Firstly, while good forecasts of future values of desired time series can be highly effective in guiding good decisions, forecasting these values is difficult, and erroneous estimates tend to degrade the performance instead of improving. And second the non-stationary nature of time series data renders a fixed decision-threshold highly ineffective. To address these problems, a supervised learning approach that learns to forecast a set of extreme future values (extreme-K future values) of the underlying time series instead of forecasting all the future values of the same and bases the final decision on the extreme-K forecast values.
[0018] The present disclosure provides a system and method for making optimally timed decisions by forecasting a set of extreme future values of a time series data. Furthermore, to handle the non-stationarity in the time series data, which poses challenges to the independent and the identically distributed assumption in supervised learning methods, an adaptively learned decision-thresholds have been proposed based on recent historical episodes (trends).
[0019] For better decision-making, the forecasting of only a set of extreme future values (extreme-k future values) (bottom-k or top-k) is being done instead of using the traditional multi-step forecasting, where needs to forecast all the future values and not the K values to be precise. Estimating only K future values instead of all, is of high importance owning to the non-stationary nature of the time series data.
[0020] Through extensive empirical evaluation, the benefits estimating extreme-K future values have been shown over the traditional multi-step forecasting of future values. The inherent issue of non-stationarity in the time series data is handled using adaptive local decision-thresholds using recent historical

trends of the underlying time series data instead of global decision-threshold obtained using the available entire history of time series data.
[0021] The method is solving the problem of identifying an optimal time for making decision as described in the prior art. In essence, the present disclosure involves estimating, at time t, a decision variable (dt) representing the difference between the best possible future value and the current value of the time series data for a current episode and comparing it with a decision boundary in the form of an adaptive threshold.
[0022] According to an embodiment of the disclosure, the decision variable is estimated as follows: In the first approach, the problem is casted as a general, non-Markovian optimal stopping problem, and recursive regression is applied using a known technique to estimate the decision variable. The recursive regression requires as many regression models as the number of time steps. Hence dynamic-programming-based value-function approximation in a finite-horizon Markovian framework is also used to estimate the decision variable. Value-function approximation and Q-learning were also considered for optimal decision making using a Deep Q-Network-type algorithm in an infinite-horizon, stationary Markovian setting to obtain estimators for the decision variable that do not explicitly depend on time. The proposed approach is designed to mitigate the problem of overestimation bias by using specialized loss function and involves using estimates of extreme-K future values to build an estimate of decision variable. The overestimation bias means overestimate the certain values which are actually not close to the actual expected (ground truth). Estimations turned out to be spuriously overestimated due to this overestimation bias, eventually affecting the model’s performance.
[0023] According to an embodiment of the disclosure, the adaptive threshold (δe) is calculated as follows. Most traditional as well as learning-based agents apply a threshold over a decision variable (dt) for final decision for handling bias in estimates. Hence, instead of comparing decision variable with 0 to trigger a decision, decision variable is compared with a threshold δe computed at the start of each episode e. The condition dt - δe < 0 then triggers the decision at time t in the

episode e. In its simplest form, the threshold δe can be chosen to be a constant independent of e. However, given the non-stationary nature of the input data series, an episode-dependent threshold is chosen from a finite set of candidate thresholds. Each candidate threshold is used along with the chosen estimator for decision variable for decision-making on a collection of historical episodes immediately preceding the episode e. The candidate threshold that yields the highest average cumulative payoff on the historical episodes is chosen as the δe for the episode e.
[0024] Referring now to the drawings, and more particularly to FIG. 1 through FIG. 5B, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[0025] According to an embodiment of the disclosure, FIG. 1 illustrates a block diagram of a system 100 for making optimally timed decisions by forecasting a set of extreme future values of a time series data. It may be understood that the system 100 comprises one or more computing devices 102, such as a laptop computer, a desktop computer, a notebook, a workstation, a cloud-based computing environment and the like. It will be understood that the system 100 may be accessed through one or more input/output interfaces 104-A, 104-B…, collectively referred to as I/O interface 104 or user interface 104. Examples of the I/O interface 104 may include, but are not limited to, a user interface, a portable computer, a personal digital assistant, a handheld device, a smartphone, a tablet computer, a workstation and the like. The I/O interface 104 are communicatively coupled to the system 100 through a network 106.
[0026] In an embodiment, the network 106 may be a wireless or a wired network, or a combination thereof. In an example, the network 106 can be implemented as a computer network, as one of the different types of networks, such as virtual private network (VPN), intranet, local area network (LAN), wide area network (WAN), the internet, and such. The network 106 may either be a dedicated network or a shared network, which represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer

Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), and Wireless Application Protocol (WAP), to communicate with each other. Further, the network 106 may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices. The network devices within the network 106 may interact with the system 100 through communication links.
[0027] The system 100 may be implemented in a workstation, a mainframe computer, a server, and a network server. In an embodiment, the computing device 102 further comprises one or more hardware processors 108, one or more memory 110, hereinafter referred as a memory 110 and a data repository 112, for example, a repository 112. The memory 110 is in communication with the one or more hardware processors 108, wherein the one or more hardware processors 108 are configured to execute programmed instructions stored in the memory 110, to perform various functions as explained in the later part of the disclosure. The repository 112 may store data processed, received, and generated by the system 100. The memory 110 further comprises a plurality of units for performing various functions. The plurality of units comprises a preprocessor 114, a model training and selection unit 116, an adaptive threshold selection unit 118 and an optimal timing decision making unit 120 as shown in the block diagram of FIG. 1.
[0028] The system 100 supports various connectivity options such as BLUETOOTH®, USB, ZigBee and other cellular services. The network environment enables connection of various components of the system 100 using any communication link including Internet, WAN, MAN, and so on. In an exemplary embodiment, the system 100 is implemented to operate as a stand-alone device. In another embodiment, the system 100 may be implemented to work as a loosely coupled device to a smart computing environment. The components and functionalities of the system 100 are described further in detail.
[0029] According to an embodiment of the disclosure, the system 100 is configured to receive, the time series data as an input, via the I/O interface 104. The time series data can be taken from any scenario in which it is required to forecast future values to take some kind of decisions.

[0030] According to an embodiment of the disclosure, the preprocessor 114 is configured to preprocess the time series data. Since the time series data is too long, therefore the preprocessor 114 is configured to convert the time series data into a plurality of episodes of a predefined length using a shifted window of the predefined length. The predefined length is decided by the user. The plurality of episodes is then normalized depending on the type of use case. The plurality of episodes is then split into a training dataset, a validation dataset and a test dataset, wherein the training dataset, the validation dataset and the test dataset are three independent mutually exclusive sets
[0031] According to an embodiment of the disclosure, the model training and selection unit 116 is configured to train a plurality of models and then select the appropriate model to be used for forecasting a set of extreme future values over a time horizon as shown in flowchart 200 of FIG. 2. The plurality of models is trained using the training dataset of the plurality of episodes corresponding to a plurality of hyperparameter combinations. The model is a three layered feed forward neural network, wherein a third layer is an output layer comprising the same number of values as the set of extreme future values (K Values). Any neural network model can be used depending on the problem defined by the user, it just needs to made sure that the output unit must be same as K. The plurality of models is trained using a weighted loss function. The model training and selection unit 116 is further configured to calculate, a performance metric for each model of the plurality of trained models on the validation dataset. In an example, average cumulative reward (ACR) have been used to determine the performance of each of the plurality of models. A model from amongst the plurality of models that yields the highest calculated performance metric is selected as a best model. The best model is then used to forecast the set of extreme future values (extreme K values) over the time horizon in the future for each of the plurality of episodes in the test dataset. An average is taken for all values of the set of extreme futures values and used as the final estimated future value ( W^ for further calculations.
[0032] According to an embodiment of the disclosure, the system 100 comprises the adaptive threshold selection unit 118. The working of adaptive

threshold selection unit 118 is shown in the flowchart of FIG. 3. The adaptive
threshold selection unit 118 configured to adaptively select an adaptive threshold
(δe) for each episode of the plurality of episodes of the test dataset. The adaptive
threshold is selected from a candidate set of thresholds by calculating a performance
of each candidate threshold on a set of most recent past episodes of a current episode
‘e’. The candidate set of threshold consists of some fixed number of thresholds
within some range (specific to underlying problem & dataset). The performance is
calculated in terms of the performance metric for each candidate threshold on the
set of most recent past episodes. The candidate threshold that yields the best
performance on the historical episodes is chosen as the adaptive threshold (δe) for
the current episode ‘e’.
[0033] According to an embodiment of the disclosure, the system 100
comprises the optimal timing decision making unit 120. The optimal timing
decision making unit 120 is configured to trigger the optimally timed decisions at
each time point of the time series for the current episode ‘e’. The steps involved in
the optimal timed decision making unit 120 are shown in the flowchart of FIG. 4.
The optimal timed decision making unit 120 is configured to calculate the average
future value (W^ ) of the set of extreme future values for the current episode for each
time point ‘t’ of the time series. Followed by the calculation of a difference between
W^ the average future value (�) and a current value h(s), where s is current state of the
time series data at time t for current episode of the time series data at time ‘t’. The
optimal timed decision making unit 120 is further configured to compare, the
calculated difference with the adaptive threshold based on a predefined condition
to trigger the optimally timed decisions at each time point of the time series for the
current episode. Thus, the gap between W^ and h(s) i.e., W^ – h(s) < δe or h(s) – W^
< δe (as per the requirement), to trigger the final optimal decision. It should be
appreciated that the optimally timed decision is dependent on the predefined
condition and can be changed by the user.
[0034] To describe the multi-task supervised learning formulation of the
system 100, some notations are introduced. First, for each t and each i = 1, . . . T -
t, whereas T – indicates terminal time i.e. where the current episode

ends/terminates, let Xt+[i] denote the ith largest value with Xt+[i] being the largest future value of the time series data, of the collection {Xt+1, . . . ,XT}. Next, let K < T be a fixed integer, and define Note that Zt is the
average of the top extreme highest values of the time series data from time t + 1 onwards. Finally define
W(s) = max {h(s),E[Zt+1|St = s]} ……. (1)
[0035] The recommended decision for state s at time t is if and only if h(s) ≥ W(s). In the present disclosure, objective is to estimate W in equation (1) using data samples and apply the recommended action according to the estimate.
[0036] Thus, following top extreme future values forecasting + de (extreme-K + 8e) is proposed. A multi-task setting with K tasks is considered, where each task is a regression task corresponding to estimating one of the extreme-K future values. In other words, a regression task is considered where the target variable is K-dimensional. A neural network is used to obtain an approximation W^(s,θ) to E[Zt+1|St = s], where θ represents the neural network parameters. Following weighted loss function is considered based on the ranks of the future extreme values for sample state S at time step t in an episode:

[0037] In practice for t > T-K the summation in equation (2) is over T-t terms only. Finally,

[0038] Note that in contrast to prior art, the target in the loss function is not an estimated value but rather obtained from direct observations. Hence, the agents learned using this objective are easier to train in practice as the targets are less noisy and the learning does not suffer from the over-estimation bias. Upon the completion of training, the recommended decision whenever in state S is decided using dt = W^(s,θ)-h(s) ……………………… (4) [0039] Note that K = 1 is a special case with single task with maximum data value as univariate target:
Lt(θ) = (w|(s,θ)-m^ {xt+1…xr})2 ……………… (5)

[0040] According to an embodiment of the disclosure, a 3-layered feed forward neural network is used for all learning-based approaches. A third layer is an output layer comprising the same number of values as the set of extreme future values. In an example, where 1st and 2nd layers have 256 and 128 ReLUs, respectively. For proposed extreme-K + AT, output units are same as K, for DQN-based Q-network output units are 2, and for MC formulations, network has 1 output corresponding to decision making. A learning rate of 0.003 is used with Adam optimizer, batch size of 128, and episode length T of 58. The feature vector ft-1 as well as state S at time t in case of the MC approaches consists of past n days data values, where n ∈ {5, 10, 20}. The value of K for extreme-K data values forecasts is selected from {1, 2, 3, 4, 5}. Normalized data values are considered as input to the neural network, where all data values in an episode are divided by a first day value of the current episode time series data X1. All the hyper-parameters across all the approaches are tuned using grid search over the validation set.
[0041] FIG. 5A-5B illustrates a flow chart of a method 500 for making optimally timed decisions by forecasting the set of extreme future values of the time series data, in accordance with an example embodiment of the present disclosure. The method 500 depicted in the flow chart may be executed by a system, for example, the system 100 of FIG. 1. In an example embodiment, the system 100 may be embodied in the computing device.
[0042] Operations of the flowchart, and combinations of operations in the flowchart, may be implemented by various means, such as hardware, firmware, processor, circuitry and/or other device associated with execution of software including one or more computer program instructions. For example, one or more of the procedures described in various embodiments may be embodied by computer program instructions. In an example embodiment, the computer program instructions, which embody the procedures, described in various embodiments may be stored by at least one memory device of a system and executed by at least one processor in the system. Any such computer program instructions may be loaded onto a computer or other programmable system (for example, hardware) to produce a machine, such that the resulting computer or other programmable system embody

means for implementing the operations specified in the flowchart. It will be noted herein that the operations of the method 500 are described with help of system 100. However, the operations of the method 500 can be described and/or practiced by using any other system.
[0043] Initially at step 502 of the method 500, the time series data is received as an input via the user interface 104. At step 504, the time series data is then converted into the plurality of episodes of the predefined length using the shifted window of the predefined length. Further, at step 506, the plurality of episodes is split into the training dataset, the validation dataset and the test dataset. The training dataset, the validation dataset and the test dataset are three independent mutually exclusive sets.
[0044] At step 508 of the method 500, the plurality of models is trained using the training dataset of the plurality of episodes corresponding to the plurality of hyperparameter combinations. The plurality of models is trained using a weighted loss function as described in equation (2) above. In the next step at 510, a performance metric such as ACR is calculated for each model of the plurality of trained models on the validation dataset. At step 512, a model from amongst the plurality of models that yields the highest calculated performance metric is selected as a best model. Further at step 514, the set of extreme future values is forecasted over a time horizon in the future using the best model for each of the plurality of episodes in the test dataset.
[0045] At step 516 of the method 500, the adaptive threshold is adaptively selected for each episode of the plurality of episodes of the test dataset. The adaptive selection comprises selecting the adaptive threshold from a candidate set of thresholds by calculating a performance of each candidate threshold on a set of most recent past episodes of a current episode, wherein the performance is calculated in terms of the performance metric for each candidate threshold on the set of most recent past episodes.
[0046] Further at step 518 of the method 500, the average future value (w^ ) of the set of extreme future values is calculated for the current episode for each time point of the time series. At step 520, a difference (dt) between the average future

value (w^ ) and the current value (h(s)) of the time series data is also calculated. And finally, at step 522, the calculated difference is compared with the adaptive threshold based on a predefined condition to trigger the optimally timed decisions at each time point of the time series for the current episode.
Experimental results
[0047] To highlight the system 100 effectively following scenario is considered: Learning a trading agent acting on behalf of the treasury of a firm earning revenue in a foreign currency (FC) and incurring expenses in the home currency (HC). The goal of the agent is to maximize the expected HC at the end of the trading episode (e.g. over a financial quarter) by deciding to hold or sell the FC at each time step in the trading episode. For converting FC to HC, conversion rates are needed, referred as foreign exchange rates and forecasting foreign exchange (FX) rates is extremely difficult owing to its non-stationary nature. Through extensive empirical evaluation across 21 approaches and 7 currency pairs, it was shown that the present disclosure provides the only approach which is able to consistently improve upon a simple heuristic baseline. Further experiments show the inefficacy of state-of-the-art statistical and deep-learning-based forecasting methods as they degrade the performance of the trading agent.
[0048] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[0049] The embodiments of present disclosure herein address unresolved problem related to ineffective forecasts of future values of desired time series and non-stationary nature of time series data. The embodiment thus provides a method and a system for making optimally timed decisions by forecasting a set of extreme future values of a time series data.

[0050] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
[0051] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[0052] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are

appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[0053] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[0054] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.

We Claim:
1. A processor implemented method (500) for making optimally timed decisions by forecasting a set of extreme future values of a time series data, the method comprising:
receiving, via a user interface, the time series data as an input (502);
converting, via one or more hardware processors, the time series data into a plurality of episodes of a predefined length using a shifted window of a predefined length (504);
splitting, via one the or more hardware processors, the plurality of episodes into a training dataset, a validation dataset and a test dataset, wherein the training dataset, the validation dataset and the test dataset are three independent mutually exclusive sets (506);
training, via the one or more hardware processors, a plurality of models using the training dataset of the plurality of episodes corresponding to a plurality of hyperparameter combinations, wherein the plurality of models is trained using a weighted loss function (508);
calculating, via the one or more hardware processors, a performance metric for each model of the plurality of trained models on the validation dataset (510);
selecting, via the one or more hardware processors, a model from amongst the plurality of models that yields a highest calculated performance metric as a best model (512);
forecasting, via the one or more hardware processors, the set of extreme future values over a time horizon in the future using the best model for each of the plurality of episodes in the test dataset (514);
adaptively selecting, via the one or more hardware processors, an adaptive threshold for each episode of the plurality of episodes of the test dataset, wherein the adaptive selection comprises:
selecting the adaptive threshold from a candidate set of thresholds by calculating a performance of each candidate threshold on a set of most recent past episodes of a current episode, wherein the performance

is calculated in terms of the performance metric for each candidate threshold on the set of most recent past episodes (516);
calculating, via the one or more hardware processors, an average future value of the set of extreme future values for the current episode for each time point of the time series (518);
calculating, via the one or more hardware processors, a difference between the average future value and a current value of the time series data for the current episode (520); and
comparing, via the one or more hardware processors, the calculated difference with the adaptive threshold based on a predefined condition to trigger the optimally timed decisions at each time point of the time series for the current episode (522).
2. The processor implemented method of claim 1, wherein the model is a deep neural network, wherein an output layer of the deep neural network comprising the same number of values as the set of extreme future values.
3. The processor implemented method of claim 1, wherein the candidate threshold that yields the highest performance metric is chosen as the adaptive threshold for the current episode
4. The processor implemented method of claim 1, wherein the model is trained using weighted loss function based on ranks of the future extreme values.
5. The processor implemented method of claim 1, further comprising the step of normalizing the plurality of episodes based on the predefined condition.
6. A system (100) for making optimally timed decisions by forecasting a set of extreme future values of a time series data, the system comprises:
a user interface (104) for providing the time series data as an input; one or more hardware processors (108);

a memory (110) in communication with the one or more hardware processors, wherein the one or more first hardware processors are configured to execute programmed instructions stored in the one or more first memories, to:
convert the time series data into a plurality of episodes of a predefined length using a shifted window of a predefined length;
split the plurality of episodes into a training dataset, a validation dataset and a test dataset, wherein the training dataset, the validation dataset and the test dataset are three independent mutually exclusive sets;
train a plurality of models using the training dataset of the plurality of episodes corresponding to a plurality of hyperparameter combinations, wherein the plurality of models is trained using a weighted loss function;
calculate a performance metric for each model of the plurality of trained models on the validation dataset;
select a model from amongst the plurality of models that yields a highest calculated performance metric as a best model;
forecast the set of extreme future values over a time horizon in the future using the best model for each of the plurality of episodes in the test dataset;
adaptively select an adaptive threshold for each episode of the plurality of episodes of the test dataset, wherein the adaptive selection comprises:
selecting the adaptive threshold from a candidate set of thresholds by calculating a performance of each candidate threshold on a set of most recent past episodes of a current episode, wherein the performance is calculated in terms of the performance metric for each candidate threshold on the set of most recent past episodes;
calculate an average future value of the set of extreme future values for the current episode for each time point of the time series;

calculate a difference between the average future value and a current value of the time series data for the current episode; and
compare the calculated difference with the adaptive threshold based on a predefined condition to trigger the optimally timed decisions at each time point of the time series for the current episode.
7. The system of claim 6, wherein the model is a deep neural network, wherein an output layer of the deep neural network comprising the same number of values as the set of extreme future values.
8. The system of claim 6, wherein the candidate threshold that yields the highest performance metric is chosen as the adaptive threshold for the current episode.
9. The system of claim 6, wherein the model is trained using weighted loss function based on ranks of the future extreme values.
10. The system of claim 6, further comprising the step of normalizing the plurality of episodes based on the predefined condition.

Documents

Application Documents

#	Name	Date
1	202221009702-STATEMENT OF UNDERTAKING (FORM 3) [23-02-2022(online)].pdf	2022-02-23
2	202221009702-REQUEST FOR EXAMINATION (FORM-18) [23-02-2022(online)].pdf	2022-02-23
3	202221009702-FORM 18 [23-02-2022(online)].pdf	2022-02-23
4	202221009702-FORM 1 [23-02-2022(online)].pdf	2022-02-23
5	202221009702-FIGURE OF ABSTRACT [23-02-2022(online)].jpg	2022-02-23
6	202221009702-DRAWINGS [23-02-2022(online)].pdf	2022-02-23
7	202221009702-DECLARATION OF INVENTORSHIP (FORM 5) [23-02-2022(online)].pdf	2022-02-23
8	202221009702-COMPLETE SPECIFICATION [23-02-2022(online)].pdf	2022-02-23
9	202221009702-Proof of Right [21-04-2022(online)].pdf	2022-04-21
10	202221009702-FORM-26 [21-04-2022(online)].pdf	2022-04-21
11	202221009702-FER.pdf	2025-03-04
12	202221009702-OTHERS [05-08-2025(online)].pdf	2025-08-05
13	202221009702-FER_SER_REPLY [05-08-2025(online)].pdf	2025-08-05
14	202221009702-DRAWING [05-08-2025(online)].pdf	2025-08-05
15	202221009702-CLAIMS [05-08-2025(online)].pdf	2025-08-05
16	202221009702-US(14)-HearingNotice-(HearingDate-08-12-2025).pdf	2025-11-19

Search Strategy

1	SearchE_04-03-2024.pdf