System And Method For Determining An Optimal And Cost Effective

< Back

System And Method For Determining An Optimal And Cost Effective Vaccine Distribution Chain Network

Abstract: This disclosure relates to a system and method for determining an optimal and cost-effective vaccine distribution chain network. The method of the present disclosure addresses unresolved problems of optimizing vaccine distribution during mass vaccinations. Embodiments of the present disclosure utilizes a smart framework that is capable of learning and predicting the distribution chain network of vaccines optimally by leveraging supervised deep learning and reinforcement learning. More Specifically, the present disclosure describes a deep neural architecture design that predicts state-wise cost-efficient optimal vaccine allocations as per daily vaccination demands during a mass vaccination program using an RNN-based vaccine demand prediction model and a reinforcement learning-based cost optimization technique for cold chain network. [To be published with FIG. 2]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

16 November 2021

Publication Number

20/2023

Publication Type

INA

Invention Field

BIOTECHNOLOGY

Status

kcopatents@khaitanco.com

Parent Application

Patent Number

Legal Status

Grant Date

2024-06-20

Renewal Date

Applicants

Tata Consultancy Services Limited

Nirmal Building, 9th Floor, Nariman Point Mumbai Maharashtra India 400021

Inventors

1. MONDAL, Jayeeta

Tata Consultancy Services Limited Block -1B, Eco Space, Plot No. IIF/12 (Old No. AA-II/BLK 3. I.T) Street 59 M. WIDE (R.O.W.) Road, New Town, Rajarhat, P.S. Rajarhat, Dist - N. 24 Parganas, Kolkata West Bengal India 700160

2. DUTTA, Jeet

3. BARUA, Hrishav Bakul

Claims

1. A processor implemented method, comprising: receiving (202), via one or more hardware processors, a plurality of input time series data indicative of daily vaccination data at a vaccination site; preprocessing (204), via the one or more hardware processors, the plurality of input time series data to obtain a sequence of input feature vectors; training (206), via the one or more hardware processors, a supervised learning based predictor model using a ground truth and the sequence of input feature vectors; predicting (208), via the one or more hardware processors, one or more variables indicative of daily vaccination requirement in a specific region using the trained supervised learning based predictor model; inputting (210), via the one or more hardware processors, the one or more variables and a cost matrix to construct a state space for an auto regressive agent implementing a reinforcement learning-based cost optimization technique; determining (212), via the one or more hardware processors, an optimized reward parameter using the reinforcement learning-based cost optimization technique, wherein the optimized reward parameter is determined by the auto regressive agent based on a time series data from the state space and an action taken; and performing (214), via the one or more hardware processors, an optimal vaccine distribution to the specific region based on the optimized reward parameter.

2. The processor implemented method of claim 1, wherein the daily vaccination data at the vaccination site comprises location wise population, total number of vaccinations, number of partially vaccinated persons and number of fully vaccinated persons.

3. The processor implemented method of claim 1, wherein the cost matrix is determined using a storage cost and a transportation cost associated with the vaccine distribution.

4. The processor implemented method of claim 1, wherein the state space is indicative of cost optimized daily vaccinations at the specific location.

5. The processor implemented method of claim 1, wherein the optimal vaccine distribution is further scaled to a granular level of locations.

6. A system (100), comprising: a memory (102) storing instructions; one or more communication interfaces (106); and one or more hardware processors (104) coupled to the memory (102) via the one or more communication interfaces (106), wherein the one or more hardware processors (104) are configured by the instructions to: receive, via one or more hardware processors, a plurality of input time series data indicative of daily vaccination data at a vaccination site; preprocess, via the one or more hardware processors, the plurality of input time series data to obtain a sequence of input feature vectors; train, via the one or more hardware processors, a supervised learning based predictor model using a ground truth and the sequence of input feature vectors; predict, via the one or more hardware processors, one or more variables indicative of daily vaccination requirement in a specific region using the trained supervised learning based predictor model; input, via the one or more hardware processors, the one or more variables and a cost matrix to construct a state space for an auto regressive agent implementing a reinforcement learning-based cost optimization technique; determine, via the one or more hardware processors, an optimized reward parameter using the reinforcement learning-based cost optimization technique, wherein the optimized reward parameter is determined by the auto regressive agent based on a time series data from the state space and an action taken; and perform, via the one or more hardware processors, an optimal vaccine distribution to the specific region based on the optimized reward parameter.

7. The system of claim 6, wherein the one or more control parameters include an optimized mean value of a subset of the velocities corresponding to an optimized set of predicted optical flow losses that are used to adaptively update one or more parameters of a multivariate gaussian distribution until the predicted flow loss reaches the pre-defined threshold.

8. The system of claim 6, wherein the cost matrix is determined using a storage cost and a transportation cost associated with the vaccine distribution.

9. The system of claim 6, wherein the state space is indicative of cost optimized daily vaccinations at the specific location.

10. The system of claim 6, wherein the optimal vaccine distribution is further scaled to a granular level of locations.

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION (See Section 10 and Rule 13)
Title of invention:
SYSTEM AND METHOD FOR DETERMINING AN OPTIMAL AND COST-EFFECTIVE VACCINE DISTRIBUTION CHAIN NETWORK
Applicant
Tata Consultancy Services Limited A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description
The following specification particularly describes the invention and the manner in which it is to be performed.

TECHNICAL FIELD [001] The disclosure herein generally relates to the field of vaccine distribution, and, more particularly, to system and method for determining an optimal and cost-effective vaccine distribution chain network.
BACKGROUND [002] Infectious diseases and pandemic outbreaks have impacted the world history by shaping societies, changing war outcomes, influencing socio-economic policies and political standpoints, and overall paving a way for innovation in medicine and technology. Vaccinations against infectious diseases specially in an epidemic or a pandemic are paramount in the strategy of any public health policymaker’s arsenal. But it is a great challenge to efficiently distribute vaccines (on time) to all corners of a country, especially during a pandemic. Considering the vastness of population, diversified communities, and demands of a smart society, the objective for vaccine distribution is to optimize the vaccine distribution in any country/state effectively such that incurring cost is minimized. Few conventional approaches evaluated selective objective functions to estimate vaccination coverage and effectiveness. However, these approaches fail to optimize cost of vaccine distribution.
SUMMARY
[003] Embodiments of the present disclosure present technological
improvements as solutions to one or more of the above-mentioned technical
problems recognized by the inventors in conventional systems. For example, in one
embodiment, a processor implemented method is provided. The method
comprising: receiving, via one or more hardware processors, a plurality of input time series data indicative of daily vaccination data at a vaccination site; preprocessing, via the one or more hardware processors, the plurality of input time series data to obtain a sequence of input feature vectors; training, via the one or more hardware processors, a supervised learning based predictor model using a ground truth and the sequence of input feature vectors; predicting, via the one or

more hardware processors, one or more variables indicative of daily vaccination requirement in a specific region using the trained supervised learning based predictor model; inputting, via the one or more hardware processors, the one or more variables and a cost matrix to construct a state space for an auto regressive agent implementing a reinforcement learning-based cost optimization technique; determining, via the one or more hardware processors, an optimized reward parameter using the reinforcement learning-based cost optimization technique, wherein the optimized reward parameter is determined by the auto regressive agent based on a time series data from the state space and an action taken; and performing, via the one or more hardware processors, an optimal vaccine distribution to the specific region based on the optimized reward parameter.
[004] In another aspect, a system is provided. The system comprising a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to: receive, via one or more hardware processors, a plurality of input time series data indicative of daily vaccination data at a vaccination site; preprocess, via the one or more hardware processors, the plurality of input time series data to obtain a sequence of input feature vectors; train, via the one or more hardware processors, a supervised learning based predictor model using a ground truth and the sequence of input feature vectors; predict, via the one or more hardware processors, one or more variables indicative of daily vaccination requirement in a specific region using the trained supervised learning based predictor model; input, via the one or more hardware processors, the one or more variables and a cost matrix to construct a state space for an auto regressive agent implementing a reinforcement learning-based cost optimization technique; determine, via the one or more hardware processors, an optimized reward parameter using the reinforcement learning-based cost optimization technique, wherein the optimized reward parameter is determined by the auto regressive agent based on a time series data from the state space and an action taken; and perform, via the one

or more hardware processors, an optimal vaccine distribution to the specific region based on the optimized reward parameter.
[005] In yet another aspect, a non-transitory computer readable medium is provided. The non-transitory computer readable medium, comprising: receiving, via one or more hardware processors, a plurality of input time series data indicative of daily vaccination data at a vaccination site; preprocessing, via the one or more hardware processors, the plurality of input time series data to obtain a sequence of input feature vectors; training, via the one or more hardware processors, a supervised learning based predictor model using a ground truth and the sequence of input feature vectors; predicting, via the one or more hardware processors, one or more variables indicative of daily vaccination requirement in a specific region using the trained supervised learning based predictor model; inputting, via the one or more hardware processors, the one or more variables and a cost matrix to construct a state space for an auto regressive agent implementing a reinforcement learning-based cost optimization technique; determining, via the one or more hardware processors, an optimized reward parameter using the reinforcement learning-based cost optimization technique, wherein the optimized reward parameter is determined by the auto regressive agent based on a time series data from the state space and an action taken; and performing, via the one or more hardware processors, an optimal vaccine distribution to the specific region based on the optimized reward parameter.
[006] In accordance with an embodiment of the present disclosure, the one or more control parameters include an optimized mean value of a subset of the velocities corresponding to an optimized set of predicted optical flow losses that are used to adaptively update one or more parameters of a multivariate gaussian distribution until the predicted flow loss reaches the pre-defined threshold.
[007] In accordance with an embodiment of the present disclosure, the cost matrix is determined using a storage cost and a transportation cost associated with the vaccine distribution.
[008] In accordance with an embodiment of the present disclosure, the state space is indicative of cost optimized daily vaccinations at the specific location.

[009] In accordance with an embodiment of the present disclosure, the optimal vaccine distribution is further scaled to a granular level of locations.
[010] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[011] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[012] FIG. 1 illustrates an exemplary system for determining an optimal and cost-effective vaccine distribution chain network according to some embodiments of the present disclosure.
[013] FIG. 2 is a model architecture for determining the optimal and cost-effective vaccine distribution chain network according to some embodiments of the present disclosure.
[014] FIG. 3 depicts an exemplary flow diagram illustrating a method for determining the optimal and cost-effective vaccine distribution chain network in accordance with some embodiments of the present disclosure.
[015] FIG. 4 is a detailed model architecture for determining the optimal and cost-effective vaccine distribution chain network according to some embodiments of the present disclosure.
[016] FIG. 5 depicts a Pearson’s correlation matrix plot for one or more attributes in an input dataset for determining the optimal and cost-effective vaccine distribution chain network according to some embodiments of the present disclosure.
[017] FIG. 6 depicts the Pearson’s correlation matrix plot for a final set of inputs provided to an SRU predictor model described in the detailed model architecture for determining the optimal and cost-effective vaccine distribution chain network, in accordance with an embodiment of the present disclosure.

[018] FIG. 7 depicts a graphical representation illustrating train-validation curves for the SRU predictor model with and without attention mechanism in accordance with some embodiments of the present disclosure.
[019] FIG. 8 depicts a graphical representation illustrating a precision-recall (PR) curve of the SRU predictor model with and without attention mechanism in accordance with some embodiments of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS [020] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope being indicated by the following embodiments described herein. [021] Vaccine distribution primarily depends on vaccine coverage, release time, and deployment methods. Few vaccines need expensive Controlled Temperature Chain (CTC) handling for thermostability and hence need accurately optimized procurement, distribution, and storage management strategies to help curb global pandemic and economic crisis. Conventionally, parameters such as occupation based infection risks and age based fatality risks of general population are employed and examined to model optimal vaccine allocation strategy. Few conventional approaches evaluated selective objective functions to estimate vaccination coverage and effectiveness. However, these approaches fail to optimize cost of vaccine distribution.
[022] There exists several methods that discuss adoption of optimization policies by governments and health institutions in a small span of time prioritizing vaccine allocation at an early stage of a pandemic (prioritized vaccination is referred as phase-1 of a vaccination drive). However, there exists a gap in developing

economically optimal vaccine distribution strategies during mass vaccination (the mass vaccination is referred as phase-2 of the vaccination drive). For example, in case of Covid-19 vaccination, doses were made available to the most vulnerable population based on age and occupation in the phase-1. A very recent conventional approach (e.g., refer ‘Raghav Awasthi, Keerat Guliani, Arshita Bhatt, Mehrab Gill, Aditya Nagori, Ponnurangam Kumaraguru, and Tavpritesh Sethi. 2020. VacSIM: Learning Effective Strategies for COVID-19 Vaccine Distribution using Reinforcement Learning. (09 2020)) employs a concatenation of Reinforcement Learning (RL) and Contextual Bandits sub-models in feed-forward for phase-1 Covid-19 vaccination distribution. This conventional approach utilizes some attributes such as death-rate, recovery-rate, hospital facilities, and/or the like of a state as input attributes to predict which part of population is at a higher risk of pandemic fatalities and therefore needs priority vaccinations. This narrows down the scope of performance of the conventional approach in real world scenarios of mass vaccination where a dense mismanaged cold chain distribution network can incur unnecessarily large expenditures. For example, for delivering 1 billion Covid-19 vaccines in a country like USA, nearly 2 billion US dollars were spent and about 22% of the expenditure was associated with Controlled Temperature Chain (CTC) transportation. Even in front-line health centres, there is a cost for refrigerated storage of the supplied vaccines. Thus, an uncoordinated vaccine distribution may lead to global GDP loss.
[023] The present disclosure address unresolved problems of optimizing vaccine distribution during mass vaccinations. Embodiments of the present disclosure provide system and method for determining the optimal and cost-effective vaccine distribution chain network which utilizes a smart framework that is capable of learning and predicting the distribution chain network of vaccines optimally by leveraging supervised deep learning and reinforcement learning. More specifically, the present disclosure describes the following:
1. A deep neural architecture design that predicts state-wise cost-
efficient optimal vaccine allocations as per daily vaccination demands during a mass vaccination program.

2. A reinforcement learning (RL)-based cost optimization technique
for cold chain network.
3. Design of an RNN-based vaccine demand prediction model
[024] Referring now to the drawings, and more particularly to FIGS. 1
through 8, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[025] FIG. 1 illustrates an exemplary system 100 for determining an optimal and cost-effective vaccine distribution chain network according to some embodiments of the present disclosure. In an embodiment, the system 100 includes one or more hardware processors 104, communication interface device(s) or input/output (I/O) interface(s) 106 (also referred as interface(s)), and one or more data storage devices or memory 102 operatively coupled to the one or more hardware processors 104. The one or more processors 104 may be one or more software processing components and/or hardware processors. In an embodiment, the hardware processors can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor(s) is/are configured to fetch and execute computer-readable instructions stored in the memory. In an embodiment, the system 100 can be implemented in a variety of computing systems, such as laptop computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud and the like.
[026] The I/O interface device(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface device(s) can include one or more ports for connecting a number of devices to one another or to another server.

[027] The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, a database 108 is comprised in the memory 102, wherein the database 108 comprises one or more learning models such as supervised deep learning models and reinforcement learning model which when invoked and executed perform corresponding steps/actions as per the requirement by the system 100 to perform the methodologies described herein. The database 108 further comprises input and output attributes and other relevant information associated with supervised deep learning models and reinforcement learning model. The database 108 further stores the plurality of input time series data, preprocessed data, ground truth data, sequence of input feature vectors, output of the supervised deep learning models such as recurrent neural network based models, and reinforcement learning model, and optimized reward parameter.
[028] The memory 102 further comprises (or may further comprise) information pertaining to input(s)/output(s) of each step performed by the systems and methods of the present disclosure. In other words, input(s) fed at each step and output(s) generated at each step are comprised in the memory 102 and can be utilized in further processing and analysis.
[029] FIG. 2, with reference to FIG. 1, is a model architecture as implemented by the system of 100 of FIG. 1 for determining the optimal and cost-effective vaccine distribution chain network, in accordance with an embodiment of the present disclosure.
[030] FIG. 3, with reference to FIGS. 1-2, depicts an exemplary flow chart illustrating a method 200 for determining the optimal and cost-effective vaccine distribution chain network, using the system 100 of FIG. 1, in accordance with an embodiment of the present disclosure.
[031] Referring to FIG. 3, in an embodiment, the system(s) 100 comprises one or more data storage devices or the memory 102 operatively coupled to the one

or more hardware processors 104 and is configured to store instructions for execution of steps of the method by the one or more processors 104. The steps of the method 200 of the present disclosure will now be explained with reference to components of the system 100 of FIG. 1, the block diagram of FIG. 2, the flow diagram as depicted in FIG. 3 and the detailed model architecture of FIG. 4. In an embodiment, at step 202 of the present disclosure, the one or more hardware processors 104 are configured to receive a plurality of input time series data indicative of daily vaccination data at a vaccination site. As depicted in the block diagram of FIG. 2, the raw data indicates (e.g., refer environment that is in the first row of the block diagram) the plurality of input time series data. In an embodiment, the environment represents the daily vaccination data at the vaccination monitoring sites of various states of a country. In an embodiment, the daily vaccination data at the vaccination site comprises location wise population, total number of vaccinations, number of partially vaccinated persons and number of fully vaccinated persons.
[032] In an embodiment, at step 204 of the present disclosure, the one or more hardware processors 104 are configured to preprocess, the plurality of input time series data to obtain a sequence of input feature vectors. As depicted in the block diagram of FIG. 2, the plurality of input time series data (e.g., refer to the dimensionality reduction and feature engineering block that is in the first row of the block diagram) is preprocessed to remove redundant data and select dominant features suitable for training models used in the present disclosure. In an embodiment, the plurality of input time series is preprocessed using a dimensionality reduction technique and a feature engineering technique. In an embodiment, the dimensionality reduction techniques may include but are not limited to missing value ratio, high correlation filter, and/or the like. In an embodiment, the feature engineering technique is used to increase input feature importance in final prediction suitable for further processing.
[033] In an embodiment, at step 206 of the present disclosure, the one or more hardware processors 104 are configured to train, a supervised learning based predictor model using a ground truth and the sequence of input feature vectors. As

depicted in the block diagram of FIG. 2, a recurrent neural network (RNN)-based predictor Model (e.g., refer to the RNN-based predictor model and supervised learning block that are in third row and fourth row respectively of the block diagram) is used as the supervised learning based predictor model and trained using the ground truth and the sequence of input feature vectors. In an embodiment, at step 208, the one or more hardware processors 104 are configured to predict, one or more variables indicative of daily vaccination requirement in a specific region using the trained supervised learning based predictor model.
[034] The steps 202 till 208 are better understood by way of the following description provided as exemplary explanation.
[035] In the system of present disclosure, the environment represents the daily vaccination data at the vaccination monitoring sites of various states of a country. Further, a preprocessing step on raw numerical data is performed to derive suitable input feature attributes for training the recurrent neural network (RNN)-based predictor model. In an embodiment, aim of the recurrent neural network (RNN)-based predictor model is to find a daily vaccination requirement of a state based on the state’s population and total people vaccinated till a stipulated date. In the present disclosure, the learning models are chosen such that the system of the present disclosure becomes efficient in terms of computational load and training time. Further, manual feature engineering requirements are minimized in the present disclosure. In an embodiment, input data for the RNN-based predictor model is in the form of a time-series, gathered daily at vaccine administration points. In an embodiment, a deep Simple Recurrent Unit (SRU) is used as the RNN-based predictor model which is a new variant of RNNs that has advantages over other known recurrent neural networks such as Quasi RNNs (QRNNs) and Kernel Neural Networks for their "light recurrence" connections with lesser parameters. "Highway connections" in the SRUs give them an edge over Long Short Term Memories (LSTMs) and Gated Recurrent Units (GRUs) with improved training and inference time. In the present disclosure, Deep Q networks (DQNs) are used to optimize distribution chain using Deep Q-learning.

[036] FIG. 4, with reference to FIGS. 1-3, is the detailed model architecture for determining the optimal and cost-effective vaccine distribution chain network according to some embodiments of the present disclosure.
1. Model architecture details: As shown in FIG. 4, the deep SRU predictor model consists of two SRU layers with 10 units each, an attention block, and a single unit dense layer. Inputs to the SRU units are processed feature attributes such as total_population (state population), people_partially_vaccinated and peoplefully- vaccinated. Further, an attention mechanism is leveraged in getting final output which indicates predicted daily state-wise vaccine demand. Connections inside an SRU unit and their details are shown in FIG. 4
2. Mathematical formulations: The data provided as input to the deep
SRU predictor model represents a sequence of features, , over a set of contiguous dates per state and represented as = {0,1,2,...,}, where is a set of feature for the timestep/date . represents an upper bound to the sequence. The SRUs are formulated as follows:

The output of the SRU layer is given by ht in equation 4 and ft represents a light recurrence computation in equation 1 which is highly parallelizable due to the point-wise multiplication of vf and ct-1. rt in equation 3 represents a highway connection and ct in equation 2 represents the state of a SRU cell. Coupled with the attention mechanism, the deep SRU predictor model is trained using a mean-squared error loss against the dailyvaccinations of an area denoted by y.
[037] Referring to steps of FIG. 3 of the present disclosure, at step 210, the one or more hardware processors 104 are configured to input, the one or more variables and a cost matrix to construct a state space for an auto regressive agent implementing a reinforcement learning-based cost optimization technique. As depicted in the block diagram of FIG. 2, the cost matrix (e.g., refer to the cost matrix

block that is in the first column and the fourth row of the block diagram) is determined using a storage cost and a transportation cost associated with the vaccine distribution. Further, as depicted in the block diagram of FIG. 2, the state space is constructed for the auto regressive agent (e.g., refer to the RL-agent block that is in the first column and the fourth row of the block diagram) implementing the reinforcement learning-based cost optimization technique (e.g., refer to the reinforcement learning state space block that is in the first column and the fourth row of the block diagram). In an embodiment, the state space is indicative of cost optimized daily vaccinations at the specific location. In an embodiment, at step 212 of the present disclosure, the one or more hardware processors 104 are configured to determine, an optimized reward parameter using the reinforcement learning (RL)-based cost optimization technique. The optimized reward parameter is determined based on a time series data from the state space and an action taken by the auto regressive agent.
[038] The step 210 and 212 can be better understood by way of the following description provided as exemplary explanation.
[039] In the present disclosure, output of the recurrent neural network (RNN)-based predictor model along with the transportation and storage cost data from the environment, constitutes the state space for the RL-based optimization technique. Further, aim of the auto-regressive agent (alternatively referred as RL agent) is to learn to predict state-wise vaccine allocations, where the reward parameter is calculated based on supplied time series data from the state space and action taken by RL-Agent. In an embodiment, the state space denoted by s is constructed for RL optimization using the output from SRU and cost data from the environment which includes the storage cost and the transportation cost and shown below in equation 5 as:
(5)
where, Cstore is the storage cost and Cdist is the distribution transport cost (alternatively referred as transportation cost in the present disclosure). Further, for

a given state space s, the autoregressive agent follows a policy π with which it produces an action a. From this action, the RL agent receives a reward (referred as the reward parameter in the present disclosure) given by Qπ(s, a). A reinforcement learning (RL)-based model tries to maximize the reward parameter of RL agent by optimizing its action space thereafter represented by Q*(s, a) which is defined as follows:

where r is immediate reward parameter and y is discount factor. s′ and a′ represent subsequent state and action spaces for the reinforcement learning (RL)-based model to select, following an optimal policy to maximize the reward parameters.
[040] In an embodiment, at step 212 of the present disclosure, the one or more hardware processors 104 are configured to perform, an optimal vaccine distribution to the specific region based on the optimized reward parameter. In an embodiment, based on the output of the supervised leaning based model that predicts location-wise vaccine demand, the RL agent varies the vaccine allocation and distribution for that particular location. Depending on vaccine demand and cost of transport/storage, best policy for a single location is reached through policy gradient based learning, by reinforcing the appropriate actions (i.e., vaccine allocation and distribution) using positive and negative rewards from distribution environment. In an embodiment, wherein the optimal vaccine distribution is further scaled to a granular level of locations. The granular level of locations may include but not limited to a locality level, a city level, a state, a country, a subcontinent, and a world level.
[041] The entire approach/method of the present disclosure can be further better understood by way of following pseudo code provided as example:
Data: state-wise vaccination data, transportation and storage cost
Result: distribution_chain
D ← {state-wise vaccination data};
daily- vaccinations ← {ϕ};
for state, input_features in D do

daily - vaccinations [state ] ←SRUPredictor (input_f
eatures);
end
state_space ← {daily_vaccinations , transport cost, storage cost};
distribution_chain ← RL_Agent (state_space ); Experimental Results:
[042] In the present disclosure, data processing techniques implemented to carefully select suitable input features for the deep SRU predictor model (Hereafter referred as SRU predictor model throughout the description) and preliminary results of training the SRU predictor model with and without the attention mechanism are discussed. Data Analysis and Preparation
[043] The data used in the method of present disclosure consists of state-wise Covid-19 vaccine administration details of USA, with date stamps from 1st January to 9th August 2021. There are 240 sets of data points for each state. The present disclosure performs an analysis of Pearson’s correlation matrix plots for one or more attributes in input dataset. The one or more attributes considered in the present disclosure are totalvaccinations, totaldistributed, peoplevaccinated, peoplefullyvaccinated, and dailyvaccinations since they have higher correlation coefficients. Further, population for each state from official census of USA is determined. In an embodiment, from the attributes, totalvaccinations and peoplefullyvaccinated, a new feature, people_partially_vaccinated is calculated. The attributes totalvaccinations and peoplefullyvaccinated represent total number of Covid-19 vaccinations administered and total number of fully vaccinated people (first and second dose) respectively, in a particular state till a specific date. The formula to derive it is given by Equation 7 below:
NS = total_vaccinations - 2 × ND (7)
Here, NS and ND represent number of people vaccinated with a single dose and both doses respectively. FIG. 5 depicts the Pearson’s correlation matrix plot for the one or more attributes in the input dataset for determining the optimal and cost-effective

vaccine distribution chain network according to some embodiments of the present disclosure. As shown in FIG.5, mapping of values mentioned in x and y axes are as follows:
0 → totalvaccinations
1 → totaldistributed
2 → peoplevaccinated
3 → people_fully_vaccinated_per_hundred
4 → total_vaccinations_per_hundred
5 → peoplefullyvaccinated
6 → people_vaccinated_per_hundred
7 → distributed_per_hundred
8 → dailyvaccinations
9 → dailyvaccinationspermillion.
FIG. 6 depicts the Pearson’s correlation matrix plot for the final set of inputs provided to the SRU predictor model for determining the optimal and cost-effective vaccine distribution chain network, in accordance with an embodiment of the present disclosure. As shown in FIG. 6, the x and y axes values mapping are as follows:
0 → total_population,
1 → people_partially_vaccinated and 2→ peoplefullyvaccinated
Hence, the input to the SRU predictor model is given by the sequence of a set of 3 features namely total_population, people_partially_vaccinated, and peoplefullyvaccinated, from a first date to a selected vaccination date. The data is split into train, test and validate with a split ratio of 80%, 10% and 10% respectively. The ground truth for training and evaluating the SRU predictor model is given by the dailyvaccinations target attribute values from the input dataset. Model training and evaluation
[044] The present disclosure requires model training and reports performance evaluation of the trained model. The SRU encoder shown in FIG. 4 does not require more than 10 dimensionally expanded features to learn better

representations. Hence, for simplicity of the detailed model architecture, a limit is imposed on the number of SRU units which is 10 per layer. The attention mechanism aids in constructing fixed size context vectors from the encoded representations. The context vector improves the decoding step performed by the single dense/linear layer. FIG. 7 depicts a graphical representation illustrating train-validation curves for the SRU predictor model with and without attention mechanism in accordance with some embodiments of the present disclosure. It is observed from FIG. 7 that the SRU predictor model without the attention network start at a higher loss and the SRU predictor model with the attention mechanism converges 2.9x faster than SRU without attention network. Thus, without the attention mechanism, the SRU predictor model takes a much longer time to converge to lower values as analyzed from FIG. 7. Further, without the attention mechanism, the input of the sequences is required to be fixed to 240x3 (i.e., 240 sequences for each of the 3 features) so that the input to the dense layer would always be of size 240 × 10 = 2400 (10 feature outputs per SRU layer). The sequences are required to be padded also. As a result of padding, excess of zeros are included that contributes to poor learning which is solved using attention. Padding of the sequences also enables variable length inputs. FIG. 8 depicts a graphical representation illustrating a precision-recall (PR) curve of the SRU predictor model with and without attention mechanism in accordance with some embodiments of the present disclosure. As shown in FIG. 8, precision-Recall (PR) curve of the SRU predictor model with attention mechanism has better Area Under Curve (AUC) and F1-score (F1) than that without attention on test dataset.
[045] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined herein and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the present disclosure if they have similar elements that do not differ from the literal language of the embodiments or if they include equivalent elements with insubstantial differences from the literal language of the embodiments described herein.

[046] The embodiments of present disclosure address unresolved problems of optimizing vaccine distribution during mass vaccinations. The method of the present disclosure provides a cost optimized solution for the mass vaccination using supervised machine learning and reinforcement learning to mitigate a crisis of producing an optimized vaccine distribution chain during an adverse situation such as a pandemic situation.
[047] The method of the present disclosure is extendable/adaptable to a reinforcement learning (RL)-based training of Deep Q Network which can be accomplished by accessing state-wise data by optimizing vaccine transportation and storage cost, identifying a policy function for Q-learning, and identifying RL-agent optimization strategy. The RL-agent optimization strategy may include but not limited to Deep Deterministic Policy Gradients (DDPG). The present disclosure provides future prediction of vaccine demands in a region to help in strategizing manufacturing and distribution of vaccines. In the present disclosure, a scalar value is obtained from the SRU predictor model for daily vaccine demand using previous data. However, the present disclosure is further extendable to prediction of a sequence of daily_vaccinations for a set of dates. The system of the present disclosure is scalable to other essential commodities such as supply including food and medicines during crisis situation when finance management becomes critical. The present disclosure includes optimizing a denser vaccine distribution chain provided more granular data of exact locations of medical centers, pharmacies, vaccination drives and their respective vaccine delivery costs, storage costs, and nearby population density. Thus, the present disclosure provides optimized cost cutting and faster time to market for vaccines by manufacturing companies based on optimal transportation and storage strategies for a region.
[048] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of

computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
[049] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[050] The illustrated steps are set out to explain the exemplary
embodiments shown, and it should be anticipated that ongoing technological
development will change the manner in which particular functions are performed.
These examples are presented herein for purposes of illustration, and not limitation.
Further, the boundaries of the functional building blocks have been arbitrarily
defined herein for the convenience of the description. Alternative boundaries can
be defined so long as the specified functions and relationships thereof are
appropriately performed. Alternatives (including equivalents, extensions,
variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items

following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[051] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[052] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.

We Claim:
1. A processor implemented method, comprising:
receiving (202), via one or more hardware processors, a plurality of input time series data indicative of daily vaccination data at a vaccination site;
preprocessing (204), via the one or more hardware processors, the plurality of input time series data to obtain a sequence of input feature vectors;
training (206), via the one or more hardware processors, a supervised learning based predictor model using a ground truth and the sequence of input feature vectors;
predicting (208), via the one or more hardware processors, one or more variables indicative of daily vaccination requirement in a specific region using the trained supervised learning based predictor model;
inputting (210), via the one or more hardware processors, the one or more variables and a cost matrix to construct a state space for an auto regressive agent implementing a reinforcement learning-based cost optimization technique;
determining (212), via the one or more hardware processors, an optimized reward parameter using the reinforcement learning-based cost optimization technique, wherein the optimized reward parameter is determined by the auto regressive agent based on a time series data from the state space and an action taken; and
performing (214), via the one or more hardware processors, an optimal vaccine distribution to the specific region based on the optimized reward parameter.
2. The processor implemented method of claim 1, wherein the daily
vaccination data at the vaccination site comprises location wise population,
total number of vaccinations, number of partially vaccinated persons and
number of fully vaccinated persons.

3. The processor implemented method of claim 1, wherein the cost matrix is determined using a storage cost and a transportation cost associated with the vaccine distribution.
4. The processor implemented method of claim 1, wherein the state space is indicative of cost optimized daily vaccinations at the specific location.
5. The processor implemented method of claim 1, wherein the optimal vaccine distribution is further scaled to a granular level of locations.
6. A system (100), comprising:
a memory (102) storing instructions;
one or more communication interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the
one or more communication interfaces (106), wherein the one or more
hardware processors (104) are configured by the instructions to:
receive, via one or more hardware processors, a plurality of input time series data indicative of daily vaccination data at a vaccination site;
preprocess, via the one or more hardware processors, the plurality of input time series data to obtain a sequence of input feature vectors;
train, via the one or more hardware processors, a supervised learning based predictor model using a ground truth and the sequence of input feature vectors;
predict, via the one or more hardware processors, one or more variables indicative of daily vaccination requirement in a specific region using the trained supervised learning based predictor model;
input, via the one or more hardware processors, the one or more variables and a cost matrix to construct a state space for an auto regressive agent implementing a reinforcement learning-based cost optimization technique;
determine, via the one or more hardware processors, an optimized reward parameter using the reinforcement learning-based cost optimization

technique, wherein the optimized reward parameter is determined by the auto regressive agent based on a time series data from the state space and an action taken; and
perform, via the one or more hardware processors, an optimal vaccine distribution to the specific region based on the optimized reward parameter.
7. The system of claim 6, wherein the one or more control parameters include an optimized mean value of a subset of the velocities corresponding to an optimized set of predicted optical flow losses that are used to adaptively update one or more parameters of a multivariate gaussian distribution until the predicted flow loss reaches the pre-defined threshold.
8. The system of claim 6, wherein the cost matrix is determined using a storage cost and a transportation cost associated with the vaccine distribution.
9. The system of claim 6, wherein the state space is indicative of cost optimized daily vaccinations at the specific location.
10. The system of claim 6, wherein the optimal vaccine distribution is further scaled to a granular level of locations.

Documents

Application Documents

#	Name	Date
1	202121052641-STATEMENT OF UNDERTAKING (FORM 3) [16-11-2021(online)].pdf	2021-11-16
2	202121052641-REQUEST FOR EXAMINATION (FORM-18) [16-11-2021(online)].pdf	2021-11-16
3	202121052641-PROOF OF RIGHT [16-11-2021(online)].pdf	2021-11-16
4	202121052641-FORM 18 [16-11-2021(online)].pdf	2021-11-16
5	202121052641-FORM 1 [16-11-2021(online)].pdf	2021-11-16
6	202121052641-FIGURE OF ABSTRACT [16-11-2021(online)].jpg	2021-11-16
7	202121052641-DRAWINGS [16-11-2021(online)].pdf	2021-11-16
8	202121052641-DECLARATION OF INVENTORSHIP (FORM 5) [16-11-2021(online)].pdf	2021-11-16
9	202121052641-COMPLETE SPECIFICATION [16-11-2021(online)].pdf	2021-11-16
10	Abstract1.jpg	2022-02-14
11	202121052641-FORM-26 [20-04-2022(online)].pdf	2022-04-20
12	202121052641-FER.pdf	2023-11-20
13	202121052641-FER_SER_REPLY [16-04-2024(online)].pdf	2024-04-16
14	202121052641-CLAIMS [16-04-2024(online)].pdf	2024-04-16
15	202121052641-US(14)-HearingNotice-(HearingDate-24-05-2024).pdf	2024-05-02
16	202121052641-Correspondence to notify the Controller [17-05-2024(online)].pdf	2024-05-17
17	202121052641-Written submissions and relevant documents [05-06-2024(online)].pdf	2024-06-05
18	202121052641-Annexure [05-06-2024(online)].pdf	2024-06-05
19	202121052641-PatentCertificate20-06-2024.pdf	2024-06-20
20	202121052641-IntimationOfGrant20-06-2024.pdf	2024-06-20

Search Strategy

1	SearchHistory_202121052641E_17-11-2023.pdf