Equilibrium Solution For Periodic Double Auctions To Generate Optimal

< Back

Equilibrium Solution For Periodic Double Auctions To Generate Optimal Bidding Strategies For Energy Market

Abstract: ABSTRACT EQUILIBRIUM SOLUTION FOR PERIODIC DOUBLE AUCTIONS TO GENERATE OPTIMAL BIDDING STRATEGIES FOR ENERGY MARKET Periodic double auction (PDA) setting is where buyers of the auction have multiple (but finite) opportunities to procure multiple but fixed units of commodity. The goal of each buyer participating in such auctions is to reduce their cost of procurement by planning purchase across multiple rounds of the PDA. Formulating such optimal bidding strategies in multi-agent periodic double auction setting is a challenging problem as such strategies involve planning across current and future auctions. The method and system disclosed herein addresses such setup wherein the composite supply curve is known to all buyers. Specifically, for the complete information setting, the method models the PDA as Markov game and derives Markov perfect Nash equilibrium (MPNE) solution to devise an optimal bidding strategy for the case when each buyer is allowed to make one bid per round of the PDA. The efficacy of the Nash policies obtained is demonstrated with numerical experiments. [To be published with 1]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

08 September 2023

Publication Number

11/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Tata Consultancy Services Limited

Nirmal Building, 9th floor, Nariman point, Mumbai 400021, Maharashtra, India

Inventors

1. SUBRAMANIAN, Easwara Naga

Tata Consultancy Services Limited, Plot No 1, Survey No. 64/2, Software Units Layout, Serilingampally Mandal, Madhapur, Hyderabad 500081, Telangana, India

2. MANVI, Bharat Basavaraj

Tata Consultancy Services Limited, Plot No 1, Survey No. 64/2, Software Units Layout, Serilingampally Mandal, Madhapur, Hyderabad 500081, Telangana, India

Specification

Description:FORM 2

THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003

COMPLETE SPECIFICATION
(See Section 10 and Rule 13)

Title of invention:
EQUILIBRIUM SOLUTION FOR PERIODIC DOUBLE AUCTIONS TO GENERATE OPTIMAL BIDDING STRATEGIES FOR ENERGY MARKET

Applicant
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India

Preamble to the description:

The following specification particularly describes the invention and the manner in which it is to be performed.
TECHNICAL FIELD

The embodiments herein generally relate to the field of bidding strategies in energy markets and, more particularly, to a method and system for equilibrium solution for Periodic Double Auctions (PDAs) to generate optimal bidding strategies for the energy market.

BACKGROUND

Auctions are mechanisms that facilitate buying and selling of goods between market participants. A double auction consists of multiple buyers and sellers submitting their asks and bids to a market institution in order to procure a target unit of a commodity. A bid or ask consists of a price quantity pair (??,??) indicating that the participant is willing to buy/sell ‘q’ units of the commodity at a unit price ??. The market institution matches the buy bids with the seller asks to determine the clearing price and cleared quantities for all sellers and buyers. These type of auctions are very prevalent in stock exchanges and energy markets. For example, in energy markets, power generating companies are the sellers while energy brokers servicing retail customers are the buyers, and an energy market regulator plays the role of the central market institution. Since the volume of trade is very high in such markets, it is prudent to design an optimal bidding strategy on behalf of a market participant to bring in profits and system efficiency to the ecosystem. The design of such optimal bidding strategies become more pronounced in a periodic double auction (PDA) setup, also referred to as multi-shot set up, wherein buyers and sellers participate in a (finite) sequence of auctions to exchange certain units of a commodity. For example, an energy broker, armed with an estimated energy requirement for a future time slot, participates in day-ahead auctions, to procure the required energy from power generating companies by competing with other energy brokers. In these auctions, the broker will have more than one opportunity to procure the estimated energy by participating in a sequence of auctions. Evidently, in this PDA setup, an optimal bidding strategy involves planning across current and future time auctions and any small improvement in the bidding strategies of the market participants can lead to improved profits and systemic efficiency.
Equilibrium solutions for double auctions have been studied extensively in the past. For example, the work of Satterthwaite and Williams proved the existence of non-trivial equilibria for k-double auctions. Analytical solutions for Nash equilibrium strategies for double auctions with average clearing price rule (ACPR) have been derived for the one buyer one seller, which is a single shot case for uniformly distributed valuations and scale based strategies. The Nash equilibrium in game theory is a situation in which a player will continue with their chosen strategy, having no incentive to deviate from it, after taking into consideration the opponent's strategy Thus, most of the existing approaches based on finding equilibrium solutions are developed for single shot (single agent) cases and not PDAs which is a multi-shot or multiagent scenario. It can be understood that sequential decision making in a multi-agent setting was required to study PDAs. In recent years, Markov game framework has been in use to devise bidding strategies in multi-shot or PDA auctions wherein approaches such as multiagent Q-learning, multi-agent deep Q networks, deep deterministic policy gradients are deployed. However, much of these works consider only single-side auctions, where it is assumed that only one side is presenting bids (buyers) or asks (suppliers). Thus, true equilibrium, that needs to consider match of demand and supply with multiple players remains unaddressed. Thus, the above approaches do not involve deriving analytical solutions for equilibrium strategies for PDAs. Periodic double auction (PDA) setting is where buyers of the auction have multiple (but finite) opportunities to procure multiple but fixed units of a commodity. The goal of each buyer participating in such auctions is to reduce their cost of procurement by planning their purchase across multiple rounds of the PDA. Formulating such optimal bidding strategies in a multi-agent PDA setting is a challenging problem as such strategies involve planning across current and future auctions.
Furthermore, there are few works that propose modelling double auctions as Markov game and determining Nash equilibrium using Reinforcement Learning (RL) approaches to recommend optimal trading strategies for resource allocation in edge server market. Such RL based solutions are numerical setups that normally provides an approximate solution and are not exact. Second, using a numerical approach for equilibrium, or Nash strategies, involves offline or online training. Offline training would mean that the current solution is based on some past data, and there remains a risk of the model not being updated to current observations. In online learning - the model is updated on the fly, which means trade-off between time versus accuracy. Thus, in numerical approaches, if focus is on quick solutions, accuracy is compromised, and focus is on decent approximation time is lost. Furthermore, there is no way to check if the solution to arrive at optimal strategy that is eventually used in the art is really an equilibrium solution or is a Nash equilibrium. Works in the art just use a solution (either based on convergence or based on some stopping criterion) and tend to loosely use the word Nash because of use of algorithm based on game theory and reinforcement. Convergence to true Nash equilibria is hardly verified.

SUMMARY

Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
For example, in one embodiment, a method for equilibrium solutions for multi-shot Periodic Double Auctions (PDAs) to generate optimal bidding strategies for the energy market is provided.
The method includes sourcing for a current round (h) from among a plurality of rounds (H) of an auction offered by a PDA before termination of the PDA, (i) a composite supply curve providing a set of asks that are ordered based on value of price-supply quantity pairs consolidated for a set of suppliers in an energy market, and (ii) an outstanding demand of energy of each of a set of buyers (N) participating in the PDA, wherein the PDA is modeled as a finite horizon Markov game (M) for the set of buyers (N) restricted with the plurality of rounds (H) for procuring the outstanding demand of energy, wherein finite horizon Markov game (M) is specified by a plurality of parameters comprising the plurality of buyers (N), a state space ?? defining distinct states of a buyer (b) among the plurality of buyers (N), an action set ???? for the buyer (b) at the current round (h) among the plurality of rounds (H) a cost function representing a cost of energy procurement at the current round (h) for the buyer (b), and the plurality of rounds (H).
Further, the method includes applying a joint policy (p_*^(b,h) (s)) for each of the set of buyers for the current round (h), wherein the joint policy is determined in accordance with the finite horizon Markov game (M) to generate an action for the buyer (b) defining an optimal bidding strategy with a price-demand quantity pair for procuring energy for the current round (h), wherein the joint policy minimizes the cost function at a Markov Perfect Nash Equilibrium (MPNE).
Obtaining the MPNE by applying the joint policy (p_*^(b,h) (s)) comprises: determining a position of the buyer (b) in a sorting list generated by sorting the outstanding demand of the set of buyers in increasing order; checking availability of sufficient rounds for remaining buyers except the buyer (b) to satiate the outstanding demand of the remaining buyers, wherein availability of the sufficient rounds is determined from the ordered set of asks obtained from the composite supply curve; applying a first policy ?(p?_1^(b,h)) if sufficient round are available, and a second policy ?(p?_2^(b,h)) if the sufficient round are unavailable.
The first policy comprising, if the sorting list determines that the buyer (b) has a maximum outstanding demand, and the sufficient rounds are available for the remaining buyers, (i) holding the buyer (b) until the outstanding demand of the remaining buyers is satiated, (ii) allowing the buyer (b) to buy the maximum outstanding demand at a price at which the outstanding demand of the remaining buyers gets cleared, and (iii) allowing the remaining buyers bid a price that is greater than a maximum ask from the set of asks.
The second policy comprising, if enough rounds are unavailable for the remaining buyers, then the buyer (b) determined as a first buyer from the sorting list whose outstanding demand when reduced from an overall demand of the set of buyers is cleared from the available supply, (i) allowing the buyer (b) to buy the maximum outstanding demand at the lowest price at which the outstanding demand of the buyer (b) gets cleared, and ii) allowing the remaining buyers bid a price that is greater than a maximum ask from the set of asks.
Further, the method includes computing an optimal bid comprising value of the price-demand quantity pair for the current round for each of the set of buyers in accordance with the joint policy. Furthermore, the method includes determining a market clearing price for trading of the energy by the set of buyers for the current round, wherein the market clearing price is determined using a market clearing price mechanism based on the composite supply curve and the optimal bid of each of the set of buyers comprising the price-demand quantity pair of each of the set of buyers. Computing the optimal bid for each of the set of buyers by applying the joint policy is repeated for each round until the plurality of rounds (H) are exhausted.
In another aspect, a system for equilibrium solutions for multi-shot Periodic Double Auctions (PDAs) to generate optimal bidding strategies for the energy market is provided. The system comprises a memory storing instructions; one or more Input/Output (I/O) interfaces; and one or more hardware processors coupled to the memory via the one or more I/O interfaces, wherein the one or more hardware processors are configured by the instructions to source for a current round (h) from among a plurality of rounds (H) of an auction offered by a PDA before termination of the PDA, (i) a composite supply curve providing a set of asks that are ordered based on value of price-supply quantity pairs consolidated for a set of suppliers in an energy market, and (ii) an outstanding demand of energy of each of a set of buyers (N) participating in the PDA, wherein the PDA is modeled as a finite horizon Markov game (M) for the set of buyers (N) restricted with the plurality of rounds (H) for procuring the outstanding demand of energy, wherein finite horizon Markov game (M) is specified by a plurality of parameters comprising the plurality of buyers (N), a state space ?? defining distinct states of a buyer (b) among the plurality of buyers (N), an action set ???? for the buyer (b) at the current round (h) among the plurality of rounds (H) a cost function representing a cost of energy procurement at the current round (h) for the buyer (b), and the plurality of rounds (H).
Further, the one or more hardware processors are configured to apply a joint policy (p_*^(b,h) (s)) for each of the set of buyers for the current round (h), wherein the joint policy is determined in accordance with the finite horizon Markov game (M) to generate an action for the buyer (b) defining an optimal bidding strategy with a price-demand quantity pair for procuring energy for the current round (h), wherein the joint policy minimizes the cost function at a Markov Perfect Nash Equilibrium (MPNE).
Obtaining the MPNE by applying the joint policy (p_*^(b,h) (s)) comprises: determining a position of the buyer (b) in a sorting list generated by sorting the outstanding demand of the set of buyers in increasing order; checking availability of sufficient rounds for remaining buyers except the buyer (b) to satiate the outstanding demand of the remaining buyers, wherein availability of the sufficient rounds is determined from the ordered set of asks obtained from the composite supply curve; applying a first policy ?(p?_1^(b,h)) if sufficient round are available, and a second policy ?(p?_2^(b,h)) if the sufficient round are unavailable.
The first policy comprising, if the sorting list determines that the buyer (b) has a maximum outstanding demand, and the sufficient rounds are available for the remaining buyers, (i) holding the buyer (b) until the outstanding demand of the remaining buyers is satiated, (ii) allowing the buyer (b) to buy the maximum outstanding demand at a price at which the outstanding demand of the remaining buyers gets cleared, and (iii) allowing the remaining buyers bid a price that is greater than a maximum ask from the set of asks.
The second policy comprising, if enough rounds are unavailable for the remaining buyers, then the buyer (b) determined as a first buyer from the sorting list whose outstanding demand when reduced from an overall demand of the set of buyers is cleared from the available supply, (i) allowing the buyer (b) to buy the maximum outstanding demand at the lowest price at which the outstanding demand of the buyer (b) gets cleared, and ii) allowing the remaining buyers bid a price that is greater than a maximum ask from the set of asks.
Further, the one or more hardware processors are configured to compute an optimal bid comprising value of the price-demand quantity pair for the current round for each set of buyers in accordance with the joint policy. Furthermore, the one or more hardware processors are configured to determine a market clearing price for trading of the energy by the set of buyers for the current round, wherein the market clearing price is determined using a market clearing price mechanism based on the composite supply curve and the optimal bid of each of the set of buyers comprising the price-demand quantity pair of each of the set of buyers. Computing of the optimal bid for each of the set of buyers by applying the joint policy is repeated for each round until the plurality of rounds (H) are exhausted.
In yet another aspect, there are provided one or more non-transitory machine-readable information storage mediums comprising one or more instructions, which when executed by one or more hardware processors causes a method for equilibrium solutions for multi-shot Periodic Double Auctions (PDAs) to generate optimal bidding strategies for the energy market.
The method includes sourcing for a current round (h) from among a plurality of rounds (H) of an auction offered by a PDA before termination of the PDA, (i) a composite supply curve providing a set of asks that are ordered based on value of price-supply quantity pairs consolidated for a set of suppliers in an energy market, and (ii) an outstanding demand of energy of each of a set of buyers (N) participating in the PDA, wherein the PDA is modeled as a finite horizon Markov game (M) for the set of buyers (N) restricted with the plurality of rounds (H) for procuring the outstanding demand of energy, wherein finite horizon Markov game (M) is specified by a plurality of parameters comprising the plurality of buyers (N), a state space ?? defining distinct states of a buyer (b) among the plurality of buyers (N), an action set ???? for the buyer (b) at the current round (h) among the plurality of rounds (H) a cost function representing a cost of energy procurement at the current round (h) for the buyer (b), and the plurality of rounds (H).
Further, the method includes applying a joint policy (p_*^(b,h) (s)) for each of the set of buyers for the current round (h), wherein the joint policy is determined in accordance with the finite horizon Markov game (M) to generate an action for the buyer (b) defining an optimal bidding strategy with a price-demand quantity pair for procuring energy for the current round (h), wherein the joint policy minimizes the cost function at a Markov Perfect Nash Equilibrium (MPNE).
Obtaining the MPNE by applying the joint policy (p_*^(b,h) (s)) comprises: determining a position of the buyer (b) in a sorting list generated by sorting the outstanding demand of the set of buyers in increasing order; checking availability of sufficient rounds for remaining buyers except the buyer (b) to satiate the outstanding demand of the remaining buyers, wherein availability of the sufficient rounds is determined from the ordered set of asks obtained from the composite supply curve; applying a first policy ?(p?_1^(b,h)) if sufficient round are available, and a second policy ?(p?_2^(b,h)) if the sufficient round are unavailable.
The first policy comprising, if the sorting list determines that the buyer (b) has a maximum outstanding demand, and the sufficient rounds are available for the remaining buyers, (i) holding the buyer (b) until the outstanding demand of the remaining buyers is satiated, (ii) allowing the buyer (b) to buy the maximum outstanding demand at a price at which the outstanding demand of the remaining buyers gets cleared, and (iii) allowing the remaining buyers bid a price that is greater than a maximum ask from the set of asks.
The second policy comprising, if enough rounds are unavailable for the remaining buyers, then the buyer (b) determined as a first buyer from the sorting list whose outstanding demand when reduced from an overall demand of the set of buyers is cleared from the available supply, (i) allowing the buyer (b) to buy the maximum outstanding demand at the lowest price at which the outstanding demand of the buyer (b) gets cleared, and ii) allowing the remaining buyers bid a price that is greater than a maximum ask from the set of asks.
Further, the method includes computing an optimal bid comprising value of the price-demand quantity pair for the current round for each of the set of buyers in accordance with the joint policy. Furthermore, the method includes determining a market clearing price for trading of the energy by the set of buyers for the current round, wherein the market clearing price is determined using a market clearing price mechanism based on the composite supply curve and the optimal bid of each of the set of buyers comprising the price-demand quantity pair of each of the set of buyers. Computing the optimal bid for each of the set of buyers by applying the joint policy is repeated for each round until the plurality of rounds (H) are exhausted.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
FIG. 1 is a functional block diagram of a system, for equilibrium solutions for multi-shot Periodic Double Auctions (PDAs) to generate optimal bidding strategies for the energy market, in accordance with some embodiments of the present disclosure.
FIGS. 2A through 2B (collectively referred as FIG. 2) is a flow diagram illustrating a method for equilibrium solutions for multi-shot PDAs to generate optimal bidding strategies for the energy market, using the system depicted in FIG. 1, in accordance with some embodiments of the present disclosure.
FIGS. 3A and 3B illustrates approach used for checking of conditions to be satisfied to decide on joint policy to be applied by the system at a specific round among a plurality of rounds of auction opportunities of the PDA, in accordance with some embodiments of the present disclosure.
FIG. 4 illustrates comparison of value function of Markov Perfect Nash Equilibrium (MPNE) using graphical charts, in accordance with some embodiments of the present disclosure.
It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems and devices embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.

DETAILED DESCRIPTION OF EMBODIMENTS

Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
Periodic double auction (PDA) setting is where buyers of the auction have multiple (but finite) opportunities to procure multiple but fixed units of a commodity. The goal of each buyer participating in such auctions is to reduce their cost of procurement by planning their purchase across multiple rounds of the PDA. Formulating such optimal bidding strategies in a multi-agent periodic double auction setting with double sided (buyer -seller consideration) is a technically challenging problem as such strategies involve planning across current and future auctions with consideration of demand and supply match
Embodiments of the present disclosure provide a method and system for equilibrium solution for Periodic Double Auctions (PDAs) to generate optimal bidding strategies for the energy market.
The method and system disclosed models the PDA as a Markov game and derives Markov perfect Nash equilibrium (MPNE) solution, that enables formulating a joint policy for a buyer in consideration with remaining buyers to devise an optimal bidding strategy, wherein each buyer is allowed to make one bid per round of the PDA.
The joint policy formulated herein addresses the technical challenges for PDA equilibria by system sourcing information such as composite supply curve and demand requirement of all the buyers, wherein complete information is present to generate a closed form or easily computable expression for evaluation. . The system enables distributing bidding for the entire outstanding demand of a buyer by spreading it optimally across the multiple bidding opportunities of the PDA such that the procurement cost is minimal for all the buyers participating in the bid. The efficacy of the Nash policies obtained is demonstrated with numerical experiments.
Referring now to the drawings, and more particularly to FIGS. 1 through 4, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
FIG. 1 is a functional block diagram of a system 100, for equilibrium solution for Periodic Double Auctions (PDAs) to generate optimal bidding strategies for the energy market, in accordance with some embodiments of the present disclosure.
In an embodiment, the system 100 includes a processor(s) 104, communication interface device(s), alternatively referred as input/output (I/O) interface(s) 106, and one or more data storage devices or a memory 102 operatively coupled to the processor(s) 104. The system 100 with one or more hardware processors is configured to execute functions of one or more functional blocks of the system 100.
Referring to the components of system 100, in an embodiment, the processor(s) 104, can be one or more hardware processors 104. In an embodiment, the one or more hardware processors 104 can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 104 are configured to fetch and execute computer-readable instructions stored in the memory 102. In an embodiment, the system 100 can be implemented in a variety of computing systems including laptop computers, notebooks, hand-held devices such as mobile phones, workstations, mainframe computers, servers, and the like.
The I/O interface(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface and the like to provide an interface for buyers and suppliers to place their bids and asks respectively for initiating bidding in the energy markets by receiving optimal bidding strategies for each auction round of PDA. The system 100, via the I/O interface 106 can connect with external sources and also sources to source energy market information such as a composite supply curve and the like. The I/O interface 106 can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular and the like. In an embodiment, the I/O interface (s) 106 can include one or more ports for connecting to a number of external devices or to another server or devices.
The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
In an embodiment, the memory 102 includes a plurality of modules 110 such as a PDA model (wherein the PDA is modeled as Finite horizon Markov game), a Joint policy execution module providing optimal bidding strategies to all buyers receiving bidding services from the system 100.
The plurality of modules 110 further include programs or coded instructions that supplement applications or functions performed by the system 100 for executing different steps involved in the process of obtaining equilibrium solutions for multi-shot PDAs to generate optimal bidding strategies for the energy market, being performed by the system 100. The plurality of modules 110, amongst other things, can include routines, programs, objects, components, and data structures, which performs particular tasks or implement particular abstract data types. The plurality of modules 110 may also be used as, signal processor(s), node machine(s), logic circuitries, and/or any other device or component that manipulates signals based on operational instructions. Further, the plurality of modules 110 can be used by hardware, by computer-readable instructions executed by the one or more hardware processors 104, or by a combination thereof. The plurality of modules 110 can include various sub-modules (not shown).
Further, the memory 102 may comprise information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system100 and methods of the present disclosure.
Further, the memory 102 includes a database 108. The database (or repository) 108 may include a plurality of abstracted pieces of code for refinement and data that is processed, received, or generated as a result of the execution of the plurality of modules in the module(s) 110.
Although the data base 108 is shown internal to the system 100, it will be noted that, in alternate embodiments, the database 108 can also be implemented external to the system 100, and communicatively coupled to the system 100. The data contained within such an external database may be periodically updated. For example, new data may be added into the database (not shown in FIG. 1A) and/or existing data may be modified and/or non-useful data may be deleted from the database. In one example, the data may be stored in an external system, such as a Lightweight Directory Access Protocol (LDAP) directory and a Relational Database Management System (RDBMS). Functions of the components of the system 100 are now explained with reference to steps in flow diagrams in FIG. 2.
FIGS. 2A through 2B (collectively referred as FIG. 2) is a flow diagram illustrating a method for equilibrium solutions for multi-shot PDAs to generate optimal bidding strategies for the energy market, using the system depicted in FIG. 1, in accordance with some embodiments of the present disclosure.
In an embodiment, the system 100 comprises one or more data storage devices or the memory 102 operatively coupled to the processor(s) 104 and is configured to store instructions for execution of steps of the method 200 by the processor(s) or one or more hardware processors 104. The steps of the method 200 of the present disclosure will now be explained with reference to the components or blocks of the system 100 as depicted in FIG. 1 and the steps of flow diagram as depicted in FIG. 2. Although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods, and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
Referring to the steps of the method 200, at step 202 of the method 200, the one or more hardware processors 104 are configured by the instructions to source, for a current round (h) from among a plurality of rounds (H) of an auction offered by a PDA before termination of the PDA: (i) a composite supply curve providing a set of asks that are ordered based on value of price-supply quantity pairs consolidated for a set of suppliers in an energy market, and (ii) an outstanding demand of energy of each of a set of buyers (N) participating in the PDA. This information is sourced, for example, from external or internal sources or computed based on inputs received form from the buyers and suppliers.
The PDA herein is modeled as a finite horizon Markov game (M) for the set of buyers (N) restricted with the plurality of rounds (H) for procuring the outstanding demand of energy, wherein finite horizon Markov game (M) is specified by a plurality of parameters comprising the plurality of buyers (N), a state space ?? defining distinct states of a buyer (b) among the plurality of buyers (N), an action set ???? for the buyer (b) at a current round (h) among the plurality of rounds (H) a cost function representing a cost of energy procurement at the current round (h) for the buyer (b), and the plurality of rounds (H).
The PDA considered herein is based on certain constraints or assumptions while carrying out equilibrium analysis as below: First, it is assumed that the total supply available is enough to meet the overall demand requirement and that all the asks from the suppliers can be clubbed into a composite supply curve and is made known to the system 100to generate bids. This implies that the resultant Markov game has only the buyers as the game participants. Second, each PDA consists of a fixed number of rounds, i.e., each buyer has the same fixed number of auctions to procure their respective estimated demand. Third, buyers estimate their respective procurement need before the start of the first auction of the PDA and is sourced by the system 100. The estimate is not altered during the course of the PDA. Fourth, the buyers are not allowed to attempt to buy more than their outstanding requirement in any auction. Fifth, in the case that a particular buyer is not able to procure their targeted units of the commodity even after exhausting all rounds of the PDA, it will be procured outside of the auction at a higher cost. Finally, a uniform payment rule (UPR) is considered, wherein, the clearing price decided through the auction mechanism is the same for all market participants.
PDA modeled as Markov game framework: The PDA consisting of ?? buyers with horizon ?? as a finite horizon Markov game. For any positive integer K, let [K] denote the set {1,··· ,??}, consider a system of ?? buyers participating in ?? rounds of a PDA to procure multiple units of a commodity from a set of suppliers (sellers). For the purpose of this exposition, it can be assumed that the composite supply curve from all the sellers or suppliers at round h ? [??] consisting of ??h asks is sourced by the system 100 (available to plan bidding strategies) and is given by L^h= {(p_1^h,q_1^h ),(p_2^h,q_2^h ),.......(p_(M^h)^h,q_(M^h)^h )} where ??h ? [0, ??max] and ??h ? [0,??max] are respectively the price and quantity components of the ask (p_(M^h)^h,q_(M^h)^h), with ?? ? [??h] and ??max,??max are suitable upper bounds for the ask.
The Markov game is specified by M = ???,??, ??,??,??,???. The ingredients of M are a finite set of players ??; a state space ??; for each player ?? ? [??], an action set ????; a transition probability ?? from ?? × ?? ? ??, where ?? = ×????? ???? is the action profile, with ??(??'|??,??) is the probability that the next state is ??' ? ??, given the current state is ?? ? ?? and current action profile is ?? ? ??; and a payoff function ( or cost function) C from ?? × ?? ? R??, where the ??-th coordinate of C is ????, is the payoff to player ?? as a function of state and action profile.
More specifically, the state at round h denoted by ??h is let to consist of {Qh, Lh} and the action ????,h ? ???? by player ?? ? [??] at round h consists of ????,h bids, with each bid belong to the bounded set [0, ??max] × [0,??max] with the condition that the sum of the quantity components of the bids does not exceed the outstanding demand of player ?? ? [??] at round h. The payoff or cost function ????,h : ?? × ?? ? R returns a scalar value to player ?? specifying players cost of procurement (if any) for the auction at round h. More precisely,
C^(b,h) (s^h,a^h )={¦(?^h·a^(b,h) at non terminal state s^h@?×Q^(b,h) when h=H+1,)¦ (1)
where ??h = (????,h,??-??,h) ? ?? is the joint action set containing one action for each player at round h with ??-??,h specifying the ?? -1 actions of all players except ??. In addition, ??h = 0 is the clearing price of the auction at round h, ????,h is cleared quantity for the buyer ??’s bid ?? at round h. The entity ? = 0 is the unit price of procuring the commodity
outside of the ?? auctions and ????,??+1 is the remaining units of the commodity to be procured by buyer ?? after exhausting the ?? rounds of the PDA. Given a state ??h ? ?? and action profile ??h ? ??, the next state at round h+1 is given by ??h+1 = {Q^(h+1), L^(h+1)}, here L^(h+1) refers to the uncleared asks of the supply curve from round h and Qh+1 = {??1,h+1,··· ,????,h+1} with ????,h+1 = ????,h - ????,h ? ?? ? [??]. The above mathematical statement indicates that a decision (action) taken for current round (h) by all buyers is based on current information, which takes all buyers to a next state that is influenced or varied based on actions of all buyers in previous round.
At any round h, having seen the state ??h, the players choose their action based on a policy. A (Markov) policy for a player ?? ? [??] is a collection of policies ???? = {????,h : ?? ? + ?_Ab }_(h=1)^H, where each ????,h(·|??h) ? ????? specifies the probability of taking action ??h ? ???? at state ??h. Let ?? = (????,??-??) be the joint policy containing one policy for each player ?? ? [??] where ??-?? denotes the ?? - 1 policies of all players except ??. The value of a joint policy ??, at round h, for any player ?? is a function ????h : ?? ? R defined as below. The value of joint policy ?? (not necessary Markov), the value at round h, for a player ?? ? [??] is given by,
V_p^h (S)= E_(t~(P,p^b,p^(-b))) [?_(h^'=h)^(H+1)¦+ C^(b,h^' ) (s^(h^' ),a^(b,h^' ),a^(-b,h^' ))¦| s^h=s] (2)
with ????,h' ~ ???? and ??-??,h' ~ ??-?? and t is a trajectory of the Markov game generated by the joint policy ??. As the Markov game pertaining to the PDA herein involves cost minimization as the objective, the optimal policy for any player is to find a policy that minimizes the value function. However, in a multiagent scenario, when other players act rationally, finding optimal policy is equivalent to finding (Nash) equilibrium solutions which is the best response to rational behavior of other participating agents. Hence, for the PDA modeled as a Markov game, Markov Perfect Nash Equilibrium (MPNE) solutions is defined as below:
Definition 1: Given a ?? player finite horizon stochastic game specified by M=< ??,??, ??,??,??,?? > a joint policy ??* = [p^b,p_*^(-b)] is a (Markov) perfect Nash equilibrium (MPNE) if for all ?? ? [??], for all ?? ? ??, for all h ? [??] and for all
???? : ?? ? ?????, there exists:
V_(p_*^b,p_*^(-b))^h (s)= V_(p^b,p_*^(-b))^h (s) (3)
The perfectness of the Nash equilibrium is due to condition that the inequality in Definition (1) holds for every round h ? [??] and for every element of the state space ??.
A NASH STRATEGY FOR THE SINGLE BID CASE- Formulating the joint policy: .
As is well-known in the art the total demand requirement of all buyers and total supply available from all suppliers at round h from the total rounds H can be obtained. The total supply available, at any round h ? [??], is given by equation 4.
Q^(S,h)=?_(m?[M_h])¦q_(m·)^h (4)
Denote the outstanding requirement of a buyer ?? ? [??] at round h as ????,h = 0. Let Qh = {??1,h,??2,h,...,????,h} be the vector that contains the outstanding requirement of all the ?? buyers at round h. Let Bh denote the set of all bids placed by all buyers at round h, wherein, each buyer ?? ? [??] places at most one bid consisting of price-quantity pair (????,h,????,h) with ????,h ? [0, ??max] and ????,h ? [0,????,h]. The number of bids placed at round h is given by ??h = ?? and the total demand from all buyers round h is given by equation 5.
Q^(D,h)=?_b?[N]¦q^(b,h) (5)
At each round h, let [??h] denote the set of ?? players indexed by the decreasing order of their quantity requirement. Now, let ??h be the index of the ask from the set such that all of the demand requirement at round h is met. That is, ??h = argmin?? ???? =?_(m=1)^j¦q_m^h . At round h, denote ????-??,h = ????,h - ????,h, ?? ? [??h] as the demand requirement of all players except the player ??. Let ????h, ?? ? [??h] be the lowest index of the ordered set L^h such that the total supply available for the first ????h asks satisfies the demand requirement of all players except ??. That is, ????h = argmin?? ????-?? = ?_(m=1)^j¦q_m^h ? ?? ? [??h ]. Next, defined is the index ??h as ??h = ??h - (?? - h). Finally, let ??h = max{1,argmax??{v_h^j= z_h} as the player who bids p_(z_h ) and let ??h as the player with maximum requirement. Note that ??h = ??h when ??h = 1.
The joint policy ??* for a player ?? ? [??h] at round h ? [??] for state ??h ? ??, can now be formulated as:

p_1^(b,h) (s)={¦(p^(b,h)=0,q^(b,h)=Q^(b,h) if Q^(b,h)=0,? b?[N^h]@p_1^(b,h) (s) if H-h=u_h-v_h^?@p_2^(b.h) (s) otherwise) ¦ (5)

Where the policies p_1^(b,h) (s) and p_2^(b,h) (s) are defined as,
p_1^(b,h) (s)={¦(p^(b,h)=p_(v_h^? ),q^(b,h)=Q^(b,h) if Q^(b,h)>0,b=?^h@p^(b,h)= p_max,q^(b,h)= Q^(b,h) if Q^(b,h)>0,b??^h )¦

p_2^(b,h) (s)={¦(p^(b,h)=p_zh,q^(b,h)=Q^(b,h) if Q^(b,h)>0,b=?^h@p^(b,h)=p_max,q^(b,h)=Q^(b,h) if Q^(b,h)>0,b??^h )¦

Based on the formulated joint policy, for the current round, at step 204 of the method 200, the joint policy implementation module executed by the one or more hardware processors 104 are configured by the instruction to apply the a joint policy (p_*^(b,h) (s)) for each of the set of buyers for the current round (h). The joint policy is determined in accordance with the finite horizon Markov game (M) to generate an action for the buyer (b) defining an optimal bidding strategy with a price-demand quantity pair for procuring energy for the current round (h). The joint policy, so formulated minimizes the cost function at a Markov Perfect Nash Equilibrium (MPNE) as understood form equation 1, 2 and 3.
Obtaining the MPNE by applying the joint policy (p_*^(b,h) (s)) as in equation 5 comprises following steps:
Determine a position of the buyer (b) in a sorting list generated by sorting the outstanding demand of the set of buyers in increasing order (204a).
Checking availability of sufficient rounds (?? - h = ??h - ??h??) for remaining buyers except the buyer (b) to satiate the outstanding demand of the remaining buyers. The availability of the sufficient rounds is determined from the ordered set of asks obtained from the composite supply curve (204b).
Provide one of a first policy ?(p?_1^(b,h)) if sufficient rounds (?? - h = ??h - ??h??) are available, and a second policy ?(p?_2^(b,h)) if the sufficient round are unavailable (204c).
The first policy ?(p?_1^(b,h)): If the sorting list determines that the buyer (b) has a maximum outstanding demand, and the sufficient rounds are available for the remaining buyers:
holding the buyer (b) until the outstanding demand of the remaining buyers is satiated, and
allowing the buyer (b) to buy the maximum outstanding demand at a price at which the outstanding demand of the remaining buyers gets cleared,.
allowing the remaining buyers to bid a price that is greater than a maximum ask from the set of asks
The second policy ?(p?_2^(b,h)): If enough rounds are unavailable (?? - h < ??h - ??h??), for the remaining buyers, then the buyer (b) determined as a first buyer (??) from the sorting list whose outstanding demand when reduced from an overall demand of the set of buyers is cleared from the available supply:
allowing the buyer (b) to buy the maximum outstanding demand at the lowest price at which the outstanding demand of the buyer (b) gets cleared, and
allowing the remaining buyers to bid a price that is greater than a maximum ask from the set of asks.
An illustrative example below simplifies the logical flow of the system 100 to determine whether the first policy or the second policy is to be applied.
FIGS. 3A and 3B ( collectively referred as FIG. 3) illustrates approach used for checking of conditions to be satisfied to decide on the joint policy to be applied by the system 100 at a specific round among a plurality of rounds of auction opportunities of the PDA, in accordance with some embodiments of the present disclosure. FIGS. 3A and 3B depicts the composite supply curve sourced by the system for a given PDA indicating the ask for supplier end, wherein each ask is a pair of price in dollars against number of units of energy (p,q). Further, the FIGS. 3A and 3B also depicting three buyers (Q1, Q2, Q3) interested in the PDA having total 4 rounds of auction (H=4), with outstanding demand requirement of 20, 10 and 5 units of energy respectively. Thus, here buyer Q1 is buyer having maximum demand requirement. The total supply provided by the set of suppliers, as seen for composite supply curve is 40 units, and exceeds the total demand of (20+10+5=35 units) by the three buyers. The composite supply curve consists of 4 asks sorted in increasing order price and energy units as :
Sorted list of asks
Index 1----(5,12)
Index 2----(10,23)
Index 3----(15,30)
Index 4----(20,43)
The interpretation of sorted list of asks is well understood by person having ordinary skill in the domain. Thus, the first ask means that first 12 units of power is available at 5 dollars and second the next 11 units is available at 10 dollars and so on (the second ask being (10,23) means that the next 11 units (23 - 12) is available at 10 units).
The finite horizon length for the example of FIG. 3 is 4 (H=4). In FIG. 3A, the current round is the first round of the auction with h=1. The available rounds, after the first round is over, is H-h; (4 -1=3). The term uh corresponds to index value in the sorted list of asks above where all of the 35 units of demand by all the buyers gets satisfied. Thus, uh = 4. Similarly, ??h?? refers to the index value of ask where the total demand of remaining buyers (Q2, Q3) excluding the demand of maximum demanding buyer Q1 is satisfied (10+5 units is satisfied at Index 2, i.e., ??h?? =2). Hence the condition check for deciding on first policy or second policy evaluates to H-h (4-1=3)>u_h-v_h^? (4-2=2). This satisfies the condition requirement for first policy and the system 100 executes first policy.
In the second figure, assuming the current round is h=3, wherein the total demand of (20+10+5=35 units) is made by the three buyers. In this case, with the assumption that composite supply curve is unaltered, uh = 4 is still valid. However, unlike FIG. 3A, while evaluating policy in this scenario, the condition H-h (4-3=1) 0. In the case, that at round h, there is adequate supply to cater to the demand of all buyers, that is, ????,h = ????,h, the player ??h has maximum requirement and the bid at price p_(?v_h?^? ). By construction, p_(?v_h?^? ) is also the point where the supply and demand curve intersect and hence the MCP is p_(?v_h?^? ) . It is now easy to see that, the total market cleared quantity is given by,
Q^h=min- ?(?_(j=1)^(v_h^?)¦q_j'^h ?_(b?[N^h])¦Q^(b,h) ) (6)
As the bids placed at the higher price ??max gets cleared first and since the available supply is enough to cater to outstanding demand requirement at round h, bids gets cleared exactly as stated in the first column of the Table 1 in Lemma 1. In similar lines, it can be shown for the case (?? - h < ??h - ??h??).
Having described the clearing implications for a buyer ?? ? [??h] for following the policy p_*^(b,h) (s) of Equation (5) at state ??h ? ??, now computed is the value of the equilibrium policy for a buyer ?? ? [??h] which follows from Lemma 1. When (?? - h = ??h - ??h??), it can be seen that
V_(p_*^b,p_*^(-b))^h (s)= {¦(?p_(v_h )?^?×Q^(b,h),if b??^h@+ [?p_(v_h )?^? ¦×(Q^h-?_(b?[N]\?^h)¦q_j^(b,h) )+ ?_(k=h+1)^H¦??p_(v_k )?^?×Q^k ?],if b =?^h )¦ (7)

V_(p_*^b,p_*^(-b))^h (s)= {¦(p_(z_h )×Q^(b,h),if b??^h@+ [p_(z_h ) ¦×(Q^h-?_(b?[N]\?^h)¦q_j^(b,h) )+ ?_(k=h+1)^H¦?p_(z_k )×Q^k ?],if b =?^h )¦ (8)

Equilibrium analysis: In this section, for the PDA considered in this exposition, it is shown that the policy in (equation 5) is an MPNE in the space of all deterministic policies. More precisely, it is needed to show that, for all ?? ? [??h], for all ?? ? ??, for all h ? [??] and for any deterministic policy ???? : ?? ? ????, and as in equation 3, V_(p_*^b,p_*^(-b))^h (s)= V_(p^b,p_*^(-b))^h (s).
Denote the bid of buyer ?? ? [??h] at state ?? and round h as prescribed by the policy ?? as . Further, recall that each bid of a player ?? ? [??h] belongs to the bounded set [0, ??max] × [0,??max] and at any round h, the player ?? does not bid more than the outstanding demand requirement ????,h, the possible deviations available for a player ?? at a state ?? and round h can be tabulated as in table 2 below.

TABLE 2: POSSIBLE DEVIATIONS
Higher Priced Deviations Lower Priced Deviations
p_b^(b,h)>p_*^(b,h),q_b^(b,h)p_*^(b,h),q_b^(b,h)=q_*^(b,h) p_b^(b,h) ?? · ??max (?? > 1) and let ?? ? [??h] be a player that is prescribed by the joint policy in Equation (5) to bid at price ??max to procure his outstanding demand requirement at round h. If the player deviates from the said policy to another policy ?? at round h at state ??, then, from equation 3, V_(p_*^?,p_*^(-?))^h (s)= V_(p^?,p_*^(-?))^h (s).
Proof: Among the five deviations enumerated in Equation Table 2, the deviations suggesting that the bid price greater than ??max are not applicable to player ?? as by design ??max is the maximum bid price. The other three deviations (at round h) either suggest that the bid of player ?? has bid price less than equal to ??max or bid quantity less than equal to ????,h. In the case ????,h < ????,h, the bid placed by ?? will lose out on priority when compared to following the joint policy in Equation (5). This implies that the bid quantity ????,h could be partially cleared at round h (as opposed to ????,h being fully cleared if joint policy (equation 5) is followed). Further, by Lemma 2, as the remaining requirement of ????,h is likely to be cleared at higher price in future rounds, the overall cost incurred by player ?? is greater than or equal to the cost incurred when joint policy in (equation 5) is followed. Now if player ?? deviates in bid price but with fixed bid quantity as ????,h = ????,h. First, when there are enough or sufficient rounds for player ??h , i.e., (?? - h = ??h - ??h??) and the player ??h deviates below the price ??max then priority would decrease and hence it leads to increase in cost.
Next if the player ??h does not have enough or sufficient rounds (?? - h < ??h - ??h??), the policy ?(p?_2^(b,h)) is used. Here, it is considered are the deviations by two types of player ??. That is, if player ?? has requirement greater than the player bidding at ????h , then with bid price ????,h ? [????h , ??max), the value function is unchanged for player ??. However, if the player ?? bids at price ????,h ? [p_(?v_h?^? ) , ????h ), by construction the number of remaining rounds for the player ?? would be less. Hence the player has to buy non-zero quantity from balancing market at a price ?. Finally, if the player ?? has less requirement than the player bidding at ????h , then with bid price ????,h ? (????h , ??max), the value function is unchanged for player ??. And for the bid price ????,h = ????h , the value function might increase due to lemma 2. Now, for the bid price ????,h ? [p_(?v_h?^? ) , ????h ), the condition on ? > ?? · ??max will lead to higher value function for the player ??.
Lemma 4: Let the conditions of Lemma 2 with the balancing cost ? > ?? · ??max (?? > 1) and let ?? ? [??h] be a player that is prescribed by the policy in Equation (3) to bid at the price p_(?v_h?^? ) or ????h to procure its outstanding demand requirement at round h. If the player deviates to another policy ?? at round h, instead of following the policy in Equation (3), then, for any state ?? ? ??, V_(p_*^?,p_*^(-?))^h (s)= V_(p^?,p_*^(-?))^h (s).
Proof: Here for player ??, all the five deviations listed in Equation Table 2) are possible. Now if the policy recommended is p_1^(b,h) with the bid price p_(?v_h?^? ) , the deviations with the bid price greater than p_(?v_h?^? ) will increase the MCP hence increases the value function. Next if the bid price is less than p_(?v_h?^? ) , by construction the player ?? is not cleared. Similarly for bidding ????,h < ????,h, the cleared quantity is less. Hence in both cases, with Lemma 2, the cost increases.
For the policy p_2^(b,h), the recommended price is ????h . If the player ?? bids at a price more than ????h , then the MCP at round h would be greater than or equal to ????h resulting in possible increase of overall cost of procurement. And, if the player bids at price ????h with bid quantity less than ????,h, the cleared quantity could be less than the demand procured by following policy (5) which would imply more demand needs to be satisfied in the remaining rounds. Again from Lemma 2, this could lead to higher procurement. Finally, if the bid price is ????,h ? [p_(?v_h?^? ) , ????h ) then by the choice of ? > ?? · ??max, the value function increases.
Theorem 1: Let the conditions of Lemma 2 hold with the balancing cost ? > ?? · ??max (?? > 1). If a buyer ?? ? [??h] deviates to another policy ?? at round h, instead of following the policy in Equation (3), then, for any state ?? ? ?? and h ? [??], V_(p_*^?,p_*^(-?))^h (s)= V_(p^?,p_*^(-?))^h (s)
Proof: From Lemmas 2, 3 and 4 the value function of the policy of equation 5 satisfies V_(p_*^?,p_*^(-?))^h (s)= V_(p^?,p_*^(-?))^h (s).
Note that the policy in (equation 5) is a Markov policy since it depends on only on present state ?? not on history and since the above Lemmas hold for all ?? ? ?? and h ? [??], the policy in (equation 5) is an MPNE for the Markov game described in Section III for the case when each buyer submits one bid per round of the PDA.
SIMULATIONS: This section considers a simple numerical setup to demonstrate the efficacy of the Nash policies described herein. The setup consists of three players (buyers) in the market. The players go through a PDA simulator, which has ?? = 24 rounds to procure the required quantity. The quantity requirement of the four players (P0, P1, P2) at some round h = ?? is given as Qh = (232.18,164.6,90.7). The players P0 and P2 are the players with the largest, and smallest requirement respectively. The players know the supply curve (ask pattern) L^h which has 31 asks and the total supply ????= 1502.38 > ????,h = 487.48. We consider two values of h, namely h =1 and h =23, wherein the choice h =1 satisfies the condition (?? - h = ??h - ??h??) and the latter does not.
FIG. 4 depicting subfigures (a) though (d) is comparison of the value function of MPNE and deviation. FIG. 4, (a), the value function for h = 1 with the player P0 deviating to a price higher than ????h . FIG. 4, (b) the value function for h = 23 with player P0 deviating from MPNE to a lesser price than the prescribed price ??max. FIG. 4, (c) the value function for h =24 with player P2 deviating from Nash policy to bid at a price lower than ????h . FIG. 4, (d) the average cost incurred by players in a series of 100 PDAs each with horizon ?? = 24 with deviation by player P0 to ZI policy
FIG. 4, (a) compares the value function when all players adopt the policy in (equation 5) with a joint policy in which player P0 deviates to a bid price higher than the prescribed price ????h at h = 1. The higher bid price of P0 results in higher cost because of increased MCP. In Figure 3(b), for h = 23, the condition (?? - h = ??h - ??h??) is not satisfied and hence the prescribed bid price of P0 is ??max. When P0 bids less than ??max its bid priority decreases resulting in procurement outside of the PDA at higher cost ? thereby increasing the overall cost. FIG. 4, (c), considers the case h = 24, wherein the condition (?? - h = ??h - ??h??) is not satisfied. Here, the deviation by the minimum requirement player P2 to bid at a price ?? ? (?????? , ????h ) less than the prescribed price ????h . This deviation to a lower price, although results in lower cost of procurement at round h, leads to higher overall cost as the player has to buy more units of the commodity outside of the auction at higher price ? = ?? · ??max. Finally, in FIG.3, (d), considered is average cost incurred by the players in 100 PDAs (each with ?? =24 rounds) with varying demand requirement. In each of these 100 PDAs, the player P0 is allowed to deviate from the prescribed Nash policy to the Zero intelligent (ZI) policy and the corresponding value functions are compared with the value function for the Nash policy.
Thus, the method and system disclosed herein generates optimal bidding strategies for the periodic double auction (PDA) setting consisting of multiple buyers competing with each other to satisfy their respective demand. Each buyer has multiple opportunities to procure their need and the composite supply curve is known to all of them. The problem is modeled as the Markov game and system generates equilibrium solutions that could act as optimal bidding strategies when all buyers behave rationally. It is proved that joint policy applied by the system are indeed MPNE. The system disclosed that implements the joint policy devises optimal bidding strategies for day-ahead electricity markets.
Although constraints such as having adequate supply with one bid per auction per buyer, are considered, the case of multiple bids per auction and inadequate supply can be handled using the techniques disclosed. Despite the fact that the equilibrium solutions disclosed herein are for the complete information setting, they are still important for two reasons. The system and method provides a technical solution for the unaddressed problem of providing analytical equilibrium solutions for multi-shot auctions in PDA. The joint policy disclosed can be used as a baseline to compare with a policy that is obtained in an incomplete information setting
The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means, and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
, Claims:We Claim:
A processor implemented method (200) for generating optimal bidding strategies for Periodic Double Auctions (PDAs) in an energy market, the method comprising:
sourcing (202), via one or more hardware processors, for a current round (h) from among a plurality of rounds (H) of an auction offered by a PDA before termination of the PDA, (i) a composite supply curve providing a set of asks that are ordered based on value of price-supply quantity pairs consolidated for a set of suppliers in an energy market, and (ii) an outstanding demand of energy of each of a set of buyers (N) participating in the PDA, wherein the PDA is modeled as a finite horizon Markov game (M) for the set of buyers (N) restricted with the plurality of rounds (H) for procuring the outstanding demand of energy, wherein finite horizon Markov game (M) is specified by a plurality of parameters comprising the plurality of buyers (N), a state space ?? defining distinct states of a buyer (b) among the plurality of buyers (N), an action set ???? for the buyer (b) at the current round (h) among the plurality of rounds (H) a cost function representing a cost of energy procurement at the current round (h) for the buyer (b), and the plurality of rounds (H);
applying (204), via the one or more hardware processors, a joint policy (p_*^(b,h) (s)) for each of the set of buyers for the current round (h), wherein the joint policy is determined in accordance with the finite horizon Markov game (M) to generate an action for the buyer (b) defining an optimal bidding strategy with a price-demand quantity pair for procuring energy for the current round (h), wherein the joint policy minimizes the cost function at a Markov Perfect Nash Equilibrium (MPNE), wherein obtaining the MPNE by applying the joint policy (p_*^(b,h) (s)) comprises:
determining a position of the buyer (b) in a sorting list generated by sorting the outstanding demand of the set of buyers in increasing order (204a);
checking availability of sufficient rounds for remaining buyers except the buyer (b) to satiate the outstanding demand of the remaining buyers, wherein availability of the sufficient rounds is determined from the ordered set of asks obtained from the composite supply curve (204b);
applying a first policy ?(p?_1^(b,h)) if sufficient round are available, and a second policy ?(p?_2^(b,h)) if the sufficient round are unavailable (204c), wherein:
the first policy comprising, if the sorting list determines that the buyer (b) has a maximum outstanding demand, and the sufficient rounds are available for the remaining buyers, (i) holding the buyer (b) until the outstanding demand of the remaining buyers is satiated, (ii) allowing the buyer (b) to buy the maximum outstanding demand at a price at which the outstanding demand of the remaining buyers gets cleared, and (iii) allowing the remaining buyers bid a price that is greater than a maximum ask from the set of asks; and
the second policy comprising, if enough rounds are unavailable for the remaining buyers, then the buyer (b) determined as a first buyer from the sorting list whose outstanding demand when reduced from an overall demand of the set of buyers is cleared from the available supply, (i) allowing the buyer (b) to buy the maximum outstanding demand at the lowest to bid at which the outstanding demand of the buyer (b) gets cleared, and ii) allowing the remaining buyers bid a price that is greater than a maximum ask from the set of asks;
computing (206), via the one or more hardware processors, an optimal bid comprising value of the price-demand quantity pair for the current round for each of the set of buyers in accordance with the joint policy; and
determining (208), via the one or more hardware processors, a market clearing price for trading of the energy by the set of buyers for the current round, wherein the market clearing price is determined using a market clearing price mechanism based on the composite supply curve and the optimal bid of each of the set of buyers comprising the price-demand quantity pair of each of the set of buyers.

The method as claimed in claim 1, wherein computing the optimal bid for each of the set of buyers by applying the joint policy is repeated for each round until the plurality of rounds (H) are exhausted (210).

The method as claimed in claim 1, wherein obtaining the MPNE by applying the joint policy considers a set of constraints comprising:
the total supply available from the set of suppliers is enough to meet a total demand from the set of buyers, wherein the set of asks from the set of suppliers is clubbed into the composite supply curve and is known to plan the optimal bids during the PDA, wherein the PDA, modeled as the finite horizon Markov game (M), includes the set of buyers as game participants and excludes the set of suppliers;
the plurality of rounds (H) is finite for the PDA;
a precomputed estimate of the outstanding demand of each buyer from a set of buyers to be procured before start of a first round among the plurality of rounds (H) of the PDA, and the precomputed estimate of the outstanding demand that remains unaltered during the plurality of rounds (H) of the PDA;
each buyer is not allowed to buy more than the outstanding demand in any of the plurality of rounds (H);
if the buyer is unable to procure the outstanding demand post exhausting the plurality of rounds (H), a deficit is procured outside of the auction at a higher cost; and
executing a uniform payment rule (UPR), wherein, the market clearing price is kept same across the set of buyers (b).

The method as claimed in claim 1, wherein the market clearing price per unit of the energy for each round among the plurality of rounds (h) is determined using an average clearing price rule (ACPR).

A system (100) for generating optimal bidding strategies for Periodic Double Auctions (PDAs) in an energy market, the system (100) comprising:
a memory (102) storing instructions;
one or more Input/Output (I/O) interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the one or more I/O interfaces (106), wherein the one or more hardware processors (104) are configured by the instructions to:
source for a current round (h) from among a plurality of rounds (H) of an auction offered by a PDA before termination of the PDA, (i) a composite supply curve providing a set of asks that are ordered based on value of price-supply quantity pairs consolidated for a set of suppliers in an energy market, and (ii) an outstanding demand of energy of each of a set of buyers (N) participating in the PDA, wherein the PDA is modeled as a finite horizon Markov game (M) for the set of buyers (N) restricted with the plurality of rounds (H) for procuring the outstanding demand of energy, wherein finite horizon Markov game (M) is specified by a plurality of parameters comprising the plurality of buyers (N), a state space ?? defining distinct states of a buyer (b) among the plurality of buyers (N), an action set ???? for the buyer (b) at the current round (h) among the plurality of rounds (H) a cost function representing a cost of energy procurement at the current round (h) for the buyer (b), and the plurality of rounds (H);
apply a joint policy (p_*^(b,h) (s)) for each of the set of buyers for the current round (h), wherein the joint policy is determined in accordance with the finite horizon Markov game (M) to generate an action for the buyer (b) defining an optimal bidding strategy with a price-demand quantity pair for procuring energy for the current round (h), wherein the joint policy minimizes the cost function at a Markov Perfect Nash Equilibrium (MPNE), wherein obtaining the MPNE by applying the joint policy (p_*^(b,h) (s)) comprises:
determining a position of the buyer (b) in a sorting list generated by sorting the outstanding demand of the set of buyers in increasing order (204a);
checking availability of sufficient rounds for remaining buyers except the buyer (b) to satiate the outstanding demand of the remaining buyers, wherein availability of the sufficient rounds is determined from the ordered set of asks obtained from the composite supply curve (204b);
applying a first policy ?(p?_1^(b,h)) if sufficient round are available, and a second policy ?(p?_2^(b,h)) if the sufficient round are unavailable (204c), wherein:
the first policy comprising, if the sorting list determines that the buyer (b) has a maximum outstanding demand, and the sufficient rounds are available for the remaining buyers, (i) holding the buyer (b) until the outstanding demand of the remaining buyers is satiated, (ii) allowing the buyer (b) to buy the maximum outstanding demand at a price at which the outstanding demand of the remaining buyers gets cleared, and (iii) allowing the remaining buyers bid a price that is greater than a maximum ask from the set of asks; and
the second policy comprising, if enough rounds are unavailable for the remaining buyers, then the buyer (b) determined as a first buyer from the sorting list whose outstanding demand when reduced from an overall demand of the set of buyers is cleared from the available supply, (i) allowing the buyer (b) to buy the maximum outstanding demand at the lowest price at which the outstanding demand of the buyer (b) gets cleared, and ii) allowing the remaining buyers bid a price that is greater than a maximum ask from the set of asks;
compute an optimal bid comprising value of the price-demand quantity pair for the current round for each of the set of buyers in accordance with the joint policy; and
determine a market clearing price for trading of the energy by the set of buyers for the current round, wherein the market clearing price is determined using a market clearing price mechanism based on the composite supply curve and the optimal bid of each of the set of buyers comprising the price-demand quantity pair of each of the set of buyers.

The system as claimed in claim 5, wherein computing the optimal bid for each of the set of buyers by applying the joint policy is repeated for each round until the plurality of rounds (H) are exhausted.

The system as claimed in claim 5, wherein obtaining the MPNE by applying the joint policy considers a set of constraints comprising:
the total supply available from the set of suppliers is enough to meet a total demand from the set of buyers, wherein the set of asks from the set of suppliers is clubbed into the composite supply curve and is known to plan the optimal bids during the PDA, wherein the PDA, modeled as the finite horizon Markov game (M), includes the set of buyers as game participants and excludes the set of suppliers;
the plurality of rounds (H) is finite for the PDA;
a precomputed estimate of the outstanding demand of each buyer from a set of buyers to be procured before start of a first round among the plurality of rounds (H) of the PDA, and the precomputed estimate of the outstanding demand that remains unaltered during the plurality of rounds (H) of the PDA;
each buyer is not allowed to buy more than the outstanding demand in any of the plurality of rounds (H);
if the buyer is unable to procure the outstanding demand post exhausting the plurality of rounds (H), a deficit is procured outside of the auction at a higher cost; and
executing a uniform payment rule (UPR), wherein, the market clearing price is kept same across the set of buyers (b).

The system as claimed in claim 5, wherein the market clearing price per unit of the energy for each round among the plurality of rounds (h) is determined using an average clearing price rule (ACPR).

Documents

Application Documents

#	Name	Date
1	202321060569-STATEMENT OF UNDERTAKING (FORM 3) [08-09-2023(online)].pdf	2023-09-08
2	202321060569-REQUEST FOR EXAMINATION (FORM-18) [08-09-2023(online)].pdf	2023-09-08
3	202321060569-FORM 18 [08-09-2023(online)].pdf	2023-09-08
4	202321060569-FORM 1 [08-09-2023(online)].pdf	2023-09-08
5	202321060569-FIGURE OF ABSTRACT [08-09-2023(online)].pdf	2023-09-08
6	202321060569-DRAWINGS [08-09-2023(online)].pdf	2023-09-08
7	202321060569-DECLARATION OF INVENTORSHIP (FORM 5) [08-09-2023(online)].pdf	2023-09-08
8	202321060569-COMPLETE SPECIFICATION [08-09-2023(online)].pdf	2023-09-08
9	202321060569-Proof of Right [04-10-2023(online)].pdf	2023-10-04
10	202321060569-FORM-26 [19-01-2024(online)].pdf	2024-01-19
11	Abstract.jpg	2024-02-07
12	202321060569-Power of Attorney [28-10-2024(online)].pdf	2024-10-28
13	202321060569-Form 1 (Submitted on date of filing) [28-10-2024(online)].pdf	2024-10-28
14	202321060569-Covering Letter [28-10-2024(online)].pdf	2024-10-28
15	202321060569-FORM 3 [06-11-2024(online)].pdf	2024-11-06
16	202321060569-FORM-26 [07-11-2025(online)].pdf	2025-11-07