Abstract: This disclosure relates to multi-vehicle formation among vehicles/robots/agents. Multi-vehicle formation is challenging as multiple vehicles have to maintain formation simultaneously. The disclosure is enables multivehicle formation based on rapidly-exploring random tree (RRT) and game theory. The disclosed technique for multi-vehicle formation is implemented in several steps that includes generation a set of random control signal vectors for each vehicle in the group of the multi-vehicles using the rapidly exploring random tree (RRT). Subsequently, for every control signal vector, the vehicle computes a formation error, a penalty and a reward considering the neighbors state vector (rather than neighbors’ control signal vector). Further the vehicle selects the optimal control signal vector among the set of random control signal vectors based on game theory (based on correlated equilibrium) using the reward. The process of RRT and game theory continues until the desired multi-vehicle formation is attained. [To be published with FIG. 2]
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention:
METHOD AND SYSTEM FOR MULTI-VEHICLE FORMATION BASED ON CORRELATED EQUILIBRIUM FROM GAME THEORY
Applicant
Tata Consultancy Services Limited A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
The following specification particularly describes the invention and the manner in
which it is to be performed.
TECHNICAL FIELD
[001] The disclosure herein generally relates to the field multi-vehicle formation by robots, and, more particularly, to a method and a system for multi-vehicle formation based on correlated equilibrium from game theory.
BACKGROUND
[002] Multi-vehicle system refers to the collection of multiple rational (or self-interested) interacting vehicles, which collaboratively attain common goal autonomously. Application of multi-vehicle system includes military, health care, smart cities, transportation systems, patrolling, etc. The success of multi-vehicle is largely dependent on formation among the multiple vehicles. Few challenges of multi-vehicle system include control architecture design, communication, task allocation, coalition and simultaneously maintaining multi-vehicle formation.
[003] During control architecture design in multi-vehicle system, organizing and interacting with heterogeneous vehicles is challenging. Heterogeneity of vehicles becomes necessity for automating complex tasks that involve several different types of tasks, wherein the challenges for heterogeneity/heterogeneous vehicles can be categorized in terms of different sensing capabilities and dynamics of the vehicle. To handle the challenges involved in heterogeneity, the traditional control approaches take vehicle dynamics into consideration, while designing a control law. However, the designed control laws are vehicle specific hence the control law needs to be modified as the heterogeneity in system changes. To meet the growing requirement of heterogeneity in multi-vehicle system, a universally accepted control law is required, which should not change with the heterogeneity and has to remain a standard yet complete the formation in synchronization with the other heterogeneous vehicles.
SUMMARY
[004] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a method and a system for multi-vehicle formation based on correlated equilibrium from game theory is provided. The system includes a memory storing instructions, one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to receive a plurality of inputs associated with a group of multi-vehicles, via the hardware processors. The system is further configured to generate a set of random control signal vectors for each vehicle from the group of multi-vehicles, via the hardware processors, based on a Rapidly-exploring random tree (RRT) technique, wherein the set of random control signal vectors comprises of a plurality of control signal vectors. The system is further configured to generate the target formation by the group of multi-vehicles, via the hardware processors, comprising several steps that includes: computing a formation error for each of the control signal vectors, computing a penalty for each of the plurality of control signal vectors and computing a reward for each of the plurality of control signal vectors iteratively selecting an optimal control signal vector from the set of random control signal vectors using the rewards of the plurality of control signal vectors based on the correlated equilibrium from game theory technique to enable the target formation.
[005] In another aspect, a method for multi-vehicle formation based on correlated equilibrium from game theory is provided. The method includes receiving a plurality of inputs associated with a group of multi-vehicles. The method further includes generating a set of random control signal vectors for each vehicle from the group of multi-vehicles based on a Rapidly-exploring random tree (RRT) technique, wherein the set of random control signal vectors comprises of a plurality of control signal vectors. The method further includes generating the target
formation by the group of multi-vehicles, comprising several steps that includes : computing a formation error for each of the control signal vectors, computing a penalty for each of the plurality of control signal vectors and computing a reward for each of the plurality of control signal vectors iteratively selecting an optimal control signal vector from the set of random control vectors using the rewards of the plurality of control signal vectors based on the correlated equilibrium from game theory technique to enable the target formation.
[006] In yet another aspect, a non-transitory computer readable medium for multi-vehicle formation based on correlated equilibrium from game theory is provided. The program includes receiving a plurality of inputs associated with a group of multi-vehicles. The program further includes generating a set of random control signal vectors for each vehicle from the group of multi-vehicles based on a Rapidly-exploring random tree (RRT) technique, wherein the set of random control signal vectors comprises of a plurality of control signal vectors. The program further includes generating the target formation by the group of multi-vehicles, comprising several steps that includes : computing a formation error for each of the control signal vectors, computing a penalty for each of the plurality of control signal vectors and computing a reward for each of the plurality of control signal vectors iteratively selecting an optimal control signal vector from the set of random control signal vectors using the rewards of the plurality of control signal vectors based on the correlated equilibrium from game theory technique to enable the target formation.
[007] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[008] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[009] FIG.1 illustrates an exemplary system for multi-vehicle formation based on correlated equilibrium from game theory according to some embodiments of the present disclosure.
[010] FIG.2 is a functional block diagram of the system for multi-vehicle formation based on correlated equilibrium from game theory according to some embodiments of the present disclosure.
[011] FIG.3A and FIG.3B is a flow diagram illustrating a method for multi-vehicle formation based on correlated equilibrium from game theory in accordance with some embodiments of the present disclosure.
[012] FIG.4 illustrates an example scenario of multi-vehicle formation along with a target formation, a current formation and a possible formation in accordance with some embodiments of the present disclosure.
[013] FIG.5A and FIG.5B illustrates an example scenario of generation of random control signal vectors for two vehicles based on their vehicle dynamics in accordance with some embodiments of the present disclosure.
[014] FIG.6 illustrates an example of inter-vehicle collision avoidance based on a safe envelope for multi-vehicle formation in accordance with some embodiments of the present disclosure.
[015] FIG.7 illustrates a graph result for optimal trajectories for consensus with four quadcopters for simulation-one in accordance with some embodiments of the present disclosure.
[016] FIG.8 illustrates a graph result for tracking error during consensus formation with four quadcopters for simulation-one in accordance with some embodiments of the present disclosure.
[017] FIG.9A and FIG.9B illustrates a multi-vehicle homogeneous formation, (a) Rectangle formation (FIG.9A) (b) Triangle formation (FIG.9B) for simulation-two in accordance with some embodiments of the present disclosure.
[018] FIG.10 illustrates a multi-vehicle heterogeneous rectangle formation for simulation-three in accordance with some embodiments of the present disclosure.
[019] FIG.11 illustrates a multi-vehicle homogeneous cube formation for simulation-three in accordance with some embodiments of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS
[020] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
[021] Referring now to the drawings, and more particularly to FIG.1 through FIG.11, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[022] FIG.1 is a functional block diagram of a system 100 for multi-vehicle formation based on correlated equilibrium from game theory in accordance with some embodiments of the present disclosure.
[023] In an embodiment, the system 100 includes a processor(s) 104, communication interface device(s), alternatively referred as input/output (I/O) interface(s) 106, and one or more data storage devices or a memory 102 operatively coupled to the processor(s) 104. The system 100 with one or more hardware
processors is configured to execute functions of one or more functional blocks of the system 100.
[024] Referring to the components of system 100, in an embodiment, the processor(s) 104, can be one or more hardware processors 104. In an embodiment, the one or more hardware processors 104 can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 104 is configured to fetch and execute computer-readable instructions stored in the memory 102. In an embodiment, the system 100 can be implemented in a variety of computing systems including laptop computers, notebooks, hand-held devices such as mobile phones, workstations, mainframe computers, servers, a network cloud and the like.
[025] The I/O interface(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, a touch user interface (TUI) and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface (s) 106 can include one or more ports for connecting a number of devices (nodes) of the system 100 to one another or to another server.
[026] The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
[027] Further, the memory 102 may include a database 108 configured to include information regarding several algorithms for path planning and multi-vehicle formation. Thus, the memory 102 may comprise information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system 100 and methods of the present disclosure. In an embodiment, the database 108 may be
external (not shown) to the system 100 and coupled to the system via the I/O interface 106.
[028] Functions of the components of system 100 are explained in conjunction with functional overview of the system 100 in FIG.2 and flow diagram of FIGS.3A and 3B for multi-vehicle formation based on correlated equilibrium from game theory.
[029] The system 100 supports various connectivity options such as BLUETOOTH®, USB, ZigBee and other cellular services. The network environment enables connection of various components of the system 100 using any communication link including Internet, WAN, MAN, and so on. In an exemplary embodiment, the system 100 is implemented to operate as a stand-alone device. In another embodiment, the system 100 may be implemented to work as a loosely coupled device to a smart computing environment. The components and functionalities of the system 100 are described further in detail.
[030] FIG.2 is a functional block diagram of the various modules of the system of FIG.1, in accordance with some embodiments of the present disclosure. As depicted in the architecture, the FIG.2 illustrates the functions of the components of a system 200 that includes multi-vehicle formation based on correlated equilibrium from game theory. The system 200 is an example of system 100 (FIG.1).
[031] The system 200 for multi-vehicle formation based on correlated equilibrium from game theory is configured for receiving a plurality of inputs associated with a group of multi-vehicles (m) via an input module 202. The system 200 further comprises of a random control signal vector generator 204 configured to generate a set of random control signal vectors for each vehicle from the group of multi-vehicles based on a Rapidly-exploring random tree (RRT) technique. The system 200 further comprises of a target formation module 206 configured to generate the target formation by the group of multi-vehicles using a formation error module 208, a penalty module 210, a reward module 212 and an optimal control signal vector selector 214. The formation error module 208 is configured to compute a formation error for each of the control signal vectors from the set of
random control signal vectors. The penalty module 210 is configured to compute a penalty for each of the plurality of control signal vectors based on a penalty computation technique. The reward module 212 is configured to compute a reward for each of the plurality of control signal vectors based on the formation error and the penalty. The optimal control signal vector selector 214 is configured to iteratively select an optimal control signal vector from the set of random control vectors using the rewards of the plurality of control signal vectors. The system 200 further comprises an output module configured to share and display the target formation enabled using the optimal control signal vector for the group of multi-vehicles.
[032] The various modules of the system 100 for multi-vehicle formation based on correlated equilibrium from game theory are implemented as at least one of a logically self-contained part of a software program, a self-contained hardware component, and/or, a self-contained hardware component with a logically self-contained part of a software program embedded into each of the hardware component that when executed perform the above method described herein.
[033] Functions of the components of the system 200 are explained in conjunction with functional modules of the system 100 stored in the memory 102 and further explained in conjunction with flow diagram of FIGS. 3A and FIG.3B. The FIG.3A and FIG.3B with reference to FIG.1, is an exemplary flow diagram illustrating a method 300 for multi-vehicle formation based on correlated equilibrium from game theory using the system 100 of FIG.1 according to an embodiment of the present disclosure.
[034] The steps of the method of the present disclosure will now be explained with reference to the components of the system(100) for multi-vehicle formation based on correlated equilibrium from game theory and the modules (202-216) as depicted in FIGS.2 and the flow diagrams as depicted in FIG.3A and FIG.3B. Although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the
steps to be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
[035] At step 302 of the method (300), a plurality of inputs associated with a group of multi-vehicles (m) is received at the input module 202.
[036] In an embodiment, the plurality of inputs comprises a target formation of the multi-vehicle (Fr), a state vector of a vehicle from the group of multi-vehicles (xi), a state vector of the vehicle's neighbors (XNi), a vehicle dynamics factor for the vehicle, a maximum velocity each of the vehicle from the group of multi-vehicles, a tolerance factor (ε), a time step size (∆t) and a plurality of sampling parameters.
[037] In an embodiment, the group of multi-vehicles is a collection of one of: multiple rational and self-interested interacting vehicles that collaboratively attain common target autonomously and comprises one of: a group of heterogeneous vehicles or a group of homogeneous vehicles. The Further the target formation of the multi-vehicle formation is a goal formation to be formed by each of a vehicle in sync with all the other vehicles in the multi-vehicle formation.
[038] In an embodiment, the plurality of sampling parameters comprises a threshold number of control signal vectors for the vehicle (i) represented as C i max and the vehicle dynamics factor defines all necessary constraints required to impose upon the vehicle’s movement. The vehicle (i) belongs to the group of multi-vehicles (m), wherein [i ∈1…m].
[039] An example scenario of a multi-vehicle formation problem with three vehicles is illustrated in FIG.4. In the FIG.4, the three vehicles are attempting to the target formation Fr( to form a triangle) and the corresponding vehicle’s state vectors are represented as 1′, 2′ and 3′ . The current formation Fc is inscribed and current state vectors of vehicles are shown by (1,2 and 3) . Fp1k represents a possible formation by vehicle 1 because of the Kth signal vector, which is formed by hypothetical movement of vehicle 1 and current state vectors of neighbors (vehicle 2 and 3). In FIG.4, To attain the target formation, each vehicle
from the group of multi-vehicles must collaborate with each other or the vehicle’s neighbor.
[040] In an embodiment, the vehicle's neighbors comprises of at
least one neighbor vehicle (j) that is connected to the vehicle (i) by direct communication link(s) where an essential spacing between the vehicle (i) and the neighbor vehicle (j) is denoted by Considering the example scenario in the
FIG.4, the vehicle 1 has two neighbors – Vehicle 2 and Vehicle 3. The essential spacing between vehicle (i) and the neighbor vehicle (j) in an n-dimensional
state space ( assuming that vehicle (i) can access a state vector of all its neighbors) is expressed as shown below:
[041] The vehicle dynamics factor defines all necessary constraints required to impose upon the vehicle’s movement and is expressed as shown below:
(2)
where,
Ci is the d-dimensional control signal vector applied to vehicle i generated
from
[042] At the next step 304 of the method (300), a set of random control signal vectors is generated for each vehicle from the group of multi-vehicles
(m), via the random control signal vector generator 204. The set of random control signal vectors is generated based on a Rapidly-exploring random tree (RRT)
technique. The set of random control signal vectors comprises of a plurality of
control signal vectors .
[043] In an embodiment, the RRT generates control signal vectors randomly. However, every randomly generated control signal vector cannot generate a feasible move in the space, this is because of the actuation limitation of
the vehicle or vehicle dynamics as defined in equation 2. A vehicle can generate several next state vectors based on RRT corresponding to a number of control signal vectors. However, because of the limitations set by the vehicle dynamics as defined in equation 2, moving to all state vectors is not feasible. Hence, the disclosed RRT techniques identify the control signal vector, which leads to feasible state vectors of the vehicle. RRT inherently takes care of the equation (2) which naturally generates matrix of control signal vectors as shown in an example scenario in FIG.5A and FIG.5B.
[044] FIG.5A and FIG.5B shows examples of generation of random control signal vector for two vehicles based on their vehicle dynamics. The vehicle 1, in FIG.5A cannot move in lateral direction due to differential drive constraints on Vehicle 1 due to the vehicle’s dynamics. However, in case of Vehicle 2 illustrated in FIG.5B, a quadrotor with 6-degree of freedom can move in all possible directions. Hence, RRT generates the set of random control signal vectors ,
wherein the random control signal vectors comprises of a plurality of control
signal vectors based on the vehicle dynamics.
[045] At step 306 of the method (300), the target formation is generated by the group of multi-vehicles via the target formation module 206. The target formation is generated in each vehicle among the group of multi-vehicles based on several steps using the formation error module 208, the penalty module 210, the reward module 212 and the optimal control signal vector selector 214.
[046] The process of target formation by the group of multi-vehicles is performed in several steps as explained in exemplary flow diagram FIG.3B illustrating the method 300. The steps are explained in detail below:
[047] At the next step 306A of the method (300), a formation error is computed for each of the control signal vectors from the set of random control
signal vectors . The formation error is computed based on the state vector of
the vehicle , the state vector of the vehicle's neighborsand the target
formation of the multi-vehicle .
[048] In an embodiment, a possible formation is generated between the current formation Fc and the target formation Fr. For each generated
an error vector is generated between Fr and . The generated error vector is computed and named as the formation error . The
formation error for vehicle i, when the vehicle i selects is as shown
below:
where, the formation error vector between vehicle i and neighbor) is as shown below:
(5) and next state vector of because ofis :
(6)
here is the time step size in RRT from the plurality of sampling
parameters.
[049] At the next step 306B of the method (300), a penalty is computed for each of the plurality of control signal vectors (cik) . The penalty is computed based on a penalty computation technique.
[050] In an embodiment, the penalty computation technique comprises computing penalty based on generation of a safe envelope between a vehicle i and a vehicle) from the group of multi-vehicles (m) using the maximum velocity of the and the maximum velocity of the
[051] Penalty computation is performed for inter-vehicle collision avoidance, wherein if the vehicles are close, then the control signal vectors leading them closer are penalized. An example scenario for which inter-vehicle collision avoidance is required is illustrated in the FIG.6. In FIG.6, the vehicle i and the vehicle j are close enough to create a collision. However, as shown in FIG.6,
vehicles are inscribed by a safe envelope, wherein safe envelope is an imaginary sphere around the vehicle, centered at the vehicle’s center of mass. The safe envelope is expressed as the function of vehicle’s maximum velocity and RRT time step size ∆t. The safe envelope radius for the vehicle i and the vehicle j are expressed as shown below:
(7)
(8)
[052] In an example scenario, considering the FIG.6 if vehicle i generates three control signal vectors, , then it is apparent that penetrates the
safe envelope of vehicle j. Hence, should be penalized to avoid inter-vehicle
collision. Hence to ensure if a next state vector of vehicle i is inscribed by the safe envelope from vehicle j , then the below expression is applied:
(9) [053] Hence the penalty of vehicle i because of is
expressed as shown below:
(10)
where, penalty vector between vehicle i and vehicle due to
is as shown below:
(11)
[054] At the next step 306C of the method (300), a reward is computed for each of the plurality of control signal vectors . The reward is computed based
on the formation error and the penalty.
[055] In an embodiment, reward is computed based on the
formation error and the penalty for a corresponding control signal vectors .
The reward is expressed as shown below:
(12)
[056] At step 306C of the method (300), an optimal control signal vector is iteratively selected from the set of random control signal vectors Ci). The optimal control signal vector is iteratively selected using the rewards of the
plurality of control signal vectors based on the correlated equilibrium from
game theory technique to enable the target formation.
[057] In an embodiment, the correlated equilibrium from game theory comprises of maximizing the sum of rewards of all scalar components of the control signal vectors for each vehicle within the group of multi-vehicles (m) and is expressed as shown below :
(13)
wherein,
is a reward for the control signal vector .
[058] The equation (10) is based on the correlated equilibrium from game theory, which is based on the utilitarian equilibrium (UE). The optimal control signal vector is iteratively selected from the set of random control vectors (Ci)
individually be each vehicle in the group of multi-vehicles till the target formation is reached/enabled.
[059] The target formation enabled using the optimal control signal vector for the group of multi-vehicles is displayed/shared on an output module 216.
[060] Assuming vehicles are rational too, and they maintain privacy regarding the about to execute control signal vectors, which reduces the unnecessary communication burden over the wireless ad hoc network. Hence, unlike traditional game theoretic approaches, the disclosed formulation of joint (or collective) control signal vector is avoided. Rather, vehicle observes the state vector of the neighbor and updates the reward to finally select the optimal control signal vector to enable the target formation.
[061] EXPERIMENTS:
[062] Several experiments have been conducted to simulate the performance of the above disclosed for multi-vehicle formation based on correlated equilibrium from game theory. For experimentation purposes, Simulation are conducted in a system with an All the simulations are conducted in an Intel il acta - core processor™ with a clock speed of 2:90GHz and 16GB of RAM. An 20X 20X10 arena is considered for all simulations. The Gazebo based Hector platform™ is used for simulation, which offers Robot Operating System (ROS) packages related to modelling, control and simulation of aerial (UAV: unmanned aerial vehicle) and ground vehicle. In an embodiment, an aerial vehicle quadcopter with a pre-defined vehicle dynamic and a ground vehicle with a pre-defined vehicle dynamic are used. All the vehicles move with
because of the control signal vectors generated from RRT. The pre-defined parameters are empirically selected. In RRT, time step size, At is 0.3s. The maximum control signal vector generated by RRT for each vehicle is 300
. The value of tolerance and n-dimension tolerance respectively in Algorithm 1 are 0.8 and 0.1. ”penalty” for vehicle safety is considered as 1000 (large positive value). For square and triangle, and cube formation the inter-vehicle distance is considered as 5m. While for consensus formation the inter-vehicle distance is 0m. All the parameters are for the disclosed technique and the following results are obtained.
[063] Simulation results are explained in four parts. The simulation 1 is regarding the consensus formation among the vehicles. The simulation 2 includes results for homogeneous formation of the group of multi-vehicles. The simulation
3 covers results on heterogeneous formation of the group of multi-vehicles. Finally, the simulation 4 is regarding the multi-vehicle formation in 3D.
[064] The simulation 1 is regarding the consensus formation among the vehicles. The results of simulation 1 are illustrated in FIG.7 that four quadcopters’ trajectories are meeting to a single point, that is., they are forming consensus. FIG.8 shows the decaying tracking error while four quadcopters are forming consensus among themselves. It is also apparent from FIG.8 that consensus among four quadcopters is formed within 70s.
[065] The simulation 2 includes results for homogeneous formation. The FIG.9A and FIG.9B illustrates a homogeneous formation for four (FIG.9A) and three (FIG.9B) quadcopters.
[066] The simulation 3 covers results on heterogeneous formation of the group of multi-vehicles. The FIG.10 illustrates the heterogeneous formation among three quadcopters and one differential drive husky.
[067] The simulation 4 is regarding the multi-vehicle formation in 3D. A cube is formed by employing eight quadcopters as illustrated in FIG.11.
[068] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[069] The disclosure provides a method and system for multi-vehicle formation based on correlated equilibrium from game theory. The disclosure is an approach for multivehicle formation based rapidly-exploring random tree (RRT) and game theory. The disclosed technique for multi-vehicle formation is implemented in several steps that includes generation a set of random control signal vectors for each vehicle in the group of the multi-vehicles by employing the rapidly exploring random tree (RRT). Subsequently, for every control signal vector, the vehicle computes a formation error, a penalty and a reward considering the
neighbors state vector ( rather than neighbors’ control signal vector) . Among all the generated control signal vectors, the vehicle selects the optimal one by employing the merit of game theory using the reward. The process of RRT and game theory continues until the desired multi-vehicle formation is attained.
[070] Further the disclosed technique also provides an improvisation over traditional methods wherein, the disclosure of formulation of joint (or collective) control signal vector is avoided. Assuming vehicles are rational too, and maintain privacy regarding the their control signal vectors (which reduces the unnecessary communication burden over the wireless ad hoc network), the disclosed techniques enables multi-vehicle formation wherein the vehicle observes the state vector of the neighbor ( rather than the control signal vectors) and updates the reward to finally select the optimal control signal vector to enable the target formation.
[071] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
[072] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by
18
various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[073] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[074] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include
random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[075] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
We Claim:
1. A processor-implemented method (300) comprising:
receiving a plurality of inputs associated with a group of multi-vehicles (m), via one or more hardware processors, the plurality of inputs comprising a target formation of the multi-vehicle , a state vector of a
vehicle from the group of multi-vehicles , a state vector of the vehicle's
neighbors a vehicle dynamics factor for the vehicle, a maximum
velocity each of the vehicle from the group of multi-vehicles, a
tolerance factor , a time step sizeand a plurality of sampling
parameters; (302)
generating a set of random control signal vectors for each vehicle
from the group of multi-vehicles (m), via one or more hardware processors, based on a Rapidly-exploring random tree (RRT) technique, wherein the set of random control signal vectors comprises of a plurality of control
signal vectors ; (304) and
generating the target formation by the group of multi-vehicles, comprising:
computing a formation error for each of the control signal vectors from the set of random control signal vectors
based on the state vector of the vehicle , the state vector of
the vehicle's neighbors and the target formation of the
multi-vehicle
computing a penalty for each of the plurality of control signal vectors based on a penalty computation technique;
(304B)
computing a reward for each of the plurality of control signal vectors based on the formation error and the penalty;
(304C) and
iteratively selecting an optimal control signal vector from the set of random control signal vectors using the
rewards of the plurality of control signal vectors based on
the correlated equilibrium from game theory technique to enable the target formation. (304D).
2. The method of claim 1, wherein the group of multi-vehicles is a collection of one of: multiple rational and self-interested interacting vehicles which collaboratively attain common target autonomously and comprises one of: a group of heterogeneous vehicles or a group of homogeneous vehicles.
3. The method of claim 1, wherein the target formation of the multi-vehicle formation is a goal formation to be formed by each of a vehicle in sync with all the other vehicles in the multi-vehicle formation.
4. The method of claim 1, wherein the vehicle's neighbors comprises of at least one neighbor vehicle (j) that is connected to the vehicle by direct communication link(s) where an essential spacing between the vehicle (i) and the neighbor vehicle(j) is denoted by
5. The method of claim 1, wherein the plurality of sampling parameters comprises a threshold number of control signal vectors for the vehicle (i) represented as and the vehicle dynamics factor defines all necessary constraints required to impose upon the vehicle’s movement.
6. The method of claim 1, wherein the penalty computation technique comprises computing penalty based on generation of a safe envelope between a vehicle i and a vehicle j from the group of multi-vehicles (m) using the maximum velocity of the ith vehicle and the maximum velocity of the jth vehicle
7. The method of claim 1, wherein the correlated equilibrium from game theory comprises of maximizing the sum of rewards of all scalar components of the control signal vectors for each vehicle within the group of multi-vehicles (m) and is expressed as shown below :
wherein,
is a reward for the control signal vector (Cik).
8. A system (100), comprising:
an input/output interface (106); one or more memories (102); and
one or more hardware processors (104), the one or more memories
(102) coupled to the one or more hardware processors (104), wherein the
one or more hardware processors (104) are configured to execute
programmed instructions stored in the one or more memories (102), to:
receive a plurality of inputs associated with a group of multi-vehicles
(m), via one or more hardware processors, the plurality of inputs comprising
a target formation of the multi-vehicle (Fr), a state vector of a vehicle from
the group of multi-vehicles (Xi), a state vector of the vehicle's neighbors
, a vehicle dynamics factor for the vehicle, a maximum velocity each
of the vehicle from the group of multi-vehicles, a tolerance factor
(ε), a time step size (∆t) and a plurality of sampling parameters;
generate a set of random control signal vectors (Ci ) for each vehicle from the group of multi-vehicles (m), via one or more hardware processors, based on a Rapidly-exploring random tree (RRT) technique, wherein the set of random control signal vectors (Ci) comprises of a plurality of control signal vectors (Cik) ; and
generate the target formation by the group of multi-vehicles, comprising:
computing a formation error for each of the control signal vectors (Cik) from the set of random control signal vectors (Ci)
based on the state vector of the vehicle , the state vector of
the vehicle's neighbors and the target formation of the
multi-vehicle
computing a penalty for each of the plurality of control signal vectors based on a penalty computation technique;
computing a reward for each of the plurality of control signal vectors based on the formation error and the penalty;
and
iteratively selecting an optimal control signal vector from the set of random control signal vectors using the
rewards of the plurality of control signal vectors based on
the correlated equilibrium from game theory technique to enable the target formation.
9. The system of claim 8, wherein the one or more hardware processors are configured by the instructions to perform the penalty computation technique comprises computing penalty based on generation of a safe envelope between a vehicle i and a vehicle j from the group of multi-vehicles (m) using the maximum velocity of the ith vehicle and the maximum velocity of the
10. The system of claim 8, wherein the one or more hardware processors are configured by the instructions to perform the correlated equilibrium from game theory comprises of maximizing the sum of rewards of all scalar components of the control signal vectors for each vehicle within the group of multi-vehicles (m) and is expressed as shown below :
wherein,
is a reward for the control signal vector
| # | Name | Date |
|---|---|---|
| 1 | 202121029153-STATEMENT OF UNDERTAKING (FORM 3) [29-06-2021(online)].pdf | 2021-06-29 |
| 2 | 202121029153-REQUEST FOR EXAMINATION (FORM-18) [29-06-2021(online)].pdf | 2021-06-29 |
| 3 | 202121029153-FORM 18 [29-06-2021(online)].pdf | 2021-06-29 |
| 4 | 202121029153-FORM 1 [29-06-2021(online)].pdf | 2021-06-29 |
| 5 | 202121029153-FIGURE OF ABSTRACT [29-06-2021(online)].jpg | 2021-06-29 |
| 6 | 202121029153-DRAWINGS [29-06-2021(online)].pdf | 2021-06-29 |
| 7 | 202121029153-DECLARATION OF INVENTORSHIP (FORM 5) [29-06-2021(online)].pdf | 2021-06-29 |
| 8 | 202121029153-COMPLETE SPECIFICATION [29-06-2021(online)].pdf | 2021-06-29 |
| 9 | 202121029153-Proof of Right [17-08-2021(online)].pdf | 2021-08-17 |
| 10 | 202121029153-FORM-26 [22-10-2021(online)].pdf | 2021-10-22 |
| 11 | Abstract1..jpg | 2021-12-13 |
| 12 | 202121029153-FER.pdf | 2023-04-20 |
| 13 | 202121029153-OTHERS [18-09-2023(online)].pdf | 2023-09-18 |
| 14 | 202121029153-FER_SER_REPLY [18-09-2023(online)].pdf | 2023-09-18 |
| 15 | 202121029153-CLAIMS [18-09-2023(online)].pdf | 2023-09-18 |
| 1 | 202121029153E_17-04-2023.pdf |