Method And System For Managing Fault Tolerance Associated With An

< Back

Method And System For Managing Fault Tolerance Associated With An Auditor Service Unit

Abstract: The present disclosure relates to a method and a system for managing fault tolerance associated with an Auditor Service (AU). The method comprises receiving by a Load Balancer (LB) a request message from one or more service instances. The method comprises identifying by the LB, one or more available auditor service instances from a set of auditor service instances. The method comprises identifying by the LB, a positive health status associated with the one or more available auditor service instances. The method comprises transmitting, by the LB to an available auditor service instance from the one or more available auditor service instances, the request message based on the positive health status. The method further comprises managing the fault tolerance associated with the AU by transmitting the request message. [FIG. 4]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

26 September 2023

Publication Number

14/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

Parent Application

Applicants

Jio Platforms Limited

Office - 101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad - 380006, Gujarat, India.

Inventors

1. Aayush Bhatnagar

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

2. Ankit Murarka

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

3. Rizwan Ahmad

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

4. Kapil Gill

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

5. Arpit Jain

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

6. Shashank Bhushan

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

7. Jugal Kishore

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

8. Meenakshi Sarohi

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

9. Kumar Debashish

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

10. Supriya Kaushik De

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

11. Gaurav Kumar

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

12. Kishan Sahu

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

13. Gaurav Saxena

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

14. Vinay Gayki

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

15. Mohit Bhanwria

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

16. Durgesh Kumar

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

17. Rahul Kumar

Reliance Corporate Park, Thane-Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India.

Specification

FORM 2
THE PATENTS ACT, 1970
(39 OF 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See section 10 and rule 13)
“METHOD AND SYSTEM FOR MANAGING FAULT
TOLERANCE ASSOCIATED WITH AN AUDITOR SERVICE
UNIT”
We, Jio Platforms Limited, an Indian National, of Office - 101, Saffron, Nr.
Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad - 380006, Gujarat, India.
The following specification particularly describes the invention and the manner in
which it is to be performed.
2
METHOD AND SYSTEM FOR MANAGING FAULT TOLERANCE
ASSOCIATED WITH AN AUDITOR SERVICE UNIT
FIELD OF THE DISCLOSURE
5
[0001] Embodiments of the present disclosure generally relate to network
performance management systems. More particularly, embodiments of the present
disclosure relate to managing fault tolerance associated with an Auditor Service
(AU) unit.
10
BACKGROUND
[0002] The following description of the related art is to provide background
information pertaining to the field of disclosure. This section may include certain
15 aspects of the art that may be related to various features of the present disclosure.
However, it should be appreciated that this section is used only to enhance the
understanding of the reader with respect to the present disclosure, and not as
admissions of the prior art.
20 [0003] The 5G core networks are based on service‐based architecture (SBA) that is
centred around network function (NF) services. In the said Service‐Based
Architecture (SBA), a set of interconnected Network Functions (NFs) deliver the
control plane functionality and common data repositories of the 5G network, where
each NF is authorized to access services of other NFs. Particularly, each NF can
25 register itself and its supported services to a Network Repository Function (NRF),
which is used by other NFs for the discovery of NF instances and their services.
Further, the network functions may include, but not limited to, a cloud-native
network function (CNF) and a virtual network function (VNF).
30 [0004] The CNFs are a set of small, independent, and loosely coupled services such
as microservices. These microservices work independently, which may increase
3
speed and flexibility while reducing deployment risk. In 5G communication, cloudnative 5G network offers the fully digitized architecture necessary for deploying
new cloud services and taking full advantage of cloud-native 5G features such as
edge computing, as well as network slicing and other services. Whereas the VNFs
5 may run in virtual machines (VMs) on common virtualization infrastructure. The
VNFs may be created on top of network function virtualization infrastructure
(NFVI) which may allocate resources like compute, storage, and networking
efficiently among the VNFs. MANO which stands for Management and
Orchestration is a key NFV architectural framework that includes all the essential
10 management modules. It coordinates network resources in NFV framework.
Further, due to such vast usage and implementation of the CNFs and the VNFs,
there is a need of maintaining such microservices and applications data in a secure
way, which may not be lost and may be retained safely from any unwanted incidents
such as network service crash, outage, cyber-attacks and any other undesirable
15 incidents.
[0005] A network function virtualization (NFV) and software defined network
(SDN) design function module platform provides the facility to act as a single
platform to manage all the Virtual Network Functions (VNFs) and Cloud-Native
20 Network Functions (CNFs) being deployed in a telecom network. As the platform
is completely based on micro service architecture, it is highly scalable and will be
able to handle hundreds of NFVs. The platform is completely event driven and is
based on standard REST Application Program Interfaces (APIs). An Auditor
Service (AU) audits the resources in terms of physical memory, RAM and CPU at
25 Inventory Manager (IM). IM maintains the virtual inventory and limited physical
inventory. It maintains relation between physical and virtual resources (w.r.t
overlay). Also, it describes physical and virtual resources w.r.t different attributes
using updates from external micro-service. Thus, its data accuracy depends on the
micro-services which create, update, delete these resources and at the same time
30 update these events with IM. Other services can query IM relations, attributes etc.
using Query APIs provided by IM. The AU brings inventory in synchronize with
4
real time available/used resources and minimizes the mismatch between Inventory
Manager (IM) and real time hardware. The data accuracy depends primarily on
Swarm Adaptor (SA) and Inventory Manager (IM). The AU detects whether the
hosts contain lesser/more containers than the amount present in inventory managed
5 by the IM. It accordingly sends API request to IM to update its inventory. AU
interacts with these microservices (MS) to fetch the real time data using various
APIs. Due to high traffic on a particular instance of the dependent MS or unhealthy
instance, requests might get failed which results into failure of system, so to avoid
this type of scenario and to ensure that no server is overworked, there is a need for
10 a new type of interface to resolve the incoming/outgoing requests distribution
among all auditor instances.
[0006] Thus, there exists an imperative need in the art to manage fault tolerance for
any event failure via an interface by efficiently maintaining incoming and outgoing
15 requests distribution among all AU instances, which the present disclosure aims to
address.
SUMMARY
20 [0007] This section is provided to introduce certain aspects of the present disclosure
in a simplified form that are further described below in the detailed description.
This summary is not intended to identify the key features or the scope of the claimed
subject matter.
25 [0008] An aspect of the present disclosure may relate to a method for managing a
fault tolerance associated with an Auditor Service (AU) unit. The method comprises
receiving, by a Load Balancer (LB) unit, a request message from one or more
service instances. The method further comprises identifying, by the LB unit, one or
more available auditor service instances from a set of auditor service instances.
30 Further, the method comprises identifying, by the LB unit, a positive health status
associated with the one or more available auditor service instances from the set of
5
available auditor service instances. The method further comprises transmitting, by
the LB unit via the AU_LB interface to an available auditor service instance from
the one or more available auditor service instances, the request message based on
the positive health status. The method further comprises receiving, by the LB unit
5 [304], a request message response from the available auditor service instance based
on the request message. The method further comprises transmitting, by the LB unit
[304], the request message response to the one or more service instances. The
method further comprises managing, by the LB unit via the AU_LB interface, the
fault tolerance associated with the AU unit by transmitting the request message
10 based on the positive health status.
[0009] In an exemplary aspect of the present disclosure, the set of auditor service
instances comprises at least one of the one or more available auditor service
instances and one or more unavailable auditor service instances associated with the
15 AU unit.
[0010] In an exemplary aspect of the present disclosure, the set of auditor service
instances is identified by an OAM unit via the AU_LB interface based on a
predefined auditor status determination rule.
20
[0011] In an exemplary aspect of the present disclosure, the managing AU unit by
the LB unit via the AU_LB interface, the fault tolerance associated with the AU
unit further comprises transmitting the request message to another available auditor
service instance from the one or more available auditor service instances, in an
25 event the available auditor service instance from the one or more available auditor
service instances becomes unavailable during processing of the request message.
[0012] In an exemplary aspect of the present disclosure, the set of auditor service
instances is received at the LB unit from the OAM unit.
30
6
[0013] In an exemplary aspect of the present disclosure, the positive health status
associated with the one or more available auditor service instances from the set of
available auditor service instances is received by the LB unit from an OAM unit
based on a predefined health determination rule.
5
[0014] In an exemplary aspect of the present disclosure, the at least the positive
health status associated with the one or more available auditor service instances
from the set of available auditor service instances is received by the LB unit [304]
from the OAM unit in real time.
10
[0015] Another aspect of the present disclosure may relate to a system for
managing a fault tolerance associated with an Auditor Service (AU) unit. The
system comprises at least a Load Balancer (LB) unit. The LB unit is configured to
receive, a request message from one or more service instances. The LB unit is
15 further configured to identify, via the AU_LB interface, one or more available
auditor service instances from the set of auditor service instances. Furthermore, the
LB unit is configured to identify, a positive health status associated with the one or
more available auditor service instances from the set of auditor service instances.
The LB unit is configured to transmit, via the AU_LB interface to an available
20 auditor service instance from the one or more available auditor service instances,
the request message based on the positive health status. The LB unit is configured
to receive a request message response from the available auditor service instances
based on the request message. The LB unit is configured to transmit the request
message response to the one or more service instances. The LB unit is configured
25 to manage, via the AU_LB interface, the fault tolerance associated with the AU unit
by transmitting the request message based on the positive health status.
[0016] Yet another aspect of the present disclosure may relate to a non-transitory
computer readable storage medium storing instruction for managing fault tolerance
30 associated with an Auditor Service (AU) unit, the instructions include executable
code which, when executed by one or more units of a system cause a load balancer
7
(LB) unit to receive, a request message from one or more service instances. The
instructions when executed by the system further cause the LB unit to identify, via
the AU_LB interface, one or more available auditor service instances from the set
of auditor service instances. The instructions when executed by the system further
5 cause the LB unit to identify, a positive health status associated with the one or
more available auditor service instances from the set of auditor service instances.
The instructions when executed by the system further cause the LB unit to transmit,
via the AU_LB interface to an available auditor service instance from the one or
more available auditor service instances, the request message based on the positive
10 health status. The instructions when executed by the system further cause the LB
unit to receive a request message response from the available auditor service
instances based on the request message. The instructions when executed by the
system further cause the LB unit to transmit the request message response to the
one or more service instances. The instructions when executed by the system
15 further cause the LB unit to manage, via the AU_LB interface, the fault tolerance
associated with the AU unit by transmitting the request message based on the
positive health status.
OBJECTS OF THE DISCLOSURE
20
[0017] Some of the objects of the present disclosure, which at least one
embodiment disclosed herein satisfies are listed herein below.
[0018] It is an object of the present disclosure to provide a system and a method for
25 managing fault tolerance for any event failure via an interface by efficiently
maintaining incoming and outgoing requests distribution among all AU instances.
[0019] It is another object of the present disclosure to provide a solution that
effectively routes client requests across all servers in a manner that maximizes
30 speed and capacity utilization.
8
[0020] It is another object of the invention to provide an async event-based
implementation to utilize the interface efficiently.
[0021] It is yet another object of the present disclosure to provide the ability to
5 support HTTP/HTTPS in parallel (Configurable).
DESCRIPTION OF THE DRAWINGS
[0022] The accompanying drawings, which are incorporated herein, and constitute
10 a part of this disclosure, illustrate exemplary embodiments of the disclosed methods
and systems in which like reference numerals refer to the same parts throughout the
different drawings. Components in the drawings are not necessarily to scale,
emphasis instead being placed upon clearly illustrating the principles of the present
disclosure. Also, the embodiments shown in the figures are not to be construed as
15 limiting the disclosure, but the possible variants of the method and system
according to the disclosure are illustrated herein to highlight the advantages of the
disclosure. It will be appreciated by those skilled in the art that disclosure of such
drawings includes disclosure of electrical components or circuitry commonly used
to implement such components.
20
[0023] FIG. 1 illustrates an exemplary system architecture, in accordance with
exemplary implementations of the present disclosure.
[0024] FIG. 2 illustrates an exemplary block diagram of a computing device upon
25 which the features of the present disclosure may be implemented in accordance with
exemplary implementation of the present disclosure.
[0025] FIG. 3 illustrates an exemplary block diagram of a system for managing a
fault tolerance associated with an Auditor Service (AU) unit, in accordance with
30 exemplary implementations of the present disclosure.
9
[0026] FIG. 4 illustrates a method flow diagram for managing a fault tolerance
associated with an Auditor Service (AU) unit, in accordance with exemplary
implementations of the present disclosure.
5 [0027] FIG. 5 illustrates an implementation of the method for managing a fault
tolerance associated with an Auditor Service (AU) unit, in accordance with
exemplary implementations of the present disclosure.
[0028] The foregoing shall be more apparent from the following more detailed
10 description of the disclosure.
DETAILED DESCRIPTION
[0029] In the following description, for the purposes of explanation, various
15 specific details are set forth in order to provide a thorough understanding of
embodiments of the present disclosure. It will be apparent, however, that
embodiments of the present disclosure may be practiced without these specific
details. Several features described hereafter may each be used independently of one
another or with any combination of other features. An individual feature may not
20 address any of the problems discussed above or might address only some of the
problems discussed above.
[0030] The ensuing description provides exemplary embodiments only, and is not
intended to limit the scope, applicability, or configuration of the disclosure. Rather,
25 the ensuing description of the exemplary embodiments will provide those skilled in
the art with an enabling description for implementing an exemplary embodiment.
It should be understood that various changes may be made in the function and
arrangement of elements without departing from the spirit and scope of the
disclosure as set forth.
30
10
[0031] Specific details are given in the following description to provide a thorough
understanding of the embodiments. However, it will be understood by one of
ordinary skill in the art that the embodiments may be practiced without these
specific details. For example, circuits, systems, processes, and other components
5 may be shown as components in block diagram form in order not to obscure the
embodiments in unnecessary detail.
[0032] Also, it is noted that individual embodiments may be described as a process
which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure
10 diagram, or a block diagram. Although a flowchart may describe the operations as
a sequential process, many of the operations may be performed in parallel or
concurrently. In addition, the order of the operations may be re-arranged. A process
is terminated when its operations are completed but could have additional steps not
included in a figure.
15
[0033] The word “exemplary” and/or “demonstrative” is used herein to mean
serving as an example, instance, or illustration. For the avoidance of doubt, the
subject matter disclosed herein is not limited by such examples. In addition, any
aspect or design described herein as “exemplary” and/or “demonstrative” is not
20 necessarily to be construed as preferred or advantageous over other aspects or
designs, nor is it meant to preclude equivalent exemplary structures and techniques
known to those of ordinary skill in the art. Furthermore, to the extent that the terms
“includes,” “has,” “contains,” and other similar words are used in either the detailed
description or the claims, such terms are intended to be inclusive in a manner similar
25 to the term “comprising” as an open transition word without precluding any
additional or other elements.
[0034] As used herein, a “processing unit” or “processor” or “operating processor”
includes one or more processors, wherein processor refers to any logic circuitry for
30 processing instructions. A processor may be a general-purpose processor, a special
purpose processor, a conventional processor, a digital signal processor, a plurality
11
of microprocessors, one or more microprocessors in association with a (Digital
Signal Processing) DSP core, a controller, a microcontroller, Application Specific
Integrated Circuits, Field Programmable Gate Array circuits, any other type of
integrated circuits, etc. The processor may perform signal coding data processing,
5 input/output processing, and/or any other functionality that enables the working of
the system according to the present disclosure. More specifically, the processor or
processing unit is a hardware processor.
[0035] As used herein, “a user equipment”, “a user device”, “a smart-user-device”,
10 “a smart-device”, “an electronic device”, “a mobile device”, “a handheld device”,
“a wireless communication device”, “a mobile communication device”, “a
communication device” may be any electrical, electronic and/or computing device
or equipment, capable of implementing the features of the present disclosure. The
user equipment/device may include, but is not limited to, a mobile phone, smart
15 phone, laptop, a general-purpose computer, desktop, personal digital assistant,
tablet computer, wearable device or any other computing device which is capable
of implementing the features of the present disclosure. Also, the user device may
contain at least one input means configured to receive an input from at least one of
a transceiver unit, a processing unit, a storage unit, a detection unit and any other
20 such unit(s) which are required to implement the features of the present disclosure.
[0036] As used herein, “storage unit” or “memory unit” refers to a machine or
computer-readable medium including any mechanism for storing information in a
form readable by a computer or similar machine. For example, a computer-readable
25 medium includes read-only memory (“ROM”), random access memory (“RAM”),
magnetic disk storage media, optical storage media, flash memory devices or other
types of machine-accessible storage media. The storage unit stores at least the data
that may be required by one or more units of the system to perform their respective
functions.
30
12
[0037] As used herein “interface” or “user interface” refers to a shared boundary
across which two or more separate components of a system exchange information
or data. The interface may also be referred to a set of rules or protocols that define
communication or interaction of one or more modules or one or more units with
5 each other, which also includes the methods, functions, or procedures that may be
called.
[0038] All modules, units, components used herein, unless explicitly excluded
herein, may be software modules or hardware processors, the processors being a
10 general-purpose processor, a special purpose processor, a conventional processor,
a digital signal processor (DSP), a plurality of microprocessors, one or more
microprocessors in association with a DSP core, a controller, a microcontroller,
Application Specific Integrated Circuits (ASIC), Field Programmable Gate Array
circuits (FPGA), any other type of integrated circuits, etc.
15
[0039] As used herein the transceiver unit include at least one receiver and at least
one transmitter configured respectively for receiving and transmitting data, signals,
information or a combination thereof between units/components within the system
and/or connected with the system.
20
[0040] As discussed in the background section, a network function virtualization
(NFV) and software defined network (SDN) design function module platform
provides the facility to act as a single platform to manage all the Virtual Network
Functions (VNFs) and Cloud-Native Network Functions (CNFs) being deployed in
25 a telecom network. As the platform is completely based on micro service
architecture, it is highly scalable and will be able to handle hundreds of NFVs. The
platform is completely event driven and is based on standard REST Application
Program Interfaces (APIs). Auditor service (AU) audits the resources in terms of
physical memory, RAM and CPU at Inventory Manager (IM). The AU brings
30 inventory in close sync with real time available/used resources and minimizes the
mismatch between Inventory Manager (IM) and real time hardware. The data
13
accuracy depends primarily on Swarm Adaptor (SA) and Inventory Manager (IM).
The AU receives information from the SA which constantly monitors the resources.
Further, the AU reconciles any differences between the real time available and used
resource information obtained from AU and the information present in the IM. AU
5 detects whether the hosts contain lesser/more containers than the amount present in
inventory managed by IM, by querying the SA via API requests It accordingly sends
API requests to IM to update its inventory. AU interacts with these microservices
to fetch the real time data using various APIs. Due to high traffic on a particular
instance of the dependent MS or unhealthy instance, requests might get failed which
10 results into failure of system, so to avoid this type of scenario and to ensure that no
server is overworked, there is a need for a new type of interface to resolve the
incoming/outgoing requests distribution among all auditor instances. Furthermore,
the current known solutions have several shortcomings.
15 [0041] Thus, there exists an imperative need in the art to manage fault tolerance for
any event failure via an interface by efficiently maintaining incoming and outgoing
requests distribution among all AU instances, which the present disclosure aims to
address. The present disclosure aims to overcome the above-mentioned and other
existing problems in this field of technology by providing a method and system of
20 managing a fault tolerance associated with an Auditor Service (AU) unit.
[0042] Referring to FIG. 1, an exemplary system architecture [100] comprises a
user interface (UI) [102] or a user experience (UX), an Elastic Load Balancer (ELB)
1 [104], an ELB 2 [106], an Identity and Access Management (IAM) [108], an
25 Event Routing Manager (ERM) unit [110], an ELB 3 [112], an ELB 4 [114],
Auditor service instances (AU) [116], an Elastic Search Database [118], an
Orchestration Manager (OAM) unit [120], and a Central Log Management Service
(CLMS) [122], wherein all the components are assumed to be connected to each
other in a manner as obvious to the person skilled in the art for implementing
30 features of the present disclosure. The Auditor service instances [116] may
14
comprise multiple instances as shown in the FIG. 1, such as AU 1 [116-1], AU 2
[116-2], AU 3 [116-3] …AU N [116-N].
[0043] The UI/UX [102] refers to an interface to interact with the system
5 architecture [100] by a user. The system architecture [100] provides a graphically
rich and alluring UI/UX interface [102] which helps the user to on-board VNFs,
CNFs, design Network Service Chain, define CNF auto scaling and healing
policies, instantiate Network Service and CNFs. It also allows users to create
storage volume pools, availability zones and define host aggregates. The UI/UX
10 [102] also provides a one stop solution to users who want to investigate mismatch
of resources at server level and at inventory’s database. The user may be a system
operator, a network consumer, and the like. The UI/UX [102] may be one of a
graphical user interface (GUI), a command line interface, and the like. The GUI
refers to an interface to interact with the system [100] by the user by visual or
15 graphical representation of icons, menu, etc. The GUI is an interface that may be
used within a smartphone, laptop, computer, etc. The CLI refers to a text-based
interface to interact with the system [100] as by the user. The user may input text
lines called as command lines in the CLI to access the data in the system.
20 [0044] The Elastic Load Balancer (ELB) such as ELB 1 [104], ELB 2 [106], ELB
3 [112], and ELB 4 [114] are exemplary ELBs. The system may include ‘N’ number
of ELBs. The ELBs are configured to distribute the load on the auditor service
instances [AU 1 [116-2], AU 2 [116-2]... AU N [116-N]. The ELBs are further
configured to distribute traffic of incoming requests (JSON) based on availability
25 of the auditor service instances. The distribution of traffic enhances fault tolerance.
The fault tolerance refers to a capability of the auditor service system [116] to
handle failures even when one of the auditor service instance fails.
[0045] The IAM [108] refers Identity and Access Management that verifies identity
30 of the user who is trying to access the system. The verification may include entering
a password, a biometric, and the like. Based on the authentication, the IAM unit
15
[108] may authorise access for the user. The verification and the authorisation
ensures that the user who is making the access request gets access to the system,
and the user has the right level of access to the system. This way, the IAM also acts
as a gateway to access other micro services. The IAM provides tokens for API
5 access to other microservices. Further, the IAM [108] is configured to define roles
and access privileges of the user.
[0046] The Event Routing Manager unit [110] is a central routing manager to which
an MS publishes an event, then whichever other MS has subscribed to the same
10 event, ERM will send the JSON to all those MS. Here, the ERM unit [110] routes
the incoming events to appropriate auditor service instances [116] in an
asynchronous manner. The auditor service instances [116] which includes one or
more auditor service instances, where each instance audits resources in terms of
physical memory, RAM and CPU at an Inventory Manager and Swarm Adaptor.
15 Each of the auditor service instance may be served by at least two ELB units. For
instance, the AU 1 [116-1] may be served by the ELB 3 [112] and the ELB 4 [114].
The one or more auditor service instances may have an active status or an inactive
status.
20 [0047] The Elastic Search Database (DB) [118] refers to a database that organizes
data into documents. The documents are grouped into different headers based on
characteristics of the data. The Elastic Search Database [118] stores, performs
searches and analyses the data quickly and in real-time to give a response in
milliseconds. The Elastic Search Database [118] produces a fast search response
25 based on performing the search in the header instead of the whole data. All micro
services not only maintain the state information in their local cache but also persist
it in Elastic Search DB [118].
[0048] The OAM unit [120] is a framework that stores data of the auditor service
30 instances. The data includes but may not be limited to an internet protocol (IP)
address, a port, a server disk location. The OAM unit [120] is further configured to
16
maintain a ping-pong communication with all the instances of the auditor service
[116]. The OAM unit [120] maintains the ping-pong communication to check
whether an instance is running or is down. All instances not only maintain the state
information in their local cache but also persist it in Elastic Search DB [118]. In
5 case if one of the instances goes down, OAM unit [120] detects it and broadcasts
the status to other running instances and also the ELB serving the instance which
has gone down.
[0049] FIG. 2 illustrates an exemplary block diagram of a computing device [200]
10 upon which the features of the present disclosure may be implemented in
accordance with exemplary implementation of the present disclosure. In an
implementation, the computing device [200] may also implement a method for
managing a fault tolerance associated with an Auditor Service (AU) unit, utilising
the system. In another implementation, the computing device [200] itself
15 implements the method for managing a fault tolerance associated with an Auditor
Service (AU) unit, using one or more units configured within the computing device
[200], wherein said one or more units are capable of implementing the features as
disclosed in the present disclosure.
20 [0050] The computing device [200] may include a bus [202] or other
communication mechanism for communicating information, and a hardware
processor [204] coupled with bus [202] for processing information. The hardware
processor [204] may be, for example, a general-purpose microprocessor. The
computing device [200] may also include a main memory [206], such as a random25 access memory (RAM), or other dynamic storage device, coupled to the bus [202]
for storing information and instructions to be executed by the processor [204]. The
main memory [206] also may be used for storing temporary variables or other
intermediate information during execution of the instructions to be executed by the
processor [204]. Such instructions, when stored in non-transitory storage media
30 accessible to the processor [204], render the computing device [200] into a specialpurpose machine that is customized to perform the operations specified in the
17
instructions. The computing device [200] further includes a read only memory
(ROM) [208] or other static storage device coupled to the bus [202] for storing static
information and instructions for the processor [204].
5 [0051] A storage device [210], such as a magnetic disk, optical disk, or solid-state
drive is provided and coupled to the bus [202] for storing information and
instructions. The computing device [200] may be coupled via the bus [202] to a
display [212], such as a cathode ray tube (CRT), Liquid crystal Display (LCD),
Light Emitting Diode (LED) display, Organic LED (OLED) display, etc. for
10 displaying information to a computer user. An input device [214], including
alphanumeric and other keys, touch screen input means, etc. may be coupled to the
bus [202] for communicating information and command selections to the processor
[204]. Another type of user input device may be a cursor controller [216], such as
a mouse, a trackball, or cursor direction keys, for communicating direction
15 information and command selections to the processor [204], and for controlling
cursor movement on the display [212]. This input device typically has two degrees
of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allow
the device to specify positions in a plane.
20 [0052] The computing device [200] may implement the techniques described
herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware
and/or program logic which in combination with the computing device [200] causes
or programs the computing device [200] to be a special-purpose machine.
According to one implementation, the techniques herein are performed by the
25 computing device [200] in response to the processor [204] executing one or more
sequences of one or more instructions contained in the main memory [206]. Such
instructions may be read into the main memory [206] from another storage medium,
such as the storage device [210]. Execution of the sequences of instructions
contained in the main memory [206] causes the processor [204] to perform the
30 process steps described herein. In alternative implementations of the present
18
disclosure, hard-wired circuitry may be used in place of or in combination with
software instructions.
[0053] The computing device [200] also may include a communication interface
5 [218] coupled to the bus [202]. The communication interface [218] provides a twoway data communication coupling to a network link [220] that is connected to a
local network [222]. For example, the communication interface [218] may be an
integrated services digital network (ISDN) card, cable modem, satellite modem, or
a modem to provide a data communication connection to a corresponding type of
10 telephone line. As another example, the communication interface [218] may be a
local area network (LAN) card to provide a data communication connection to a
compatible LAN. Wireless links may also be implemented. In any such
implementation, the communication interface [218] sends and receives electrical,
electromagnetic or optical signals that carry digital data streams representing
15 various types of information.
[0054] The computing device [200] can send messages and receive data, including
program code, through the network(s), the network link [220] and the
communication interface [218]. In the Internet example, a server [230] might
20 transmit a requested code for an application program through the Internet [228], the
ISP [226], the local network [222], the host [224] and the communication interface
[218]. The received code may be executed by the processor [204] as it is received,
and/or stored in the storage device [210], or other non-volatile storage for later
execution.
25
[0055] The present disclosure is implemented by a system [300] (as shown in FIG.
3). In an implementation, the system [300] may include the computing device [200]
(as shown in FIG. 2). It is further noted that the computing device [200] is able to
perform the steps of a method [400] (as shown in FIG. 4).
30
19
[0056] Referring to FIG. 3, an exemplary block diagram of a system [300] for
managing a fault tolerance associated with an Auditor Service (AU) unit [302] is
shown, in accordance with the exemplary implementations of the present
disclosure. The system [300] comprises at least one auditor instance unit [302], at
5 least one load balancer (LB) unit [304] and at least one AU_LB interface [306].
Also, all of the components/ units of the system [300] are assumed to be connected
to each other unless otherwise indicated below. As shown in the FIG. 3 all units
shown within the system should also be assumed to be connected to each other.
Also, in FIG. 3 only a few units are shown, however, the system [300] may
10 comprise multiple such units or the system [300] may comprise any such numbers
of said units, as required to implement the features of the present disclosure. In an
implementation, the system [300] may reside in a server or a network entity. In yet
another implementation, the system [300] may reside partly in the server/ network
entity.
15
[0057] The system [300] is configured for managing fault tolerance associated with
an Auditor Service (AU) unit, with the help of the interconnection between the
components/units of the system [300]. The system [300] is based on a microservice
based architecture. The fault tolerance refers to a capability of a microservice based
20 system to handle failures even when one of the microservice instances fail. The
fault tolerance avoids complete failure of the microservice based system. These
microservices (MSs) have specific task and functionality which they all need to
perform. The MSs work collectively to achieve the overall functionality of the
system [300]. Each MS exposes certain APIs which are called by other micro
25 services. A microservice refers to an independent service in a system that performs
a specific function. Each microservice may include one or more microservice
instances. The one or more microservice instances are responsible for handling
requests related to a specific functionality. The AU unit audits the resources in terms
of physical memory, RAM and CPU at Inventory Manager. It brings inventory in
30 close sync with real time available/used resources and minimizes the mismatch
between Inventory Manager (IM) and real time hardware, by querying via a Swarm
20
Adaptor via API requests. The Swarm Adaptor constantly monitors the real-time
resources available and used and the AU reconciles any differences between the
real time available and used resources and the information present in the IM.
5 [0058] The Load Balancer (LB) unit [304] is configured to receive a request
message from one or more service instances. The service can be a producer service
or a consumer service. The consumer service builds a request and sends a POST
request to the producer service. The producer service then builds a response and
returns it to the consumer service. In an exemplary implementation, the service may
10 comprise one or more microservices. A microservice is a small, loosely coupled
distributed service. Each microservice is designed to perform a specific business
function and can be developed, deployed, and scaled independently. For example,
a microservice may be defined to handle capacity management tasks, another can
be defined to handle routing tasks in a network architecture. In the present
15 disclosure, for example, the auditor service is a microservice which is implemented
as part of the larger architecture. Further the request message includes parameters
required for an application programming interface (API) related to a process/event
for which message request is sent. For example, the request message includes
parameters like event name, identifier of the LB unit [304], a message type, and the
20 like. In an implementation, the event name is the name of the API. For example:
X-Event-Name=GET_DETAILS_FROM_TARGET_MS
[0059] An exemplary implementation for the event name may be
GET_CNFC_INVENTORY_AUDIT. This event is for fetching the audit report of
25 actual and allocated resources used by a CNF ranging over various nodes in a
region.
[0060] In one example, the request message may be to identify the RAM and CPU
usage of another service instance (MS). The request message will be processed via
30 the one or more instances of the AU unit [302]. The one or more instances of the
21
AU unit [302] refers to multiple segments within the AU unit [302] to handle
multiple incoming requests at a given time.
[0061] When an auditor service instance starts, it registers itself with the OAM unit
5 [120]. Further, the OAM unit [120] stores data about the initiated set of auditor
service instances. The data may include an internet protocol (IP) address, a port, a
server disk location, and the like. The port is an endpoint in a connection that allows
a system to differentiate between multiple services or applications running on the
same IP address. The server disk location refers to a specific path on a server’s
10 storage where files, applications, or data are stored. The OAM unit [120] maintains
a ping-pong communication with the set of auditor service instances. The ping-pong
communication may check whether an instance from the set of auditor service
instances is running or down. So, on request OAM Unit [120] broadcasts a message
to all the instances registered to it. This message may be in JSON format and
15 contains information whether an auditor service instance is running or down. Thus,
with this process OAM unit [120] gets the availability status and is referred to as
predefined auditor status determination rule. In this way, the one or more available
auditor service instances from the set of auditor service instances are identified
based on the predefined auditor status determination rule.
20
[0062] In an exemplary aspect of the present disclosure, a set of auditor service
instances is received at the LB unit [304] from the Orchestration Manager (OAM)
unit [120]. Further, the set of auditor service instances comprises at least one of the
one or more available auditor service instances and one or more unavailable auditor
25 service instances associated with the AU unit [302]. The list of the auditor service
instances includes at least one of one or more available auditor service instances,
and one or more unavailable auditor service instances associated with the AU unit
[302]. The one or more available auditor service instances refers to the one or more
instances of the AU unit [302] that may be available to process the request message.
30 The one or more unavailable auditor service instances refers to the one or more
instances of the AU unit [302] that are not available to process the request message.
22
The one or more auditor service instances may be unavailable due to another request
being processed by them. To automatically distribute incoming traffic across
multiple instances of the AU unit [302], the AU_LB interface [306] may monitor
the one or more instances of the AU unit [302].
5
[0063] In an exemplary aspect of the present disclosure, the LB unit [304] identifies
one or more available auditor service instances from the set of auditor service
instances. As described earlier, the LB unit [304], receives the set of auditor service
instances from the OAM unit [120]. The set of auditor service instances comprises
10 the one of one or more available auditor service instances and the one or more
unavailable auditor service instances. The LB unit [304] on receiving the request
message identifies the one or more available auditor service instances. The one or
more available auditor service instances refers to the instances of the AU unit [302]
that may be available to process the message request.
15
[0064] Further, a positive health status is received by the LB unit [304] from the
Orchestration Manager (OAM) unit [120] based on a predefined health
determination rule. The predefined health determination rule includes the OAM unit
[120] to do the ping-pong communication with the one or more available auditor
20 service instances. The OAM unit [120] sends the message request to an OAM client
that is located in the one or more available auditor service instances. If the one or
more auditor service instances is running, the corresponding OAM client sends a
response back to OAM unit [120]. If the response is received, it means the one or
more available auditor service instances are running. Based on the response
25 received, the one or more available auditor service instances may be identified with
the positive health status. Further, the positive health status associated with the one
or more available auditor service instances is received in real-time.
[0065] In an exemplary aspect of the present disclosure, the LB unit [304] is
30 configured to identify a health status of one or more available auditor service
instances from the set of auditor service instances of the AU unit [302]. As
23
described earlier, the LB unit [304] receives the positive heath status associated
with the one or more available auditor service instances from the OAM unit [120].
The LB unit [304] then identifies the positive health status from the already
identified one or more available auditor service instances. The health status may be
5 one of a positive health status or a negative health status. The positive health status
is associated with the one or more available service instances of the AU unit [302]
that are running. The negative health status is associated with the one or more
available service instances of the AU unit [302] that are not running.
10 [0066] Herein after, based on the positive health status of an available auditor
service instance, the LB unit [304] is configured to transmit the request message to
an available auditor service instance from the one or more available auditor service
instances, based on the positive health status. The request message may be
transmitted via the AU_LB interface [306]. The AU_LB interface [306] is
15 configured to distribute all incoming requests and outgoing requests. The
distribution of the incoming requests balances the load equally on the AU unit
[302]. Once the LB unit [304] has determined the health status of the one or more
available auditor service instances, the LB unit [304] transmits the received request
message to an available auditor service instance which has a positive health status.
20
[0067] In an exemplary aspect of the present disclosure, the LB unit [304], receives
a request message response from the available auditor service instance based on the
request message. The request message response is sent by the available auditor
service instance which processed the request message sent by the one or more
25 service instances. The request message response comprises response values to the
parameters as included in the request message. In an implementation, the request
message may comprise an event name which is the name of the API. For example:
X-Event-Name=GET_DETAILS_FROM_TARGET_MS
30 [0068] An exemplary implementation for the event name may be
GET_CNFC_INVENTORY_AUDIT. This event is for fetching the audit report of
24
actual and allocated resources used by a CNF ranging over various nodes in a
region. Based on this request message, the available auditor service instance will
provide the request message response which will include the audit report of actual
and allocated resources to the CNF.
5
[0069] In an exemplary aspect of the present disclosure, the LB unit [304] transmits
the request message response to the one or more service instances.
[0070] Further, each of the auditor service instance may be served by at least two
10 LB units. The LB unit [304] is configured to distribute incoming message requests
on each of the auditor service instance by a round robin scheduling. The round robin
scheduling refers to an algorithm to distribute tasks in a uniform manner where the
incoming requests may be placed in a queue. The LB unit [304] may select the first
request message in the queue and a time duration for execution is allocated to each
15 of the request messages. After expiry of the time duration, the LB unit [304] may
select the next request in the queue. Furthermore, the LB unit [304] may ensure that
an acknowledgement of receiving the message request is sent back to the service
instance which may have sent the message request.
20 [0071] Based on the above, it is seen that the LB unit [304] is configured to manage
the fault tolerance associated with the AU unit [302] by transmitting the request
message via the AU_LB interface [306]. The message request is transmitted by the
AU_LB interface [306] based on the positive health status of an available auditor
service instance of the AU unit [302]. Further, to manage fault tolerance associated
25 with the AU unit [302], the LB unit [304] is configured to transmit the request
message to another available auditor service instance from the one or more available
auditor service instances, in an event the available auditor service instance from the
one or more available auditor service instances becomes unavailable during
processing of the request message, ensuring that the AU unit [302] does not
30 shutdown in case of a fault occurrence. To transmit the request message to another
available auditor service instance, the LB unit [304] fetches a state information of
25
the incomplete request message being served from the instance that may have
become unavailable. The state information of the event included in the request
message tells at what step has the event reached in the overall flow. The state
information may be one of an event received, data fetched, processing done, data
5 insertion in database completed, response sent to the requested microservice. The
state information may be stored in a database after completion of every step.
[0072] In an implementation, all auditor service instances, maintain the state
information in their local cache and also persist it in Elastic Search (ES) Database
10 (DB). In case one of the instances goes down, OAM unit [120] detects it and
broadcast the status to other running auditor instances and also the LB unit [304]
serving the instance. The LB unit [304] as such distributes the ingress traffic on the
remaining available instances. One of the available auditor service instances takes
the ownership of the instance which has gone down. It fetches the state information
15 of the incomplete transaction being served by the instance which has gone down
from ES and re-executes them. In case any transaction has not persisted in the ES,
there will be a timeout, and the publisher service instance of that request message
will re-transmit the same for execution to the LB unit [304], and the LB unit [304]
transmits to a healthy and available auditor service instance. A time out will also
20 occur because request message acknowledgement was not sent by the auditor
service instance in time, indicating failure at the instance.
[0073] Referring to FIG. 4, an exemplary method flow diagram [400] for managing
fault tolerance associated with an Auditor Service (AU) unit [302], in accordance
25 with exemplary implementations of the present disclosure is shown. In an
implementation the method [400] is performed by the system [300]. Further, in an
implementation, the system [300] may be present in a server device to implement
the features of the present disclosure. Also, as shown in FIG. 4, the method [400]
starts at step [402].
30
26
[0074] At step [404], the method comprises receiving, by a Load Balancer (LB)
unit [304], a request message from one or more service instances. In an
implementation, each of the service instance may comprise one or more
microservices (MSs). A microservice (MS) is a small, loosely coupled distributed
5 service. These microservices (MSs) have specific task and functionality which they
all need to perform. For example, a microservice may be defined to handle capacity
management tasks, another can be defined to handle routing tasks in a network. The
MSs work collectively to achieve the overall functionality of the system [300]. Each
MS exposes certain APIs which are called by other micro services. Further the
10 request message includes parameters required for an application programming
interface (API) related to a process/event for which message request is sent. For
example, the request message includes parameters like event name, identifier of the
LB unit [304], a message type, and the like. In an implementation, the event name
is the name of the API. For example:
15 X-Event-Name=GET_DETAILS_FROM_TARGET_MS
[0075] An exemplary implementation for the event name may be
GET_CNFC_INVENTORY_AUDIT. This event is for fetching the audit report of
actual and allocated resources used by a CNF ranging over various nodes in a
20 region.
[0076] When an auditor service instance starts, it registers itself with an
Orchestration Manager (OAM) unit [120]. Further, the OAM unit [120] stores data
about the initiated set of auditor service instances. The data may include an internet
25 protocol (IP) address, a port, a server disk location, and the like. The port is an
endpoint in a connection that allows a system to differentiate between multiple
services or applications running on the same IP address. The server disk
location refers to a specific path on a server’s storage where files, applications, or
data are stored. The OAM unit [120] maintains a ping-pong communication with
30 the set of auditor service instances. The ping-pong communication may check
whether an instance from the set of auditor service instances is running or down.
27
So, on request OAM unit [120] broadcasts a JSON to all the instances registered to
it. This JSON contains information whether an auditor service instance is running
or down. Thus, this process gets the availability status and is referred to as
predefined auditor status determination rule. In this way, the one or more available
5 auditor service instances from the set of auditor service instances are identified
based on the predefined auditor status determination rule.
[0077] In an exemplary aspect of the present disclosure, a set of auditor service
instances is received at the LB unit [304] from the Orchestration Manager (OAM)
10 unit [120]. Further, the set of auditor service instances comprises at least one of the
one or more available auditor service instances and one or more unavailable auditor
service instances associated with the AU unit [302]. The list of the auditor service
instances includes at least one of one or more available auditor service instances,
and one or more unavailable auditor service instances associated with the AU unit
15 [302]. The one or more available auditor service instances refers to the one or more
instances of the AU unit [302] that may be available to process the request message.
The one or more unavailable auditor service instances refers to the one or more
instances of the AU unit [302] that are not available to process the request message.
The one or more auditor service instances may be unavailable due to another request
20 being processed by them. To automatically distribute incoming traffic across
multiple instances of the AU unit [302], the AU_LB interface [306] may monitor
the one or more instances of the AU unit [302].
[0078] At step [406], the method comprises identifying, by the LB unit [304] one
25 or more available auditor service instances from a set of auditor service instances.
As described earlier, the LB unit [304], receives the set of auditor service instances
from the OAM unit [120]. The set of auditor service instances comprises the one of
one or more available auditor service instances and the one or more unavailable
auditor service instances. The LB unit [304] on receiving the request message
30 identifies the one or more available auditor service instances. The one or more
28
available auditor service instances refers to the instances of the AU unit [302] that
may be available to process the message request.
[0079] Further, a positive health status is received by the LB unit [304] from the
5 Orchestration Manager (OAM) unit [120] based on a predefined health
determination rule. The predefined health determination rule includes the OAM unit
[120] to do the ping-pong communication with the one or more available auditor
service instances. The OAM unit [120] sends the message request to an OAM client
that is located in the one or more available auditor service instances. If the one or
10 more auditor service instances is running, the corresponding OAM client sends a
response back to OAM unit [120]. If the response is received, it means the one or
more available auditor service instances are running. Based on the response
received, the one or more available auditor service instances may be identified with
the positive health status. Further, the positive health status associated with the one
15 or more available auditor service instances is received in real-time.
[0080] Next, at Step [408], the method comprises identifying, by the LB unit [304],
a health status of one or more available auditor service instances from the set of
auditor service instances of the AU unit [302]. As described earlier, the LB unit
20 [304] receives the positive heath status associated with the one or more available
auditor service instances from the OAM unit [120]. The LB unit [304] then
identifies the positive health status from the already identified one or more available
auditor service instances. The health status may be one of a positive health status
or a negative health status. The positive health status is associated with the one or
25 more available service instances of the AU unit [302] that are running. The negative
health status is associated with the one or more available service instances of the
AU unit [302] that are not running.
[0081] Next, at Step[ 410], the method comprises transmitting, by the LB unit
30 [304], the request message to an available auditor service instance from the one or
more available auditor service instances, based on the positive health status. The
29
request message may be transmitted via the AU_LB interface [306]. The AU_LB
interface [306] is configured to distribute all incoming requests and outgoing
requests. The distribution of the incoming requests balances the load equally on the
AU unit [302]. Once the LB unit [304] has determined the health status of the one
5 or more available auditor service instances, the LB unit [304] transmits the received
request message to an available auditor service instance which has a positive health
status.
[0082] Next, at Step [412], the method comprises receiving, by the LB unit [304],
10 a request message response from the available auditor service instance based on the
request message. The request message response is sent by the available auditor
service instance which processed the request message sent by the one or more
service instances. The request message response comprises response values to the
parameters as included in the request message. In an implementation, the request
15 message may comprise an event name which is the name of the API. For example:
X-Event-Name=GET_DETAILS_FROM_TARGET_MS
[0083] An exemplary implementation for the event name may be
GET_CNFC_INVENTORY_AUDIT. This event is for fetching the audit report of
20 actual and allocated resources used by a CNF ranging over various nodes in a
region. Based on this request message, the available auditor service instance will
provide the request message response which will include the audit report of actual
and allocated resources to the CNF.
25 [0084] Next, at Step [414], the LB unit [304] transmits the request message
response to the one or more service instances.
[0085] Each of the auditor service instances may be served by at least two Load
Balancer units. The LB unit [304] is configured to distribute incoming message
30 requests on each of the microservice instance by a round robin scheduling. The
round robin scheduling refers to an algorithm to distribute tasks in a uniform
30
manner where the incoming requests may be placed in a queue. The LB unit [304]
may select the first request message in the queue and a time duration for execution
is allocated to each of the request messages. After expiry of the time duration, the
LB unit [304] may select a next request in the queue. Furthermore, the LB unit [304]
5 may ensure that an acknowledgement of receiving the message request is sent back
to the microservice instance which may have sent the message request.
[0086] Next at step [416], the method comprises managing, by the LB unit [304]
via the AU_LB interface [306], the fault tolerance associated with the AU unit [302]
10 by transmitting the request message based on the positive health status. The
message request is transmitted by the AU_LB interface [306] based on the positive
health status of an available auditor service instance of the AU unit [302]. Further,
to manage the fault tolerance associated with the AU unit [302], the LB unit [304]
is configured to transmit the request message to another available auditor service
15 instance from the one or more available auditor service instances, in an event the
available auditor service instance from the one or more available auditor service
instances becomes unavailable during processing of the request message, ensuring
that the AU unit [302] does not shutdown in case of a fault occurrence. To transmit
the request message to another available auditor service instance, the LB unit [304]
20 fetches the status information of the incomplete request message being served from
the instance that may have become unavailable. The state information of the event
included in the request message tells at what step has the event reached in the
overall flow. The state information may be one of an event received, data fetched,
processing done, data insertion in database completed, response sent to the
25 requested microservice. The state information may be stored in a database after
completion of every step.
[0087] In an implementation, all auditor service instances maintain the state
information in their local cache and persist it in Elastic Search (ES) Database (DB).
30 In case one of the instances goes down, OAM unit [120] detects it and broadcasts
the status to other running auditor instances and also the LB unit [304] serving the
31
instance. The LB unit [304] as such distributes the ingress traffic on the remaining
available instances. One of the available auditor service instances takes the
ownership of the instance which has gone down. It fetches the state information of
the incomplete transaction being served by the instance which has gone down from
5 ES and re-executes them. In case any transaction has not persisted in the ES, there
will be a timeout, and the publisher service instance of that request message will retransmit the same for execution to the LB unit [304], and the LB unit [304] transmits
to a healthy and available auditor service instance. A time out will also occur
because request message acknowledgement was not sent by the auditor service
10 instance in time, indicating failure at the instance.
[0088] The method thereafter terminates at step [418].
[0089] Referring to FIG.5, an exemplary implementation [500] of the method for
15 managing a fault tolerance associated with an Auditor Service (AU) unit, is shown
in accordance with exemplary implementations of the present disclosure. Further,
FIG. 5 is intended to be read in conjunction with FIG. 1 and FIG. 3.
[0090] At Step [502], the Orchestration Manager (OAM) [120] sends a set of
20 auditor service instances to the Load Balancer [304]. Further, the set of auditor
service instances comprises at least one of the one or more available auditor service
instances and one or more unavailable auditor service instances associated with the
AU unit [302]. The list of the auditor service instances includes at least one of one
or more available auditor service instances, and one or more unavailable auditor
25 service instances associated with the AU unit [302]. The one or more available
auditor service instances refers to the one or more instances of the AU unit [302]
that may be available to process a request message from a service instance or a
microservice instance. Further, a positive health status is received by the LB unit
[304] from the Orchestration Manager (OAM) unit [120] based on a predefined
30 health determination rule. The predefined health determination rule includes the
OAM unit [120] to do the ping-pong communication with the one or more available
32
auditor service instances. The OAM unit [120] sends the message request to an
OAM client that is located in the one or more available auditor service instances. If
the one or more auditor service instances is running, the corresponding OAM client
sends a response back to OAM unit [120]. If the response is received, it means the
5 one or more available auditor service instances are running. Based on the response
received, the one or more available auditor service instances may be identified with
the positive health status. Further, the positive health status associated with the one
or more available auditor service instances is received in real-time.
10 [0091] At Step [504], the one or more microservices [501] may send a request
message to the Load Balancer [304]. Further the request message is related to a
process/event which includes parameters required for an application programming
interface (API) for which message request is sent. Here, each of the one or more
microservices [501] is a publisher of the event. An example of the publisher
15 microservices may be an Inventory Manager (IM) and a Swarm Adaptor (SA). For
example, the request message includes parameters like event name, identifier of the
LB unit [304], resource information, a message type, and the like. The message
request is then received by the load balancer (LB) [304] from one or more
microservice instances [501]. Further, based on the request message, the LB unit
20 [304] identifies an available auditor service instance from the set of auditor service
instances received from the OAM [120]. Further, the LB unit [304] also identifies
the health status associated with the identified available auditor service instance.
[0092] Next, at step [506], the LB unit [304] transmits the request message to an
25 available auditor service instance from the one or more available auditor service
instances, based on the positive health status. The request message is transmitted
via the AU_LB interface [306]. The AU_LB interface [306] is configured to
distribute all incoming requests and outgoing requests. The distribution of the
incoming requests balances the load equally on the AU unit [302].
30
33
[0093] At Step [508], the LB unit [304], receives a request message response from
the available auditor service instance based on the request message. The request
message response is sent by the available auditor service instance which processed
the request message sent by the one or more service instances. The request message
5 response comprises response values to the parameters as included in the request
message. In an implementation, the request message may comprise an event name
which is the name of the API. For example:
X-Event-Name=GET_DETAILS_FROM_TARGET_MS
10 [0094] An exemplary implementation for the event name may be
GET_CNFC_INVENTORY_AUDIT. This event is for fetching the audit report of
actual and allocated resources used by a CNF ranging over various nodes in a
region. Based on this request message, the available auditor service instance will
provide the request message response which will include the audit report of actual
15 and allocated resources to the CNF.
[0095] Finally, at Step [510], the LB unit [304] transmits the request message
response to the one or more service instances who have sent the query request.
20 [0096] In an implementation, all auditor service instances maintain the state
information in their local cache and also persist it in Elastic Search (ES) Database
(DB). In case one of the instances goes down, OAM unit [120] detects it and
broadcasts the status to other running auditor instances and also the LB unit [304]
serving the instance. The LB unit [304] as such distributes the ingress traffic on the
25 remaining available instances. One of the available auditor service instances takes
the ownership of the instance which has gone down. It fetches the state information
of the incomplete transaction being served by the instance which has gone down
from ES and re-executes them. In case any transaction has not been persisted in the
ES, there will be a timeout, and the publisher service instance of that request
30 message will re-transmit the same for execution to the LB unit [304], and the LB
unit [304] transmits to a healthy and available auditor service instance. A time out
34
will also occur because request message acknowledgement was not sent by the
auditor service instance in time, indicating failure at the instance.
[0097] The present disclosure further discloses a non-transitory computer readable
5 storage medium storing instruction for managing a fault tolerance associated with
an Auditor Service (AU) unit, the instructions include executable code which, when
executed by one or more units of a system, cause a load balancer (LB) unit [304] to
receive, a request message from one or more service instances. The instructions
when executed by the system further cause the LB unit to identify, via the AU_LB
10 interface, one or more available auditor service instances from the set of auditor
service instances. The instructions when executed by the system further cause the
LB unit to identify, a positive health status associated with the one or more available
auditor service instances from the set of auditor service instances. The instructions
when executed by the system further cause the LB unit to transmit, via the AU_LB
15 interface to an available auditor service instance from the one or more available
auditor service instances, the request message based on the positive health status.
The instructions when executed by the system further cause the LB unit to receive
a request message response from the available auditor service instances based on
the request message. The instructions when executed by the system further cause
20 the LB unit to transmit the request message response to the one or more service
instances. The instructions when executed by the system further cause the LB unit
to manage, via the AU_LB interface, the fault tolerance associated with the AU unit
by transmitting the request message based on the positive health status.
25 [0098] As is evident from the above, the present disclosure provides a technically
advanced solution for managing fault tolerance associated with an Auditor Service
(AU) unit. The present disclosure provides a system and a method for managing
fault tolerance for any event failure via an AU_LB interface by efficiently
maintaining incoming and outgoing requests distribution among all AU instances.
30 The present disclosure further provides a solution that effectively routes client
requests across all servers in a manner that maximizes speed and capacity
35
utilization. Further, the present disclosure provides the ability to support
HTTP/HTTPS in parallel. Additionally, due to high traffic on a particular instance
of a dependent MS or unhealthy auditor service instance requests might get failed,
which results into failure of system, so to avoid this type of scenario and to ensure
5 that no server is overworked AU_LB interface enables a means by which
incoming/outgoing requests can be easily distributed among all auditor service
instances.
[0099] While considerable emphasis has been placed herein on the disclosed
10 implementations, it will be appreciated that many implementations can be made and
that many changes can be made to the implementations without departing from the
principles of the present disclosure. These and other changes in the implementations
of the present disclosure will be apparent to those skilled in the art, whereby it is to
be understood that the foregoing descriptive matter to be implemented is illustrative
15 and non-limiting.
[0100] Further, in accordance with the present disclosure, it is to be acknowledged
that the functionality described for the various components/units can be
implemented interchangeably. While specific embodiments may disclose a
20 particular functionality of these units for clarity, it is recognized that various
configurations and combinations thereof are within the scope of the disclosure. The
functionality of specific units as disclosed in the disclosure should not be construed
as limiting the scope of the present disclosure. Consequently, alternative
arrangements and substitutions of units, provided they achieve the intended
25 functionality described herein, are considered to be encompassed within the scope
of the present disclosure
36
We Claim:
1. A method [400] for managing fault tolerance associated with an Auditor Service
(AU) unit [302], the method comprising:
5 - receiving, by a Load Balancer (LB) unit [304], a request message from one
or more service instances;
- identifying, by the LB unit [304], one or more available auditor service
instances from a set of auditor service instances;
- identifying, by the LB unit [304], a positive health status associated with the
10 one or more available auditor service instances from the set of available
auditor service instances;
- transmitting, by the LB unit [304] via the AU_LB interface [306], the
request message to an available auditor service instance from the one or
more available auditor service instances based on the positive health status;
15 - receiving, by the LB unit [304], a request message response from the
available auditor service instance based on the request message;
- transmitting, by the LB unit [304], the request message response to the one
or more service instances; and
- managing, by the LB unit [304] via the AU_LB interface [306], the fault
20 tolerance associated with the AU unit [302] by transmitting the request
message based on the positive health status.
2. The method [400] as claimed in claim 1, wherein the set of auditor service
instances comprises at least one of the one or more available auditor service
25 instances and one or more unavailable auditor service instances associated with
the AU unit [302].
3. The method [400] as claimed in claim 2, wherein the set of auditor service
instances is identified by an Orchestration Manager (OAM) unit [120] via the
30 AU_LB interface [306] based on a predefined auditor status determination rule.
37
4. The method as claimed in claim 3, wherein the set of auditor service instances
is received at the LB unit [304] from the OAM unit [120].
5. The method [400] as claimed in claim 1, wherein, the managing by the LB unit
5 [304] via the AU_LB interface [306], the fault tolerance associated with the AU
unit [302] further comprises:
transmitting the request message to another available auditor service instance
from the one or more available auditor service instances, in an event the
available auditor service instance from the one or more available auditor service
10 instances becomes unavailable during processing of the request message.
6. The method [400] as claimed in claim 3, wherein the positive health status
associated with the one or more available auditor service instances from the set
of available auditor service instances is received by the LB unit [304] from the
15 OAM unit [120] based on a predefined health determination rule.
7. The method [400] as claimed in claim 6, wherein at least the positive health
status associated with the one or more available auditor service instances from
the set of available auditor service instances is received by the LB unit [304]
20 from the OAM unit [120] in real time.
8. A system [300] for optimising fault tolerance associated with an Auditor
Service (AU) unit, the system comprises:
- at least a Load Balancer (LB) unit [304], wherein the load balancer unit is
25 configured to:
• receive, a request message from one or more service instances;
• identify, one or more available auditor service instances from a set of
auditor service instances;
• identify, a positive health status associated with the one or more
30 available auditor service instances from the set of auditor service
instances;
38
• transmit, via the AU_LB interface [306] to an available auditor service
instance from the one or more available auditor service instances, the
request message based on the positive health status;
• receive, a request message response from the available auditor service
5 instance based on the request message;
• transmit, the request message response to the one or more service
instances; and
• manage, via the AU_LB interface [306], the fault tolerance associated
with the AU unit [302] by transmitting the request message based on the
10 positive health status.
9. The system [300] as claimed in claim 8, wherein the set of auditor service
instances comprises at least one of the one or more available auditor service
instances and one or more unavailable auditor service instances associated with
15 the AU unit [302].
10. The system [300] as claimed in claim 8, wherein the one or more available
auditor service instances from the set of auditor service instances are identified
by the Orchestration Manager (OAM) unit [120] via the AU_LB interface [306]
20 based on a predefined auditor status determination rule.
11. The system as claimed in claim 10, wherein the set of auditor service instances
is received at the LB unit [304] from the OAM unit [120].
25 12. The system [300] as claimed in claim 8, wherein, to manage the fault tolerance
associated with the AU unit [302], the LB unit [304] via the AU_LB interface
[306], is further configured to:
transmit the request message to another available auditor service instance from
the one or more available auditor service instances, in an event the available
30 auditor service instance from the one or more available auditor service instances
becomes unavailable during processing of the request message.
39
13. The system [300] as claimed in claim 10, wherein the positive health status
associated with the one or more available auditor service instances from the set
of available auditor service instances is received by the LB unit [304] from the
5 OAM unit [120] based on a predefined health determination rule.
14. The system [300] as claimed in claim 13, wherein at least the positive health
status associated with the one or more available auditor service instances from
the set of available auditor service instances is received by the LB unit [304] from the OAM unit [120] in real time.

Documents

Application Documents

#	Name	Date
1	202321064694-STATEMENT OF UNDERTAKING (FORM 3) [26-09-2023(online)].pdf	2023-09-26
2	202321064694-PROVISIONAL SPECIFICATION [26-09-2023(online)].pdf	2023-09-26
3	202321064694-POWER OF AUTHORITY [26-09-2023(online)].pdf	2023-09-26
4	202321064694-FORM 1 [26-09-2023(online)].pdf	2023-09-26
5	202321064694-FIGURE OF ABSTRACT [26-09-2023(online)].pdf	2023-09-26
6	202321064694-DRAWINGS [26-09-2023(online)].pdf	2023-09-26
7	202321064694-Proof of Right [09-02-2024(online)].pdf	2024-02-09
8	202321064694-FORM-5 [24-09-2024(online)].pdf	2024-09-24
9	202321064694-ENDORSEMENT BY INVENTORS [24-09-2024(online)].pdf	2024-09-24
10	202321064694-DRAWING [24-09-2024(online)].pdf	2024-09-24
11	202321064694-CORRESPONDENCE-OTHERS [24-09-2024(online)].pdf	2024-09-24
12	202321064694-COMPLETE SPECIFICATION [24-09-2024(online)].pdf	2024-09-24
13	202321064694-FORM 3 [08-10-2024(online)].pdf	2024-10-08
14	202321064694-Request Letter-Correspondence [09-10-2024(online)].pdf	2024-10-09
15	202321064694-Power of Attorney [09-10-2024(online)].pdf	2024-10-09
16	202321064694-Form 1 (Submitted on date of filing) [09-10-2024(online)].pdf	2024-10-09
17	202321064694-Covering Letter [09-10-2024(online)].pdf	2024-10-09
18	202321064694-CERTIFIED COPIES TRANSMISSION TO IB [09-10-2024(online)].pdf	2024-10-09
19	Abstract.jpg	2024-10-25
20	202321064694-ORIGINAL UR 6(1A) FORM 1 & 26-070125.pdf	2025-01-14