System And Method For Managing Servers Configured In A Cloud Network

< Back

System And Method For Managing Servers Configured In A Cloud Network

Abstract: ABSTRACT SYSTEM AND METHOD FOR MANAGING SERVERS CONFIGURED IN A CLOUD NETWORK The present disclosure relates to a system (115) and a method of managing servers configured in a cloud network (110). The system (115) fetches information related to one or more servers added to the cloud network (110) and fetches details of containers running on the one or more servers. The system (115) distributes the details of the containers to a plurality of agent managers. The plurality of agent managers fetches metrics of the one or more servers. Ref. Fig. 1

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

15 July 2023

Publication Number

03/2025

Publication Type

INA

Invention Field

ELECTRONICS

Status

Email

Parent Application

Applicants

JIO PLATFORMS LIMITED

OFFICE-101, SAFFRON, NR. CENTRE POINT, PANCHWATI 5 RASTA, AMBAWADI, AHMEDABAD - 380006, GUJARAT, INDIA

Inventors

1. Aayush Bhatnagar

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

2. Kishan Sahu

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

3. Sanjana Chaudhary

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

4. Ankit Murarka

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

5. Rahul Verma

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

6. Chandra Kumar Ganveer

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

7. Kalikivayi Srinath

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

8. Gaurav Kumar

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

9. Gourav Gurbani

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

10. Kumar Debashish

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

11. Jugal Kishore Kolariya

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

12. Sunil Meena

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

13. Supriya De

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

14. Vitap Pandey

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

15. Tilala Mehul

Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad, Gujarat - 380006, India

Specification

DESC:
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003

COMPLETE SPECIFICATION
(See section 10 and rule 13)
1. TITLE OF THE INVENTION
SYSTEM AND METHOD FOR MANAGING SERVERS CONFIGURED IN A CLOUD NETWORK
2. APPLICANT(S)
NAME NATIONALITY ADDRESS
JIO PLATFORMS LIMITED INDIAN OFFICE-101, SAFFRON, NR. CENTRE POINT, PANCHWATI 5 RASTA, AMBAWADI, AHMEDABAD 380006, GUJARAT, INDIA
3.PREAMBLE TO THE DESCRIPTION

THE FOLLOWING SPECIFICATION PARTICULARLY DESCRIBES THE NATURE OF THIS INVENTION AND THE MANNER IN WHICH IT IS TO BE PERFORMED.

FIELD OF THE INVENTION
[0001] The present invention relates to the field of cloud networks. More particularly, the invention pertains to a system and method for automatically capturing metrics for servers added to a cloud network.

BACKGROUND OF THE INVENTION
[0002] Virtualization is a process that allows a computing device to share its hardware resources with multiple digitally separated environments. For resource virtualization, the computing devices such as servers are made available over cloud networks, to different users.
[0003] Currently, when a modification is made to the servers configured over a cloud network, for example when a new server is added, server details and metrics of the server need to be added manually into a system configured for monitoring the entire environment. Due to such manual process, real-time visibility and issues occurring in the server cannot be managed efficiently. Further, performance anomalies cannot be monitored effectively.
[0004] Hence, there is a need for a system and a method that can provide real-time visibility of entire cloud environment including the servers that are modified over the cloud network. Such system and method should enable efficient management and provide proactive response to any issues or performance anomalies.

BRIEF SUMMARY OF THE INVENTION
[0005] One or more embodiments of the present disclosure provide a system and method of managing servers configured in a cloud network.
[0006] In one aspect of the present invention, a system for managing servers configured in a cloud network is disclosed. The system includes an infrastructure manager configured to fetch information related to one or more servers added to the cloud network, fetch details of containers running on the one or more servers, and distribute the details of the containers to a plurality of agent managers. The infrastructure manager is further configured to monitor modification of the one or more servers, and identify addition of a new server and collects metrics of the new server. In one implementation, the details of the containers are distributed equally between the plurality of agent managers or distributed between the plurality of agent managers based on prior load at each agent manager. The plurality of agent managers is configured to fetch metrics of the one or more servers. The metrics data is collected from a Virtual Machine (VM).
[0007] In one aspect, the system further comprises a metric ingestion unit configured to validate metrics from the broker topics and perform data enrichment.
[0008] In one aspect, the system further comprises an infrastructure normalizer unit configured to process, filter, and store the metrics into a distributed data lake.
[0009] In one aspect, the system further comprises a Machine Learning/Artificial Intelligence (ML/AI) unit configured to detect metric anomalies and trigger forecasting for metrics upon detecting the metric anomalies. The ML/AI unit may operate using one of supervised, unsupervised, and reinforcement learning techniques. The supervised learning techniques may include decision trees, support vector machines, Naïve-Bayes classifier, and K-Nearest Neighbors. The unsupervised learning techniques may include K-means clustering, hierarchical clustering, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and Gaussian Mixture Models (GMM).
[0010] In one aspect, the system further comprises a forecasting engine configured to obtain a request from the ML/AI unit for taking a pre-emptive action for the metric anomalies, using a threshold value. The forecasting engine has a capability of performing network expansion based on data trends from the ML/AI unit.
[0011] In one aspect, the system further comprises a reporting and alarm engine configured to generate an alarm for the metric anomalies based on an instruction received from the ML/AI unit.
[0012] In one aspect, the system further comprises an anomaly detection engine configured to obtain an anomaly request from the ML/AI unit to take an action in a closed loop.
[0013] In another aspect of the present invention, a method of managing servers configured in a cloud network is disclosed. The method includes the step of fetching information related to one or more servers added to the cloud network, fetching details of containers running on the one or more servers, and distributing the details of the containers to a plurality of agent managers. The method further includes the step of monitoring modification of the one or more servers, and identifying addition of a new server and collects metrics of the new server. The details of the containers are distributed equally between the plurality of agent managers or distributed between the plurality of agent managers based on prior load at each agent manager. The method further includes the step of fetching metrics of the one or more servers. The metrics data is collected from a Virtual Machine (VM).
[0014] Other features and aspects of this invention will be apparent from the following description and the accompanying drawings. The features and advantages described in this summary and in the following detailed description are not all-inclusive, and particularly, many additional features and advantages will be apparent to one of ordinary skill in the relevant art, in view of the drawings, specification, and claims hereof. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] The accompanying drawings, which are incorporated herein, and constitute a part of this disclosure, illustrate exemplary embodiments of the disclosed methods and systems in which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that disclosure of such drawings includes disclosure of electrical components, electronic components or circuitry commonly used to implement such components.
[0016] FIG. 1 illustrates an exemplary block diagram of an environment for managing servers configured in a cloud network, according to one or more embodiments of the present invention;
[0017] FIG. 2 illustrates an exemplary block diagram of the system for managing servers configured in the cloud network, according to one or more embodiments of the present invention;
[0018] FIG. 3 illustrates a block diagram of the environment including the system for managing servers configured in the cloud network, according to one or more embodiments of the present invention;
[0019] FIG. 4 illustrates a flow diagram of a method of procuring metrics for a server, according to one or more embodiments of the present invention; and
[0020] FIG. 5 illustrates a flow diagram of a method of managing servers configured in a cloud network, according to one or more embodiments of the present invention.
[0021] Further, skilled artisans will appreciate that elements in the drawings are illustrated for simplicity and may not have necessarily been drawn to scale. For example, the flow charts illustrate the method in terms of the most prominent steps involved to help to improve understanding of aspects of the present invention. Furthermore, in terms of the construction of the device, one or more components of the device may have been represented in the drawings by conventional symbols, and the drawings may show only those specific details that are pertinent to understanding the embodiments of the present invention so as not to obscure the drawings with details that will be readily apparent to those of ordinary skill in the art having benefit of the description herein.
DETAILED DESCRIPTION OF THE INVENTION
[0022] Some embodiments of the present disclosure, illustrating all its features, will now be discussed in detail. It must also be noted that as used herein and in the appended claims, the singular forms “a”, “an” and “the” include plural references unless the context clearly dictates otherwise.
[0023] Various modifications to the embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. However, one of ordinary skill in the art will readily recognize that the present disclosure including the definitions listed here below are not intended to be limited to the embodiments illustrated but is to be accorded the widest scope consistent with the principles and features described herein.
[0024] A person of ordinary skill in the art will readily ascertain that the illustrated steps detailed in the figures and here below are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0025] The present disclosure provides a method and system for managing servers configured in a cloud network. The system automatically determines performance metrics of a server that is newly added to a cloud network. The system is also called as a Cloud-Native Infrastructure System (CNIS) and is configured to automatically capture metrics of servers newly added into the cloud network. Observability and monitoring components of the CNIS can detect and integrate the new server seamlessly. CNIS utilizes automated discovery mechanisms to include network scanning, agent-based monitoring or integration with infrastructure management tools. Once a new server is detected, CNIS automatically starts capturing relevant metrics and performance data from the server.
[0026] As the process of capturing metrics for new servers is completely automated, the monitoring and observability capabilities of the CNIS is easily extendable to the entire infrastructure without requirement of any manual intervention or setup. This allows for real-time visibility and monitoring of the entire environment, enabling efficient management and proactive response to any issues or performance anomalies using advanced AI/ML algorithms.
[0027] In an embodiment, a user can give custom commands to fetch relevant metrics from a server. Alternatively, the server can run its own predefined set of commands. The system is configured to adapt to a user command automatically for understanding the output by leveraging Artificial Intelligence / Machine Learning (AI/ML) algorithms. The system is also configured to auto decide, based on underlying OS/Hardware, which set of key metrics and commands need to be fired.
[0028] FIG. 1 illustrates an exemplary block diagram of an environment 100 for managing servers configured in a cloud network 110, according to one or more embodiments of the present invention. The environment 100 includes a plurality of hosts 105 (represented as a first host 105-1, a second host 105-2, and nth host 105-n) connected with the cloud network 110. The hosts 105 can be understood as computing devices communicating with each other in a computer network. The cloud network 110 can be understood as a Wide Area Network (WAN) hosting users and resources and allowing the two to communicate via cloud-based technologies. The cloud network 110 may consist of server, memories, virtual routers, firewalls, and network management software. The cloud network 110 may be a public, private, hybrid, or a community cloud network.
[0029] The environment 100 further includes a system 115 for managing servers configured in the cloud network 110. The system 115 is communicably coupled with the plurality of hosts 105 and the cloud network 110. The system 115 may be a computing device having a User Interface (UI), such as a laptop, general-purpose computer, desktop, personal digital assistant, tablet computer, mainframe computer.
[0030] In some implementations, the system 115 may include by way of example but not limitation, one or more of a standalone server, a server blade, a server rack, a bank of servers, a server farm, hardware supporting a part of a cloud service or system, a home server, hardware running a virtualized server, one or more processors executing code to function as a server, one or more machines performing server-side functionality as described herein, at least a portion of any of the above, some combination thereof.
[0031] The environment 100 further includes a distributed data lake 120 communicably coupled to the system 115. The distributed data lake 120 is a data repository providing storage and computing for structured and unstructured data, such as for machine learning, streaming, or data science. The distributed data lake 120 allows users and/or organizations to ingest and manage large volumes of data in an aggregated storage solution for business intelligence or data products. The distributed data lake 120 may be implemented and utilize different technologies. For example, in one implementation, Hadoop may be deployed with the Spark processing engine and HBase, a NoSQL database that runs on top of Hadoop Distributed File System (HDFS). In another implementation, Spark may be used against data stored in Amazon Simple Storage Service (S3).
[0032] Operational and construction features of the system 115 will be explained in detail successively with respect to different figures.
[0033] FIG. 2 illustrates an exemplary block diagram of the system 115 for managing servers configured in the cloud network 110, according to one or more embodiments of the present invention.
[0034] As per the illustrated embodiment, the system 115 includes one or more processors 205, a memory 210, and an input/output interface unit 215. The one or more processor 205, hereinafter referred to as the processor 205 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, single board computers, and/or any devices that manipulate signals based on operational instructions. As per the illustrated embodiment, the system 115 includes one or more processors 205. However, it is to be noted that the system 115 may include multiple processors as per the requirement and without deviating from the scope of the present disclosure. Among other capabilities, the one or more processors 205 are configured to fetch and execute computer-readable instructions stored in the memory 210. The memory 210 may be configured to store one or more computer-readable instructions or routines in a non-transitory computer-readable storage medium, which may be fetched and executed to create or share data packets over a network service. The memory 210 may include any non-transitory storage device including, for example, volatile memory such as RAM, or non-volatile memory such as EPROM, flash memory, and the like.
[0035] In an embodiment, the input/output (I/O) interface unit 215 includes a variety of interfaces, for example, interfaces for data input and output devices, referred to as Input/Output (I/O) devices, storage devices, and the like. The I/O interface unit 215 facilitates communication of the system 115. In one embodiment, the I/O interface unit 215 provides a communication pathway for one or more components of the system 115. Examples of such components include, but are not limited to, a backend database 220 and a distributed cache 225.
[0036] The backend database 220 is one of, but is not limited to, a centralized database, a cloud-based database, a commercial database, an open-source database, a distributed database, an end-user database, a graphical database, a No-Structured Query Language (NoSQL) database, an object-oriented database, a personal database, an in-memory database, a document-based database, a time series database, a wide column database, a key value database, a search database, a cache database, and so forth. The foregoing examples of the backend database 220 types are non-limiting and may not be mutually exclusive e.g., a database can be both commercial and cloud-based, or both relational and open-source, etc.
[0037] The distributed cache 225 is a pool of random-access memory (RAM) of multiple networked computers into a single in-memory data store for use as a data cache to provide fast access to data. The distributed cache 225 is essential for applications that need to scale across multiple servers or are distributed geographically. The distributed cache 225 ensures that data is available close to where it’s needed, even if the original data source is remote or under heavy load.
[0038] Further, the one or more processors 205, in an embodiment, may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the one or more processors 205. In the examples described herein, such combinations of hardware and programming may be implemented in several different ways. For example, the programming for the one or more processors 205 may be processor-executable instructions stored on a non-transitory machine-readable storage medium and the hardware for one or more processors 205 may comprise a processing resource (for example, one or more processors), to execute such instructions. In the present examples, the memory 210 may store instructions that, when executed by the processing resource, implement the one or more processors 205. In such examples, the system 115 may comprise the memory 210 storing the instructions and the processing resource to execute the instructions, or the memory 210 may be separate but accessible to the system 115 and the processing resource. In other examples, the one or more processors 205 may be implemented by electronic circuitry.
[0039] FIG. 3 illustrates a block diagram of the environment 100 including the system 115 for managing servers configured in the cloud network 110, according to one or more embodiments of the present invention. FIG. 4 illustrates a flow diagram of a method of procuring metrics for a server, according to one or more embodiments of the present invention.
[0040] In order for the system 115 to manage servers configured in the cloud network 110, the processor 205 includes an infrastructure manager 240, metric ingestion unit 245, infrastructure enrichment unit 250, infrastructure normalizer unit 255, Machine Learning/Artificial Intelligence (ML/AI) unit 260, anomaly detection engine 265, reporting and alarm engine 270, and a forecasting engine 275 communicably coupled to each other.
[0041] Each of the hosts 105 includes a plurality of Agent Managers (AM), as shown in FIG. 3 and FIG. 4. An AM is a first component that interacts with network functions on a southbound interface. Each of the hosts 105 integrates over a Transmission Control Protocol (TCP) interface with AM container and collects all counter metric data. All processes are defined at the AM so that it can process, ingest all metrics data i.e. details of containers from the containers running on a Virtual Machine (VM).
[0042] Details of the containers may be expressed using different attributes, as described henceforth.
a) Container image details expressed using:
• Base image specifying a base operating system or image from which the container is derived.
• Dependencies listing all libraries, frameworks, and software packages required by the application within the container.
• Dockerfile or equivalent instructions detailing how to build the image, including commands for copying files, setting environment variables, and running setup scripts.
b) Container runtime details expressed using
• Runtime environment providing information about the environment in which the container will execute (e.g., operating system version, kernel version).
• Resource limits such as limits on CPU, memory, and disk usage allocated to the container.
• Networking configuration details such as exposed ports and network interfaces used by the container.
c) Container configuration expressed using
• Environment Variables that are passed to the container to configure application settings or behavior.
• Volumes indicative of mount points where data can be shared between the container and the host system or other containers.
• Execution command or script to execute when the container starts (e.g., launching the application).
d) Metadata and Labels expressed using
• Labels that are custom key-value pairs used for organizing and querying containers (e.g., for versioning, environment type).
• Annotations providing additional metadata describing the container’s purpose, dependencies, or other relevant information.
e) Security configuration expressed using
• User and Permissions providing user ID and group settings used by processes within the container to enforce least privilege.
f) Versioning and history expressed using
• Image tags i.e. identifiers denoting different versions or variants of the container image.
• Image history including layers and changes introduced in each version of the container image, useful for auditing and troubleshooting.
[0043] In one implementation, the metrics data can be ingested from the containers or the VM. Once received from the containers or the VM, the metrics are evaluated and the data is sent to a Broker. The AM also provides support for a set of Application Programming Interfaces (APIs) through which each of the host 105 can be easily provisioned.
[0044] The one or more processors 205 includes the infrastructure manager 240 configured to interact with a Graphical User Interface (GUI)/dashboard on a southbound interface and with agent managers on a northbound interface via communication protocols, such as, but not limited to, Hyper Text transfer Protocol (HTTP). The infrastructure manager 240 allocates host IPs to the agent managers which is basically configurable. A host can be added or removed from the infrastructure manager 240.
[0045] The one or more processors 205 further includes the metric ingestion unit 245 configured to consume data/metrics from brokers topics, validate the data/metrics, and create a Comma Separated Values (CSV) file including the data/metrics. The CSV file is pulled by the infrastructure enrichment layer 250 for successive processing.
[0046] The one or more processors 205 further includes the infrastructure enrichment unit 250 configured to fetch/pull the data/metrics created by the metric ingestion unit 245 and enrich the data/metrics for improving accuracy and reliability.
[0047] The one or more processors 205 further includes the infrastructure normalizer unit 255 configured to intelligently process incoming data, filter the data for shrinking size, and store the shrink data. The data may be stored into the distributed data lake 120 for a predefined time period, such as for a year.
[0048] The one or more processors 205 further includes the ML/AI unit 260 for executing one or more ML algorithms on the metric data for training a data model. The data model may be trained to tune initial model parameters. The training may be performed iteratively for several times. Post training, the data model may be validated using a validation dataset containing unseen data, for evaluating model performance and tune hyperparameters. Successively, the data model may be tested using a testing dataset, for testing real-time performance of the data model. For training, validation, and testing, the metric data may be split into a ratio of 80/10/10 or 60/20/20.
[0049] Thereafter, the ML/AI unit 260 may execute the data model for finding any metric anomalies or trigger forecasting for metrics, as soon as any metric or trigger anomalies are detected. The ML/AI unit 260 sends the metric anomalies to the forecasting engine 275 and the reporting and alarm engine 270 to take a suitable preempting action.
[0050] The one or more processors 205 further includes the anomaly detection engine 265 for receiving an anomaly request from the AI/ML unit 260 to generate and take action in a closed loop. The anomaly detection engine 265 is capable of performing network expansion in a closed-loop automation.
[0051] The one or more processors 205 further includes the forecasting engine 275 for obtaining a request from the ML/AI unit 260 to take a preemptive action based on a threshold value. The forecasting engine 275 is capable of performing network expansion by addition of new resources (server, container, or services) in the network based on data trends derived from AI/ML algorithm.
[0052] The one or more processors 205 further includes the reporting and alarm engine 270 for receiving a request from the ML/AI unit 260 to generate an alarm based on a threshold value. The reporting and alarm engine 270 has capability of performing network expansion in closed loop automation.
[0053] FIG. 5 illustrates a flow diagram of a method 500 of managing servers configured in a cloud network, according to one or more embodiments of the present invention. For the purpose of description, the method 500 is described with reference to the embodiments as illustrated in FIG. 1 through FIG. 4 and should nowhere be construed as limiting the scope of the present disclosure.
[0054] At step 505, the method 500 includes the step of fetching, by the infrastructure manager 240 of the one or more processors 205, information related to one or more servers added to the cloud network 110. The infrastructure manager 240 interacts with a Graphical User Interface (GUI)/dashboard on southbound and the agent managers on northbound via a Hypertext transfer Protocol (HTTP) interface. The infrastructure manager 240 allocates host IPs to the agent managers, and the host IPs are configurable. The infrastructure manager 240 may add or remove a host.
[0055] At step 510, the method 500 includes the step of fetching, by the infrastructure manager 240 of the one or more processors 205, details of containers running on the one or more servers.
[0056] At step 515, the method 500 includes the step of distributing, by the infrastructure manager 240 of the one or more processors 205, the details of the containers to a plurality of agent managers. The details of the containers are distributed equally between the plurality of agent managers or distributed between the plurality of agent managers based on prior load at each agent manager.
[0057] At step 520, the method 500 includes the step of fetching, by the plurality of agent managers, metrics of the one or more servers.
[0058] A person of ordinary skill in the art will readily ascertain that the illustrated embodiments and steps in description and drawings (FIGS.1-5) are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0059] The above described techniques (of creating hierarchies) of the present disclosure provide multiple advantages, including providing an agentless unique architectural design that efficiently fetches container and host metrics of servers that are added to a cloud network. The disclosed process can be adopted into a standard technology. The disclosed method can enable collection of metrics on a mobile phone.
[0060] The present disclosure incorporates technical advancement of automated discovery mechanisms to identify and onboard new servers. The mechanisms include network scanning, agent-based monitoring, or integration with infrastructure management tools. Once a new server is detected, the system automatically captures relevant metrics and performance data from the server. By automating the process of capturing metrics for new servers, the system ensures that the monitoring and observability capabilities are extended to entire infrastructure without any manual configuration or setup. This allows real-time visibility and monitoring of entire environment, enabling efficient management and proactive response to any issues or performance anomalies using advanced AI/ML algorithms.
[0061] The present invention offers multiple advantages over the prior art and the above listed are a few examples to emphasize on some of the advantageous features. The listed advantages are to be read in a non-limiting manner.
[0062] The present invention further discloses a non-transitory computer-readable medium having stored thereon computer-readable instructions. The computer-readable instructions are executed by a processor. The processor is configured to fetch information related to one or more servers added to a cloud network. The processor is further configured to fetch details of containers running on the one or more servers. The processor is configured to distribute the details of the containers to a plurality of agent managers. The processor is configured to fetch metrics of the one or more servers.

REFERENCE NUMERALS
[0063] Environment - 100;
[0064] Hosts - 105;
[0065] Cloud network - 110;
[0066] System - 115;
[0067] Distributed data lake – 120;
[0068] One or more processors -205;
[0069] Memory – 210;
[0070] Input/output interface unit – 215;
[0071] Database – 220;
[0072] Distributed cache – 225;
[0073] Infrastructure manager – 240;
[0074] Metric ingestion unit – 245;
[0075] Infrastructure enrichment unit – 250;
[0076] Infrastructure normalize unit – 255;
[0077] Machine Learning/Artificial Intelligence unit – 260;
[0078] Anomaly detection engine – 265;
[0079] Report and alarm engine – 270;
[0080] Forecasting engine – 275.
,CLAIMS:CLAIMS
We Claim:
1. A method of managing servers configured in a cloud network (110), the method comprising the steps of:
fetching, by one or more processors (205), information related to one or more servers added to the cloud network (110);
fetching, by the one or more processors (205), details of containers running on the one or more servers;
distributing, by the one or more processors (205), the details of the containers to a plurality of agent managers; and
fetching, by the plurality of agent managers, metrics of the one or more servers.

2. The method as claimed in claim 1, comprising validating the metrics from broker topics and performing data enrichment.

3. The method as claimed in claim 1, comprising processing, filtering, and storing the metrics into a distributed data lake (120).

4. The method as claimed in claim 1, comprising utilizing a Machine Learning/Artificial Intelligence (ML/AI) unit (260) for detecting metric anomalies and trigger forecasting for metrics upon detecting the metric anomalies.

5. The method as claimed in claim 1, comprising taking a pre-emptive action for the metric anomalies, using a threshold value.

6. The method as claimed in claim 4, comprising generating an alarm for the metric anomalies based on an instruction received from the ML/AI unit (260).

7. The method as claimed in claim 1, wherein the details of the containers are distributed equally between the plurality of agent managers or distributed between the plurality of agent managers based on prior load at each agent manager.

8. The method as claimed in claim 1, comprising monitoring modification of the one or more servers.

9. The method as claimed in claim 1, comprising identifying addition of a new server and collecting metrics of the new server.

10. A system for managing servers configured in a cloud network (110), the system comprising:
an infrastructure manager configured to:
fetch information related to one or more servers added to the cloud network (110);
fetch details of containers running on the one or more servers;
distribute the details of the containers to a plurality of agent managers, and
the plurality of agent managers configured to fetch metrics of the one or more servers.

11. The system as claimed in claim 10, wherein the metrics data is collected from a Virtual Machine (VM).

12. The system as claimed in claim 10, wherein a metric ingestion unit (245) is configured to validate metrics from the broker topics and perform data enrichment.

13. The system as claimed in claim 10, comprises an infrastructure normalizer unit (255) configured to process, filter, and store the metrics into a distributed data lake (120).

14. The system as claimed in claim 13, comprises a Machine Learning/Artificial Intelligence (ML/AI) unit (260) configured to detect metric anomalies and trigger forecasting for metrics upon detecting the metric anomalies.

15. The system as claimed in claim 14, comprises a forecasting engine (275) configured to obtain a request from the ML/AI unit (260) for taking a pre-emptive action for the metric anomalies, using a threshold value.

16. The system as claimed in claim 14, comprises a reporting and alarm engine (270) configured to generate an alarm for the metric anomalies based on an instruction received from the ML/AI unit (260).

17. The system as claimed in claim 15, wherein the forecasting engine (275) has a capability of performing network expansion based on data trends from the ML/AI unit (260).

18. The system as claimed in claim 14, comprises an anomaly detection engine (265) configured to obtain an anomaly request from the ML/AI unit (260) to take an action in a closed loop.

19. The system as claimed in claim 14, wherein the details of the containers are distributed equally between the plurality of agent managers or distributed between the plurality of agent managers based on prior load at each agent manager.

20. The system as claimed in claim 10, wherein the infrastructure manager (240) is configured to monitor modification of the one or more servers.

21. The system as claimed in claim 10, wherein the infrastructure manager (240) is configured to identify addition of a new server and collecting metrics of the new server.

Documents

Application Documents

#	Name	Date
1	202321047839-STATEMENT OF UNDERTAKING (FORM 3) [15-07-2023(online)].pdf	2023-07-15
2	202321047839-PROVISIONAL SPECIFICATION [15-07-2023(online)].pdf	2023-07-15
3	202321047839-FORM 1 [15-07-2023(online)].pdf	2023-07-15
4	202321047839-FIGURE OF ABSTRACT [15-07-2023(online)].pdf	2023-07-15
5	202321047839-DRAWINGS [15-07-2023(online)].pdf	2023-07-15
6	202321047839-DECLARATION OF INVENTORSHIP (FORM 5) [15-07-2023(online)].pdf	2023-07-15
7	202321047839-FORM-26 [03-10-2023(online)].pdf	2023-10-03
8	202321047839-Proof of Right [08-01-2024(online)].pdf	2024-01-08
9	202321047839-DRAWING [13-07-2024(online)].pdf	2024-07-13
10	202321047839-COMPLETE SPECIFICATION [13-07-2024(online)].pdf	2024-07-13
11	Abstract-1.jpg	2024-08-28
12	202321047839-Power of Attorney [25-10-2024(online)].pdf	2024-10-25
13	202321047839-Form 1 (Submitted on date of filing) [25-10-2024(online)].pdf	2024-10-25
14	202321047839-Covering Letter [25-10-2024(online)].pdf	2024-10-25
15	202321047839-CERTIFIED COPIES TRANSMISSION TO IB [25-10-2024(online)].pdf	2024-10-25
16	202321047839-FORM 3 [02-12-2024(online)].pdf	2024-12-02
17	202321047839-FORM 18 [20-03-2025(online)].pdf	2025-03-20