Sign In to Follow Application
View All Documents & Correspondence

Method And System For Creating Server Pools

Abstract: ABSTRACT METHOD AND SYSTEM FOR CREATING SERVER POOLS The present disclosure relates to a system (120) and a method (500) for creating a server pool. The system (120) includes an identification unit (225) which is configured to identify requests which require numerous resources to execute one or more tasks. The system (120) further includes determining unit (230) configured to determine the type of resources required to execute the tasks. The system (120) further includes an assigning unit (235) which assigns the one or more resources running on one or more servers from a server pool based on the determination. The assigned resources are utilized to execute the given tasks. The system (120) further includes a training unit (240) which trains a selected Machine Learning model with data generated during execution of the one or more tasks. The system (120) further includes an adjusting unit (245) which dynamically allocates the number of servers within the server pool depending on one or more parameters using the trained model. Ref. Fig. 2

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
06 October 2023
Publication Number
15/2025
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

JIO PLATFORMS LIMITED
OFFICE-101, SAFFRON, NR. CENTRE POINT, PANCHWATI 5 RASTA, AMBAWADI, AHMEDABAD 380006, GUJARAT, INDIA

Inventors

1. Aayush Bhatnagar
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
2. Ankit Murarka
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
3. Jugal Kishore
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
4. Chandra Ganveer
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
5. Sanjana Chaudhary
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
6. Gourav Gurbani
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
7. Yogesh Kumar
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
8. Avinash Kushwaha
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
9. Dharmendra Kumar Vishwakarma
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
10. Sajal Soni
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
11. Niharika Patnam
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
12. Shubham Ingle
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
13. Harsh Poddar
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
14. Sanket Kumthekar
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
15. Mohit Bhanwria
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
16. Shashank Bhushan
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
17. Vinay Gayki
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
18. Aniket Khade
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
19. Durgesh Kumar
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
20. Zenith Kumar
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
21. Gaurav Kumar
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
22. Manasvi Rajani
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
23. Kishan Sahu
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
24. Sunil meena
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
25. Supriya Kaushik De
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
26. Kumar Debashish
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
27. Mehul Tilala
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
28. Satish Narayan
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
29. Rahul Kumar
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
30. Harshita Garg
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
31. Kunal Telgote
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
32. Ralph Lobo
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India
33. Girish Dange
Reliance Corporate Park, Thane - Belapur Road, Ghansoli, Navi Mumbai, Maharashtra 400701, India

Specification

DESC:
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003

COMPLETE SPECIFICATION
(See section 10 and rule 13)
1. TITLE OF THE INVENTION
METHOD AND SYSTEM FOR CREATING SERVER POOLS
2. APPLICANT(S)
Name Nationality Address
JIO PLATFORMS LIMITED INDIAN OFFICE-101, SAFFRON, NR. CENTRE POINT, PANCHWATI 5 RASTA, AMBAWADI, AHMEDABAD 380006, GUJARAT, INDIA
3. PREAMBLE TO THE DESCRIPTION

THE FOLLOWING SPECIFICATION PARTICULARLY DESCRIBES THE NATURE OF THIS INVENTION AND THE MANNER IN WHICH IT IS TO BE PERFORMED.

FIELD OF THE INVENTION
[0001] The present invention relates to the field of network management and, more specifically, to a system and a method for resource allocation based on predictive models integrated into the network.
BACKGROUND OF THE INVENTION
[0002] With increase in number of users, the network service provisions have been implementing to up-gradations to enhance the service quality so as to keep pace with such high demand. With advancement of technology, there is a demand for the telecommunication service to induce up-to-date features into the scope of provision. To enhance user experience and implement advanced monitoring mechanisms, prediction methodologies are being incorporated in the network management. An advanced prediction system integrated with an AI/ML system excels in executing a wide array of algorithms and predictive tasks.
[0003] An edge-level inference hosting, also known as on-device inference hosting or edge deployment of machine learning models, refers to the practice of deploying and running machine learning models directly on edge devices or at the edge of a network.
[0004] The traditional system with integrated AI/ML technology performs predictions using the centralized server bundle which are to be transferred to the edge devices of the network such as network nodes and network performance management entities. The resource allocated like number of servers allocated to train the AI/ML model is fixed and sometimes it may cause under-utilization or over –utilization of resources.
[0005] In contemporary practices, Machine Learning (ML) at edge requires execution server group as hardware resource for training the models at edge. Fixing this set of servers will lead to resource underutilization. Fixed server configurations may lead to underutilization of resources during periods of low demand. This inefficiency can result in wasted computational power and increased operational costs. Fixed server setups might not be optimized for low-latency processing. In scenarios where low latency is critical, such as real-time object detection in autonomous vehicles or industrial automation, not having enough resources available immediately can result in delays. Then again, fixed server configurations may lack built-in fault tolerance mechanisms. If a server fails, it can take time to replace it manually, resulting in downtime and service disruptions.
[0006] There is a requirement of a system and method thereof to create execution server groups corresponding to the specific use case optimally and when required without consuming too much time or bandwidth.
SUMMARY OF THE INVENTION
[0007] One or more embodiments of the present disclosure provide a method and a system for creating server pools.
[0008] In one aspect of the present invention, the system for creating the server pools is disclosed. The system includes an identification unit configured to identify one or requests which require one or more resources wherein the one or more requests pertain to execution of one or more tasks. The system further includes a determining unit configured to determine a type of one or more resources required to execute the one or more tasks. The system further includes an assigning unit, configured to, assign, from a server pool, the one or more resources running on one or more servers to execute the one or more tasks based on determining the type of the one or more resources. The system further includes a training unit, configured to, train, a model with historic data. The system further includes an adjusting unit, configured to, dynamically adjust, using the trained model, number of servers within the server pool depending on one or more learnt parameters to manage the one or more requests.
[0009] In an embodiment, the one or more resources include at least one of, Central Processing Unit (CPU), Random Access Memory (RAM), network interfaces, power supply, Operating System (OS) and cooling system.
[0010] In an embodiment, the training unit, trains, the model with historic data generated by enabling, the model to learn trends/patterns from the historic data pertaining to the execution of the one or more tasks.
[0011] In an embodiment, the one or more parameters include at least one of, number of requests, type of requests, working conditions of one or more servers in the server pool, status of execution of the one or more tasks.
In an embodiment, when the number of requests exceed a predefined threshold, the adjusting unit using the trained model, dynamically increases number of servers in the server pool.
[0012] The system as claimed in claim 13, wherein when the one or more servers is at least one of, not in a working condition or has issues, the adjusting unit, using the trained model, dynamically increases number of servers in the server pool.

[0013] In an embodiment, when status of execution of the one or more tasks indicates that the one or more tasks are completed, the adjusting unit, using the trained model, dynamically decreases number of servers in the server pool.
[0014] In an embodiment, upon completion of the one or more tasks, the adjusting unit releases the one or more resources.
[0015] In an embodiment, the one or more servers in the server pool have one or more capabilities including at least one of, Central Processing Unit (CPU) intensive servers, Graphical Processing Unit (GPU) accelerated servers or specialized hardware for machine learning tasks.
[0016] In an embodiment, the issues of the one or more servers include at least one of, hardware failures, network failures, software bugs, security vulnerabilities, resource exhaustion and load balancing problems.
[0017] In another aspect of the present invention, the method for creating the server pools is disclosed. The method includes the step of identifying, by the one or more processors, one or requests which require one or more resources, wherein the one or more requests pertain to execution of one or more tasks. The method further includes the step of determining, by the one or more processors, a type of one or more resources required to execute the one or more tasks. The method further includes the step of assigning, by the one or more processors, from a server pool, the one or more resources running on one or more servers to execute the one or more tasks based on determining the type of the one or more resources. The method further includes training, by the one or more processors, a model with historic data. The method further includes dynamically adjusting, by the one or more processors, using the trained model, number of servers within the server pool depending on one or more learnt parameters to manage the one or more requests. Upon completion of the one or more tasks, the method further includes releasing, by the adjusting unit, the one or more resources.
[0018] In another aspect of the invention, a non-transitory computer-readable medium having stored thereon computer-readable instructions is disclosed. The computer-readable instructions are executed by a processor. The processor is configured to identify one or requests which require one or more resources to execute one or more tasks. The processor is further configured to determine, a type of one or more resources required to execute the one or more tasks. The processor is further configured to assign, from a server pool, the one or more resources running on one or more servers to execute the one or more tasks based on determining the type of the one or more resources. The processor is further configured to train, a selected model with data generated during execution of the one or more tasks by the one or more resources. The processor is further configured to dynamically adjust, using the trained model, number of servers within the server pool depending on one or more parameters.
[0019] Other features and aspects of this invention will be apparent from the following description and the accompanying drawings. The features and advantages described in this summary and in the following detailed description are not all-inclusive, and particularly, many additional features and advantages will be apparent to one of ordinary skill in the relevant art, in view of the drawings, specification, and claims hereof. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter.
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] The accompanying drawings, which are incorporated herein, and constitute a part of this disclosure, illustrate exemplary embodiments of the disclosed methods and systems in which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that disclosure of such drawings includes disclosure of electrical components, electronic components or circuitry commonly used to implement such components.
[0021] FIG. 1 is an exemplary block diagram of an environment for creating server pools, according to one or more embodiments of the present invention;
[0022] FIG. 2 is an exemplary block diagram of a system for creating the server pools, according to one or more embodiments of the present invention;
[0023] FIG. 3 is an exemplary block diagram of an architecture implemented in the system of the FIG. 2, according to one or more embodiments of the present invention;
[0024] FIG. 4 is a flow diagram for creating the server pools, according to one or more embodiments of the present invention; and
[0025] FIG. 5 is a schematic representation of a method for creating the server pools, according to one or more embodiments of the present invention.
[0026] The foregoing shall be more apparent from the following detailed description of the invention.
DETAILED DESCRIPTION OF THE INVENTION
[0027] Some embodiments of the present disclosure, illustrating all its features, will now be discussed in detail. It must also be noted that as used herein and in the appended claims, the singular forms "a", "an" and "the" include plural references unless the context clearly dictates otherwise.
[0028] Various modifications to the embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. However, one of ordinary skill in the art will readily recognize that the present disclosure including the definitions listed here below are not intended to be limited to the embodiments illustrated but is to be accorded the widest scope consistent with the principles and features described herein.
[0029] A person of ordinary skill in the art will readily ascertain that the illustrated steps detailed in the figures and here below are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0030] FIG. 1 illustrates an exemplary block diagram of an environment 100 for creating server pools 345 (as referred in FIG. 3), according to one or more embodiments of the present disclosure. In this regard, the environment 100 includes a User Equipment (UE) 110, a server 115, a network 105 and a system 120 communicably coupled to each other for creating the server pools 345. The server pools 345 refer to a collection of servers providing various computational resources. The resources include, but not limited to, Central Processing Unit (CPU), Random Access Memory (RAM), network interfaces, and other hardware or software components. The servers located in distinct locations are to be dynamically assigned and adjusted based on demand of workload. The present invention accesses individual servers from time to time creating a server pool 345.
[0031] As per the illustrated embodiment and for the purpose of description and illustration, the UE 110 includes, but not limited to, a first UE 110a, a second UE 110b, and a third UE 110c, and should nowhere be construed as limiting the scope of the present disclosure. In alternate embodiments, the UE 110 may include a plurality of UEs as per the requirement. For ease of reference, each of the first UE 110a, the second UE 110b, and the third UE 110c, will hereinafter be collectively and individually referred to as the “User Equipment (UE)” 110.
[0032] In an embodiment, the UE 110 is one of, but not limited to, any electrical, electronic, electro-mechanical or an equipment and a combination of one or more of the above devices such as a smartphone, virtual reality (VR) devices, augmented reality (AR) devices, laptop, a general-purpose computer, desktop, personal digital assistant, tablet computer, mainframe computer, or any other computing device.
[0033] The environment 100 includes the server 115 accessible via the network 105. The server 115 may include, by way of example but not limitation, one or more of a standalone server, a server blade, a server rack, a bank of servers, a server farm, hardware supporting a part of a cloud service or system, a home server, hardware running a virtualized server, one or more processors executing code to function as a server, one or more machines performing server-side functionality as described herein, at least a portion of any of the above, some combination thereof. In an embodiment, the entity may include, but is not limited to, a vendor, a network operator, a company, an organization, a university, a lab facility, a business enterprise side, a defense facility side, or any other facility that provides service.
[0034] The network 105 includes, by way of example but not limitation, one or more of a wireless network, a wired network, an internet, an intranet, a public network, a private network, a packet-switched network, a circuit-switched network, an ad hoc network, an infrastructure network, a Public-Switched Telephone Network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, or some combination thereof. The network 105 may include, but is not limited to, a Third Generation (3G), a Fourth Generation (4G), a Fifth Generation (5G), a Sixth Generation (6G), a New Radio (NR), a Narrow Band Internet of Things (NB-IoT), an Open Radio Access Network (O-RAN), and the like.
[0035] The network 105 may also include, by way of example but not limitation, at least a portion of one or more networks having one or more nodes that transmit, receive, forward, generate, buffer, store, route, switch, process, or a combination thereof, etc. one or more messages, packets, signals, waves, voltage or current levels, some combination thereof, or so forth. The network 105 may also include, by way of example but not limitation, one or more of a wireless network, a wired network, an internet, an intranet, a public network, a private network, a packet-switched network, a circuit-switched network, an ad hoc network, an infrastructure network, a Public-Switched Telephone Network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, a VOIP or some combination thereof.
[0036] The environment 100 further includes the system 120 communicably coupled to the server 115 and the UE 110 via the network 105. The system 120 is configured for creating the server pools. As per one or more embodiments, the system 120 is adapted to be embedded within the server 115 or embedded as an individual entity.
[0037] Operational and construction features of the system 120 will be explained in detail with respect to the following figures.
[0038] FIG. 2 is an exemplary block diagram of the system 120 for creating the server pools 345, according to one or more embodiments of the present invention.
[0039] As per the illustrated embodiment, the system 120 includes one or more processors 205, a memory 210, a user interface 215, and a database 220. For the purpose of description and explanation, the description will be explained with respect to one processor 205 and should nowhere be construed as limiting the scope of the present disclosure. In alternate embodiments, the system 120 may include more than one processor 205 as per the requirement of the network 105. The one or more processors 205, hereinafter referred to as the processor 205 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, single board computers, and/or any devices that manipulate signals based on operational instructions.
[0040] As per the illustrated embodiment, the processor 205 is configured to fetch and execute computer-readable instructions stored in the memory 210. The memory 210 may be configured to store one or more computer-readable instructions or routines in a non-transitory computer-readable storage medium, which may be fetched and executed to create or share data packets over a network service. The memory 210 may include any non-transitory storage device including, for example, volatile memory such as RAM, or non-volatile memory such as disk memory, EPROMs, FLASH memory, unalterable memory, and the like.
[0041] In an embodiment, the user interface 215 includes a variety of interfaces, for example, interfaces for a graphical user interface, a web user interface, a Command Line Interface (CLI), and the like. The user interface 215 facilitates communication of the system 120. In one embodiment, the user interface 215 provides a communication pathway for one or more components of the system 120. Examples of such components include, but are not limited to, the UE 110 and the database 220.
[0042] The database 220 is one of, but not limited to, a centralized database, a cloud-based database, a commercial database, an open-source database, a distributed database, an end-user database, a graphical database, a No-Structured Query Language (NoSQL) database, an object-oriented database, a personal database, an in-memory database, a document-based database, a time series database, a wide column database, a key value database, a search database, a cache databases, and so forth. The foregoing examples of database 220 types are non-limiting and may not be mutually exclusive e.g., a database can be both commercial and cloud-based, or both relational and open-source, etc.
[0043] In order for the system 120 to create the server pools 345, the processor 205 includes one or more modules. In one embodiment, the one or more modules includes, but not limited to, an identification unit 225, a determining unit 230, an assigning unit 235, a training unit 240, and an adjusting unit 245 communicably coupled to each other for creating server pools 345.
[0044] In one embodiment, each of the identification unit 225, the determining unit 230, the assigning unit 235, the training unit 240 and the adjusting unit 245 can be used in combination or interchangeably for creating server pools 345.
[0045] The identification unit 225, the determining unit 230, the assigning unit 235, the training unit 240, and the adjusting unit 245 in an embodiment, may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processor 205. In the examples described herein, such combinations of hardware and programming may be implemented in several different ways. For example, the programming for the processor 205 may be processor-executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the processor may comprise a processing resource (for example, one or more processors), to execute such instructions. In the present examples, the memory 210 may store instructions that, when executed by the processing resource, implement the processor 205. In such examples, the system 120 may comprise the memory 210 storing the instructions and the processing resource to execute the instructions, or the memory 210 may be separate but accessible to the system 120 and the processing resource. In other examples, the processor 205 may be implemented by electronic circuitry.
[0046] In one embodiment, the identification unit 225 is configured to identify one or more requests which require one or more resources to execute one or more tasks. The one or more resources pertain to the execution of one or more tasks. The request represents specific operations to be performed. For example, a request for database query includes tasks corresponding to at least one of retrieving data and processing data. The one or more tasks refer to computational or data processing activities. The one or more tasks are to be executed by the one or more allocated servers from the server pools 345 and the one or more allocated resources running on the corresponding servers. In an embodiment, the one or more tasks include, but are not limited to operations, process, or requests like model training, handling network traffic, and file storage and retrieval.
[0047] In an embodiment, the models include, but not limited to, machine learning or statistical models trained on a dataset to analyse and recognize insights from the dataset. The model is enabled to predict outputs from new or unseen data whenever applied to the new data based on the insights gained from the training process. The unseen data refers to the real time data delivered to the system 120 for the model to predict outputs based on the training insights the model acquired during the training. The model includes but not limited to Regression Models, Decision Trees, Random Forest, Support Vector Machines (SVM), Neural Networks, K-means Clustering, Hierarchical Clustering, Principal Component Analysis, Q-Learning, Deep Q-Networks, Generative Adversarial Networks (GANs). The insights acquired by the model from the training includes but not limited to patterns of resource allocations, values of performance parameters for proper execution of one or more tasks.
[0048] In an embodiment, the one or more resources include at least one of, Central Processing Unit (CPU), Random Access Memory (RAM), network interfaces, power supply, Operating System (OS) and cooling system. In an embodiment one or more servers in the server pool have various capabilities including at least one of, Central Processing Unit (CPU) intensive servers, Graphical Processing Unit (GPU) accelerated servers or specialized hardware for machine learning tasks.
[0049] Upon identifying, the determining unit 230 determines a type of one or more resources required to execute the one or more tasks. The type of resources refers to various components required to execute one or more tasks on servers. The components include but not limited to Central Processing Unit (CPU), Random Access Memory (RAM), network interfaces, power supply, Operating System (OS) and cooling system. The type of resources further includes components configured with different memory size. The determining unit 230 determines which are the computational resources required for the execution of the one or more incoming requests. The incoming requests include, but are not limited to, ML inference requests or data processing tasks. The ML inference requests are requests to make predictions or decisions from real time data based on the insights gained by the models by training on a given data. The prediction, in an embodiment, includes at least prediction of resource requirements to allocate of resources to meet the requirements of the execution of one or more tasks. The data processing requests include processing, analyzing and manipulating data utilizing one or more resources working under servers. The tasks associated with data processing requests include but are not limited to, processing incoming patterns, handling machine learning workloads, and analyzing trends and patterns on the data generated during the execution of tasks. For example, task A requires CPU intensive server with 16 CPU cores and 32 GB of RAM, while task B requires a GPU server with 8 GB memory and 64 GB of RAM. The assigning unit accesses and assigns the servers and the resources running on the server to perform the given request. The determining unit 230 determines the type of resources required to perform the tasks. The determination of the type of resources required enable the system to assign the resources for the one or more tasks.
[0050] Upon determining the type of resources required to perform the tasks, the assigning unit 235 assigns from the server pool 345, the one or more resources running on one or more servers to execute the one or more tasks based on determining the type of the one or more resources. The resources required to perform each of the tasks are different.
[0051] In an embodiment, the server pool 345 comprises a collection of servers including but not limited to 345a, 345b and 345c. The server is a physical or virtual machine providing computing power and services in a network 105. Each of the server in the server pool 345 acts as platform or infrastructure running resources required to execute the one or more tasks. The one or more servers in the server pool have one or more capabilities including at least one of, CPU intensive servers, GPU accelerated servers or specialized hardware for machine learning tasks. The resources refer to individual components or services utilized to execute the one or more tasks. The resources running on each of the server in the server pool 345 include but are not limited to hardware and software components. The hardware components include at least, one of Central Processing Unit (CPU), Graphical Processing Unit (GPU), Random Access Memory (RAM) and Network Interfaces and Cooling Systems. The software components include at least one of resource managers, task scheduling and load balancing. Upon understanding which resources are required to perform the given tasks, the one or more resources running on one or more servers from the server pool 345 are assigned to perform one or more tasks. The assigned resources are utilized to execute the one or mor tasks.
[0052] Thereafter, a model is trained by the training unit 240. The training unit 240 trains the model, by enabling the model to learn trends or patterns from the historic data pertaining to the execution of the one or more tasks. The historic data refers to at least one of workload characteristics related to previous execution of one or more tasks using the one or more resources running on one or more servers, wherein the one or more servers belong to a server pool. The workload characteristics include at least one of CPU and GPU requirements, requests requiring computational resources, resource utilization, working condition of the servers, latency restraints and throughput demands. The purpose of the model training in the present embodiment is to analyze the new data to identify patterns in the requirement of type of resources to execute the one or more tasks, trends of execution of one or more tasks and trends of working condition of servers. The training enables the model to predict at least one of workload characteristics and the working conditions of the servers to manage the requests. The prediction enables the system 120 to dynamically allocate servers from the server pool 345 and corresponding resources to manage the requests and perform the given one or more tasks.
[0053] Upon training of model on the historic data, the adjusting unit 245 dynamically adjusts the number of servers from the server pool 345 to perform the one or more tasks. The adjusting unit 245 allocates the number of servers required to perform the one or more tasks based on one or more learnt parameters to manage the one or more requests. The one or more parameters include, but is not limited to type of requests, number of requests and status of execution of tasks. The requests refer to demands made by the application or processes for system resources in order to perform one or more tasks on the server infrastructure. Therefore, the one or more requests initiate the one or more requests. The type and number of requests and status of execution of the one or more tasks determine the type and number of resources or number of servers from the server pool 345 to execute the requests. The type and number of requests are determined by the threshold or optimal value predefined using the historical data upon which the model trained and gained insights.
[0054] In an embodiment when the number of requests exceed a predefined threshold, the adjusting unit 245 dynamically increases the number of servers in the server pool 345. The predefined threshold refers to a specific value or optimal range of values used by the system 120 to trigger actions. The actions in the present embodiment relate to dynamically adjusting the number of servers from the server pool 345 and resources running on each of the servers. The threshold is predefined based on several factors including but not limited to number of requests, expected workload, system resource capacity, historical usage patterns and server health. The threshold value is configured by the selected model, from the data used by the model during the model training.
[0055] In an embodiment, the number of requests is greater than the threshold value, the implication is that greater number of tasks are to be performed, and workload is high. As more tasks are to be performed, the number of resources required to perform one or more tasks is also more. Thus, the number of servers from the server pool 345 and the resources running on the server are dynamically increased by the adjusting unit 245.The number of resources is dynamically increased by adding more servers from the server pool 345 for the execution of the one or more tasks. In an embodiment, when the number of requests is less than the threshold value, then the number of tasks to be performed is less and the workload is low. As less tasks are to be performed, the number of resources to perform the work is also less. Therefore, the adjusting unit 245 decommissions the idle servers for the given task. The dynamic addition and decommissioning of the servers enable optimized utilization of resources, avoiding under-utilization and over-utilization of the one or more resources for the execution of the one or more tasks.
[0056] In an embodiment, when health of the one or more servers is not in working condition, or has issues, the adjusting unit 245, dynamically increases the number of the servers in the server pool 345. The health of the server refers to a capacity of the server to perform tasks and run the resources for the one or more tasks, effectively. The issues in the server include at least one of hardware failures, network failures, software bugs, security vulnerabilities, resource exhaustion and load balancing problems. In an embodiment, the trained model monitors how many failed requests the server is undergoing or whether response time of server is increasing. The increase in failed requests or response time indicates degraded performance. The health of the servers is monitored continuously by the trained model. The failure in the server performance is detected by the model through data analysis.
[0057] In an embodiment when status of execution of the one or more tasks indicates that the one or more tasks are completed, the adjusting unit 245, dynamically decreases the number of servers in the server pool 345. As the one or more tasks are completed, there are no servers required. Therefore, upon completion of the one or more tasks, the one or more resources are released. In an embodiment, where one or more servers are working to perform a task, average utilization of the working servers is monitored. In an embodiment, average utilization of servers is the percentage of resource out of the total capacity of the resource that is being used to perform the given task for a specific period. The adjusting unit 245 releases the one of the servers whenever the average utilization of servers falls below the predefined threshold value. For example, if 10 servers are utilized with an average utilization of 40% and the predefined threshold for decreasing the number of servers is 30%, then the adjusting unit 245 decreases the number of servers by 2. Thereafter, 8 servers remain active to perform the task. In another embodiment, when the average utilization is zero, then the task is completed, and the servers are released. The releasing of idle servers enables the system 120 to utilize the released resources in tasks which require more servers and resources to implement the given task. The data on the type of servers, number of servers, the required resources for each of the requests, corresponding tasks and the action of dynamic allocations are stored in the database 220 to utilize in future predictions.
[0058] FIG. 3 is an exemplary block diagram of an architecture 300, implemented in the system of the FIG. 2, according to one or more embodiments of the present invention.
[0059] The architecture 300 includes a data processing unit 305, an edge level training unit 310, a model compression unit 315 an AI/ML unit 320 and an execution server group 325. The AI/ML unit 320 includes a health monitoring unit 330, an auto scaling unit 335 and a logging and analytics unit 340. The AI/ML unit 320 works closer to the execution server group 325 comprising server pool 345 and an edge server 350.
[0060] In an embodiment, the data processing unit 305 is configured to standardize a received dataset. The received dataset includes but is not limited to the one or more requests and the associated one or more tasks. The received data set is the historic data. The historic data refers to data of at least one of workload characteristics and performance metrics. The workload characteristics and performance metrics is related to the one or more tasks which were previously executed using the one or more resources running on the one or more servers, wherein the one or more servers belong to a server pool. The historic data pertains to data including but not limited to CPU and GPU requirements, latency constraints, and throughput demands. The one or more requests initiate the execution of the one or more tasks. The received dataset further includes at least the performance of the one or more tasks utilizing one or more servers and corresponding resources to perform the given request. The received dataset utilized to train the model is raw data. The raw data contains heterogenous formats which is not suitable for model training. To prepare the dataset for the model, processing of data is to be done. The processing of the data includes but not limited to normalization, cleaning and organizing the data. After the pre-processing of the data, the raw data is transformed into standardized data with organized structure and homogenous formats suitable for the model training.
[0061] Upon processing of the data using data processing unit 305, the data is applied to the model in the edge level training unit 310. The edge level training refers to training at locations in the network 105 where the network 105 interacts with external network or is closer to the data sources and users. The edge level training unit 310 depends on training at the edge of the network 105 rather than depending on a cloud-based system. In the present invention, the edge level training unit 310 applies the standardized data on the model. The model gains insights into the type of requests, the number of servers and corresponding resources to perform a given task. The one or more resources include at least one of CPU, RAM, network interfaces, power supply, OS and cooling system. The model monitors the execution of tasks and performance of the servers and resources running on the servers. The model acquires inference on the optimal range or threshold value in each of the performance parameters, essential for the optimized execution of the one or more task. The performance parameters include, but are not limited to type of requests, number of requests, health of servers, and status of execution of tasks. The monitoring of the model enables the model to detect deviations of each of the parameters in an incoming or real time data from the inferred threshold value or optimal range of each of the parameters.
[0062] Upon edge level training of the model, the model compression unit 315 reduces the size of the trained models. The compression enables the model to run efficiently using edge servers 350 and devices run on the server. The edge servers 350 are servers located closer to the end devices. The edge devices refer to devices located at the edge of the network 105. The edge devices include but not limited to sensors, smartphones and IoT devices. The edge devices have limited computational power and storage. The compression of model using the model compression unit 315 reduces the overloading of the system 120 and facilitates seamless execution of the model.
[0063] Upon compression of the model by the model compression unit 315, the model is deployed on the edge network. The model continuously monitors the execution of the one or more tasks, performance of the servers and corresponding resources utilized to execute the tasks in the edge network. The model detects events when the parameters of the execution of the given task deviate beyond predefined values of threshold or optimal range. The threshold value or optimal value is predefined by determining performance parameters including but not limited to number of requests, health of the servers or resources, status of execution of tasks, which essential for the optimized functioning of the network 105.
[0064] In an embodiment, the health monitoring unit 330 continuously monitors the working conditions or the issues of the one or more servers in use. In an embodiment, the health monitoring unit 330 finds the health of one of the servers deteriorates due to the load of requests to be performed. The health monitoring unit 330 works alongside the edge server 350 to add one or more servers from the server pool 345. The addition of the one or more servers from the server pool 345 assists the servers in use to perform the given request. In an embodiment as the health of one or more servers is low, the one or more unhealthy servers are decommissioned, and one or more healthy servers are added from the server pool 345 to manage the given requests.
[0065] In an embodiment, the auto scaling unit 335 continuously monitors the edge network. When the number of requests exceeds the predefined threshold, more servers and resources are required to execute the one or more received requests. The servers required in addition to the deployed servers are added from the server pool 345. The autoscaling is done by the auto scaling unit 335 working alongside the edge server 350. The automated addition of servers enables the network to facilitate successful execution of the one or more requests and the one or more tasks associated with the requests.
[0066] In an embodiment, when the status of the execution of one or more tasks indicates that the one or more tasks are completed, the model dynamically adjusts the number of servers in the server pool 345. Upon completion of the one or more tasks, the one or more servers and the corresponding resources running on the server, are released from the task of execution. The model is trained on a dataset, made to monitor a network 105. The model detects any deviations of the performance parameters of the network 105 with respect to the threshold values. Thereafter, the model dynamically adjusts the servers executing the tasks and servers in the server pool 345 and the resources running on the server to perform the one or more tasks.
[0067] Each of the dynamic adjusting of servers in the server pool 345 and the corresponding resources running on each of the server in the server pool 345 is logged in or recorded in logging and analytics unit 340. The recorded data regarding the workload pattern, resource utilization and system performance are further utilized to gain insights for the forthcoming events in the network 105.
[0068] FIG. 4 is a flow diagram for creating the server pool 345, according to one or more embodiments of the present invention.
[0069] At step 405, the workload is continuously monitored in the environment. The monitoring includes at least analyzing CPU and GPU requirements, latency constraints and throughput demands to perform one or more requests. If more CPU and GPU requirements are more, or the time to transmit data through the network is more, then the workload is more.
[0070] At step 410, the incoming requests or computational workload requiring resources, are detected. The detection is essential to understand the number of servers and associated number and type of resources to be assigned for fulfilling the one or more request and the associated one or more tasks.
[0071] At step 415, based on the detected incoming workload, on-the-fly server provisioning is to be done. The required number of servers and corresponding resources are allocated dynamically from a server pool 345. The allocated resources are utilized for the performance of the one or more tasks.
[0072] At step 420, the models are trained on the data generated from the execution of one or more tasks. The model includes at least one of, machine learning models or statistical models. The model monitors the execution of tasks and gains inferences on the performance of the servers during the training. The model is enabled to predict the performance parameters of the servers with respect to a given task from the inferences acquired during the training process.
[0073] At step 425, the model dynamically allocates the servers from the server pool 345 and corresponding resources based on the inferences gained from the training. As the incoming workload is increasing or the response time is hiking, the model infers that more resources are required. The model scales up the number of servers to perform the given task. The scaling up involves adding servers on the fly when workloads increase. The model scales down the number of idle servers among the allotted servers to perform the given task if the incoming workload is decreasing. The scaling down involves the decommissioning of idle servers during low workload.
[0074] At step 430, the health of the servers is monitored by the model. As the health of the server is not at an optimal condition, the model dynamically increases the number of servers in the server pool 345. If the health of the server is optimally down, then the model adds servers from the server pool 345 to perform the one or more tasks.
[0075] At step 435, the performance of the system during the functioning is recorded during logging and analytics. The logging and analytics further record the analysis of workload done by the model. The number and type of resources utilized for executing each of the tasks are also recorded during logging and analytics.
[0076] In an embodiment upon completion of one or more tasks, the servers and the corresponding resources running for performing the given task are released from usage.
[0077] FIG. 5 is a flow diagram of a method 500 for creating server pools 345, according to one or more embodiments of the present invention. For the purpose of description, the method 500 is described with the embodiments as illustrated in FIG. 2 and should nowhere be construed as limiting the scope of the present disclosure.
[0078] At step 505, the method 500 includes the step of identifying, using the identification unit 225 one or requests which require one or more resources to execute one or more tasks. The one or more requests pertain to execution of one or more tasks. The requests refer to the initiation of one or more tasks to be performed. The identification unit 225 identifies the type of requests received by the system 120. The identification helps in determining the type of resources required to perform the given task.
[0079] At step 510, the method 500 includes the step of determining a type of one or more resources required to execute the one or more tasks. The step of determining the resources is done using the determining unit 230.The step of determining the type of resources enables the system 120 to allocate the one or more servers from the server pool 345 and the corresponding one or more resources to perform the given task.
[0080] At step 515, the method 500 includes the step of assigning the one or more resources running on one or more servers from the server pool 345 to execute the one or more tasks. The assigning unit 235 facilitates the assigning process based on the type of resources required to perform the given task. Each of the diverse tasks require diverse types of resources for the completion. Thereafter, the resources are utilized for execution of the given tasks.
[0081] At step 520, the method 500 includes the step of training a model. The model is trained on the historic data The historic data is at least one of workload characteristics and performance metrics related to the execution of the one or more tasks by the one or more resources. The model is trained by the training unit 240 to recognize patterns and gain insights into the performance of one or more resources during the previous executions of the one or more tasks. The insights enable the model to predict the performance of resources in one or more task executions. The predictions equip the model to dynamically allocate the one or more resources to perform one or more tasks.
[0082] At step 525, the method 500 includes the step of dynamically adjusting the number of servers within the server pool 345 based on the insights that the model gained during the training. The adjusting unit 245 facilitates the adjusting of the servers and the corresponding resources. The dynamic adjustment is based on the type of requests, number of requests, the working condition of servers utilized to perform the one or more tasks, and the status of execution of tasks. The number of servers allotted to perform the given task is increased or decreased based on the workload and completion of tasks.
[0083] The present invention further discloses a non-transitory computer-readable medium having stored thereon computer-readable instructions. The computer-readable instructions are executed by the processor 205. The processor 205 is configured to identify, one or requests which require one or more resources to execute one or more tasks. The processor 205 is further configured to determine, a type of one or more resources required to execute the one or more tasks. The processor 205 is further configured to assign, from a server pool 345, the one or more resources running on one or more servers to execute the one or more tasks based on determining the type of the one or more resources. The processor 205 is further configured to train, a selected model with data generated during execution of the one or more tasks by the one or more resources. The processor 205 is further configured to dynamically adjust, using the trained model, number of servers within the server pool 345 depending on one or more parameters.
[0084] A person of ordinary skill in the art will readily ascertain that the illustrated embodiments and steps in description and drawings (FIG.1-6) are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0085] The present disclosure incorporates technical advancement in solving the problem of resource underutilization of resources, latency issues and limited fault tolerance. Further, the present invention improves the fault tolerance of resources performing the one or more tasks within a network. The present invention dynamically allocates resources based on the workload demands. The flexibility in adding and removing resources increases scalability, optimized resource allocation. Further the present invention automates the process of creating and allocating servers from the server pools, without manual intervention. By automation of dynamic allocation of resources, the present invention brings economic significance facilitating cost efficiency in handling the workload, reducing the delay in data transmission and enhancing the fault tolerance in the network. The technical and economic advancement brought about by the present invention is applicable in industries including, but not limited to telecommunications and edge computing.
[0086] The present invention offers multiple advantages over the prior art and the above listed are a few examples to emphasize on some of the advantageous features. The listed advantages are to be read in a non-limiting manner.

REFERENCE NUMERALS

[0087] Environment- 100
[0088] Network- 105
[0089] User Equipment (UE) - 110
[0090] Server- 115
[0091] System -120
[0092] Processor- 205
[0093] Memory- 210
[0094] User Interface- 215
[0095] Database- 220
[0096] Identification unit- 225
[0097] Determining unit- 230
[0098] Assigning unit- 235
[0099] Training unit- 240
[00100] Adjusting unit- 245
[00101] Data Processing unit-305
[00102] Edge Level Training Unit- 310
[00103] Model Compression Unit- 315
[00104] Execution Server Group- 320
[00105] Server Pool- 325
[00106] Edge Server- 330
[00107] Health Monitoring Unit- 335
[00108] Auto Scaling Unit- 340
[00109] Logging and Analytics Unit- 345

,CLAIMS:CLAIMS
We Claim:
1. A method (500) for creating server pools (345), the method comprising the steps of:
identifying (505), by the one or more processors (205), one or more requests which require one or more resources, wherein the one or more requests pertain to execution of one or more tasks;
determining (510), by the one or more processors (205), a type of the one or more resources required to execute the one or more tasks;
assigning (515), by the one or more processors (205), from a server pool, the one or more resources, running on one or more servers, to execute the one or more tasks based on determining the type of the one or more resources;
training (520), by the one or more processors (205), a model with historic data; and
dynamically adjusting (525), by the one or more processors (205), using the trained model, number of servers within the server pool (345) depending on one or more learnt parameters to manage the one or more requests.

2. The method (500) as claimed in claim 1, wherein the one or more resources include at least one of, Central Processing Unit (CPU), Random Access Memory (RAM), network interfaces, power supply, Operating System (OS) and cooling system.

3. The method (500) as claimed in claim 1, wherein the step of, training, a model with historic data, includes the step of:
enabling, by the one or more processors (205), the model to learn trends/patterns of the historic data pertaining to the execution of the one or more tasks.

4. The method (500) as claimed in claim 1, wherein the one or more parameters include at least one of, number of requests, type of requests, health of one or more servers in the server pool (345), status of execution of the one or more tasks.

5. The method (500) as claimed in claim 4, wherein when the number of requests exceed a predefined threshold, the one or more processors using the trained model, dynamically increases number of servers in the server pool (345).

6. The method (500) as claimed in claim 4, wherein when the one or more servers is at least one of, not in a working condition or has issues, the one or more processors, using the trained model, dynamically increases number of servers in the server pool (345).

7. The method (500) as claimed in claim 4, wherein when status of execution of the one or more tasks indicates that the one or more tasks are completed, the one or more processors, using the trained model, dynamically decreases number of servers in the server pool (345).

8. The method (500) as claimed in claim 1, wherein the method further comprises the step of:
upon completion of the one or more tasks, releasing, by the one or more processors, the one or more resources.

9. The method (500) as claimed in claim 1, wherein the one or more servers in the server pool have one or more capabilities including at least one of, Central Processing Unit (CPU) intensive servers, Graphical Processing Unit (GPU) accelerated servers or specialized hardware for machine learning tasks.
10. The method (500) as claimed in claim 6, wherein the issues of the one or more servers include at least one of, hardware failures, network failures, software bugs, security vulnerabilities, resource exhaustion and load balancing problems.

11. A system (120) for creating server pools (345), the system comprising:
an identification unit (225), configured to, identify, one or requests which require one or more resources, wherein the one or more requests pertain to execution of one or more tasks;
a determining unit (230), configured to, determine, a type of one or more resources required to execute the one or more tasks;
an assigning unit (235), configured to, assign, from a server pool (345), the one or more resources running on one or more servers to execute the one or more tasks based on determining the type of the one or more resources;
a training unit (240), configured to, train, a model with historic data; and
an adjusting unit (245), configured to, dynamically adjust, using the trained model, number of servers within the server pool (345) depending on one or more learnt parameters to manage the one or more requests.

12. The system (120) as claimed in claim 10, wherein the one or more resources include at least one of, Central Processing Unit (CPU), Random Access Memory (RAM), network interfaces, power supply, Operating System (OS) and cooling system.

13. The system (120) as claimed in claim 10, wherein the training unit (240), trains, the model with the historic data by:
enabling, the model to learn trends/patterns of the historic data pertaining to the execution of the one or more tasks.

14. The system (120) as claimed in claim 10, wherein the one or more parameters include at least one of, number of requests, type of requests, working conditions of one or more servers in the server pool (345), status of execution of the one or more tasks.

15. The system (120) as claimed in claim 13, wherein when the number of requests exceed a predefined threshold, the adjusting unit using the trained model, dynamically increases number of servers in the server pool (345).

16. The system (120) as claimed in claim 13, wherein when the one or more servers is at least one of, not in a working condition or has issues, the adjusting unit (245), using the trained model, dynamically increases number of servers in the server pool (345).

17. The system (120) as claimed in claim 13, wherein when status of execution of the one or more tasks indicates that the one or more tasks are completed, the adjusting unit (245), using the trained model, dynamically decreases number of servers in the server pool (345).

18. The system (120) as claimed in claim 10, wherein the method further comprises the step of:
upon completion of the one or more tasks, releasing, by the adjusting unit (245), the one or more resources.

19. The system (120) as claimed in claim 10, wherein the one or more servers in the server pool have one or more capabilities including at least one of, Central Processing Unit (CPU) intensive servers, Graphical Processing Unit (GPU) accelerated servers or specialized hardware for machine learning tasks.

20. The system (120) as claimed in claim 16, wherein the issues of the one or more servers includes at least one of, hardware failures, network failures, software bugs, security vulnerabilities, resource exhaustion and load balancing problems.

Documents

Application Documents

# Name Date
1 202321067274-STATEMENT OF UNDERTAKING (FORM 3) [06-10-2023(online)].pdf 2023-10-06
2 202321067274-PROVISIONAL SPECIFICATION [06-10-2023(online)].pdf 2023-10-06
3 202321067274-FORM 1 [06-10-2023(online)].pdf 2023-10-06
4 202321067274-FIGURE OF ABSTRACT [06-10-2023(online)].pdf 2023-10-06
5 202321067274-DRAWINGS [06-10-2023(online)].pdf 2023-10-06
6 202321067274-DECLARATION OF INVENTORSHIP (FORM 5) [06-10-2023(online)].pdf 2023-10-06
7 202321067274-FORM-26 [27-11-2023(online)].pdf 2023-11-27
8 202321067274-Proof of Right [12-02-2024(online)].pdf 2024-02-12
9 202321067274-DRAWING [07-10-2024(online)].pdf 2024-10-07
10 202321067274-COMPLETE SPECIFICATION [07-10-2024(online)].pdf 2024-10-07
11 Abstract.jpg 2024-12-30
12 202321067274-Power of Attorney [24-01-2025(online)].pdf 2025-01-24
13 202321067274-Form 1 (Submitted on date of filing) [24-01-2025(online)].pdf 2025-01-24
14 202321067274-Covering Letter [24-01-2025(online)].pdf 2025-01-24
15 202321067274-CERTIFIED COPIES TRANSMISSION TO IB [24-01-2025(online)].pdf 2025-01-24
16 202321067274-FORM 3 [31-01-2025(online)].pdf 2025-01-31