Abstract: ABSTRACT METHOD AND SYSTEM TO PRE-PROCESS RAW NETWORK PERFORMANCE DATA The present disclosure relates to a system (108) and a method (500) for pre-processing raw network performance data. The system (108) includes a capturing unit (210) configured to capture previously executed metadata and raw network performance data. The system (108) further includes an analyzing unit (212) configured to analyze, using a machine learning algorithm, the previously executed metadata and the raw network performance data, to predict an aggregation operation. The system (108) further includes a performing unit (214) configured to perform, the aggregation of the raw network performance data using the predicted aggregation operation. Ref. FIG. 2
DESC:
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003
COMPLETE SPECIFICATION
(See section 10 and rule 13)
1. TITLE OF THE INVENTION
METHOD AND SYSTEM TO PRE-PROCESS RAW NETWORK PERFORMANCE DATA
2. APPLICANT(S)
NAME NATIONALITY ADDRESS
JIO PLATFORMS LIMITED INDIAN OFFICE-101, SAFFRON, NR. CENTRE POINT, PANCHWATI 5 RASTA, AMBAWADI, AHMEDABAD 380006, GUJARAT, INDIA
3.PREAMBLE TO THE DESCRIPTION
THE FOLLOWING SPECIFICATION PARTICULARLY DESCRIBES THE NATURE OF THIS INVENTION AND THE MANNER IN WHICH IT IS TO BE PERFORMED.
FIELD OF THE INVENTION
[0001] The present invention relates to the field of networking, and more particularly relates, to a system and a method to pre-process raw network performance data.
BACKGROUND OF THE INVENTION
[0002] Traditionally, the execution of user’s request for network performance analysis involves loading huge amount of raw network performance data from a distributed file system. The huge amount of raw network performance data is aggregated utilizing various aggregation operations (e.g., min, max, sum, avg, count, etc.). The loading and computation of huge amount of raw network performance data is resource-intensive and time-consuming process for the computation engine.
[0003] In an exemplary example, the size of just one day's raw 4G/5G network performance data may range from 1TB to 50TB, which means processing a week's raw 4G/5G network performance data would require loading and processing 7TB to 350TB of data. Processing such a massive volume of data requires significant resource utilization and results in a considerable response time.
[0004] The time-consuming steps of importing raw data and conducting initial aggregation operations pose challenges in terms of system performance and efficiency. The computational workload associated with processing large volumes of raw network performance data places a significant strain on a computation cluster, limiting its ability to handle concurrent requests effectively. Moreover, the scalability of traditional systems is hindered by the need to process raw network performance data for each user request. As the number of users and the volume of requests grow, the system's ability to accommodate the increasing demand becomes compromised, leading to degraded performance and longer response times.
[0005] Therefore, there is a need for an improved mechanism that proactively pre-processes the massive network performance data using an Artificial Intelligence/Machine Learning (AI/ML) model. By leveraging pre-processed data, the burden on the computation engine can be reduced, leading to faster response times, improved resource utilization, and enhanced scalability.
SUMMARY OF THE INVENTION
[0006] One or more embodiments of the present disclosure provide a system and a method for pre-processing raw network performance data.
[0007] In one aspect of the present invention, a method to pre-process raw network performance data is disclosed. The method includes capturing previously executed metadata and raw network performance data. Further, the method includes analyzing using a machine learning algorithm, the previously executed metadata and the raw network performance data, to predict an aggregation operation. Further, the method includes performing the aggregation of the raw network performance data using the predicted aggregation operation.
[0008] In an embodiment, the method further includes receiving a request from a user to access the previously executed metadata and the raw network performance data.
[0009] In an embodiment, the method further includes applying the aggregation operation, such as min, max, sum, count, average, to the raw network performance data at various bucket levels, such as Quarter, Hour, Day, Month, International Mobile Subscriber Identity (IMSI), cell.
[0010] In an embodiment, the step of applying comprises enabling summarizing and condensing the raw network performance data into meaningful and useful aggregated information.
[0011] In an embodiment, the method further includes storing the aggregated data upon performing the aggregation operations.
[0012] In an embodiment, the method further includes facilitating efficient access to the data, by creating a dedicated folder structure for each bucket level.
[0013] In an embodiment, the aggregated data at each of the bucket level is stored in the dedicated folder structure.
[0014] In an embodiment, requesting comprises retrieving the aggregated data.
[0015] In an embodiment, the method further includes applying the aggregation operation after a predefined time interval, which is determined based on a set of factors, including a number of raw data records, time of the day, traffic over a network, and a geographical location.
[0016] In another aspect of the present invention, the system for pre-processing raw network performance data is disclosed. The system includes a capturing unit, an analyzing unit and a performing unit. The capturing unit is configured to capture previously executed metadata and raw network performance data. The analyzing unit is configured to analyze, using a machine learning algorithm, the previously executed metadata and the raw network performance data, to predict an aggregation operation. The performing unit is configured to perform, the aggregation of the raw network performance data using the predicted aggregation operation.
[0017] In another aspect of the present invention, non-transitory computer-readable medium having stored thereon computer-readable instructions that, when executed by a processor is disclosed. The processor is configured to capture previously executed metadata and raw network performance data. The processor is configured to analyze using a machine learning algorithm, the previously executed metadata and the raw network performance data, to predict an aggregation operation. Further, the processor is configured to perform the aggregation of the raw network performance data using the predicted aggregation operation.
[0018] Other features and aspects of this invention will be apparent from the following description and the accompanying drawings. The features and advantages described in this summary and in the following detailed description are not all-inclusive, and particularly, many additional features and advantages will be apparent to one of ordinary skill in the relevant art, in view of the drawings, specification, and claims hereof. Moreover, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes and may not have been selected to delineate or circumscribe the inventive subject matter, resort to the claims being necessary to determine such inventive subject matter.
BRIEF DESCRIPTION OF THE DRAWINGS
[0019] The accompanying drawings, which are incorporated herein, and constitute a part of this disclosure, illustrate exemplary embodiments of the disclosed methods and systems in which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that disclosure of such drawings includes disclosure of electrical components, electronic components or circuitry commonly used to implement such components.
[0020] FIG. 1 is an exemplary block diagram of an environment for pre-processing raw network performance data, according to various embodiments of the present disclosure;
[0021] FIG. 2 is an exemplary block diagram of a system for pre-processing the raw network performance data, according to various embodiments of the present disclosure;
[0022] FIG. 3 is an exemplary block diagram of an architecture implemented in the system of the FIG. 2, according to various embodiments of the present disclosure;
[0023] FIG. 4 is a signal flow diagram for pre-processing the raw network performance data, according to various embodiments of the present disclosure; and
[0024] FIG. 5 is schematic representation of a method for pre-processing raw network performance data, according to various embodiments of the present disclosure.
[0025] The foregoing shall be more apparent from the following detailed description of the invention.
DETAILED DESCRIPTION OF THE INVENTION
[0026] Some embodiments of the present disclosure, illustrating all its features, will now be discussed in detail. It must also be noted that as used herein and in the appended claims, the singular forms "a", "an" and "the" include plural references unless the context clearly dictates otherwise.
[0027] Various modifications to the embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. However, one of ordinary skill in the art will readily recognize that the present disclosure including the definitions listed here below are not intended to be limited to the embodiments illustrated but is to be accorded the widest scope consistent with the principles and features described herein.
[0028] A person of ordinary skill in the art will readily ascertain that the illustrated steps detailed in the figures and here below are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0029] FIG. 1 illustrates an exemplary block diagram of an environment 100 for pre-processing raw network performance data, according to various embodiments of the present disclosure. The environment 100 includes a User Equipment (UE) 102, a server 104, a network 106, and a system 108 communicably coupled to each other for pre-processing raw network performance data. The raw network performance data refers to unprocessed data that is directly captured from the network. The raw network performance data includes, but not limited to, metrics such as call release reason (CRR) data, bandwidth usage, latency, packet loss, jitter, throughput, error rates, and other performance indicators collected from network devices and infrastructure. The pre-processing raw network performance data refers to the series of operations and transformations applied to raw network performance data before it is analyzed or utilized for further decision-making processes.
[0030] The CRR data refers to information recorded by telecommunications networks to indicate the cause or reason for the termination of a call. When a call ends, the network typically generates a CRR code to categorize why the call was released. In an aspect, the CRR data includes a number of CRR codes that may vary depending on the specific telecommunications standards and protocols used by the network, but common reasons for call release include normal call clearing (the call was terminated by either the calling party or the called party in a normal manner, such as hanging up the phone), user busy (the called party's line was busy, and the call could not be completed), No User Responding (the called party did not answer the call within a specified timeout period), Call Rejected (the called party explicitly rejected the incoming call), Call Dropped (the call was unexpectedly terminated due to a technical issue or loss of signal), Network Congestion (the call could not be completed due to network congestion or capacity limitations), and so on.
[0031] As per the illustrated embodiment and for the purpose of description and illustration, the UE 102 includes, but not limited to, a first UE 102a, a second UE 102b, and a third UE 102c, and should nowhere be construed as limiting the scope of the present disclosure. In alternate embodiments, the UE 102 may include a plurality of UEs as per the requirement. For ease of reference, each of the first UE 102a, the second UE 102b, and the third UE 102c, will hereinafter be collectively and individually referred to as the “User Equipment (UE) 102”.
[0032] In an embodiment, the UE 102 is one of, but not limited to, any electrical, electronic, electro-mechanical or an equipment and a combination of one or more of the above devices such as a smartphone, virtual reality (VR) devices, augmented reality (AR) devices, laptop, a general-purpose computer, desktop, personal digital assistant, tablet computer, mainframe computer, or any other computing device.
[0033] The environment 100 includes the server 104 accessible via the network 106. The server 104 may include, by way of example but not limitation, one or more of a standalone server, a server blade, a server rack, a bank of servers, a server farm, hardware supporting a part of a cloud service or system, a home server, hardware running a virtualized server, one or more processors executing code to function as a server, one or more machines performing server-side functionality as described herein, at least a portion of any of the above, some combination thereof. In an embodiment, the entity may include, but is not limited to, a vendor, a network operator, a company, an organization, a university, a lab facility, a business enterprise side, a defense facility side, or any other facility that provides service.
[0034] The network 106 includes, by way of example but not limitation, one or more of a wireless network, a wired network, an internet, an intranet, a public network, a private network, a packet-switched network, a circuit-switched network, an ad hoc network, an infrastructure network, a Public-Switched Telephone Network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, or some combination thereof. The network 106 may include, but is not limited to, a Third Generation (3G), a Fourth Generation (4G), a Fifth Generation (5G), a Sixth Generation (6G), a New Radio (NR), a Narrow Band Internet of Things (NB-IoT), an Open Radio Access Network (O-RAN), and the like.
[0035] The network 106 may also include, by way of example but not limitation, at least a portion of one or more networks having one or more nodes that transmit, receive, forward, generate, buffer, store, route, switch, process, or a combination thereof, etc. one or more messages, packets, signals, waves, voltage or current levels, some combination thereof, or so forth. The network 106 may also include, by way of example but not limitation, one or more of a wireless network, a wired network, an internet, an intranet, a public network, a private network, a packet-switched network, a circuit-switched network, an ad hoc network, an infrastructure network, a Public-Switched Telephone Network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, a VOIP or some combination thereof.
[0036] The environment 100 further includes the system 108 communicably coupled to the server 104 and the UE 102 via the network 106. The system 108 is configured for pre-processing raw network performance data. As per one or more embodiments, the system 108 is adapted to be embedded within the server 104 or embedded as an individual entity.
[0037] Operational and construction features of the system 108 will be explained in detail with respect to the following figures.
[0038] FIG. 2 is an exemplary block diagram of the system 108 for the pre-processing of raw network performance data, according to one or more embodiments of the present invention.
[0039] As per the illustrated embodiment, the system 108 includes one or more processors 202, a memory 204, a user interface 206, and a database 208. For the purpose of description and explanation, the description will be explained with respect to one processor 202 and should nowhere be construed as limiting the scope of the present disclosure. In alternate embodiments, the system 108 may include more than one processor 202 as per the requirement of the network 106. The one or more processors 202, hereinafter referred to as the processor 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, single board computers, and/or any devices that manipulate signals based on operational instructions.
[0040] As per the illustrated embodiment, the processor 202 is configured to fetch and execute computer-readable instructions stored in the memory 204. The memory 204 may be configured to store one or more computer-readable instructions or routines in a non-transitory computer-readable storage medium, which may be fetched and executed to create or share data packets over a network service. The memory 204 may include any non-transitory storage device including, for example, volatile memory such as RAM, or non-volatile memory such as disk memory, EPROMs, FLASH memory, unalterable memory, and the like.
[0041] In an embodiment, the user interface 206 includes a variety of interfaces, for example, interfaces for a graphical user interface, a web user interface, a Command Line Interface (CLI), and the like. The user interface 206 facilitates communication of the system 108. In one embodiment, the user interface 206 provides a communication pathway for one or more components of the system 108. Examples of such components include, but are not limited to, the UE 102 and the database 208.
[0042] The database 208 is one of, but not limited to, a centralized database, a cloud-based database, a commercial database, an open-source database, a distributed database, an end-user database, a graphical database, a No-Structured Query Language (NoSQL) database, an object-oriented database, a personal database, an in-memory database, a document-based database, a time series database, a wide column database, a key value database, a search database, a cache databases, and so forth. The foregoing examples of database 208 types are non-limiting and may not be mutually exclusive e.g., a database can be both commercial and cloud-based, or both relational and open-source, etc.
[0043] In order for the system 108 to pre-process the raw network performance data, the processor 202 includes one or more modules. In one embodiment, the one or more modules includes, but not limited to, a capturing unit 210, an analyzing unit 212, a performing unit 214, a storage unit 216, a facilitating unit 218, and a receiving unit 220 communicably coupled to each other for pre-processing the raw network performance data.
[0044] In one embodiment, each of the capturing unit 210, the analyzing unit 212, the performing unit 214, the storage unit 216, the facilitating unit 218, and the receiving unit 220 can be used in combination or interchangeably for pre-processing the raw network performance data.
[0045] The capturing unit 210, the analyzing unit 212, the performing unit 214, the storage unit 216, the facilitating unit 218, and the receiving unit 220 in an embodiment, may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processor 202. In the examples described herein, such combinations of hardware and programming may be implemented in several different ways. For example, the programming for the processor 202 may be processor-executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the processor may comprise a processing resource (for example, one or more processors), to execute such instructions. In the present examples, the memory 204 may store instructions that, when executed by the processing resource, implement the processor. In such examples, the system 108 may comprise the memory 204 storing the instructions and the processing resource to execute the instructions, or the memory 204 may be separate but accessible to the system 108 and the processing resource. In other examples, the processor 202 may be implemented by electronic circuitry.
[0046] In an embodiment, for pre-processing the raw network performance data, the capturing unit 210 is configured to capture previously executed metadata and raw network performance data. The metadata is the information that describes and explains the data. The data refers to various types of information related to network performance and metadata. The previously executed metadata refers to the data that provides information about past network performance and the operations that have been carried out on the data. The previously executed metadata includes, but not limited to, historical records of network performance, configurations, past aggregation operations, timestamps, and other contextual information that helps in understanding the raw network data over time. The raw network performance data refers to unprocessed or minimally processed data i.e., raw data collected directly from the network 106. The raw network performance data include, but not limited to, call release reason (CRR) data, traffic metrics, Quality of Service (QoS) metrics, event logs, resource utilization.
[0047] In particular, for pre-processing the raw network performance data, the previously executed metadata is captured from a data lake 306 (as shown in FIG. 3) and the raw network performance data from a distributed file system 308 (as shown in FIG. 3). The data lake 306 is a centralized repository that stores all the structured, unstructured data and historical executed metadata. The distributed file system is a file system that allows data to be stored and accessed across multiple physical servers or locations.
[0048] Upon capturing the previously executed metadata and the raw network performance data, the analyzing unit 212 is configured to analyze the previously executed metadata and the raw network performance data using a machine learning algorithm to predict an aggregation operation. The machine learning algorithm is a set of computational techniques and statistical models used to enable computers to perform tasks without explicit programming for each task. The machine learning algorithms include, but not limited to, linear regression, logistic regression, decision trees, random forest, Support Vector Machines (SVM), K-Nearest Neighbors (KNN), K-Means clustering, hierarchical clustering, Q-learning, Deep Q-Networks (DQN), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs). The aggregation operations include at least one of, min, max, sum, count, average etc.
[0049] More specifically, the analyzing unit 212 utilizes the machine learning algorithms to predict the aggregation operations which are required for future requests. To predict the aggregation operations, the analyzing unit 212 analyzes patterns and trends in the previously executed metadata and the raw network performance data using machine learning algorithms. The patterns refer to recurring structures, shapes, or arrangements observed in the data and represent regularities or similarities in the way data points are distributed or organized. For example, if the raw network performance data that includes metrics like data traffic volume, error rates, and response times at different times of the day. The data might show that traffic volume consistently peaks between 6 PM and 9 PM every day. This peak traffic pattern indicates a regular increase in network usage during evening hours. The trends refer to the general direction or tendency in which data points are moving over time and describe overall movement or pattern of change in a dataset. The trends can be upward, downward, or stable, and help in understanding long-term changes in the data. For example, over several months, the data might show a steady increase in overall network traffic. This trend indicates growing network usage, possibly due to an increase in the number of users or services. Thus, by analyzing the patterns and trends in the previously executed metadata and the raw network performance data using machine learning algorithms, the machine learning algorithm anticipate the specific aggregation operations that is most relevant and beneficial for upcoming requests.
[0050] Upon predicting the aggregation operation, the performing unit 214 is configured to perform the aggregation of the raw network performance data using the predicted aggregation operation. The predicted aggregation operation is applied after a predefined time interval (such as 10 minutes, 60 minutes, etc.), which is determined based on a set of factors, including a number of raw data records, time of the day, traffic over a network, and a geographical location. The aggregation of the raw network performance data is performed by applying the aggregation operation, such as min, max, sum, count, average, to the raw network performance data at various bucket levels, such as quarter, hour, day, month, International Mobile Subscriber Identity (IMSI), cell. The bucket levels represent time-based intervals and/or categories of data aggregation. In case of time-based intervals, the bucket levels can range from very short intervals (such as minutes) to longer intervals (such as months). In case of categories, the bucket levels comprise International Mobile Subscriber Identity (IMSI)- based aggregations and cell-based aggregations. The applying the aggregation operation comprises enable summarizing and condensing the raw data into meaningful and useful aggregated information.
[0051] In an embodiment, the performing unit 214 may leverage various aggregation functions to perform the aggregation of the raw network performance data. These functions such as min, max, sum, count, average, etc. may be applied to the raw network performance data at, for example, at a 30-minute bucket level, cell level, etc. For example, the min function returns the smallest value in a dataset. In an example, if you have a dataset of numbers [3, 7, 1, 9, 4], the min function would return 1. The max function returns the largest value in a dataset. Using the same example dataset [3, 7, 1, 9, 4], the max function would return 9. The sum function calculates the sum of all values in a dataset. For the dataset [3, 7, 1, 9, 4], the sum function would return 24 (since 3 + 7 + 1 + 9 + 4 = 24). The count function returns the number of non-null values in a dataset. It's used to count the number of rows or elements in a dataset. For example, if you have a dataset [3, 7, null, 9, 4], the count function would return 4, as there are four non-null values. The average function calculates the average of all values in a dataset. For the dataset [3, 7, 1, 9, 5], the average function would return 5. This process summarizes and condenses the raw data into meaningful and useful aggregated information that may be easily analyzed and processed.
[0052] In particular, the aggregation of the raw network performance data is performed using the predicted aggregation operation such as min, max, sum, count, average etc. at various bucket levels such as quarter, hour, day, month, International Mobile Subscriber Identity (IMSI), cell, etc. Further, by leveraging the aggregation operation, such as min, max, sum, count, average, at various bucket levels, such as quarter, hour, day, month, International Mobile Subscriber Identity (IMSI), cell, etc., the performing unit 214 summarizes and condenses the raw data into meaningful and useful aggregated information.
[0053] Upon performing the aggregation of the raw network performance data, the storage unit 216 is configured to store the aggregated data. In specific, the aggregated data obtained after performing the aggregation is stored in the distributed file system for future use. The aggregated data is typically 70-90% smaller in size compared to the raw network performance data. Further, the facilitating unit 218 is configured to facilitate efficient access to the data by creating a dedicated folder structure for each bucket level. The aggregated data at each of the bucket level is stored in the dedicated folder structure. The dedicated folder structure helps organize and manage the aggregated data, which enhances the data accessibility and retrieval performance.
[0054] In an embodiment, the receiving unit 220 is configured to receive a request from a user to access the previously executed metadata and the raw network performance data. More specifically, the request is received from the user to access the aggregated data which is stored in the storage unit 216. The request is received from the user via the UE 102 or the user interface 206. The user is at least one of subscriber, administrator and network operator.
[0055] Therefore, the system 108 performance is improved and the system 108 can efficiently serve multiple users concurrently without sacrificing performance. Further, the system 108 can seamlessly accommodate an increasing number of users and growing demands without compromising performance or response times.
[0056] FIG. 3 is an exemplary block diagram of an architecture 300 of the system 108 for pre-processing raw network performance data, according to one or more embodiments of the present invention.
[0057] The architecture 300 includes Performance Management (PM) 302, a distributed data processing orchestrator 304, the data lake 306, the distributed file system 308, an AI/ML model 310. Further, the distributed data processing orchestrator 304 includes a computation engine 312 and a distribution computation cluster 314. Further, the distribution computation cluster 314 includes a compute master 316 and plurality of workers such as worker 1, worker 2, .. worker n.
[0058] The user requests to access the raw network performance data to the distributed data processing orchestrator 304 via the PM 302. The PM 302 helps the user to interrelate a set of activities, connecting the metrics, processes and systems used to monitor and manage business performance. The constant monitoring of the performance counters and Key Performance Indicators (KPIs) of the network elements minimizes the risk of failure and improves the quality, agility and relevance of business outcomes. The PM 302 is a platform where the user can perform all the requirements related to performance counters. In particular, the request is transmitted to the computation engine 312 for handling the execution of the data processing tasks.
[0059] Upon receiving the request, the computation engine 312 capture the previously executed metadata from the data lake 306 and the raw network performance data from the distributed file system 308.
[0060] Upon capturing the previously executed metadata and the raw network performance data, the computation engine 312 transmits the captured previously executed data and the raw network performance data to the distribution computation cluster 314 for predicting and performing the aggregation operation.
[0061] The distribution computation cluster 314 with help of the compute master 316 and the plurality of workers (such as worker 1, worker 2,… worker n), analyzes the previously executed metadata and the raw network performance data to predict the aggregation operation using AI/ML model 310. The compute master 316 coordinates the activities of the plurality of workers (such as worker1, worker 2, worker n). the compute master 316 allocates tasks, monitor progress and aggregate results. The plurality of workers (such as worker 1, worker 2, …worker n) execute the individual tasks assigned by the compute master 316. The AI/ML model 310 utilizes the machine learning algorithms to make predictions about the aggregation operations that will be needed for future requests.
[0062] More specifically, the AI/ML model 310 analyzes the patterns and trends in the previously executed metadata and the raw network performance data. By analyzing the patterns and trends of the previously executed metadata and the raw network performance data, the AI/ML model 310 anticipate the specific aggregation operations that will be most relevant and beneficial for upcoming requests. The aggregation operations include at least one of, min, max, sum, count, average etc.
[0063] Upon predicting the aggregation operations, the distribution computation cluster 314 with help of the compute master 316 and the plurality of workers such as worker 1, worker 2,… worker n performs the aggregation of the raw network performance data using the AI/ML model 310. The predicted aggregation operation is applied after a predefined time interval (such as 10 minutes, 60 minutes, etc.), which is determined based on a set of factors, including a number of raw data records, time of the day, traffic over a network, and a geographical location. The aggregation of the raw network performance data is performed by applying the aggregation operations, such as min, max, sum, count, average, to the raw network performance data at various bucket levels, such as quarter, hour, day, month. The applying of the aggregation operations such as min, max, sum, count, average at the various bucket levels, such as quarter, hour, day, month, summarizes and condenses the raw data into meaningful and useful aggregated information. Thereafter, the aggregated information is stored in the distributed file system 308. To facilitate efficient access to the data, the dedicated folder structure is created for each bucket level. The dedicated folder structure helps in organizing and managing the aggregated data, which enhance data accessibility and retrieval performance.
[0064] In an embodiment, if the user transmits the request for specific aggregated information, the computation engine 312 retrieves the precomputed network performance data from the distributed file system 308. The precomputed network performance data includes the aggregated results of the network performance data.
[0065] Further, the computation engine 312 applies further computations or transformations to the precomputed data as per the user's request in order to generate the desired output. This approach eliminates the need to perform the entire aggregation process again, allowing for faster and more efficient computation by leveraging the precomputed results.
[0066] FIG. 4 is a signal flow diagram for pre-processing the raw network performance data, according to various embodiments of the present disclosure.
[0067] At step 402, the distributed data processing orchestrator 304 receives the request for raw network performance data from the user.
[0068] At step 404, upon receiving the request from the user, the distributed data processing orchestrator 304 captures the previously executed metadata from the data lake 306 and the raw network performance data from the distributed file system 308 with help of the computation engine 312.
[0069] At step 406, upon capturing the previously executed data and the raw network performance data, the distributed data processing orchestrator 304 with the help of distribution computation cluster 314 analyzes the previously executed metadata and the raw network performance data to predict the aggregation operation using AI/ML model 310. The AI/ML model 310 utilizes the machine learning algorithms to make predictions about the aggregation operations that will be needed for future requests. More specifically, the AI/ML model 310 analyzes the patterns and trends in the previously executed data and the raw network performance data. By analyzing the patterns and trends of the previously executed metadata and the raw network performance data, the AI/ML model 310 anticipate the specific aggregation operations that will be most relevant and beneficial for upcoming requests. The aggregation operations include at least one of, min, max, sum, count, average etc.
[0070] At step 408, the distributed data processing orchestrator 304 with the help of distribution computation cluster 314 performs the aggregation of the raw network performance data using the AI/ML model 310. The predicted aggregation operation is applied after a predefined time interval (such as 10 minutes, 60 minutes, etc.), which is determined based on a set of factors, including a number of raw data records, time of the day, traffic over a network, and a geographical location. The aggregation of the raw network performance data is performed by applying the aggregation operations, such as min, max, sum, count, average, to the raw network performance data at various bucket levels, such as quarter, hour, day, month. The applying of the aggregation operations such as min, max, sum, count, average at the various bucket levels, such as quarter, hour, day, month, International Mobile Subscriber Identity (IMSI), cell, summarizes and condenses the raw data into meaningful and useful aggregated information.
[0071] At step 410, the aggregated information is stored in the distributed file system 308. The aggregated data is typically 70-90% smaller in size compared to the raw network performance data. To facilitate efficient access to the data, the dedicated folder structure is created for each bucket level. The dedicated folder structure helps in organizing and managing the aggregated data, which enhance data accessibility and retrieval performance.
[0072] At step 412, the user transmits the request for specific aggregated information to the distributed data processing orchestrator 304.
[0073] At step 414, the distributed data processing orchestrator 304 with the help of the computation engine 312 retrieves the precomputed network performance data from the distributed file system 308. The precomputed network performance data includes the aggregated results of the raw network performance data. Further, the computation engine 312 applies further computations or transformations to the precomputed data as per the user's request in order to generate the desired output.
[0074] FIG. 5 is a flow diagram of a method 500 for pre-processing raw network performance data, according to various embodiments of the present disclosure. For the purpose of description, the method 500 is described with the embodiments as illustrated in FIG. 2 and should nowhere be construed as limiting the scope of the present disclosure.
[0075] At step 502, the method 500 includes the step of capturing the previously executed metadata and the raw network performance data by the capturing unit 210.
[0076] At step 504, the method 500 includes the step of analyzing, using the machine learning algorithm, the previously executed metadata and the raw network performance data to predict the aggregation operation by the analyzing unit 212. The aggregation operations include at least one of, min, max, sum, count, average etc.
[0077] At step 506, the method 500 includes the step of performing the aggregation of the raw network performance data using the predicted aggregation operation by the performing unit 214. Further, the aggregation of the raw network performance data is performed by applying the aggregation operation such as min, max, sum, count, average to the raw network performance data at various bucket levels such as quarter, hour, day, month, International Mobile Subscriber Identity (IMSI), cell. The applying the aggregation operations at various bucket levels enables summarizing and condensing the raw network performance data into meaningful and useful aggregated information. Upon performing the aggregated operations, the aggregated data is stored by the storage unit 216. Further, the facilitating unit 218 facilitates efficient access to the data by creating the dedicated folder structure for each bucket level. The aggregated data at each of the bucket level is stored in the dedicated folder structure.
[0078] In an embodiment, the request from the user is received to access the previously executed metadata and the raw network performance data by the receiving unit 220. In particular, the request from the user is received to access the aggregated data.
[0079] The present invention further discloses a non-transitory computer-readable medium having stored thereon computer-readable instructions. The computer-readable instructions are executed by the processor 202. The processor 202 is configured to capture the previously executed metadata and the raw network performance data. The processor 202 is further configured to analyze using the machine learning algorithm, the previously executed metadata and the raw network performance data, to predict the aggregation operation. The processor 202 is further configured to perform the aggregation of the raw network performance data using the predicted aggregation operation.
[0080] A person of ordinary skill in the art will readily ascertain that the illustrated embodiments and steps in description and drawings (FIGS. 1-6) are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[0081] The present invention includes a number of advantages as mentioned below:
(i)Improved Performance: By utilizing pre-processed data instead of raw 4G/5G network performance data, an improved performance is achieved in the system. The time-consuming steps of importing the raw data and conducting aggregation and sorting operations are bypassed, resulting in a faster response time. The optimization allows users to obtain the desired results more quickly, enhancing overall system efficiency.
(ii) Reducing load on computation cluster: One of the key advantages of using pre-computed data directly in user request execution is the reduction in load on the computation cluster. Since the aggregated data is significantly smaller in size compared to the raw network performance data, the processing required to obtain the desired result is greatly reduced. This reduction in computational workload translates to lower strain on the computation cluster, enabling it to handle a larger volume of requests without being overwhelmed. As a result, the system can efficiently serve multiple users concurrently without sacrificing performance.
(iii)Scalability: The scalability of the system is greatly enhanced by utilizing pre-processed data. With the reduced load on the computation cluster, the system becomes more capable of handling a greater volume of requests concurrently. The optimized approach of leveraging pre-computed data allows for improved resource utilization and efficient management of computational resources. This scalability ensures that the system can seamlessly accommodate an increasing number of users and growing demands without compromising performance or response times.
[0082] The present invention offers multiple advantages over the prior art and the above-listed are a few examples to emphasize on some of the advantageous features. The listed advantages are to be read in a non-limiting manner.
REFERENCE NUMERALS
[0083] Environment - 100
[0084] User Equipment (UE)– 102
[0085] Server - 104
[0086] Network – 106
[0087] System – 108
[0088] Processor – 202
[0089] Memory – 204
[0090] User Interface – 206
[0091] Database – 208
[0092] Capturing unit– 210
[0093] Analyzing unit – 212
[0094] Performing unit – 214
[0095] Storage unit – 216
[0096] Facilitating unit – 218
[0097] Receiving unit -220
[0098] Performance Manager (PM) -302
[0099] Distributed data processing orchestrator - 304
[00100] Data lake – 306
[00101] Distributed file system -308
[00102] AI/ML model- 310
[00103] Computation engine- 312
[00104] Distribution computation cluster- 314
[00105] Compute master- 316
,CLAIMS:CLAIMS:
We Claim:
1. A method (500) to pre-process raw network performance data, the method (500) comprising the steps of:
capturing, by one or more processors (202), previously executed metadata and raw network performance data;
analysing, by the one or more processors (202), using a machine learning algorithm, the previously executed metadata and the raw network performance data, to predict an aggregation operation; and
performing, by the one or more processors (202), the aggregation of the raw network performance data using the predicted aggregation operation.
2. The method (500) as claimed in claim 1, comprises receiving, by the one or more processors (202), a request from a user to access the previously executed metadata and the raw network performance data.
3. The method (500) as claimed in claim 1, comprises:
applying, by the one or more processors (202), the aggregation operation, such as min, max, sum, count, average, to the raw network performance data at various bucket levels, such as Quarter, Hour, Day, Month, International Mobile Subscriber Identity (IMSI), cell.
4. The method (500) as claimed in claim 3, wherein applying comprises enabling, by the one or more processors (202), summarizing and condensing the raw network performance data into meaningful and useful aggregated information.
5. The method (500) as claimed in claim 1, comprises storing, by the one or more processors (202), the aggregated data upon performing the aggregation operations.
6. The method (500) as claimed in claim 3, comprises facilitating, by the one or more processors (202), efficient access to the data, by creating a dedicated folder structure for each bucket level.
7. The method (500) as claimed in claim 6, wherein the aggregated data at each of the bucket level is stored in the dedicated folder structure.
8. The method (500) as claimed in claim 2, wherein requesting comprises retrieving, by the one or more processors (202), the aggregated data.
9. The method (500) as claimed in claim 1, comprises:
applying, by the one or more processors (202), the aggregation operation after a predefined time interval.
10. The method (500) as claimed in claim 9, wherein the predefined time interval is determined based on a set of factors, including a number of raw data records, time of the day, traffic over a network, and a geographical location.
11. A system (108) for pre-processing raw network performance data, the system (108) comprising:
a capturing unit (210) configured to capture, previously executed metadata and raw network performance data;
an analysing unit (212) configured to analyse, using a machine learning algorithm, the previously executed metadata and the raw network performance data, to predict an aggregation operation; and
a performing unit (214) configured to perform, the aggregation of the raw network performance data using the predicted aggregation operation.
12. The system (108) as claimed in claim 11, comprising a receiving unit (220) configured to receive, a request from a user to access the previously executed metadata and the raw network performance data.
13. The system (108) as claimed in claim 11, wherein the performing unit (214) is further configured to:
apply, the aggregation operation, such as min, max, sum, count, average, to the raw network performance data at various bucket levels, such as Quarter, Hour, Day, Month, International Mobile Subscriber Identity (IMSI), cell.
14. The system (108) as claimed in claim 13, wherein apply comprises enable summarizing and condensing the raw data into meaningful and useful aggregated information.
15. The system (108) as claimed in claim 11, comprises a storage unit (216) configured to store, the aggregated data upon performing the aggregation operations.
16. The system (108) as claimed in claim 13, comprises a facilitating unit (218) configured to facilitate, access to the data, by creating a dedicated folder structure for each bucket level.
17. The system (108) as claimed in claim 16, wherein the aggregated data at each of the bucket level is stored in the dedicated folder structure.
18. The system (108) as claimed in claim 12, wherein the receiving unit (220) is further configured to retrieve, the aggregated data.
19. The system (108) as claimed in claim 11, wherein the performing unit (214) is further configured to: apply the aggregation operation after a predefined time interval.
20. The system (108) as claimed in claim 19, wherein the predefined time interval is determined based on a set of factors, including a number of raw data records, time of the day, traffic over a network, and a geographical location.
| # | Name | Date |
|---|---|---|
| 1 | 202321048721-STATEMENT OF UNDERTAKING (FORM 3) [19-07-2023(online)].pdf | 2023-07-19 |
| 2 | 202321048721-PROVISIONAL SPECIFICATION [19-07-2023(online)].pdf | 2023-07-19 |
| 3 | 202321048721-FORM 1 [19-07-2023(online)].pdf | 2023-07-19 |
| 4 | 202321048721-FIGURE OF ABSTRACT [19-07-2023(online)].pdf | 2023-07-19 |
| 5 | 202321048721-DRAWINGS [19-07-2023(online)].pdf | 2023-07-19 |
| 6 | 202321048721-DECLARATION OF INVENTORSHIP (FORM 5) [19-07-2023(online)].pdf | 2023-07-19 |
| 7 | 202321048721-FORM-26 [03-10-2023(online)].pdf | 2023-10-03 |
| 8 | 202321048721-Proof of Right [08-01-2024(online)].pdf | 2024-01-08 |
| 9 | 202321048721-DRAWING [18-07-2024(online)].pdf | 2024-07-18 |
| 10 | 202321048721-COMPLETE SPECIFICATION [18-07-2024(online)].pdf | 2024-07-18 |
| 11 | Abstract-1.jpg | 2024-09-28 |
| 12 | 202321048721-Power of Attorney [05-11-2024(online)].pdf | 2024-11-05 |
| 13 | 202321048721-Form 1 (Submitted on date of filing) [05-11-2024(online)].pdf | 2024-11-05 |
| 14 | 202321048721-Covering Letter [05-11-2024(online)].pdf | 2024-11-05 |
| 15 | 202321048721-CERTIFIED COPIES TRANSMISSION TO IB [05-11-2024(online)].pdf | 2024-11-05 |
| 16 | 202321048721-FORM 3 [03-12-2024(online)].pdf | 2024-12-03 |
| 17 | 202321048721-FORM 18 [20-03-2025(online)].pdf | 2025-03-20 |