Abstract: The present disclosure provides a system (200) and a method for data enrichment. The system (200) may enrich the data by performing arithmetical and string operations on two or more fields and may create the other fields, such that it may index that field in the database (220) or provide it to an external entity for further analytics. Dynamic rules and policies may be applied in real-time to incoming data that is then normalized and not enriched. The system (200) includes an AI/ML engine that may receive the enrichment and normalization information. Based on the received enrichment and normalization information, the AI/ML engine may begin to suggest new or different fields, or further enrichment and normalization. The suggestion may further be supplemented by a generative AI that may be configured within the AI/ML engine. The generative AI may be configured to learn and be trained on new data sets. FIGURE 2
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
THE PATENTS RULES, 2003
COMPLETE SPECIFICATION
(See section 10; rule 13)
TITLE OF THE INVENTION
SYSTEM AND METHOD FOR DATA ENRICHMENT
APPLICANT
JIO PLATFORMS LIMITED
of Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad -
380006, Gujarat, India; Nationality : India
The following specification particularly describes
the invention and the manner in which
it is to be performed
RESERVATION OF RIGHTS
[0001] A portion of the disclosure of this patent document contains material, which is subject to intellectual property rights such as, but are not limited to, copyright, design, trademark, Integrated Circuit (IC) layout design, and/or trade dress protection, belonging to Jio Platforms Limited (JPL) or its affiliates (hereinafter referred as owner). The owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all rights whatsoever. All rights to such intellectual property are fully reserved by the owner.
FIELD OF DISCLOSURE
[0002] The present disclosure generally relates to a field of enriching data in a communications network. In particular, the present disclosure relates to real-time enriching of data.
DEFINITION
[0003] As used in the present disclosure, the following terms are generally intended to have the meaning as set forth below, except to the extent that the context in which they are used to indicate otherwise.
[0004] The expression ‘Dynamic enrichment of data’ used hereinafter in the specification refers to the process of enhancing or augmenting existing data with additional information in real-time or near-real-time. Data enrichment typically involves integrating external data sources, such as databases, or third-party services, to supplement and enhance the original dataset. The goal of dynamic data enrichment is to provide more comprehensive and valuable information for analysis, decision-making, or other purposes. This could include adding contextual information, such as demographic data, geographic data, social media activity, or historical trends, to enrich the understanding of the original dataset.
[0005] These definitions are in addition to those expressed in the art.
BACKGROUND
[0006] The following description of related art may be intended to provide background information pertaining to the field of the disclosure. This section may include certain aspects of the art that may be related to various features of the present disclosure. However, it should be appreciated that this section be used only to enhance the understanding of the reader with respect to the present disclosure, and not as admissions of prior art.
[0007] In traditional systems, data enrichment is a cumbersome task as it involves offline data retrieval, enrichment, and then transmission of the enriched data back to a database or system from where the data was initially acquired. Also, if any changes are required in the enrichment policy, it requires modifying the code to cater.
[0008] There may, therefore, be a requirement in the art for an easier and simpler approach to enrich acquired data.
SUMMARY
[0009] The present disclosure discloses a system for performing real-time dynamic enrichment of data in a network. The system includes a receiving unit and a processing unit. The receiving unit is configured to obtain data from one or more data sources. The data includes a first field and a second field. The first field and the second field comprise one or more attributes. The processing unit is coupled to the receiving unit. The processing unit is configured to perform at least one operation on the first field and the second field based on a data enrichment policy and implement the data enrichment policy to derive dynamically a third field based on the first field and the second field. The third field is indicative of an enriched data.
[0010] In an embodiment, to implement the data enrichment policy, the processing unit is configured to apply the at least one operation selected from arithmetical, string, concatenation, substrings, splitting, prefix/postfix, match, trim operations (single/multiple) on the first field and the second field.
[0011] In an embodiment, the processing unit is configured to index the derived third field in a database and generates an indexed third field.
[0012] In an embodiment, the processing unit is configured to transmit the indexed third field to an external entity. The external entity is configured to perform one or more analytics on the indexed third field.
[0013] In an embodiment, the system includes a machine learning (ML) engine configured to receive enrichment and normalization information pertaining to the obtained data, and define, based on the received enrichment and normalization information, one of modification and updation in the data enrichment policy to derive the third field dynamically.
[0014] In an embodiment, the system further includes a generative Artificial Intelligence (AI) module supplemented with the ML engine.
[0015] The present disclosure discloses a method of performing real-time dynamic enrichment of data in a network. The method includes obtaining, by a receiving unit, data from one or more data sources. The data includes a first field and a second field. The first field and the second field comprise one or more attributes. The method includes performing, by a processing unit, at least one operation on the first field and the second field based on a data enrichment policy. The method includes implementing, by the processing unit, the data enrichment policy to derive dynamically a third field based on the first field and the second field, wherein the third field is indicative of enriched data.
[0016] In an embodiment, implementing the data enrichment policy comprises applying, by the processing unit, at least one operation selected from arithmetical,
string, concatenation, substrings, splitting, prefix/postfix, match, trim operations (single/multiple) on the first field and the second field.
[0017] In an embodiment, the method, further comprising indexing, by the processing unit, the derived third field in a database and generating an indexed third field.
[0018] In an embodiment, the method further includes transmitting, by the processing unit, the indexed third field to an external entity, wherein the external entity is configured to perform one or more analytics on the indexed third field.
[0019] In an embodiment, the method further includes implementing, by the processing engine, a Machine Learning (ML) engine to receive enrichment and normalization information pertaining to the obtained data and define, based on the received enrichment and normalization information, one of modification and updation in the data enrichment policy to derive the third field dynamically.
[0020] In an embodiment, the method further includes supplementing, by the processing unit, the ML engine with a generative Artificial Intelligence (AI) module.
[0021] The present disclosure discloses a user equipment (UE) communicatively coupled with a network. The coupling comprises steps of receiving a connection request, sending an acknowledgment of the connection request to the network, and transmitting a plurality of signals in response to the connection request, wherein real-time dynamic enrichment of data in a network is performed by a system. The system includes a receiving unit and a processing unit. The receiving unit is configured to obtain data from one or more data sources. The data includes a first field and a second field. The first field and the second field comprise one or more attributes. The processing unit is coupled to the receiving unit. The processing unit is configured to perform at least one operation on the first field and the second field based on a data enrichment policy and implement the data
enrichment policy to derive dynamically a third field based on the first field and the second field. The third field is indicative of an enriched data.
OBJECTS OF THE INVENTION
[0022] An object of the present invention is to provide a system for real-time enrichment of data.
[0023] Another object of the present invention is to provide a system where enrichment policies may be provided in real-time.
[0024] Another object of the present invention is to provide a system for enrichment of data that is simple and quick.
BRIEF DESCRIPTION OF DRAWINGS
[0025] The accompanying drawings, which are incorporated herein, and constitute a part of this disclosure, illustrate exemplary embodiments of the disclosed methods and systems in which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that disclosure of such drawings includes the disclosure of electrical components, electronic components or circuitry commonly used to implement such components.
[0026] FIG. 1 illustrates an exemplary network architecture in which or with which embodiments of the present disclosure may be implemented;
[0027] FIG. 2 illustrates an exemplary block diagram of a system for performing real-time dynamic enrichment of data in a network;
[0028] FIG. 3 illustrates an exemplary schematic diagram of the system for performing data enrichment;
[0029] FIG. 4A illustrates an exemplary schematic diagram depicting an operation of the system;
[0030] FIG. 4B illustrates a schematic flow diagram depicting the operation of the system for performing data enrichment; and
[0031] FIG. 5 illustrates an exemplary computer system in which or with which embodiments of the present disclosure may be implemented; and
[0032] FIG. 6 illustrates a schematic flow diagram representing a method of performing real-time dynamic enrichment of data in a network.
LIST OF REFERENCE NUMERALS
100 – Network architecture
102-1, 102-2…102-N – Users
104-1, 104-2…104-N – User Equipments
106 – Network
112 – Centralized server
200 – System
202 – Receiving Unit
204 – Memory
206 – Interface(s)
210 – Processing unit
220 – Database
252 – Data sources
254 – Destination system
256 – User interface
260 – Ingestion layer
262 – Normalization layer
262-1 to 262-N – Normalization sub-units
300 – Schematic diagram
400 – Schematic diagram
450 – Flow diagram
500 – Computer system
510 – External Storage Device
520 – Bus
530 – Main Memory
540 – Read Only Memory
550 – Mass Storage Device
560 – Communication Port
570 – Processor
DETAILED DESCRIPTION
[0033] In the following description, for the purposes of explanation, various specific details are set forth in order to provide a thorough understanding of embodiments of the present disclosure. It will be apparent, however, that embodiments of the present disclosure may be practiced without these specific details. Several features described hereafter can each be used independently of one another or with any combination of other features. An individual feature may not address all of the problems discussed above or might address only some of the problems discussed above. Some of the problems discussed above might not be fully addressed by any of the features described herein.
[0034] The ensuing description provides exemplary embodiments only, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the ensuing description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing an exemplary embodiment. It should be understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the disclosure as set forth.
[0035] Specific details are given in the following description to provide a thorough understanding of the embodiments. However, it will be understood by one of ordinary skill in the art that the embodiments may be practiced without these
specific details. For example, circuits, systems, networks, processes, and other components may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments.
[0036] Also, it is noted that individual embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed but could have additional steps not included in a figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination can correspond to a return of the function to the calling function or the main function.
[0037] The word “exemplary” and/or “demonstrative” is used herein to mean serving as an example, instance, or illustration. For the avoidance of doubt, the subject matter disclosed herein is not limited by such examples. In addition, any aspect or design described herein as “exemplary” and/or “demonstrative” is not necessarily to be construed as preferred or advantageous over other aspects or designs, nor is it meant to preclude equivalent exemplary structures and techniques known to those of ordinary skill in the art. Furthermore, to the extent that the terms “includes,” “has,” “contains,” and other similar words are used in either the detailed description or the claims, such terms are intended to be inclusive—in a manner similar to the term “comprising” as an open transition word—without precluding any additional or other elements.
[0038] Reference throughout this specification to “one embodiment” or “an embodiment” or “an instance” or “one instance” means that a particular feature, structure, or characteristic described in connection with the embodiment is included
in at least one embodiment of the present disclosure. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
[0039] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
[0040] In traditional data systems, the process of data enrichment is often burdensome, involving offline data retrieval, enrichment, and subsequent reintegration of the enriched data into the database or system. Additionally, modifying the enrichment policy typically necessitates code adjustments. To combat these challenges, the present system discloses a comprehensive solution that enables dynamic enrichment of multiple fields through the utilization of exposed Application Programming Interfaces (APIs) by a third-party system. The present system facilitates a wide range of operations, including arithmetic operations and date/time difference operations on existing attributes. Moreover, the present system empowers users to create new dynamically enriched fields by applying a variety of operations such as combination, substrings, splitting, prefix/postfix, match, and trim (both single and multiple) on existing fields. Upon inputting the post-enrichment data into the system, the present system dynamically enriches the data on the go, thus eliminating the requirement for offline data
processing and repetitive code modifications. This significantly enhances the efficiency and user-friendliness of the data enrichment process.
[0041] The various embodiments of the present disclosure will be explained in detail with reference to FIGS. 1 – 6.
[0042] FIG. 1 illustrates an exemplary network architecture (100) in which or with which embodiments of the present disclosure may be implemented. Referring to FIG. 1, the network architecture (100) may include one or more computing devices or user equipment (104-1, 104-2…104-N) associated with one or more users (102-1, 102-2…102-N) in an environment. A person of ordinary skill in the art will understand that one or more users (102-1, 102-2…102-N) may be individually referred to as the user (102) and collectively referred to as the users (102). Similarly, a person of ordinary skill in the art will understand that one or more user equipment (104-1, 104-2…104-N) may be individually referred to as the user equipment (104) and collectively referred to as the user equipment (104). A person of ordinary skill in the art will appreciate that the terms “computing device(s)” and “user equipment” may be used interchangeably throughout the disclosure. Although three user equipment (104) are depicted in FIG. 1, however any number of the user equipment (104) may be included without departing from the scope of the ongoing description.
[0043] In an embodiment, the user equipment (104) may include, but is not limited to, a handheld wireless communication device (e.g., a mobile phone, a smart phone, a phablet device, and so on), a wearable computer device (e.g., a head-mounted display computer device, a head-mounted camera device, a wristwatch computer device, and so on), a Global Positioning System (GPS) device, a laptop computer, a tablet computer, or another type of portable computer, a media playing device, a portable gaming system, and/or any other type of computer device with wireless communication capabilities, and the like. In an embodiment, the user equipment (104) may include, but is not limited to, any electrical, electronic, electro-mechanical, or an equipment, or a combination of one or more of the above
devices such as virtual reality (VR) devices, augmented reality (AR) devices, laptop, a general-purpose computer, desktop, personal digital assistant, tablet computer, mainframe computer, or any other computing device, wherein the user equipment (104) may include one or more in-built or externally coupled accessories including, but not limited to, a visual aid device such as a camera, an audio aid, a microphone, a keyboard, and input devices for receiving input from the user (102) or the entity such as touch pad, touch enabled screen, electronic pen, and the like. A person of ordinary skill in the art will appreciate that the user equipment (104) may not be restricted to the mentioned devices and various other devices may be used.
[0044] Referring to FIG. 1, the user equipment (104) may communicate with a system (200), for example, a system for data enrichment, through a network (106). In an embodiment, the network (106) may include at least one of a Fifth Generation (5G) network, 6G network, or the like. The network (106) may enable the user equipment (104) to communicate with other devices in the network architecture (100) and/or with the system (200). The network (106) may include a wireless card or some other transceiver connection to facilitate this communication. In another embodiment, the network (106) may be implemented as, or include any of a variety of different communication technologies such as a wide area network (WAN), a local area network (LAN), a wireless network, a mobile network, a Virtual Private Network (VPN), the Internet, the Public Switched Telephone Network (PSTN), or the like.
[0045] In another exemplary embodiment, a centralized server (112) may include or comprise, by way of example but not limitation, one or more of: a stand¬alone server, a server blade, a server rack, a bank of servers, a server farm, hardware supporting a part of a cloud service or system, a home server, hardware running a virtualized server, one or more processors executing code to function as a server, one or more machines performing server-side functionality as described herein, at least a portion of any of the above, some combination thereof.
[0046] Although FIG. 1 shows exemplary components of the network architecture (100), in other embodiments, the network architecture (100) may include fewer components, different components, differently arranged components, or additional functional components than depicted in FIG. 1. Additionally, or alternatively, one or more components of the network architecture (100) may perform functions described as being performed by one or more other components of the network architecture (100).
[0047] FIG. 2 illustrates an exemplary block diagram of the system (200). The system (200) may include a receiving unit (202), a processing unit (210), and a memory (204) communicably coupled to the receiving unit (202).
[0048] The receiving unit (202) is configured to obtain data from one or more data sources. In an aspect, the one or more data sources may include databases, file systems, APIs, sensors, or any other systems capable of providing data. The obtained data includes a first field and a second field. Both the first field and the second field contain one or more attributes. Attributes represent specific characteristics or properties of the data. Therefore, within each field, there may be multiple attributes describing various aspects of the data.
[0049] The processing unit (210) is coupled to the receiving unit. The processing unit (210) is configured to perform at least one operation on the first field and the second field based on a data enrichment policy. The data enrichment policy is configured to outline rules and procedures for enhancing or augmenting the data. It specifies how the first field, and the second field should be processed to derive additional information or create the enriched data. The processing unit (210) is configured to implement the data enrichment policy to derive dynamically a third field based on the first field and the second field. The third field is indicative of enriched data.
[0050] In an embodiment, to implement the data enrichment policy, the processing unit (210) is configured to apply the at least one operation selected from
arithmetical, string, concatenation, substrings, splitting, prefix/postfix, match, trim operations (single/multiple) on the first field and the second field.
[0051] In an embodiment, the processing unit (210) is configured to index the derived third field in a database and generates an indexed third field.
[0052] In an embodiment, the processing unit (210) is configured to transmit the indexed third field to an external entity. The external entity is configured to perform one or more analytics on the indexed third field.
[0053] In an embodiment, the system includes a machine learning (ML) engine configured to receive enrichment and normalization information pertaining to the obtained data, and define, based on the received enrichment and normalization information, one of modification and updation in the data enrichment policy to derive the third field dynamically.
[0054] In an operative aspect, the machine learning (ML) engine is set up to receive information regarding data enrichment and normalization. This information may include details about the current techniques, algorithms, or rules used in the enrichment and normalization process. Upon receiving this information, the ML engine analyzes data for patterns, trends, or anomalies. Depending on the analysis, the ML engine may suggest adjustments to the data enrichment policy, such as refining existing rules, adding new rules, or removing ineffective ones. These adjustments are aimed at improving the quality and relevance of the enriched data. The ML engine updates the data enrichment policy to influence the dynamic derivation of the third field. This ensures that the third field, representing enriched data, reflects the most accurate and up-to-date information available. By integrating the ML engine into the system, the data processing capabilities are enhanced as it continuously learns from the enrichment and normalization processes. The system can adapt to changing data patterns and requirements by dynamically adjusting the data enrichment policy, improving the quality and relevance of the derived third field.
[0055] In an embodiment, the system further includes a generative Artificial Intelligence (AI) module supplemented with the ML engine. The generative AI module is configured to generate new data or insights based on existing data patterns or models. Unlike traditional ML algorithms that are trained on labelled 5 data to make predictions or classifications, generative AI focuses on creating new data samples that mimic the characteristics of the original dataset. By supplementing the ML engine with a generative AI module, the system gains additional data synthesis and augmentation capabilities. While the ML engine focuses on analyzing existing data and making predictions or recommendations, the
10 generative AI module can generate synthetic data points or scenarios based on the learned patterns. The generative AI module can assist in the data enrichment and normalization process by generating additional data samples that complement the existing dataset. This could involve generating synthetic attributes, simulating missing data points, or creating new data distributions to augment the original
15 dataset.
[0056] The processing unit (210) may be implemented as one or more microprocessors, microcomputers, microcontrollers, edge or fog microcontrollers, digital signal processors, central processing units, logic circuitries, and/or any devices that process data based on operational instructions. Among other
20 capabilities, the processing unit (210) may be configured to fetch and execute computer-readable instructions stored in the memory (204) of the system (200). The memory (204) may be configured to store one or more computer-readable instructions or routines in a non-transitory computer-readable storage medium, which may be fetched and executed to create or share data packets over a network
25 service. The memory (204) may include any non-transitory storage device including, for example, volatile memory such as Random-Access Memory (RAM), or non-volatile memory such as Erasable Programmable Read-Only Memory (EPROM), flash memory, and the like.
[0057] In an embodiment, the system (200) may include an interface(s) (206). 30 The interface(s) (206) may include a variety of interfaces, for example, interfaces
15
for data input and output devices, referred to as I/O devices, storage devices, and the like. The interface(s) (206) may facilitate communication of the system (200). The interface(s) (206) may also provide a communication pathway for one or more components of the system (200). Examples of such components include, but are not 5 limited to, the processing unit (210) and a database (220).
[0058] The processing unit (210) may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processing unit (210). In examples described herein, such combinations of hardware and programming may be 10 implemented in several diverse ways. For example, the programming for the processing unit (210) may be processor-executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the processing unit (210) may comprise a processing resource (for example, one or more processors), to execute such instructions. In the present examples, the machine-15 readable storage medium may store instructions that, when executed by the processing resource, implement the processing unit (210). In such examples, the system (200) may include the machine-readable storage medium storing the instructions and the processing resource to execute the instructions, or the machine-readable storage medium may be separate but accessible to the system (200) and 20 the processing resource. In other examples, the processing unit (210) may be implemented by an electronic circuitry.
[0059] For example, the first field and the second field may comprise one or more attributes. In the context of data, a field refers to a single piece of data within a record or dataset. Fields are typically organized into columns in a tabular format,
25 with each column representing a different attribute or characteristic of the data. Attributes represent properties, characteristics, or features of an entity or object. In a dataset, attributes are typically represented by individual fields or combinations of fields that convey specific information about the data. Further, the processing unit (210) may be configured to define a data enrichment policy on a normalizer
30 module stored in the memory (204). The processing unit (210) may also be
16
configured to perform at least one of arithmetical operation and a string operation on the first field and the second field. Furthermore, the processing unit (210) may be configured to implement the defined data enrichment policy to dynamically derive, in real time, a third field based on the first field and the second field. 5 Moreover, the processing unit (210) may be configured to index the derived third field in a database (220).
[0060] In an embodiment, the processing unit (210), to implement the data enrichment policy, may be configured to apply at least one of combination, substrings, splitting, prefix/postfix, match, trim operations (single/multiple) on the 10 first field and the second field.
[0061] In an embodiment, the processing unit (210) may be configured to transmit the indexed third field to an external entity, wherein the external entity is configured to perform one or more analytics on the indexed third field.
[0062] In an embodiment, the system (200) may comprise a Machine Learning 15 (ML) engine which is configured to receive enrichment and normalization information pertaining to the data. Further, the ML engine may be configured to define, based on the receive enrichment and normalization information, one of modification and updation in the data enrichment policy to dynamically derive, in real time, the third field.
20 [0063] In an embodiment, the system (200) may comprise a generative Artificial Intelligence (AI) module supplemented with the ML engine. For example, the generative AI module may be configured to learn and be trained on new data.
[0064] FIG. 3 illustrates an exemplary schematic diagram (300) of the system (200) for performing real-time dynamic enrichment of data in a network. The 25 system (200) may be configured to receive and/or acquire data from external data sources (252). The data may be used to troubleshoot one or more faults in the communications network. The data may then be transmitted, from the system (200) to be processed by destination system or systems (254), such as macro-service
17
engines, or other service engines for analysis and reporting. For example, the data may include, without limitations, fault management data, performance management data, configuration data, call data records, informatics data, log data, inventory data, etc. The system (200) further includes an ingestion layer (260) configured to receive 5 the data from the sources (252). The ingestion layer (260) may be configured to acquire pertinent data from the data sources (252) based on a request provided to the system (200). The request may depend on a pre-configuration of the system (200).
[0065] The system (200) may further include a normalization layer (262) that 10 is configured to receive the data from the ingestion layer (260). Particularly, the normalization layer (262) may be configured to collate the data received from the ingestion layer (260), and then transmit the collated data from the system (200) to the destination system (254). In some embodiments, the ingestion layer (260) may transmit the data to a storage unit, such as a data lake. In some embodiments, the 15 normalization layer (262) may include one or more normalization sub-units (such as normalization sub-units 262-1, 262-2…262-N, shown in FIG. 3).
[0066] Furthermore, the normalization layer (262) may be configured to receive inputs relating to enrichment policy from a user interface (256).
[0067] FIG. 4A illustrates an exemplary schematic diagram (400) depicting an 20 operation of the system (200) for data enrichment. The system may include the processing unit (210) and a memory (204) coupled to the processing unit (210). For example, the memory (204) may include computer implemented instructions. The computer-implemented instructions may configure the processing unit (210) to perform the steps as involved in the present disclosure.
25 [0068] In operation, the processing unit (210) may be configured to receive (at step 402), from one or more data sources (252), data including a first field and a second field. At step (402), the system is configured to perform data ingestion via file systems or streams. Data ingestion via file systems or streams refers to the
18
process of collecting and importing data from external sources into a data processing or storage system. This can be done through two primary methods:
. File Systems: In this method, data is ingested from files stored in file systems such as local file systems, network file systems (NFS), or distributed file systems. Data can be ingested by reading files directly from the file system and then processing them. This approach is suitable for batch processing scenarios where data is collected and processed in discrete chunks.
. Streams: In stream-based data ingestion, data is continuously ingested from real-time data streams such as message queues, event logs, or IoT device telemetry. These streams of data are processed as they are received, enabling real-time analytics, monitoring, or other applications.
[0069] At step (404), the system is configured to provide the enrichment policy to the normalization layer. In an example, the enrichment policy is a set of guidelines or rules that dictate how data should be augmented or enriched before it is normalized. These rules can vary based on the specific requirements of the data processing pipeline and the nature of the data being processed. For example, the enrichment policy might specify that missing or incomplete data should be supplemented with additional information from external sources before normalization. The normalization layer is configured to standardize the format, structure, or values of the data to ensure consistency and compatibility across different datasets or systems. Normalization could involve tasks such as converting data to a common data model, standardizing units of measurement, or scaling numerical values to a consistent range. In an aspect, at step 404 following steps may be performed by the system:
. Filling in missing data with values obtained from external sources.
. Enhancing data with additional contextual information, such as demographic data or historical trends.
. Standardizing or cleaning data to ensure consistency and accuracy.
. Identifying and handling outliers or anomalies in the data. . Normalizing data formats, units, or scales to ensure uniformity across datasets.
[0070] In an aspect, by applying an enrichment policy before normalization, the system ensures that the data being processed is as complete, accurate, and standardized as possible, which improves the quality and reliability of downstream analytics or applications.
[0071] At step (406), the normalization layer is connected to an external entity that requires enriched data and the normalization layer is configured to provide the normalized data (enriched data) to the external entity. The normalization layer standardizes the enriched data, ensuring consistency and compatibility with the requirements of both internal systems and the external entity. Once the data has been normalized, it is ready to be transmitted to the external entity that requires enriched data. The data may be sent the data through APIs, messaging systems, or other communication channels supported by the external entity. By connecting the normalization layer to the external entity requiring enriched data, the system ensures that the data transmitted meets the entity's requirements while also benefiting from the enhanced quality and context provided by the enrichment process.
[0072] At step (408), the normalization layer (262) is connected to the database (220) and the database serves as a persistent storage mechanism for the normalized data. In an example, the database (220) may be a relational database management system (RDBMS) like MySQL, PostgreSQL, or SQL Server, or a NoSQL database like MongoDB, Cassandra, or DynamoDB, depending on the specific requirements of the application. In an aspect, the step (408) of connecting the normalization layer to the database involves:
1. Data Transformation: The normalized data from the normalization layer needs to be transformed into a format that is compatible with the database
schema. This might involve mapping the fields and attributes of the normalized data to corresponding tables and columns in the database.
2. Data Loading: Once the data has been transformed, it is loaded into the database. This could be done through database-specific mechanisms such as SQL INSERT statements, bulk data loading tools, or using database connectors provided by the normalization layer.
3. Data Integrity: Ensuring data integrity is crucial during the loading process. This involves enforcing constraints, validating data against predefined rules, and handling errors or exceptions that may occur during data loading.
4. Optimization: Depending on the volume and frequency of data being processed, optimization techniques such as indexing, partitioning, and caching may be applied to improve the performance of data loading and retrieval operations.
[0073] In an aspect, by connecting the normalization layer to the database, the system maintains a consistent and structured approach to data management, ensuring that the data stored in the database is accurate, reliable, and ready for use in various applications and analytics processes.
[0074] FIG. 4B illustrates a schematic flow diagram (450) depicting the operation of the system (200) for data enrichment. Referring now to FIGS. 4A and 4B, the system (200) may enrich data in real-time and may directly index it in the database (220). Typically, data enrichment involves offline processing on stored or indexed data, or through repetitive code changes, which can be a tedious and inefficient process. A capacity for real-time enrichment may be particularly crucial when there is a requirement to send enriched data to multiple nodes or services, such as databases and other analytic tools, in real time. By enriching the data in real time, the system ensures that the most up-to-date and enriched information is readily available for immediate use across various systems and services.
[0075] The system (200) may be capable of enriching the data by performing arithmetical and string operations on two or more fields and may create the other fields in real-time, such that it may index that field in the database (220) or provide it to an external entity for further analytics. Data enrichment may involve a process of policy provisioning through the user interface (256) from exposed application program interfaces (APIs) for the enrichment requirement, which may include operation(s) required like substring, concatenation, split, etc. Once the policy is provisioned, the normalization layer (262) may enrich the stream data dynamically and may further index it to a data lake. The normalization layer (262) may provide the enriched data in form of files/stream to any external entity for further analytics.
[0076] The schematic flow diagram (450) depicting the operation of the system (200) may involve various steps, including receiving, from one or more data sources, data including a first field and a second field. For example, the first field and the second field may comprise one or more attributes. Further, a data enrichment policy may be defined on a normalizer module stored in the memory (at step 452). In an example, the policy defines rules and guidelines for enriching data with additional information. These guidelines may specify what types of supplementary data should be added, where to obtain this data from, and how it should be integrated with the existing dataset. Furthermore, at least one of arithmetical operation and a string operation may be performed on the first field and the second field. In addition, the defined data enrichment policy may be implemented to dynamically derive, in real time, a third field based on the first field and the second field. Moreover, the derived third field may be indexed in a database.
[0077] At step (454), the data is pushed into the normalization layer via the file system method or the stream system method. With the file system method, data is pushed into the normalization layer by writing it to files stored in a file system. These files could be located in a local file system, a network file system (NFS), or a distributed file system. The normalization layer then reads these files from the file system, processes them, and applies normalization procedures as necessary. In the
stream system method, the data can be pushed into the normalization layer via a streaming data source. In this method, data is continuously pushed in real-time or near-real-time through data streams.
[0078] At step (456), the incoming data is enriched as per the enrichment policy. In an aspect, step (456) includes enriching incoming data as per an enrichment policy involves augmenting or enhancing the data with additional information based on predefined rules or guidelines.
[0079] At step (458), the enriched data is transmitted to the external entity requiring enriched data or to a data lake. In an aspect, step (458) includes steps of initiating the transmission of enriched data to the external entity or the data lake using the selected protocol or mechanism. Step (458) also involves securely transferring the data over the network to the designated endpoint.
[0080] In an embodiment, the data enrichment policy may be implemented by applying at least one of combination, substrings, splitting, prefix/postfix, match, trim operations (single/multiple) on the first field and the second field.
[0081] In an embodiment, the indexed third field may be transmitted to an external entity, wherein the external entity is configured to perform one or more analytics on the indexed third field.
[0082] In an embodiment, a Machine Learning (ML) engine may be implemented to receive enrichment and normalization information pertaining to the data, and define, based on the received enrichment and normalization information, one modification and updation in the data enrichment policy to dynamically derive, in real-time, the third field.
[0083] An advantage of the system (200) may be that normalization is performed dynamically as it adapts to incoming stream of data. The dynamic rules and policies may be applied in real-time to incoming data that is then normalized and not enriched. Further, the system (200) may include an AI/ML engine that may
be configured to receive the enrichment and normalization information. Based on the received enrichment and normalization information, the AI/ML engine may begin to suggest new or different fields, or further enrichment and normalization. The suggestion may further be supplemented by a generative AI that may be configured within the AI/ML engine. The generative AI may be configured to learn and be trained on new data sets.
[0084] FIG. 5 illustrates an exemplary computer system (500) in which or with which embodiments of the present disclosure may be implemented. The computer system (500) may include an external storage device (510), a bus (520), a main memory (530), a read-only memory (540), a mass storage device (550), a communication port(s) (560), and a processor (570). A person skilled in the art will appreciate that the computer system (500) may include more than one processor and communication ports. The processor (570) may include various modules associated with embodiments of the present disclosure. The communication port(s) (560) may be any of an RS-232 port for use with a modem-based dialup connection, a 10/100 Ethernet port, a Gigabit or 10 Gigabit port using copper or fiber, a serial port, a parallel port, or other existing or future ports. The communication ports(s) (560) may be chosen depending on a network, such as a Local Area Network (LAN), Wide Area Network (WAN), or any network to which the computer system (500) connects.
[0085] In an embodiment, the main memory (530) may be Random Access Memory (RAM), or any other dynamic storage device commonly known in the art. The read-only memory (540) may be any static storage device(s) e.g., but not limited to, a Programmable Read Only Memory (PROM) chip for storing static information e.g., start-up or basic input/output system (BIOS) instructions for the processor (570). The mass storage device (550) may be any current or future mass storage solution, which can be used to store information and/or instructions. Exemplary mass storage solutions include, but are not limited to, Parallel Advanced Technology Attachment (PATA) or Serial Advanced Technology Attachment
(SATA) hard disk drives or solid-state drives (internal or external, e.g., having Universal Serial Bus (USB) and/or Firewire interfaces).
[0086] In an embodiment, the bus (520) may communicatively couple the processor(s) (570) with the other memory, storage, and communication blocks. The bus (520) may be, e.g. a Peripheral Component Interconnect PCI) / PCI Extended (PCI-X) bus, Small Computer System Interface (SCSI), USB, or the like, for connecting expansion cards, drives, and other subsystems as well as other buses, such a front side bus (FSB), which connects the processor (570) to the computer system (500).
[0087] In another embodiment, operator, and administrative interfaces, e.g., a display, keyboard, and cursor control device may also be coupled to the bus (520) to support direct operator interaction with the computer system (500). Other operator and administrative interfaces can be provided through network connections connected through the communication port(s) (560). The components described above are meant only to exemplify various possibilities. In no way should the aforementioned exemplary computer system (500) limit the scope of the present disclosure.
[0088] FIG. 6 illustrates a schematic flow diagram representing a method (600) of performing real-time dynamic enrichment of data in a network.
[0089] At step (602), the method includes obtaining, by a receiving unit, data from one or more data sources. The data includes a first field and a second field. The first field and the second field comprise one or more attributes.
[0090] At step (604), the method includes performing, by a processing unit, at least one operation on the first field and the second field based on a data enrichment policy.
[0091] At step (606), the method includes implementing, by the processing unit, the data enrichment policy to derive dynamically a third field based on the first field and the second field, wherein the third field is indicative of enriched data.
[0092] In an embodiment, implementing the data enrichment policy comprises applying, by the processing unit, at least one operation selected from arithmetical, string, concatenation, substrings, splitting, prefix/postfix, match, trim operations (single/multiple) on the first field and the second field.
[0093] In an embodiment, the method, further comprising indexing, by the processing unit, the derived third field in a database and generating an indexed third field.
[0094] In an embodiment, the method further includes transmitting, by the processing unit, the indexed third field to an external entity, wherein the external entity is configured to perform one or more analytics on the indexed third field.
[0095] In an embodiment, the method further includes implementing, by the processing engine, a Machine Learning (ML) engine to receive enrichment and normalization information pertaining to the obtained data and define, based on the received enrichment and normalization information, one of modification and updation in the data enrichment policy to derive the third field dynamically.
[0096] In an embodiment, the method further includes supplementing, by the processing unit, the ML engine with a generative Artificial Intelligence (AI) module.
[0097] The present disclosure discloses a user equipment (UE) communicatively coupled with a network. The coupling comprises steps of receiving a connection request, sending an acknowledgment of the connection request to the network, and transmitting a plurality of signals in response to the connection request, wherein real-time dynamic enrichment of data in a network is performed by the system (200). The system includes the receiving unit and the
processing unit. The receiving unit is configured to obtain data from one or more data sources. The data includes the first field and the second field. The first field and the second field comprise one or more attributes. The processing unit is coupled to the receiving unit. The processing unit is configured to perform at least one operation on the first field and the second field based on the data enrichment policy and implement the data enrichment policy to derive dynamically the third field based on the first field and the second field. The third field is indicative of the enriched data.
[0098] The present disclosure offers technical advancements in the field of data enrichment. This advancement overcomes the limitations of existing solutions by enriching data in real-time. The disclosure involves enriching data in real-time by performing arithmetic and string operations on two or more fields, creating another field based on two or more fields, and indexing that field in the database or providing it to an external system for further analytics. This ensures that the most up-to-date and enriched information is readily available for immediate use across various systems and services.
[0099] While considerable emphasis has been placed herein on the preferred embodiments, it will be appreciated that many embodiments can be made and that many changes can be made in the preferred embodiments without departing from the principles of the disclosure. These and other changes in the preferred embodiments of the disclosure will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter to be implemented merely as illustrative of the disclosure and not as limitation.
ADVANTAGES OF INVENTION
[00100] The present invention provides a system for real-time enrichment of data.
[00101] The present invention provides a system where enrichment policies may be provided in real-time.
[00102] The present invention provides a system for enrichment of data that is simple and quick.
We Claim:
1. A system (200) for performing real-time dynamic enrichment of data in a
network, the system (200) comprising:
a receiving unit (202) configured to obtain data from one or more data sources (252), the data including a first field and a second field; and a processing unit (210) coupled to the receiving unit (202) to:
perform at least one operation on the first field and the second field based on a data enrichment policy; and
implement the data enrichment policy to derive dynamically, a third field based on the first field and the second field, wherein the third field is indicative of an enriched data.
2. The system (200) as claimed in claim 1, wherein, to implement the data enrichment policy, the processing unit (210) is configured to apply the at least one operation selected from arithmetical, string, concatenation, substrings, splitting, prefix/postfix, match, trim operations (single/multiple) on the first field and the second field.
3. The system (200) as claimed in claim 1, wherein the processing unit (210) is configured to index the derived third field in a database (220) and generates an indexed third field.
4. The system (200) as claimed in claim 1, wherein the processing unit (210) is configured to transmit the indexed third field to an external entity, wherein the external entity is configured to perform one or more analytics on the indexed third field.
5. The system (200) as claimed in claim 1, includes a machine learning (ML) engine configured to:
receive enrichment and normalization information pertaining to the obtained data; and
define, based on the received enrichment and normalization information, one of modification and updation in the data enrichment policy to dynamically derive the third field.
6. The system (200) as claimed in claim 5, further includes a generative Artificial Intelligence (AI) module supplemented with the ML engine.
7. A method (600) of performing real-time dynamic enrichment of data in a network, the method (400) comprising:
obtaining (602), by a receiving unit (202), data from one or more data sources (252), the data including a first field and a second field;
performing (604), by a processing unit (210), at least one operation on the first field and the second field based on a data enrichment policy; and
implementing (606), by the processing unit (210), the data enrichment policy to derive dynamically a third field based on the first field and the second field, wherein the third field is indicative of an enriched data.
8. The method (600) as claimed in claim 7, wherein implementing the data
enrichment policy comprises:
applying, by the processing unit (210), the at least one operation selected from arithmetical, string, concatenation, substrings, splitting, prefix/postfix, match, trim operations (single/multiple) on the first field and the second field.
9. The method (400) as claimed in claim 7, further comprising indexing, by the processing unit (210), the derived third field in a database and generating an indexed third field.
10. The method (600) as claimed in claim 7, further comprising transmitting, by the processing unit (210), the indexed third field to an external entity, wherein the
external entity is configured to perform one or more analytics on the indexed third field.
11. The method (600) as claimed in claim 7, further comprising implementing,
by the processing unit (210), a Machine Learning (ML) engine to:
receive enrichment and normalization information pertaining to the obtained data; and
define, based on the received enrichment and normalization information, one of modification and updation in the data enrichment policy to dynamically derive the third field.
12. The method (400) as claimed in claim 11, further comprising supplementing, by the processing unit (210), the ML engine with a generative Artificial Intelligence (AI) module.
13. A user equipment (UE) communicatively coupled with a network (106), said coupling comprises steps of:
receiving a connection request;
sending an acknowledgment of the connection request to the network (106); and
transmitting a plurality of signals in response to the connection request, wherein a system in the network (106) is configured to perform real-time dynamic enrichment of data as claimed in claim 1.
| # | Name | Date |
|---|---|---|
| 1 | 202321047672-STATEMENT OF UNDERTAKING (FORM 3) [14-07-2023(online)].pdf | 2023-07-14 |
| 2 | 202321047672-PROVISIONAL SPECIFICATION [14-07-2023(online)].pdf | 2023-07-14 |
| 3 | 202321047672-FORM 1 [14-07-2023(online)].pdf | 2023-07-14 |
| 4 | 202321047672-DRAWINGS [14-07-2023(online)].pdf | 2023-07-14 |
| 5 | 202321047672-DECLARATION OF INVENTORSHIP (FORM 5) [14-07-2023(online)].pdf | 2023-07-14 |
| 6 | 202321047672-FORM-26 [13-09-2023(online)].pdf | 2023-09-13 |
| 7 | 202321047672-POA [29-05-2024(online)].pdf | 2024-05-29 |
| 8 | 202321047672-FORM 13 [29-05-2024(online)].pdf | 2024-05-29 |
| 9 | 202321047672-AMENDED DOCUMENTS [29-05-2024(online)].pdf | 2024-05-29 |
| 10 | 202321047672-Power of Attorney [04-06-2024(online)].pdf | 2024-06-04 |
| 11 | 202321047672-Covering Letter [04-06-2024(online)].pdf | 2024-06-04 |
| 12 | 202321047672-ORIGINAL UR 6(1A) FORM 26-120624.pdf | 2024-06-20 |
| 13 | 202321047672-ENDORSEMENT BY INVENTORS [28-06-2024(online)].pdf | 2024-06-28 |
| 14 | 202321047672-DRAWING [28-06-2024(online)].pdf | 2024-06-28 |
| 15 | 202321047672-CORRESPONDENCE-OTHERS [28-06-2024(online)].pdf | 2024-06-28 |
| 16 | 202321047672-COMPLETE SPECIFICATION [28-06-2024(online)].pdf | 2024-06-28 |
| 17 | 202321047672-CORRESPONDENCE(IPO)-(WIPO DAS)-06-08-2024.pdf | 2024-08-06 |
| 18 | 202321047672-FORM 18 [26-09-2024(online)].pdf | 2024-09-26 |
| 19 | Abstract.jpg | 2024-10-14 |
| 20 | 202321047672-FORM 3 [04-11-2024(online)].pdf | 2024-11-04 |