Abstract: A DATA PLATFORM FOR CREATING A STRUCTURED ADAPTIVE DATA PACKAGE AND A METHOD THEREOF The present invention provides a data platform (300) and a method (200) for creating an Adaptive Data Package (500) from data sources (303a, 303b) mapped to structured federated databases (305a, 305b) are disclosed. The platform (300) includes a user interface (350), a data source generator (335335), a query engine (309), and a Machine Learning-Based Query Processor (390) for real-time query execution. The Adaptive Data Package (500) integrates metadata, query lineage, and access controls, and is enhanced by a Data Package Naming Assistant (320) and an integrator (I) for mapping technical fields to business glossary terms. Embodiments include modules (301c, 300f, 302b, 301e) for previewing, versioning, storage, and submission to a data marketplace (575) respectively. The invention enables reusable, real-time, and compliant data assets, supporting scalable and user-driven customization across cloud, hybrid, and on-premises environments.
Description:FIELD OF THE INVENTION
[0001] The present invention relates to a data package. More specifically, the present invention relates to data Package(s) which is (are) created from multiple data sources which are mapped with a corresponding structured database. The invention provides a data platform and a method for creating a structured adaptive data package.
BACKGROUND FOR THE INVENTION:
[0002] A data package is a structured unit containing data, metadata, schema definitions, data lineage, and business glossary mappings. Included components typically consist of datasets, technical specifications, access rules, and documentation for interpretation and reuse. Designed for modularity and interoperability, a data package enables consistent data delivery, version control, and integration across systems. Functions supported include query execution, compliance enforcement, and scalable analytics, ensuring traceability, reusability, and governance. In some cases, the data packages are referred as “data products”.
[0003] A data source is the origin from which data is collected, such as applications, sensors, APIs, files, or external systems. These data sources are often mapped to structured databases, which are systems that store data in a predefined format using tables, rows, and columns—examples include MySQL, PostgreSQL, and Oracle. Structured databases enable efficient querying and management of well-organized data. Data packages are generated by extracting data from these sources, transforming it as needed (e.g., filtering, aggregating, enriching), and bundling it with metadata, schema definitions, and documentation. The mapping between data sources and structured databases ensures that data is consistently stored and accessible for packaging. This relationship allows data packages to be built with traceable lineage and structured content, making them suitable for analytics, reporting, and integration across systems.
[0004] Traditional data packages are typically generated through batch processing, where data is collected, transformed, and bundled at fixed intervals—daily, hourly, or even less frequently. This model introduces a time lag between data generation and data availability, meaning the contents of a data package may already be outdated by the time it is accessed. In operational environments such as logistics, fraud detection, or customer support, this delay can result in decisions being made on stale data, leading to missed opportunities, delayed responses, or incorrect actions.
[0005] The issue is further exacerbated in applications that require event-driven or real-time responsiveness, such as financial trading platforms, real-time inventory systems, or personalized digital experiences. In these scenarios, static data packages fail to capture rapidly changing conditions, such as price fluctuations, user behaviour, or system anomalies. Additionally, the infrastructure supporting traditional data packages often lacks mechanisms for continuous data ingestion, low-latency processing, or real-time validation, making it difficult to maintain data freshness. This disconnects between data velocity and data packaging leads to reduced situational awareness and diminished decision accuracy in time-sensitive contexts.
[0006] In many enterprise environments, data packages are frequently created in isolation by different teams or departments, each tailoring the structure, schema, and logic to their specific needs. This lack of coordination leads to redundant data packages that serves similar purposes but differ in format, naming conventions, and metadata. As these inconsistencies accumulate, it becomes increasingly difficult to align data across systems, resulting in fragmented data ecosystems where integration and interoperability are severely limited.
[0007] Moreover, the absence of standardized design and reuse mechanisms makes it challenging to scale data operations. When each new use case requires building a data package from scratch, development cycles become longer, and maintenance overhead increases. This also leads to storage inefficiencies, as similar datasets are duplicated across environments. In large-scale data platforms, this fragmentation hinders the ability to manage data lineage, enforce governance, and ensure consistency across analytical and operational workflows, ultimately limiting the scalability and agility of the entire data infrastructure.
[0008] The patent US8364636B2 introduces a hybrid data replication method that combines synchronous and asynchronous replication to ensure real-time data consistency across distributed systems. This inventive approach allows critical updates to be propagated instantly while handling bulk updates in the background, minimizing latency. It effectively addresses the challenge of real-time data availability in dynamic environments like finance or operations, where timely access to fresh data is essential. However, the invention is primarily focused on database-level replication and does not consider the modular structure and user-driven customization that are essential for building reusable and scalable data packages.
[0009] US20240220876A1, US10620923B2 provides methods for data product (comprising data packages) creation. However, the methods fail to address problems of the Real time data needs, reusability & scalability. Further, the methods do not facilitate user-driven customization in data package or data product creation.
[0010] Currently existing data platforms, systems and methods fail to address the problem of real time data needs, reusability, scalability and user-driven customization in the data package or the data product creation.
[0011] In data platforms where data packages are created from data sources mapped to structured databases, significant challenges arise related to real-time responsiveness and scalable reuse. Structured databases, while efficient for storing and querying well-defined data, often operate on batch ingestion and transformation cycles, resulting in data packages that reflect outdated snapshots rather than current operational states. This delay is problematic in time-sensitive applications such as fraud detection or inventory management, where decisions must be based on live data. Additionally, when multiple teams independently map similar data sources to structured databases without standardized schemas or reuse frameworks, it leads to redundant data packages with inconsistent logic and metadata.This fragmentation increases storage overhead, complicates governance, and limits the ability to scale data operations efficiently across the organization.
[0012] There is a need for a data platform and a method for generating data packages from structured data stored in structured databases, wherein the structured databases are mapped to corresponding data sources. Such a platform and method are required to overcome the limitations of existing systems, particularly with respect to real-time data availability, reusability, and scalability of data packages.
OBJECTS OF THE INVENTION:
[0013] An object of the present invention is to provide a data platform and method for generating (structured) data packages from structured databases mapped to data sources.
[0014] One more object of the present invention is to enable the creation of reusable, self-contained data assets that integrate real-time data with embedded query and access control mechanisms.
[0015] One more object of the present invention is to allow authorized users to independently discover, understand, and utilize data packages without technical dependency.
[0016] One more object of the present invention is to ensure compliance with organizational policies and standards through built-in governance and metadata controls.
[0017] One more object of the present invention is to support scalable and consistent data package creation across diverse use cases through modular design and user-driven customization.
SUMMARY OF THE INVENTION:
[0018] The present invention addresses the limitations of traditional data packaging systems, particularly the lack of real-time responsiveness and the inability to support scalable, reusable data assets. Existing systems rely heavily on batch processing and isolated development practices, resulting in outdated data packages and fragmented data ecosystems. The invention provides a data platform and method for generating Structured Adaptive Data Packages (SADPs) that integrate real-time data, support user-driven customization, and ensure compliance with governance standards.
[0019] In one embodiment, the data platform comprises a user interface, a data source generator, and a query engine configured to dynamically create and map data sources to structured federated databases. The platform enables users to define data source parameters, execute queries, and generate adaptive data packages enriched with metadata, access controls, and compliance attributes. The Machine Learning-Based Query Processor (MLBQ) assists in query creation using natural language inputs, while the Data Package Naming Assistant (DPNA) suggests contextually relevant names and descriptions. The integrator module links technical fields to business glossary terms, enhancing semantic clarity and usability.
[0020] Multiple embodiments of the platform support modular deployment and specialized functions. For example, a storage module caches frequently accessed data to improve performance, a preview module allows real-time validation of data packages, and a submission module facilitates approval workflows for publishing in data marketplaces. Versioning support enables iterative refinement of data packages, while cloud-native architecture ensures scalability and interoperability. Collectively, these embodiments provide a robust, flexible, and intelligent system for creating reusable, real-time data assets across diverse enterprise environments.
BRIEF DESCRIPTION OF DRAWINGS:
[0021] Figure 1 shows a block diagram of a data platform for creating an adaptive Data Package in accordance with the present invention;
[0022] Figure 2, figure 3a, figure 3b, figure 4, figure 5, figure 6, figure 7, figure 8 show schematic diagrams of various alternative embodiments of the data platform shown in figure 1; and
[0023] Figure 9 shows a flow chart of a method for creating an Adaptive Data Package in accordance with the present invention.
DETAILED DESCRIPTION OF DRAWINGS:
[0024] In a preferred embodiment of the present invention (Figure 1), a data platform (300) for creating an Adaptive Data Package (500) is provided. The data platform (300) is specifically adapted to create a structured Adaptive Data Package (500). The data platform (300) includes one or more processor(s) (p1, p2). The processor (p1/p2) is connected to one or more memory (m1, m2). The data platform (300) includes a user interface (350) and a data source generator (335). The processor (p1), the memory (m1) and the user interface (350) can be the processor, the memory and the user interface of a computing device (104). The computing device (104) is connected to the data sources (303a, 303b). More specifically, the processor (p1) is associated with one or more data source (303a, 303b) through a network (not shown) or a connection (not shown).
[0025] The user interface (350) provides a provision for a user to select the linking of the databases (305a, 305b) with the data sources (303a, 303b). The data source generator (335) is associated with the user interface (350). A user can create one or more data sources (303a, 303b) through the user interface (350). The data source generator (335) has a set of predefined instructions for creating the data source (303a, 303b) as per the inputs of the user through the user interface (350).
[0026] More specifically, the data source generator (335) is a system component configured to dynamically create the one or more data sources (303a, 303b) based on user-defined parameters. The data source generator (335) is operatively associated with the user interface (350), which serves as the medium through which users input configuration details. These details may include data source type, connection parameters, schema definitions, authentication credentials, and transformation rules. The data source generator (335) comprises a predefined set of instructions or templates that interpret the user inputs and translate them into structured data source definitions. These instructions may be implemented as configuration scripts, metadata schemas, or code generation routines, enabling the system to support a wide variety of data source types and formats.
[0027] Upon receiving input through the user interface (350), the data source generator (335) initiates a multi-step process to construct the data sources (303a, 303b). This process includes input validation, template selection, dynamic population of configuration parameters, and instantiation of the data source. The data source generator (335) may also perform connectivity checks and register the resulting data source within a system registry or catalogue. The generated data sources (303a, 303b) are then made available for downstream consumption by other system components, such as data processing engines, analytics modules, or visualization tools. The integration between the user interface (350) and the data source generator (335) ensures a seamless and user-driven mechanism for creating and managing data connectivity within the system.
[0028] Further, the created data sources (303a, 303b) are mapped with one or more federated databases (305a, 305b), wherein the databases (305a, 305b) are structured databases. The user interface (350) enables users to configure and manage data integration settings. The user is empowered to define a one-to-one correspondence between individual data sources (303a, 303b) and their respective target databases (305a, 305b). Specifically, for each data source—such as a file, API endpoint, sensor feed, or an external system, the user can select or specify a unique destination database (e.g., 305a, 305b) where the data from that source will be stored or processed.
[0029] The mapping ensures that each data source (303a/303b) is linked to exactly one database (305a/305b), thereby avoiding ambiguity or duplication in data routing. The user interface (350) may present the functionality through dropdown menus, drag-and-drop elements, or form-based configurations, allowing users to visually associate each source with its designated database. Once established, these mappings can be stored in a configuration file or database schema (not shown) and may be used by the data platform (300) to automate data ingestion, transformation, and storage workflows.
[0030] The databases (305a, 305b) are structured databases. Some trade names of the existing structured databases are MySQL, PostgreSQL, and Oracle Database. The processor (p1) can be connected to more than one data sources (303a, 303b). Similarly, the processor (p2) is connected with the more than one database (305a, 305b). The data source (303a/303b) is a cloud environment (302a) or a hybrid environment (302b) or a on-premises computing environment (302c) or combinations of these or a data platform (302d) or a SaaS computing environment (not shown) or an Adaptive Data Package (ADP) (500p) created from the data platform (300).
[0031] The Data platforms (300) are configured with connectors (now shown), drivers (not shown), and APIs that enable them to interact with the various databases (305a, 305b) and the data sources (303a, 303b). The drivers, the connectors and APIs allow the data platform (300) to establish secure connections, send queries, retrieve data, and perform operations there-amongst. Common configurations include ODBC/JDBC drivers, REST or SOAP APIs, and native connectors for platforms like SQL Server, Oracle, MongoDB, or cloud services like AWS, Azure, and Google Cloud (Trade names). These common configurations ensure seamless data integration and interoperability across diverse environments.
[0032] The data platform (300) further includes a query engine (309), a Data Package Naming Assistant (DPNA) (320) and an integrator (I).
[0033] The data platform (300) includes the user interface (350) and the query engine (309), both of which are essential for enabling dynamic, user-driven generation of adaptive data packages (ADPs) (500). The user interface (350) is configured to receive queries from users and to facilitate the mapping of data sources to structured databases (305a, 305b). This interface may be implemented as a graphical user interface (GUI), command-line interface (CLI), or web-based portal, and is adapted to accept queries in multiple programming or query languages, including but not limited to SQL, Python, and JSON-based formats.
[0034] The user interface (350) supports the establishment of one-to-one mappings between each data source and a corresponding structured database. This mapping ensures that data retrieval operations are precise and contextually aligned with the user’s intent. The interface also provides real-time feedback regarding the status of submitted queries, including validation results and execution outcomes, such as “Query valid,” “Error in query,” or “Data irrelevant.”
[0035] The query engine (309) is configured in system memory (m1/m2) and is responsible for interpreting, validating, and executing queries received from the user interface. The engine includes a parser library and a semantic interpreter capable of understanding the syntax and intent of queries across various formats. A validation module compares the query parameters against the schema and metadata of the structured databases (305a, 305b) to determine executability.
[0036] Upon successful validation, the query engine (309) transforms the input query into an executable form (q) and initiates data extraction from one or more databases. The engine supports real-time or near-real-time data retrieval, depending on system configuration and data source latency. The extracted data (501) is processed and structured into an adaptive data package (500), which includes metadata, access control policies, and compliance attributes.
[0037] The query engine (309) also includes a data packaging module (309a) that formats the ADP (500) according to predefined schemas and ensures that the resulting package is self-contained and reusable. The query engine (300) may optionally register the ADP (500) in a data catalogue or repository for future discovery and access. This architecture supports secure, compliant, and efficient data sharing across organizational boundaries.
[0038] Further, (data platform (300g)) the query (q) can be edited by the user in the user interface (350) with one or more cycles of editing. The query (q) can be optimised to ensure the extracted data (501) retrieved aligns precisely with analytical or operational needs or any such need of the user. The query (q) can be executed by the user with the assistance from a Machine Learning-Based Query Processor (MLBQ) (390) (Figure 1) configured in the memory (m1/m2). The MLBQ (390) is an AI assisted processor with a set of instructions. The MLBQ (390) can be preconfigured in the data source (303a/303b). The MLBQ (390) can be activated by the user through the user interface (350) accordingly. The MLBQ (390) reduces manual intervention in query creation and execution. The MLBQ (390) can assist the user in query creation based on predefined instructions.
[0039] The MLBQ (390) can significantly streamline query creation by understanding user intent and translating it into structured queries. For example, a user might type a natural language request such as, "Show me the total sales for each region in the last quarter." The MLBQ (390), using natural language processing (NLP) and machine learning models trained on database schemas and query patterns, interprets this request, identifies relevant tables and fields (like sales, region, and date), and automatically generates the corresponding SQL query.
[0040] The extracted data (501) from the query engine (309) according to the query (q1) forms the Adaptive Data Package (500). The Adaptive Data Package (500) may be stored in a flash memory or in a virtual memory. The query engine (309) has a structured data extractor (309b) for extracting the structured data (501b) from the data sources (303a, 303b). The structured data (501) is the relevant data. As the query engine (309) extracts and forms the Adaptive Data Package (500) from the structured databases (305a, 305b), the created Adaptive Data Package (500) can be referred as the “structured Adaptive Data Package (SADP)”. The Adaptive Data Package (500) does not save the extracted data (501) in a memory or any such storage activity/service. Instead, the ADP (500) has a set mapping of the query (q), the data source (303a) and the corresponding database (305a). The adaptive data package (500) is a data package which is reusable, self-contained data asset that integrates real-time data with query and access control mechanisms, enabling authorized users to independently discover, understand, and utilize data effectively, while maintaining compliance with organizational policies and standards, The ADP (500) can be used for data extraction or any such data related activities for any data specific operations.
[0041] The Data Package Naming Assistant (DPNA) (320) is configured in the memory (m1/m2) to suggest names and descriptions for the Adaptive Data Package (500) to the user. The DPNA (320) allows the user to assign a selected name and description to the Adaptive Data Package (500) through the user interface (350) for creating a named Adaptive Data Package (510b).
[0042] The DPNA (320) includes a sub module with predefined instructions which enables naming of the Adaptive Data Package (500) by providing relevant suggestions or options. The (DPNA) (320) is an AI module or machine learning module having a set of predefined instructions as a set of programmed.
[0043] More specifically, the DPNA (320) is stored with predefined instruction to understand the details of the extracted data (501) or any such determination parameter and relate this extracted data (501) to the prestored domain-specific taxonomies in the memory (m1). The DPNA (320) further has instructions to derive the names according based on its learning and understanding of the extracted data (501) and comparison with the domain-specific taxonomies. The derived names (N1) are shown as suggestion in the user interface (350). The user interface (350) may have a triggering icon (357). The functioning of the DPNA (320) is triggered when a user initiates the action of the DPNA (320) through the triggering icon (357). For an example, the triggering can be a clicking operation on the triggering icon (357) and such a like. The suggestions as Names (N1, N2) and Descriptions) are displayed as options or in any such obvious representations. For example, the DPNA (320) may auto-suggest a name such as "Quarterly Sales Report" by detecting recurring sales-related patterns in the Adaptive Data Package (500). This suggestion is also according to the query (q) executed. The DPNA (320) is adapted to relate a product name and a description of the product contextually relevant according to the extracted data (501) of the Adaptive Data Package (500).
[0044] The data platform (300) includes a synchronizer (not shown) with sub-modules and predefined instructions to synchronize data system(s) (not shown) and data component (s) with the Adaptive Data Package (500) by the user through the user interface (350). The synchronizer comprises data synchronization operators, file synchronization tools, distributed file systems, and real-time synchronization mechanisms. The data system (530) consists of data sources such as databases, APIs, and sensors; data pipelines for collection, cleaning, and transformation; storage solutions like cloud databases; and user interfaces including dashboards and reports.
[0045] The data system (530) operates in secure environments and integrates hardware-software components and machine learning APIs to ensure data integrity and support generation of named Adaptive Data Packages (510b). In a retail scenario, the data system includes sales databases, processing pipelines, and reporting dashboards, executing queries like "SELECT category, SUM(sales_amount) AS total_sales FROM sales_transactions GROUP BY category;". The data component (540) includes raw, processed, and derived data stored in databases or cloud storage, supporting transformation, integration, and reporting. In a CRM context, the data component (540) includes customer records and interaction logs.
[0046] The data platform (300) can be a cloud platform with the query engine (309), the Adaptive Data Package Naming Assistant (DPNA) (320) as SaaS applications. The query validator (306), the query engine (309), the Adaptive Data Package Naming Assistant (DPNA) (320) has a set of software and hardware which enables the SaaS operations for performing respective functions. The processor (p1), the memory (m1) and the computing device (104) can be virtual machines working in a SaaS environment.
[0047] The Query Engine (309) and the Data Package Naming Assistant (DPNA) (320) are architected with cloud-native principles. Each component (306, 309, 320) operate as an independent microservice, exposing secure APIs for interaction and supporting multi-tenancy to serve multiple users or organizations simultaneously. These services must integrate with authentication and authorization systems to ensure secure access and be deployable on scalable cloud infrastructure using containers and orchestration tools like Kubernetes (Trade name). Additionally, the data platform (300) has persistent connections to various data sources (303a, 303b) through standardized drivers or connectors, and includes robust monitoring, logging, and the user interface (350) integration to provide a seamless and reliable user experience.
[0048] The data platform (300) has the integrator (I) (figure 2) associated with the data platform (300). The integrator (I) of the data platform (300) enables integration of the dynamic business glossary (550) with the Adaptive Data Package (500) through the user interface (350) by the user to create the structured Adaptive Data Package (500). The integrator (I) of the data platform (300) enables the automated mapping of technical fields to the domain-specific business terminology.
[0049] The dynamic business glossary (550) is a structured collection of business terms with clear definitions which are continuously updated and retained in real time. The dynamic business glossary (550) includes terms, definitions, synonyms, acronyms, sources, links to related terms, categories, subcategories, and ownership details which are continuously updated and stored in the database (305a, 305b) in real time according to data governance conditions. The dynamic business glossary (550) also includes metadata such as the date of last revision and links to relevant documents, ensuring comprehensive understanding and easy access to information.
[0050] The dynamic business glossary(s) (550) are used in various industries to standardize terminology, improve data quality, support compliance with regulations, and enhance communication. The dynamic business glossary (550) can appear as an interactive, web-based tool integrated within a data governance platform. The dynamic business glossary (550) includes a searchable interface where users can look up business terms, definitions, synonyms, acronyms, and related documents, a searchable interface where users can look up business terms, definitions, synonyms, acronyms, and related documents.
[0051] The user interface (350) has the integrator (I) which is associated with an integrator icon (359). Upon triggering the activation of the integrator (I) by the user through the integrator icon (359), the integrator (I) integrates the dynamic business glossary (550) with the Adaptive Data Package (500). The dynamic business glossary (550) enables automated mapping of technical fields to domain-specific business terminology.
[0052] Automated mapping of technical fields to domain-specific business terminology involves linking technical data elements, such as database columns or fields, to specific business terms from a glossary. This process translates raw technical data into meaningful business information. For example, a database field labeled "cust_id" might be automatically mapped to the business term "Customer ID" from the glossary. This ensures that the extracted data (501) within the Adaptive Data Package (500) is not only structured but also easily understood and relevant in a business context, enhancing clarity and usability for analysis and decision-making.
[0053] The integration of the dynamic business glossary (550) enables the user to intuitively associate industry-relevant labels with technical fields to enhance clarity and usability the data (501). For example, ambiguous column names like "col_01" or "sales_value" can be renamed to "Net Revenue" or "Gross Margin," ensuring consistency and improving stakeholder understanding across reports and analyses.
[0054] The data platform (300) (embodiment (300a)) (figure 2) includes an editing and optimizing module (307a) configured to allow a user to edit and optimize executed queries to ensure the extracted data (501) retrieved aligns precisely with analytical or operational needs. The editing and optimizing module (307a) can be configured in the data platform (300) or more specifically in the user interface (350). A user can edit the executed queries (q) through the editing and optimizing module (307a).
[0055] In one more embodiment (301a) (figure 3a) of the data platform (300), the query engine (309), the DPNA (320) and the integrator (I) are stored as separate components in the memory (m1). The data platform (301a) may have one or two components of the query validator (306), the query engine (309), the DPNA (320), the user can extract the modules which are not available at the data platform (300) from a source and store the respective module in the memory (m1). The query validator (306), the query engine (309), the DPNA (320) can be configured in the memory (m1) one after the other based on the non-availability of respective module. The source can be database or a virtual memory or a cloud memory.
[0056] In one more embodiment (301b) (figure 3b) of the data platform (300) the query engine (309), the DPNA (320) and the integrator (I) are integrated into a single data platform application (D1) and configured in the memory (m1). The single data platform application (D1) is configured with all relevant modules of the query validator (306), the query engine (309), the DPNA (320) and prestored in the computing device (104). The single data platform application (D1) also has the module for the user interface (350). A User can configure the single data platform application (D1) in the computing device (104) for configuring the data platform (301b) to create Adaptive Data Packages (500).
[0057] In one more embodiment (300b) (figure 4) of the data platform (300), the data platform (300b) includes a storage module (301b) associated with the processor (p1). The storage module (301b) stores frequently accessed data (506) in the database (305a/305b) while creating the Adaptive Data Package (500). The storage module (301b) reduces the need for repetitive queries to the database (305a/305b). The storage module (301b) can be a memory with instruction or can be Virtual machine which is associated with the database (305a/305b). The frequently accessed data (506) is the data which is associated with the Adaptive Data Package (500) which is repeatedly (in interval of time) accessed by the user while creating the SDP (500) using the method (200). The duration of the time interval can be predefined in the data platform (300b). More specifically, the frequently accessed data (506) can be stored in a SSD (solid state drive) (not shown) of the computing device (104).
[0058] In one more embodiment (300c) (figure 5) of the data platform (300), the data platform (300c) includes the preview module (301c) configured to allow real-time previewing of the details of the created structured Adaptive Data Package (500) before deployment into a data destination (352) to validate conditions and verify accuracy, completeness, and alignment with the intended purpose. The preview module (301c) is activated by a preview trigger (311c) of the user interface (350).
[0059] In one more embodiment (300d) (figure 6) of the data platform (300), the data platform (300d) includes a query storage module (301d) configured to store details of the query (q, q1) while creating the Adaptive Data Package (500). The query storage module (301d) is coupled with the processor (p1). The query storage module (301d) can be a memory with instructions or a virtual machine or can be a service. The stored details of the query and the Adaptive Data Package (500) in the query storage module (301d) can be used for data analytics or any such data processing activity. The query (q, q1) can be stored in a RDS(Relational Database Service) of the database (305a). The stored queries (q, q1) can be used for any data processing or data analytics activities.
[0060] In one more embodiment (300e) (figure 7) of the data platform (300), the data platform (300e) has a submission module (301e) configured in the memory (m1) to submit the created structured Adaptive Data Package (500) to an approval authority (572) for publication in the data marketplace (575). The submission module (301e) is functionally connected with the user interface (350). The approval authority (572) can be another user. The computing device (not shown) of the approval authority (572) and the user interface (350) are functionally connected. The user can coordinate with the approval authority (572) to publish the created Structured Adaptive Data Package (500) in the data marketplace (575). The data marketplace (575) can be an online platform where data providers and consumers can buy, sell, and trade data. The data marketplace (575) is (are) typically hosted on cloud services, making them accessible globally. Examples include Snowflake Marketplace, AWS Data Exchange, Databricks Marketplace, and Datarade Marketplace.
[0061] In a data platform (300f) (embodiment of the data platform (300)) (figure 8), each version of the Structured Adaptive Data Package Creation (SDPC) (500) is stored as the individual dataset (590a, 590b) in the database (305a). The database (305a) allows the retention of details and enabling further modifications, with each subsequent version being created and stored as a separate dataset (590a, 590b).
[0062] In the database (305a) each version (v1) can be stored and each update over to the previous version (v2) can be subsequently stored. The versions (v1, v2 of the Adaptive Data Package (500) can be stored as drafts which will enable the user to edit and update the drafts (current/previous Adaptive Data Packages (500)). The database (305a) allows for the retention of details and enabling further modifications, with each subsequent version being created and stored as a separate dataset (590a, 590b).
[0063] In one more embodiment of the present invention, a method (200) (figure 9) for creating the Adaptive Data Package (500) using the data platform (300) is provided. The method (200) starts at step 202. At step 210, the processors (p1/p2) are connected with the one or more data source (303a, 303b). At step 220, the query engine (309), the Data Package Naming Assistant (DPNA) (320), the integrator (I) and the data source generator (335) are configured in the memory (m1, m2).
[0064] The data platform (300) creates the different data sources (303a, 303b) mapped with the federated data bases (305a, 305b). The data bases (305a, 305b) are structured data bases and data storage.
[0065] At step 222, the one or more data sources (303a, 303b) are created through the user interface (350) by the user by the functioning of the data source generator (335). At step 230, the data sources (303a, 303b) created by the data source generator (335) are mapped to the one or more federated databases (305a, 305b) through the user interface (350). This mapping process allows the user to associate each data source with the corresponding federated database (305a/305b), enabling unified access and query execution across multiple heterogeneous data sources (303a). The user interface (350) facilitates this operation by allowing the user to select the appropriate federated database (305a, 305b) and define the mapping logic, which may include schema alignment, field matching, and transformation rules. The mapping ensures that data from the created sources (303a, 303b) can be integrated and queried in a consistent and structured manner within a federated database environment (not shown). At step 240, the query (q) is received from the user through the user interface (350) of the data platform (300).
[0066] At step 250, the received query(q) is executed by the query engine (309) for creating the adaptive data package (500). At step 260, names and descriptions for the Adaptive Data Package (500) are generated and suggested by the DPNA (320). At step 270, the user, via the user interface (350) assigns the selected name (N1, N2) and description to the Adaptive Data Package (500), thereby creating the named Adaptive Data Package (510b). At step 280, the dynamic business glossary (550) is integrated with the named Adaptive Data Package (510b) through the user interface (350) by the user using the integrator (I) for enabling automated mapping of technical fields to domain-specific business terminology. The method (200) ends at step 290.
[0067] The created data package (500) from the platform (300) and the method (200) can be a data product. The Adaptive Data Package (ADP) (500) generated by the data platform (300) using the method (200) can be elevated to a data product with additional structural, governance, and usability features. While the ADP (500) contains the core extracted data, transforming it into a data product requires the inclusion of metadata, access control mechanisms, compliance attributes, and discoverability features.
[0068] The ADP (500) is registered in a data catalogue, assigned ownership, and made accessible through well-defined interfaces such as APIs or download endpoints for making the ADP (500) as a data product. These elements collectively enable the data product to be reusable, traceable, and consumable by various stakeholders across the organization, thereby turning raw data into a governed, value-generating asset.
[0069] To create the Adaptive Data Package (500), a user can configure the data platform (300), which offers various specialized configurations based on specific needs. For storing frequently accessed data (506) during creation, the user may opt for data platform (300b); for previewing the created Adaptive Data Package (500), the data platform (300c) is suitable. If storing query (q) details is required, data platform (300d) can be used, while submission of the Adaptive Data Package (500) to an approval authority (572) is enabled via data platform (300e). To manage and store multiple versions or drafts (590a, 590b) of the Adaptive Data Package (500), the data platform (300f) is applicable. Additionally, query creation and execution can be assisted by the Machine Learning-Based Query Processor (MLBQ) (390) through data platform (300g). Users may also utilize methods (200) for generating Adaptive Data Packages from data sources (303a, 303b) connected to structured databases (305a, 305b).
[0070] The present invention has an advantage of enabling the creation of reusable, self-contained data assets (adaptive data package (500)) that integrate real-time data with embedded query (q) and access control mechanisms are fulfilled by the query engine (309), which supports real-time data extraction and packaging. The data package (500) includes embedded metadata, query lineage, and access control configurations. The editing and optimizing module (307a) allows refinement of queries to ensure relevance and precision, while the storage module (302b) supports caching of frequently accessed data, enhancing reusability and performance.
[0071] The present invention has one more advantage of allowing authorized users to independently discover, understand, and utilize data packages without technical dependency. It is supported by the user interface (350), the preview module (301c), and the data package naming assistant (DPNA) (320). The user interface (350) facilitates intuitive interaction with the platform, while the preview module enables real-time inspection of the data package before deployment. The DPNA (320) assists in assigning meaningful names and descriptions, improving discoverability and comprehension for non-technical users.
[0072] The present invention also ensures compliance with organizational policies and standards through built-in governance and metadata controls is addressed by the integrator (I), which enables the integration of the dynamic business glossary (550) with the named Adaptive Data Package (510b). This integration allows automated mapping of technical fields to domain-specific terminology, ensuring semantic consistency and policy alignment. Additionally, the submission module (301e) supports approval workflows for publishing data packages in the data marketplace (575), reinforcing governance.
[0073] The present invention supports scalable and consistent data package creation across diverse use cases through modular design and user-driven customization is realized through the modular architecture of the data platform (300), which includes components such as the query engine (309), the query validator (306), the editing module (307a), and the machine learning-based query processor (MLBQ) (390). These components can be deployed independently or as an integrated application, allowing flexibility in configuration. The system (300f) ensures that each iteration of the data package (500) is stored as the separate dataset (590a, 590b), enabling traceability and iterative refinement. , Claims:CLAIMS
We Claim:
1) A data platform (300) for creating an Adaptive Data Package (500), the data platform (300) comprising:
one or more processor (p1, p2) connected to one or more memory (m1, m2);
a user interface (350);
a data source generator (335) associated with the user interface (350), a user can create one or more data sources (303a, 303b) through the user interface (350) and by the data source generator (335) and the created data sources (303a, 303b) are mapped with one or more federated databases (305a, 305b), wherein the databases (305a, 305b) are structured databases and the user interface (350) is adapted to receive a query (q) from a user;
a query engine (309) configured in the memory (m1/m2) adapted to receive the query (q) through the user interface (350) thereto and executes the query (q) for creating the adaptive data package (ADP)(500); and
a data package naming assistant (DPNA) (320) is configured in the memory (m1/m2) to suggest names and descriptions for the created Adaptive Data Package (500) to a user and allowing the user to assign a selected name and description to the Adaptive Data Package (510a) through the user interface (350) for creating a named Adaptive Data Package (510b);
an integrator (I) associated with the memory (m), the integrator (I) enabling integration of a dynamic business glossary (550) with the named Adaptive Data Package (510b) through the user interface (350) by the user for enabling automated mapping of technical fields to domain-specific business terminology.
2) The data platform (301a) as claimed in claim 1, wherein the query engine (309), a query validator (306), the DPNA (320) are stored as separate components in the memory (m1/m2).
3) The data platform (301b) as claimed in claim 1, wherein the query engine (309), the query validator (306), the DPNA (320) are integrated into a single data platform application (D1) and configured in the memory (m1/m2).
4) The data platform (300a) as claimed in claim 1, the data platform (300a) includes an editing and optimizing module (307a) configured in the memory (m1) to allow a user to edit and optimize executed queries to ensure an extracted data (501) retrieved aligns precisely with analytical or operational needs.
5) The data platform (300b) as claimed in claim 1, the data platform (300b) includes a storage module (302b) associated with the processor (p1/p2) to store frequently accessed data (506) in the databases (305a, 305b) while creating the Adaptive Data Package (500).
6) The data platform (300c) as claimed in claim 1, the data platform (300c) includes a preview module (301c) configured in the memory (m1/m2) to allow real-time previewing of the details of the created structured Adaptive Data Package (500) before deployment into a data destination (352).
7) The data platform (300d) as claimed in claim 1, the data platform (300d) comprises a query storage module (301d) configured to store details of the query (q/q1) while creating the Adaptive Data Package (500).
8) The data platform (300e) as claimed in claim 1, the data platform (300e) comprises a submission module (301e) configured in the memory (m1, m2) to submit the created structured Adaptive Data Package (500) to an approval authority (572) for publication in a data marketplace (575).
9) The data platform (300f) as claimed in claim 1, wherein each version (v1, v2) of the Structured Adaptive Data Package Creation (SDPC) (500) is stored as an individual dataset (590a, 590b) in the databases (305a, 305b), allowing for the retention of details and enabling further modifications, with each subsequent version being created and stored as a separate dataset (590a, 590b).
10) The data platform (300g) as claimed in claim 1, wherein data platform (300g) includes a Machine Learning-Based Query Processor (MLBQ) (390) configured in the memory (m1) for assisting the user for query (q) creation.
11) The data platform (300) as claimed in claim 1, wherein the data source (303a/ 303b) is a cloud environment (302a) or a hybrid environment (302b) or a on-premises computing environment (302c) or combinations of these or a data platform (302d) or a SaaS computing environment or an Adaptive Data Package (500p) created from the data platform.
12) The data platform (300) as claimed in claim 1, wherein the data platform (300) is a cloud platform and the query validator (306), the query engine (309), the Adaptive Data Package Naming Assistant (DPNA) (320) are SaaS applications.
13) A method (200) creating an adaptive data package (500) using a data platform (300), the data platform (300) comprising one or more processors (p1, p2) connected to one or more memory (m1, m2); the method (200) comprises steps of :
connecting (210) the processors (p1/p2) with one or more data source (303a, 303b);
configuring (220) a query engine (309), a data source generator (335), a Data Package Naming Assistant (DPNA) (320), an integrator (I) in the memory (m1, m2);
creating (222) one or more data sources (303a, 303b) through a user interface (350) by the data source generator (335) by the user,
mapping (230) the created data sources (303a, 303b) with one or more federated databases (305a, 305b) by the user through the user interface (350), wherein the databases (305a, 305b) are structured databases;
receiving (240) a query (q) from the user through the user interface (350) of the data platform (300);
executing (250) the received query (q) by the query engine (309) for creating the adaptive data package (500);
generating (260) suggested names and descriptions for the Adaptive Data Package (500) by the DPNA (320);
assigning (270) a selected name (N1, N2) and description to the Adaptive Data Package (500) from the suggested names and description by the user through the user interface (350), thereby creating a named Adaptive Data Package (510b); and
integrating (280) a dynamic business glossary (550) with the named Adaptive Data Package (510b) through the user interface (350) by the user using the integrator (I) for enabling automated mapping of technical fields to domain-specific business terminology.
| # | Name | Date |
|---|---|---|
| 1 | 202541055730-STATEMENT OF UNDERTAKING (FORM 3) [10-06-2025(online)].pdf | 2025-06-10 |
| 2 | 202541055730-REQUEST FOR EARLY PUBLICATION(FORM-9) [10-06-2025(online)].pdf | 2025-06-10 |
| 3 | 202541055730-POWER OF AUTHORITY [10-06-2025(online)].pdf | 2025-06-10 |
| 4 | 202541055730-FORM-9 [10-06-2025(online)].pdf | 2025-06-10 |
| 5 | 202541055730-FORM 1 [10-06-2025(online)].pdf | 2025-06-10 |
| 6 | 202541055730-DRAWINGS [10-06-2025(online)].pdf | 2025-06-10 |
| 7 | 202541055730-DECLARATION OF INVENTORSHIP (FORM 5) [10-06-2025(online)].pdf | 2025-06-10 |
| 8 | 202541055730-COMPLETE SPECIFICATION [10-06-2025(online)].pdf | 2025-06-10 |
| 9 | 202541055730-FORM 18 [01-09-2025(online)].pdf | 2025-09-01 |
| 10 | 202541055730-Proof of Right [16-09-2025(online)].pdf | 2025-09-16 |
| 11 | 202541055730-FORM-26 [16-09-2025(online)].pdf | 2025-09-16 |