Abstract: ADAPTIVE DATABASE INDEXING SYSTEM AND METHOD ABSTRACT An adaptive database indexing system (100) and method (300) are disclosed for dynamically balancing Hybrid Transactional/Analytical Processing (HTAP) workloads across cloud environments. The system (100) comprises a monitoring unit (102) configured to collect real-time performance data and a processing unit (104) utilizing a machine learning-based predictive cost model. The processing unit (104) analyzes workload data, predicts the impact of candidate indexes on both transactional throughput and analytical latency, and calculates a net performance benefit score using configurable weight parameters. Based on this score, the system (100) selects and executes optimal database indexing actions such as creation, modification, or removal of indexes. The predictive cost model is continuously refined by comparing actual versus expected performance outcomes. Claims: 10, Figures: 3 Figure 1 is selected.
Description:
BACKGROUND
Field of Invention
[001] Embodiments of the present invention generally relate to database management systems and particularly to a system and a method for adaptive databased indexing in Hybrid Transactional/Analytical Processing (HTAP) environments.
Description of Related Art
[002] In recent years, the emergence of Hybrid Transactional/Analytical Processing (HTAP) workloads has created new challenges in database performance optimization. Traditional indexing strategies, designed either for Online Transaction Processing (OLTP) or Online Analytical Processing (OLAP), fall short in efficiently handling the dynamic, mixed-query patterns typical of HTAP systems. Manual tuning by database administrators (DBAs) or reliance on automated index recommendations often leads to suboptimal performance, as these methods frequently fail to consider the adverse impact of analytical optimizations on transactional throughput and vice versa. Moreover, the growing complexity and scale of cloud-based systems exacerbate the difficulty of achieving a balanced indexing strategy that adapts in real time to fluctuating workloads.
[003] There is thus a need for an improved and advanced adaptive database indexing system for dynamically balancing workloads across multiple cloud environments that can administer the aforementioned limitations in a more efficient manner.
SUMMARY
[004] Embodiments in accordance with the present invention provide an adaptive database indexing system for dynamically balancing workloads across multiple cloud environments, the system comprising a monitoring unit configured to collect performance data from Hybrid Transactional/Analytical Processing (HTAP) workloads; and a processing unit in communication with the monitoring unit. The processing unit is configured to analyze the collected performance data using a machine learning-based predictive cost model; predict, for candidate indexes, an expected impact on transactional throughput and analytical query latency; calculate a net performance benefit score for each of the candidate indexes using configurable weight parameters; select optimal database indexing actions based on the calculated net performance benefit score; execute the selected optimal database indexing actions with minimal disruption to ongoing operations using online Data Definition Language (DDL) commands; and refine the machine learning-based predictive cost model over time by comparing actual performance outcomes against the predicted expected impact.
[005] Embodiments in accordance with the present invention further provide a method for adaptive database indexing. The method comprising collecting real-time workload performance data from transactional and analytical operations using a monitoring unit; analyzing the collected performance data using a machine learning-based predictive cost model; predicting, for candidate indexes, an expected impact on transactional throughput and analytical query latency; calculating a net performance benefit score for each of the candidate indexes using configurable weight parameters; selecting optimal database indexing actions based on the calculated net performance benefit score; executing the selected optimal database indexing actions with minimal disruption to ongoing operations using online Data Definition Language (DDL) commands; and refining the machine learning-based predictive cost model over time by comparing actual performance outcomes against the predicted expected impact.
[006] Embodiments of the present invention may provide a number of advantages depending on their particular configuration. First, embodiments of the present application may provide an adaptive database indexing system that is capable of intelligently balancing the conflicting requirements of transactional and analytical workloads through a machine learning-driven predictive model.
[007] Next, embodiments of the present application may provide an adaptive database indexing system that autonomously selects and executes an optimal database indexing actions such as creating, modifying, or dropping indexes based on calculated net performance benefits.
[008] Next, embodiments of the present application may provide an adaptive database indexing system that may be capable of minimizing operational disruption during index changes by utilizing online and concurrency-aware Data Definition Language (DDL) execution strategies.
[009] Next, embodiments of the present application may provide an adaptive database indexing system that is configured to continuously refine the machine learning-based predictive cost model using real-time performance feedback, thereby improving decision accuracy over time.
[0010] Next, embodiments of the present application may provide an adaptive database indexing system that supports policy-driven prioritization of indexing decisions for customization based on business-defined objectives or service level targets.
[0011] These and other advantages will be apparent from the present application of the embodiments described herein.
[0012] The preceding is a simplified summary to provide an understanding of some embodiments of the present invention. This summary is neither an extensive nor exhaustive overview of the present invention and its various embodiments. The summary presents selected concepts of the embodiments of the present invention in a simplified form as an introduction to the more detailed description presented below. As will be appreciated, other embodiments of the present invention are possible utilizing, alone or in combination, one or more of the features set forth above or described in detail below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] The above and still further features and advantages of embodiments of the present invention will become apparent upon consideration of the following detailed description of embodiments thereof, especially when taken in conjunction with the accompanying drawings, and wherein:
[0014] FIG. 1 depicts a block diagram of an adaptive database indexing system, according to an embodiment of the present invention;
[0015] FIG. 2 illustrates components of a processing unit of the adaptive database indexing system, according to an embodiment of the present invention; and
[0016] FIG. 3 illustrates a flowchart for a method for adaptive database indexing, according to an embodiment of the present invention.
[0017] The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word "may" is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include”, “including”, and “includes” mean including but not limited to. To facilitate understanding, like reference numerals have been used, where possible, to designate like elements common to the figures. Optional portions of the figures may be illustrated using dashed or dotted lines, unless the context of usage indicates otherwise.
DETAILED DESCRIPTION
[0018] The following description includes the preferred best mode of one embodiment of the present invention. It will be clear from this description of the invention that the invention is not limited to these illustrated embodiments but that the invention also includes a variety of modifications and embodiments thereto. Therefore, the present description should be seen as illustrative and not limiting. While the invention is susceptible to various modifications and alternative constructions, it should be understood, that there is no intention to limit the invention to the specific form disclosed, but, on the contrary, the invention is to cover all modifications, alternative constructions, and equivalents falling within the scope of the invention as defined in the claims.
[0019] In any embodiment described herein, the open-ended terms "comprising", "comprises”, and the like (which are synonymous with "including", "having” and "characterized by") may be replaced by the respective partially closed phrases "consisting essentially of", “consists essentially of", and the like or the respective closed phrases "consisting of", "consists of”, the like.
[0020] As used herein, the singular forms “a”, “an”, and “the” designate both the singular and the plural, unless expressly stated to designate the singular only.
[0021] FIG. 1 depicts a block diagram of an adaptive database indexing system 100 (hereinafter referred to as the system 100), according to an embodiment of the present invention. The system 100 may be configured for adaptive index management of databases. The system 100 may employ a machine learning-based predictive cost model that may be configured to evaluate an impact of database indexing actions on both transactional and analytical performance. The system 100 may further enable automated decision-making and execution of index operations with minimal operational disruption.
[0022] According to the embodiments of the present invention, the system 100 may incorporate non-limiting hardware components to enhance processing speed and system efficiency. The system 100 may comprise a monitoring unit 102, a processing unit 104, a memory 106, and a communication interface 108.
[0023] The monitoring unit 102 may be configured to continuously collect real-time performance data from database workloads. The performance data may be, but not limited to, a query frequency, a data modification rate, index usage statistics, a Central Processing Unit (CPU) utilization, an Input Output (I/O) latency, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the performance data, including known, related art, and/or later developed technologies. The performance data may be transmitted to the processing unit 104 for further analysis. The monitoring unit 102 may be designed to operate with minimal overhead to ensure non-intrusive data collection, and may support scalable deployment to accommodate varying numbers of the databases.
[0024] In an embodiment of the present invention, the processing unit 104 may be configured to receive, analyze, and interpret the performance data collected by the monitoring unit 102. The processing unit 104 may include programming modules (as shown in FIG. 2) for executing instructions related to the output of the system 100, such as predicting the performance impact of candidate indexes, calculating net benefit scores based on configurable weight parameters, selecting an optimal database indexing actions, and triggering execution routines. Embodiments of the present invention are intended to include or otherwise cover any suitable output of the processing unit 104, including known, related art, and/or later developed technologies.
[0025] The suitable output may include, but is not limited to, predicted transactional throughput degradation, analytical query latency improvements, an overall performance gain score, a prioritized list of the database indexing actions, and execution directives for online Data Definition Language (DDL) commands. In some embodiments, the output may also comprise confidence levels or probabilistic impact ranges to support explainability and decision traceability within the system 100.
[0026] The processing unit 104 may be, but not limited to, a Programmable Logic Control (PLC) unit, a microprocessor, a development board, and so forth. In some embodiments, the processing unit 104 may also comprise system-on-chip (SoC) architectures, field-programmable gate arrays (FPGAs), single-board computers, embedded controllers, cloud-native orchestration engines, virtualized container hosts, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the processing unit 104, including known, related art, and/or later developed technologies. In an embodiment of the present invention, the processing unit 104 may further be explained in conjunction with FIG. 2.
[0027] The processing unit 104 may employ the machine learning-based predictive cost model to predict an expected impact of index operations on both transactional and analytical performance dimensions. The machine learning-based predictive cost model may be trained on historical workload logs. The machine learning-based predictive cost model may be refined using reinforcement learning and/or other adaptive algorithms to continuously improve accuracy. The processing unit 104 may further be configured to calculate net benefit scores and prioritize the database indexing actions based on configurable business policies or service level objectives (SLOs). In some embodiments of the present invention, a feedback loop may be implemented to compare actual system behavior against predicted outcomes for continuous model refinement.
[0028] In an embodiment of the present invention, the memory 106 may be any suitable computer-readable medium configured to store programming modules, performance data, model parameters, and indexing metadata. The memory 106 may comprise volatile memory (such as RAM), non-volatile memory (such as flash or SSD), or a combination thereof. In cloud-based deployments, the memory 106 may also represent distributed object storage or persistent volumes accessible via networked interfaces. Embodiments of the present invention are intended to include or otherwise cover any type of memory architecture, including known, related art, and/or later developed technologies.
[0029] In an embodiment of the present invention, the communication interface 108 may be configured to enable secure data exchange between the system 100 and external systems or components, such as database engines, cloud infrastructure services, or monitoring dashboards. The communication interface 108 may include, but is not limited to, Ethernet ports, wireless modules, RESTful APIs, message queues, or cloud-native connectors. The communication interface 108 may further be configured to support encryption, authentication, and/or protocol interoperability to ensure secure and reliable communication across heterogeneous environments.
[0030] FIG. 2 illustrates components of the processing unit 104 for the system 100, according to an embodiment of the present invention. In an embodiment of the present invention, the processing unit 104 may comprise computer-executable instructions in the form of programming modules including a data acquisition module 200, a data preprocessing module 202, a prediction module 204, a benefit scoring module 206, a decision module 208, and a feedback refinement module 210.
[0031] In an embodiment of the present invention, the data acquisition module 200 may be configured to receive, organize, and log real-time workload performance data from the monitoring unit 102. The data acquisition module 200 may further be configured to extract the performance parameters. Upon extracting the performance parameters, the data acquisition module 200 may be configured to generate a data processing signal and may transmit the data processing signal to the data preprocessing module 202.
[0032] In an embodiment of the present invention, the data preprocessing module 202 may be configured to be activated upon receiving the data processing signal from the data acquisition module 200.
[0033] In an embodiment of the present invention, the data preprocessing module 202 may be configured to cleanse, normalize, and structure the received performance data for analysis by subsequent modules. The data preprocessing module 202 may further be configured to perform time-based aggregation, remove outliers, handle missing values, and transform raw data into structured input features for predictive modeling. Upon extracting the performance parameters, the data preprocessing module 202 may be configured to generate a prediction and may transmit the generated prediction signal to the prediction module 204.
[0034] In an embodiment of the present invention, the prediction module 204 may be configured to be activated upon receiving the prediction from the data preprocessing module 202.
[0035] In an embodiment of the present invention, the prediction module 204 may be configured to analyze structured performance data and predict the expected impact of candidate indexes on the transactional throughput and analytical query latency. The prediction module 204 may use a machine learning-based predictive cost model, which may be trained on historical workload logs and continuously refined using reinforcement learning techniques. The prediction module 204 may also be configured to consider workload context and index characteristics to generate performance impact estimates. In an exemplary scenario of the present invention, the prediction module 204 may predict that introducing a compound index on a heavily queried table may lead to a 30% reduction in analytical query latency, while causing only a negligible 3% reduction in transactional throughput, and may relay this information to the benefit scoring module 206 for further evaluation.
[0036] In an embodiment of the present invention, the benefit scoring module 206 may be configured to calculate a net performance benefit score for each candidate index based on the analyzed structured performance data and the predicted expected impact by the prediction module 204. The benefit scoring module 206 may be configured to apply configurable weight parameters that may reflect business-defined policies and/or the service level objectives. The benefit scoring module 206 may further be configured to discard candidate indexes that are associated with negative net performance benefit scores.
[0037] In an exemplary scenario of the present invention, the benefit scoring module 206 may calculate a composite benefit score of +0.65 for a first candidate index. This score may be above an acceptance threshold and therefore considered favorable for implementation. In contrast, the benefit scoring module 206 may assign a score of -0.15 to a second candidate index due to a predicted high transactional cost. As a result, the benefit scoring module 206 may discard the second candidate index from further consideration.
[0038] In an embodiment of the present invention, the decision module 208 may be configured to select the optimal database indexing actions based on the calculated net performance benefit scores calculated by the benefit scoring module 206. The decision module 208 may be further configured to choose from creating new indexes, modifying existing indexes, or dropping underperforming indexes. The decision module 208 may be configured to generate an execution plan that minimizes disruption to ongoing operations by leveraging online Data Definition Language (DDL) commands and/or concurrency-aware techniques. The concurrency-aware techniques may further aid in minimizing locking conflicts and transaction delays.
[0039] In an embodiment of the present invention, the feedback refinement module 210 may be configured to compare actual system performance outcomes after index deployment with the predictions made by the prediction module 204. The feedback refinement module 210 may be configured to refine the machine learning-based predictive cost model over time using real performance data. Additionally, the feedback refinement module 210 may be configured to periodically reevaluate workload conditions and initiate updates to an index structure of the databases to adapt to changing system behavior. The feedback refinement module 210 may further be configured to update the index structure based on the reevaluated workload conditions.
[0040] The programming modules within the processing unit 104 may be implemented as modular microservices or containerized applications. These programming modules may communicate through shared memory, APIs, or asynchronous messaging interfaces and may be orchestrated using cloud-native frameworks to enable scalability, fault tolerance, and efficient resource usage.
[0041] In an exemplary scenario of the present invention, the system 100 may be deployed in a Hybrid Transactional/Analytical Processing (HTAP) workloads environment supporting a large-scale enterprise application. During a high-traffic event, the monitoring unit 102 may detect spikes in both Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) workloads. The data acquisition module 200 may gather relevant performance data, which are preprocessed by the data preprocessing module 202. The prediction module 204 may evaluate the impact of several candidate indexes. The benefit scoring module 206 may assign scores based on predicted trade-offs between transactional overhead and analytical gains. The decision module 208 may determine that creating a multi-column index may provide a highest benefit. The decision module 208 may further issue an online Data Definition Language (DDL) command to apply the change without disrupting active users. Post-deployment, the feedback refinement module 210 may analyze an actual system behavior and may update the machine learning-based predictive cost model to reflect observed results, thereby enhancing future decision accuracy.
[0042] FIG. 3 illustrates a flowchart for a method 300 for adaptive database indexing, according to an embodiment of the present invention. The method 300 may be implemented by the system 100 for managing the index structures in the Hybrid Transactional/Analytical Processing (HTAP) workloads in a dynamic and automated manner.
[0043] At step 302, the system 100 may collect real-time workload performance data from transactional and analytical operations using the monitoring unit 102. The collected data may include, but is not limited to, query frequency, data modification rate, index usage statistics, CPU utilization, and I/O latency. This data may be gathered continuously and with minimal overhead to avoid interference with database operations.
[0044] At step 304, the system 100 may analyze the collected performance data using a machine learning-based predictive cost model. This model may be trained on historical workload logs and adapted over time through reinforcement learning techniques. The purpose of this analysis is to generate actionable insights regarding the efficiency of candidate indexing strategies.
[0045] At step 306, the system 100 may predict, for the candidate indexes, the expected impact on the transactional throughput and analytical query latency. The prediction may quantify potential trade-offs, such as increased transactional overhead or improved analytical performance, thereby allowing the system to weigh benefits and drawbacks.
[0046] At step 308, the system 100 may calculate a net performance benefit score for each of the candidate indexes using configurable weight parameters. These parameters may reflect the business-defined policies or the service level objectives, enabling the system 100 to align indexing strategies with organizational priorities. The candidate indexes associated with a negative benefit score may be discarded.
[0047] At step 310, the system 100 may select the optimal database indexing actions based on the calculated net performance benefit scores. The optimal database indexing actions may include creating the new indexes, modifying the existing indexes, or dropping the underperforming indexes. The system 100 may generate the execution plan to apply these changes with the minimal disruption.
[0048] At step 312, the system 100 may execute the selected optimal database indexing actions using the online Data Definition Language (DDL) commands. The system 100 may further employ the concurrency-aware techniques to reduce the locking conflicts and to avoid the transaction delays during the execution.
[0049] At step 314, the system 100 may refine the machine learning-based predictive cost model over time by comparing actual performance outcomes against the previously predicted impacts. In addition, the system 100 may periodically reevaluate workload conditions and initiate updates to the index structure based on newly observed workload behavior, ensuring continuous adaptation and optimization.
[0050] While the invention has been described in connection with what is presently considered to be the most practical and various embodiments, it is to be understood that the invention is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims.
[0051] This written description uses examples to disclose the invention, including the best mode, and also to enable any person skilled in the art to practice the invention, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the invention is defined in the claims, and may include other examples that occur to those skilled in the art. Such other examples are intended to be within the scope of the claims if they have structural elements that do not differ from the literal language of the claims, or if they include equivalent structural elements within substantial differences from the literal languages of the claims. , Claims:CLAIMS
I/We Claim:
1. An adaptive database indexing system (100) for dynamically balancing workloads across multiple cloud environments, the system (100) comprising:
a monitoring unit (102) configured to collect performance data from Hybrid Transactional/Analytical Processing (HTAP) workloads; and
a processing unit (104) in communication with the monitoring unit (102), characterized in that the processing unit (104) configured to:
analyze the collected performance data using a machine learning-based predictive cost model;
predict, for candidate indexes, an expected impact on transactional throughput and analytical query latency;
calculate a net performance benefit score for each of the candidate indexes using configurable weight parameters;
select optimal database indexing actions based on the calculated net performance benefit score;
execute the selected optimal database indexing actions with minimal disruption to ongoing operations using online Data Definition Language (DDL) commands; and
refine the machine learning-based predictive cost model over time by comparing actual performance outcomes against the predicted expected impact.
2. The system (100) as claimed in claim 1, the performance data is selected from a query frequency, a data modification rate, index usage statistics, a Central Processing Unit (CPU) utilization, an Input Output (I/O) latency, or a combination thereof.
3. The system (100) as claimed in claim 1, wherein the processing unit (104) is configured to prioritize the database indexing actions based on business-defined policies, service level objectives (SLOs), or a combination thereof.
4. The system (100) as claimed in claim 1, wherein the machine learning-based predictive cost model is trained on historical workload logs and continuously updated using reinforcement learning techniques.
5. The system (100) as claimed in claim 1, wherein the processing unit (104) is configured to discard the candidate indexes associated with a negative net performance benefit score.
6. The system (100) as claimed in claim 1, wherein the optimal database indexing actions are selected from creating a new index, modifying an existing index, dropping an underperforming index, or a combination thereof.
7. The system (100) as claimed in claim 1, wherein the execution of the selected optimal database indexing actions employs concurrency-aware techniques to minimize locking conflicts and transaction delays.
8. The system (100) as claimed in claim 1, wherein the processing unit (104) is configured to periodically reevaluate workload conditions and updates an index structure based on the reevaluated workload conditions.
9. A method (300) for adaptive database indexing, the method comprising:
collecting real-time workload performance data from transactional and analytical operations using a monitoring unit (102);
analyzing the collected performance data using a machine learning-based predictive cost model;
predicting, for candidate indexes, an expected impact on transactional throughput and analytical query latency;
calculating a net performance benefit score for each of the candidate indexes using configurable weight parameters;
selecting optimal database indexing actions based on the calculated net performance benefit score;
executing the selected optimal database indexing actions with minimal disruption to ongoing operations using online Data Definition Language (DDL) commands; and
refining the machine learning-based predictive cost model over time by comparing actual performance outcomes against the predicted expected impact.
10. The method (300) as claimed in claim 9, comprising a step of periodically reevaluating workload conditions and updating and an index structure based on the reevaluated workload conditions.
Date: June 03, 2025
Place: Noida
Nainsi Rastogi
Patent Agent (IN/PA-2372)
Agent for the Applicant
| # | Name | Date |
|---|---|---|
| 1 | 202541056097-STATEMENT OF UNDERTAKING (FORM 3) [11-06-2025(online)].pdf | 2025-06-11 |
| 2 | 202541056097-REQUEST FOR EARLY PUBLICATION(FORM-9) [11-06-2025(online)].pdf | 2025-06-11 |
| 3 | 202541056097-POWER OF AUTHORITY [11-06-2025(online)].pdf | 2025-06-11 |
| 4 | 202541056097-FORM-9 [11-06-2025(online)].pdf | 2025-06-11 |
| 5 | 202541056097-FORM FOR SMALL ENTITY(FORM-28) [11-06-2025(online)].pdf | 2025-06-11 |
| 6 | 202541056097-FORM 1 [11-06-2025(online)].pdf | 2025-06-11 |
| 7 | 202541056097-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [11-06-2025(online)].pdf | 2025-06-11 |
| 8 | 202541056097-EVIDENCE FOR REGISTRATION UNDER SSI [11-06-2025(online)].pdf | 2025-06-11 |
| 9 | 202541056097-EDUCATIONAL INSTITUTION(S) [11-06-2025(online)].pdf | 2025-06-11 |
| 10 | 202541056097-DRAWINGS [11-06-2025(online)].pdf | 2025-06-11 |
| 11 | 202541056097-DECLARATION OF INVENTORSHIP (FORM 5) [11-06-2025(online)].pdf | 2025-06-11 |
| 12 | 202541056097-COMPLETE SPECIFICATION [11-06-2025(online)].pdf | 2025-06-11 |