Sign In to Follow Application
View All Documents & Correspondence

System And Method For Advanced Data Exploration And Visualization

Abstract: The present disclosure provides a system (108) and a method for advanced data exploration and visualization. The system (108) authorizes users (102) to access a digital platform and record information in the digital platform. The system (108) automatically renews network protocols while authorizing the users (102) and generates datasets from the information. The system (108) enables Asynchronous Query Execution (AQE) to automatically query the datasets for a predetermined period. The system (108) analyzes the datasets while providing security to the datasets for the predetermined period. The system (108) generate an alert to the users (102) based on the datasets, where the alert is based on pre-defined rules configured by the users (102).

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
31 January 2023
Publication Number
31/2024
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

JIO PLATFORMS LIMITED
Office-101, Saffron, Nr. Centre Point, Panchwati 5 Rasta, Ambawadi, Ahmedabad - 380006, Gujarat, India.

Inventors

1. SRIVASTAVA, Shikhar
5061, Sobha Jasmine, Green Glen Layout, Bellandur, Bangalore - 560103, Karnataka, India.
2. VELEGA, Raghuram
Flat no. C 3401, Ashford Royale, S. Samuel Marg, Nahur (W), Mumbai - 400078, Maharashtra, India.

Specification

DESC:RESERVATION OF RIGHTS
[0001] A portion of the disclosure of this patent document contains material, which is subject to intellectual property rights such as but are not limited to, copyright, design, trademark, integrated circuit (IC) layout design, and/or trade dress protection, belonging to Jio Platforms Limited (JPL) or its affiliates (hereinafter referred as owner). The owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all rights whatsoever. All rights to such intellectual property are fully reserved by the owner.

FIELD OF INVENTION
[0002] The embodiments of the present disclosure generally relate to a field of analytical Big Data ecosystems. More particularly, the present disclosure relates to a system and a method for advanced data exploration and visualization.

BACKGROUND
[0003] The following description of the related art is intended to provide background information pertaining to the field of the disclosure. This section may include certain aspects of the art that may be related to various features of the present disclosure. However, it should be appreciated that this section is used only to enhance the understanding of the reader with respect to the present disclosure, and not as admissions of the prior art.
[0004] In a typical analytical Big Data ecosystem, there are multiple services available to share generated data insights to an end consumer. Services may include data file sharing between systems, and data access Application Programming Interfaces (APIs) between systems or visualization tools for interactive insights. A scalable, robust, and interactive visualization tool is a required service that enables a customer to explore datasets and derive data insights. Further, existing set of enterprise visualization and business interactive tools available in the market do not offer many important features required for efficient performance. Further, these tools offer little to no flexibility in terms of optimizations/plugin support in their architecture, which is essential for any such platform operating at a big data scale for multiple tenants.
[0005] There is, therefore, a need in the art to provide a system and a method that can mitigate the problems associated with existing enterprise visualization and business interactive tools.

OBJECTS OF THE INVENTION
[0006] Some of the objects of the present disclosure, which at least one embodiment herein satisfies are listed herein below.
[0007] It is an object of the present disclosure to provide a system and a method for advanced data exploration and visualization that automatically authenticates and authorizes users in a digital platform.
[0008] It is an object of the present disclosure to provide a system and a method that automatically renew network protocols while authorizing the users.
[0009] It is an object of the present disclosure to provide a system that provides enterprise grade security on a superset by enabling submission of an Asynchronous Query Execution (AQE) query task on a data source as the chart owner with access to all datasets/data sources, thereby fixing security vulnerability.
[0010] It is an object of the present disclosure to provide a system that facilitates custom cache warmup strategy based on a pre-defined warmup schedule frequency tag/label.
[0011] It is an object of the present disclosure to provide a system and a method that analyzes the datasets while providing security for the datasets for the predetermined period.
[0012] It is an object of the present disclosure to provide a system and a method that generates and provides an alert to the users based on one or more pre-defined rules configured by the users for the datasets.

SUMMARY
[0013] This section is provided to introduce certain objects and aspects of the present disclosure in a simplified form that are further described below in the detailed description. This summary is not intended to identify the key features or the scope of the claimed subject matter.
[0014] In an aspect, the present disclosure relates to a system for data exploration and visualization. The system includes a processor communicatively coupled to a digital platform, and a memory operatively coupled with the processor, where said memory stores instructions which, when executed by the processor, cause the processor to authorize one or more users to access the digital platform and record information in the digital platform. The processor automatically renews network protocols while authorizing the one or more users and generates one or more datasets from the information. The processor enables Asynchronous Query Execution (AQE) to automatically query the one or more datasets for a predetermined period. The processor analyses the one or more datasets while providing security for the one or more datasets for the predetermined period. The processor generates an alert for the one or more users based on the analysis of the one or more datasets. The alert is based on one or more pre-defined rules configured by the one or more users for the one or more datasets.
[0015] In an embodiment, the processor may send a request for credentials to the one or more users for automatically querying and analyzing the one or more datasets.
[0016] In an embodiment, the processor may determine if a Structured Query Language (SQL) condition is enabled, and based on a positive determination, transmit the alert to the one or more users.
[0017] In an embodiment, in response to a negative determination, the processor may disable the alert.
[0018] In an embodiment, the processor may configure a Light-Weight Directory Access Protocol (or LDAP) that includes a profile associated with the one or more users for automatically authorizing the one or more users to access the digital platform.
[0019] In an embodiment, the processor may generate one or more customized labels associated with the profile that may enable the one or more users to filter the one or more datasets associated with the profile.
[0020] In an aspect, the present disclosure relates to a method for data exploration and visualization. The method includes authorizing, by a processor associated with a system, one or more users to access a digital platform and record information in the digital platform. The method includes automatically renewing, by the processor, network protocols while authorizing the one or more users and generating one or more datasets from the information. The method includes enabling, by the processor, AQE to automatically query the one or more datasets for a predetermined period. The method includes analyzing, by the processor, the one or more datasets while providing security for the one or more datasets for the predetermined period. The method includes generating, by the processor, an alert for the one or more users based on the analysis of the one or more datasets, where the alert is based one or more pre-defined rules configured by the one or more users for the one or more datasets.
[0021] In an embodiment, the method may include sending, by the processor, a request for credentials to the one or more users for automatically querying and analyzing the one or more datasets.
[0022] In an embodiment, the method may include determining, by the processor, if a SQL condition is enabled and based on a positive determination, transmitting, by the processor, the alert to the one or more users.
[0023] In an embodiment, in response to a negative determination, the method may include disabling, by the processor, the alert.
[0024] In an embodiment, the method may include configuring, by the processor, a LDAP that includes a profile associated with the one or more users for automatically authorizing the one or more users to access the digital platform.
[0025] In an embodiment, the method may include generating, by the processor, one or more customized labels associated with the profile that enables the one or more users to filter the one or more datasets associated with the profile.

BRIEF DESCRIPTION OF DRAWINGS
[0026] The accompanying drawings, which are incorporated herein, and constitute a part of this disclosure, illustrate exemplary embodiments of the disclosed methods and systems which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that disclosure of such drawings includes the disclosure of electrical components, electronic components, or circuitry commonly used to implement such components.
[0027] FIG. 1 illustrates an example network architecture (100) for implementing a proposed system (108), in accordance with an embodiment of the present disclosure.
[0028] FIG. 2 illustrates an example block diagram (200) of a proposed system (108), in accordance with an embodiment of the present disclosure.
[0029] FIG. 3 illustrates an example block diagram (300) of a custom role and permission model for enterprise access control, in accordance with an embodiment of the present disclosure.
[0030] FIG. 4 illustrates an example representation (400) of caching in Superset, in accordance with an embodiment of the present disclosure.
[0031] FIGs. 5A-5C illustrate example representations (500A, 500B, 500C) of custom cache warmup strategies, in accordance with embodiments of the present disclosure.
[0032] FIG. 6 illustrates an example block diagram (600) of an Asynchronous Query Execution (AQE) user impersonation feature, in accordance with an embodiment of the present disclosure.
[0033] FIG. 7A-7I illustrate example representations (700A, 700B, 700C, 700D, 700E, 700F, 700G, 700H, 700I) of advanced notifications and alerts transmitted by the system (108), in accordance with embodiments of the present disclosure.
[0034] FIG. 8 illustrates an example representation (800) of a high-level scalable architecture of the system (108) for superset with caching and AQE, in accordance with an embodiment of the present disclosure.
[0035] FIG. 9 illustrates an example computer system (900) in which or with which embodiments of the present disclosure may be implemented.
[0036] The foregoing shall be more apparent from the following more detailed description of the disclosure.

DETAILED DESCRIPTION
[0037] In the following description, for the purposes of explanation, various specific details are set forth in order to provide a thorough understanding of embodiments of the present disclosure. It will be apparent, however, that embodiments of the present disclosure may be practiced without these specific details. Several features described hereafter can each be used independently of one another or with any combination of other features. An individual feature may not address all of the problems discussed above or might address only some of the problems discussed above. Some of the problems discussed above might not be fully addressed by any of the features described herein.
[0038] The ensuing description provides exemplary embodiments only and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the ensuing description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing an exemplary embodiment. It should be understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the disclosure as set forth.
[0039] Specific details are given in the following description to provide a thorough understanding of the embodiments. However, it will be understood by one of ordinary skill in the art that the embodiments may be practiced without these specific details. For example, circuits, systems, networks, processes, and other components may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail to avoid obscuring the embodiments.
[0040] Also, it is noted that individual embodiments may be described as a process that is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed but could have additional steps not included in a figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination can correspond to a return of the function to the calling function or the main function.
[0041] The word “exemplary” and/or “demonstrative” is used herein to mean serving as an example, instance, or illustration. For the avoidance of doubt, the subject matter disclosed herein is not limited by such examples. In addition, any aspect or design described herein as “exemplary” and/or “demonstrative” is not necessarily to be construed as preferred or advantageous over other aspects or designs, nor is it meant to preclude equivalent exemplary structures and techniques known to those of ordinary skill in the art. Furthermore, to the extent that the terms “includes,” “has,” “contains,” and other similar words are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising” as an open transition word without precluding any additional or other elements.
[0042] Reference throughout this specification to “one embodiment” or “an embodiment” or “an instance” or “one instance” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
[0043] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a”, “an”, and “the” are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
[0044] A digital platform such as Superset (or Apache superset) may refer to a modern data exploration and visualization platform that provides an intuitive interface to explore and visualize datasets, creates interactive dashboards and defines fine-grained, custom-rule based notifications and alerts. Further, the digital platform provides a rich set of visualizations and support for easily plugging-in custom developed charts and dashboards. The digital platform may leverage the power of existing data infrastructure without requiring yet another ingestion layer. The digital platform may integrate well with any Structured Query Language (SQL) based data-source including cloud native databases and engines at a petabyte scale. Furthermore, the digital platform may support easy data exploration through the no-code viz builder and an embedded SQL Integrated Development Environment (IDE) service.
[0045] The present disclosure discloses a modern enterprise-ready data visualization and business intelligence platform which is robust, scalable, has state-of the art security, and also provides notifications and alerting capability which allows users to be notified or alerted based on custom user-defined rules on their respective datasets with charts or dashboards of their choice. By augmenting the digital platform offering custom security features like Light-Weight Directory Access Protocol- Active Directory (LDAP-AD) integration, user impersonation, custom permissions model and Kerberos support, an enterprise grade security may be included in the digital platform. Further, usage audit reports may be generated for monitoring usage on per user basis. Scalability and responsiveness of visualizations (charts and/or dashboards) may be improved significantly by enabling key features of a web load balancer, a distributed cache, an asynchronous distributed task queue, and the like.
[0046] Various embodiments of the present disclosure will be explained in detail with reference to FIGs. 1-9.
[0047] FIG. 1 illustrates an example network architecture (100) for implementing a proposed system (108), in accordance with an embodiment of the present disclosure.
[0048] As illustrated in FIG. 1, the network architecture (100) may include a system (108). The system (106) may be connected to one or more computing devices (104-1, 104-2…104-N) via a network (106). It may be appreciated that the one or more computing devices (104-1, 104-2…104-N) may be individually referred as the computing device (104) and collectively referred as the computing devices (102). The computing devices (104) may be operated by one or more users (102-1, 102-2…102-N). It may be appreciated that the one or more users (102-1, 102-2…102-N) may be individually referred as a user (102) and collectively referred as the users (102). The system (108) may be configured with a digital platform that may be accessed by the users (102) for storing information.
[0049] In an embodiment, the computing devices (104) may include, but not be limited to, a mobile, a laptop, etc. Further, the computing devices (104) may include a smartphone, virtual reality (VR) devices, augmented reality (AR) devices, a general-purpose computer, desktop, personal digital assistant, tablet computer, and a mainframe computer. A person of ordinary skill in the art will appreciate that the computing devices (104) may not be restricted to the mentioned devices and various other devices may be used.
[0050] In an embodiment, the network (106) may include, by way of example but not limitation, at least a portion of one or more networks having one or more nodes that transmit, receive, forward, generate, buffer, store, route, switch, process, or a combination thereof, etc. one or more messages, packets, signals, waves, voltage or current levels, some combination thereof, or so forth. The network (106) may also include, by way of example but not limitation, one or more of a wireless network, a wired network, an internet, an intranet, a public network, a private network, a packet-switched network, a circuit-switched network, an ad hoc network, an infrastructure network, a Public-Switched Telephone Network (PSTN), a cable network, a cellular network, a satellite network, a fiber optic network, or some combination thereof.
[0051] In an embodiment, the system (108) may authorize the one or more users (102) to access the digital platform and record information in the digital platform. The system (108) may configure a LDAP that includes a profile associated with the one or more users (102) for automatically authorizing the one or more users (102) while accessing the digital platform.
[0052] In an embodiment, the system (108) may automatically renew network protocols while authorizing the one or more users (102) and generate one or more datasets from the information. Further, the system (108) may enable Asynchronous Query Execution (AQE) to automatically query the one or more datasets for a predetermined period based on the authorization.
[0053] In an embodiment, the system (108) may analyze the one or more datasets while providing security to the one or more datasets for the predetermined period. The system (108) may incorporate a user impersonation feature that requests credentials from the one or more users (102) for automatically querying and analyzing the one or more datasets. The system (108) may include caching of the one or more datasets stored in the memory (204), where the one or more datasets may be stored in a data structure of the memory (204) of a processor (202) configured with the system (108).
[0054] In an embodiment, the system (108) may generate an alert to the one or more users (102) based on the one or more datasets, where the alert may be based on one or more pre-defined rules configured by the one or more users (102) for the one or more datasets. The system (108) may determine if a Structured Query Language (SQL) condition is enabled and based on a positive determination, transmit the alert to the one or more users (102). Further, in response to a negative determination, the processor (202) may disable the alert.
[0055] In an embodiment, the system (108) may generate one or more customized labels associated with the profile that enables the one or more users (102) to filter the one or more datasets associated with the profile.
[0056] In an embodiment, the system (108) provides fault tolerant scalable data visualization and a business intelligence platform for building and realizing data insights at large scale from any datastore (e.g., SQL datastore, using SQLAlchemy library). Further, the system (108) includes enterprise grade security using LDAP/AD group authentication and authorization. Further, the system (108) provides user impersonation and AQE task execution that remove security vulnerabilities in the open-source offering. The system (108) offers seamless integration for ticket cache refresh after expiry, making services function smoothly both in interactive and AQE modes. Further, the system (108) provides a pluggable architecture for most of the components that gives flexibility to replace one service implementation (e.g. Memcached vs Redis for cache) without affecting an end-behaviour of the data visualization and a business intelligence platform.
[0057] Although FIG. 1 shows exemplary components of the network architecture (100), in other embodiments, the network architecture (100) may include fewer components, different components, differently arranged components, or additional functional components than depicted in FIG. 1. Additionally, or alternatively, one or more components of the network architecture (100) may perform functions described as being performed by one or more other components of the network architecture (100).
[0058] FIG. 2 illustrates an example block diagram (200) of a proposed system (108), in accordance with an embodiment of the present disclosure.
[0059] Referring to FIG. 2, the system (108) may comprise one or more processor(s) (202) that may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, logic circuitries, and/or any devices that process data based on operational instructions. Among other capabilities, the one or more processor(s) (202) may be configured to fetch and execute computer-readable instructions stored in a memory (204) of the system (108). The memory (204) may be configured to store one or more computer-readable instructions or routines in a non-transitory computer readable storage medium, which may be fetched and executed to create or share data packets over a network service. The memory (204) may comprise any non-transitory storage device including, for example, volatile memory such as random-access memory (RAM), or non-volatile memory such as erasable programmable read only memory (EPROM), flash memory, and the like.
[0060] In an embodiment, the system (108) may include an interface(s) (206). The interface(s) (206) may comprise a variety of interfaces, for example, interfaces for data input and output (I/O) devices, storage devices, and the like. The interface(s) (206) may also provide a communication pathway for one or more components of the system (108). Examples of such components include, but are not limited to, processing engine(s) (208) and a database (210), where the processing engine(s) (208) may include, but not be limited to, a data parameter engine (212), other engine(s) (214), and a notification engine (216). In an embodiment, the other engine(s) (214) may include, but not limited to, a data management engine, an input/output engine, or the like.
[0061] In an embodiment, the processing engine(s) (208) may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processing engine(s) (208). In examples described herein, such combinations of hardware and programming may be implemented in several different ways. For example, the programming for the processing engine(s) (208) may be processor-executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the processing engine(s) (208) may comprise a processing resource (for example, one or more processors), to execute such instructions. In the present examples, the machine-readable storage medium may store instructions that, when executed by the processing resource, implement the processing engine(s) (208). In such examples, the system (108) may comprise the machine-readable storage medium storing the instructions and the processing resource to execute the instructions, or the machine-readable storage medium may be separate but accessible to the system (108) and the processing resource. In other examples, the processing engine(s) (208) may be implemented by electronic circuitry.
[0062] In an embodiment, the processor (202) may receive information via the data parameter engine (212). The processor (202) may receive the information from one or more users (102) and authorize one or more users (102) to access the digital platform and record the information in the digital platform. The processor (202) may store the information in the database (210).
[0063] In an embodiment, the processor (202) may configure a LDAP that includes a profile associated with the one or more users (102) for automatically authorizing the one or more users (102) while accessing the digital platform.
[0064] In an embodiment, the processor (202) may automatically renew network protocols while authorizing the one or more users (102) and generate one or more datasets from the information. Further, the processor (202) may enable AQE to automatically query the one or more datasets for a predetermined period based on the authorization.
[0065] In an embodiment, the processor (202) may analyze the one or more datasets while providing security to the one or more datasets for the predetermined period. The processor (202) may incorporate a user impersonation feature that requests credentials from the one or more users (102) for automatically querying and analyzing the one or more datasets. The processor (202) may include caching of the one or more datasets stored in the memory (204), where the one or more datasets may be stored in a data structure of the memory (204) of a processor (202) configured with the system (108).
[0066] In an embodiment, the processor (202) may generate an alert to the one or more users (102) based on the one or more datasets, where the alert may be based on one or more pre-defined rules configured by the one or more users (102) for the one or more datasets. The processor (202) may determine if a SQL condition is enabled and based on a positive determination, transmit the alert to the one or more users (102) via the notification engine (216). Further, in response to a negative determination, the processor (202) may disable the alert.
[0067] In an embodiment, the processor (202) may generate one or more customized labels associated with the profile that enables the one or more users (102) to filter the one or more datasets associated with the profile.
[0068] FIG. 3 illustrates an example block diagram (300) of a custom role and permission model for enterprise access control, in accordance with an embodiment of the present disclosure.
[0069] As illustrated in FIG. 3, in an embodiment, the system (108) may provide authentication and authorization of users (102). The superset configured with the system (108) may offer its own set of roles and permissions, which can be configured only by superset administrator through a user interface (UI). But, in an enterprise setting, it may be difficult for the administrator to change/add roles every time a new user is added. In order to overcome this, the system (108) may map the LDAP with internal superset roles via a custom permissions model. These roles may be synchronized every time a user (102) logs in ensuring that if the user (102) has lost his/her access to AD group, he/she also does not have the corresponding superset role and permission. This allows users (102) to be easily on-boarded with the digital platform based on their access to pre-configured AD groups.
[0070] Further, in an embodiment, the system (108) the system (108) may include Kerberos integration for seamless interaction with data sources (e.g., Apache Hive), added to enforce data-source level security. By default, the digital platform may not handle the automatic initialization and renewal of tickets. Hence, the system (108) may include a capability to initialize and renew tickets automatically when querying a service. The system (108) may further include user impersonation ensuring that the submitted queries to data sources are by the signed-in user itself and hence end-to-end authorization controls may be applied.
[0071] As illustrated in FIG. 3, users (102) may access the digital platform, or superset (302) configured with the system (108) and provide information to the digital platform (302). The digital platform (302) may store this information in an Active Directory (AD) (304). The system (108) may analyze this information and provide an output to the users (102) with dashboards based on an authentication and an authorization of the users (102). The dashboards may include advanced data exploration and visualization based on the analyzed information. The system (108) may determine (306) if the users (102) are present in a login group of the AD (304). Based on a positive determination (308), the system (108) may assign superset roles as per the assigned AD groups to the users (102), where an administration group may be created with a complete access and a write group may be created for data sources and dashboards. Further, a read group may be created for reading dashboards and the users (102) not assigned to anyone of these groups may not be provided access to the dashboards.
[0072] FIG. 4 illustrates an example representation (400) of caching in digital platform, in accordance with embodiments of the present disclosure.
[0073] In an embodiment, the system (108) may generate a custom label or tags capability for end-users to define unique customized tag definitions for their dashboards from the UI. Custom tags may be pre-defined and loaded from a meta data base (DB). Tags may allow users (102) to filter their dashboards or group them together and may include a periodic cache warmup strategy implementation.
[0074] In an embodiment, the system (108) may include superset scalability where a web load balancer service may allow a cluster of superset instances to be hosted easily and caters to heavy traffic, multi-tenant, and high concurrency scenarios. The system (108) may include a distributed cache service (e.g. Redis cluster) that allows to cache necessary data from data sources (databases) and avoids repetitive calls to the underlying data source.
[0075] As illustrated in FIG. 4, in an embodiment, the system (108) may receive information from the users (102). The system (108) may store information in the cache (402). The digital platform (404) configured with the system (108) may access the information from the cache (402) and analyze the information using data source (404). The system (108) may store the analyzed information in a Metastore (406).
[0076] In an embodiment, the system (108) may receive a normal request from the users (102) and process the request using a memorization based caching. Further, the system (108) may use an advanced auto-refresh (warm up) caching based on an advanced request from the users (102).
[0077] FIGs. 5A-5C illustrate example representations (500A, 500B, 500C) of custom cache warmup strategies, in accordance with embodiments of the present disclosure.
[0078] In an embodiment, the system (108) may use AQE via a distributed task queue (e.g. celery beat and celery worker processes) to do the work of pre-populating the cache (this step is also called the cache warm-up process) from data sources based on custom user-defined label (or tag) values and reduce the load from superset process. Asynchronous execution of cache warmup queries may be based on a thread pool within a celery worker for a particular dashboard. This may let one celery worker execute multiple queries at a time which belongs to the same dashboard. Pool size may be configurable which may allow the user (102) to control a number of queries triggered at a time, thereby improving dashboard rendering performance significantly.
[0079] As illustrated in FIG. 5A, in an embodiment, the system (108) may receive information from the users (102). The system (108) may store information in the cache (502). The digital platform (504) configured with the system (108) may access the information from the cache (502) and use a celery beat (warm up scheduler) (504) to allocate scheduling of tasks via celery workers (506A, 506B, 506C). The scheduled tasks may be provided to a data source (508) for analysis. Further, the system (108) may store the analyzed information in a Metastore (510).
[0080] In an embodiment, the AQE and custom tags may allow customized cache warmup strategy basis of different time slots like, hourly or daily at a particular hour/epoch etc. This defines when the cache for a specific dashboard may be refreshed from the data-source. This allows the necessary datasets for the dashboard in question, to be prepared and ready, before the users (102) log-in. Further, a sample dashboard tag application is illustrated in FIG. 5A and dashboards with a daily refresh cache warmup strategy is illustrated in FIG. 5C. Together these improve the responsiveness of visualizations (charts and/or dashboards) and also reduce the work to be done by the superset instance and the underlying data-source(s), thereby improving experience of the user (102).
[0081] FIG. 6 illustrates an example block diagram (600) of an Asynchronous Query Execution (AQE) user impersonation feature, in accordance with an embodiment of the present disclosure.
[0082] In an embodiment, AQE user impersonation feature reinforces security while executing the cache warmup AQE task for a chart. The system (108) includes a capability to execute the said AQE tasks as the respective chart owner. This approach ensures end-to-end security of the data-source and the charts/dashboards. Further, this approach is different from the open-source superset offering where AQE tasks execute using some pre-configured service account. The service account may be passed-in the superset configuration with access to all the underlying data sources/datasets (which adds a security vulnerability and may lead to data leak issues). FIG. 6 illustrates AQE user impersonation feature with Hive. As illustrated in FIG. 6, in an embodiment, the system (108) may receive information from the users (102). The system (108) may store information in the cache (602). The digital platform (604) configured with the system (108) may access the information from the cache (602) and use a celery beat (warm up scheduler) (604) to allocate scheduling of tasks via celery workers (606A, 606B, 606C). The scheduled tasks may be provided to a data source (608) for analysis. Further, the system (108) may store the analyzed information in a Metastore (610).
[0083] FIG. 7A-7I illustrate example representations (700A, 700B, 700C, 700D, 700E, 700F, 700G, 700H, 700I) of advanced notifications and alerts transmitted by the system (108), in accordance with embodiments of the present disclosure.
[0084] In an embodiment, the system (108) may generate advanced notification and alerts. The system (108) may include an embedded notification engine that defines and schedules periodic (crontab based) reports. The embedded notification engine sends a custom dashboard or chart over configured communication channel like slack, email or Short Message Service (SMS) to users (102). FIG. 7A illustrates a reports summary view, FIG. 7B illustrates report definition, and FIG. 7C illustrates a report execution view generated by the system (108) using the embedded notification engine.
[0085] In an embodiment, the system (108) may define custom SQL rule conditions and periodic checks (crontab based) to determine if the SQL condition is met. During these checks, only when the condition is true, the chart or dashboard may be sent over a configured communication channel like slack, email or SMS to the users (102). FIG. 7D illustrates an alerts summary view, FIG. 7E illustrates a sample alert definition with SQL pre-condition, and FIG. 7F illustrates a sample alert execution view generated by the system (108). Further, FIG. 7G illustrates a sample alert email generated by the system (108).
[0086] In an embodiment, the system (108) may generate custom usage audit reports allow for easier monitoring of superset usage on per user basis as illustrated in FIG. 7H.
[0087] Further in an embodiment, the system (108) may include a celery flower process that monitors a backend celery task queue, celery beat, and celery worker daemon processes associated with scheduling of various tasks received from the user (102). FIG. 7I illustrates a celery flower dashboard generated by the system (108)
[0088] FIG. 8 illustrates an example representation of a high-level scalable architecture (800) of the system (108) for superset with caching and AQE, in accordance with an embodiment of the present disclosure.
[0089] With reference to FIG. 8, a high-level system architecture (800) for the proposed system (110) may include a plurality of data analysts (802-1, 802, …802-N) associated with a plurality of first computing devices (104) and further be associated with an authentication and authorization service module (804) which may be an out-of-the-box superset and offers its own set of roles and permissions, that may be configured only by a superset administrator through a superset UI. But in an enterprise setting, it gets difficult for an administrator to change/add roles every time a new user is added. In order to overcome this, a LDAP may be mapped with the internal superset roles via a custom permissions model. These roles may be synchronized every time a user (102) logs in ensuring that if a user (102) has lost his/her access to the authentication and authorization service module, and he/she also does not have the corresponding superset role and permission. This allows users (102) to be easily on-boarded on the system (108) based on their access to pre-configured authentication and authorization service module groups. The authentication and authorization service module (804) may be further coupled to a Web Load Balancer Service module (806). In an embodiment, the Web Load Balancer Service module (806) allows a cluster of superset instances (808) to be hosted easily and cater to heavy traffic, multi-tenant, high concurrency scenarios. The superset cluster (808) may send memorization cache-query response to a distributed cache cluster (810) equipped with a Memcache. The distributed cache cluster (810) may be further coupled with supported data sources (812) that includes relation databases (812-1), a distributed search (812-2), a data warehouse (812-3), and other data providers (812-4). This may allow to cache necessary data from the data-sources (databases) and avoid repetitive calls to the underlying data-source. The distributed cache cluster (810) may be further coupled to a celery beat task scheduler (820) that provides asynchronous query execution for cache warmup. For example, AQE and custom tags may allow customized cache warmup strategy basis of different time slots such as hourly or daily at a particular hour/epoch etc. This may define when the cache for a specific dashboard should be refreshed from the data-source. Further, this may allow necessary datasets for the dashboard in question, to be prepared and ready, before the end users (814) log-in. The distributed cache cluster (810) may read schedule for cache warm up from a Metastore (818) and set task execution details. The superset cluster (808) may set schedule for cache warmup from dashboards to the Metastore (818). The Metastore (818) may further read celery task execution logs from an observability, telemetry, and monitoring module (816).
[0090] FIG. 9 illustrates an exemplary computer system (900) in which or with which embodiments of the present disclosure may be implemented.
[0091] As shown in FIG. 9, the computer system (900) may include an external storage device (910), a bus (920), a main memory (930), a read-only memory (940), a mass storage device (950), a communication port(s) (960), and a processor (970). A person skilled in the art will appreciate that the computer system (900) may include more than one processor and communication ports. The processor (970) may include various modules associated with embodiments of the present disclosure. The communication port(s) (960) may be any of an RS-232 port for use with a modem-based dialup connection, a 10/100 Ethernet port, a Gigabit or 10 Gigabit port using copper or fiber, a serial port, a parallel port, or other existing or future ports. The communication ports(s) (960) may be chosen depending on a network, such as a Local Area Network (LAN), Wide Area Network (WAN), or any network to which the computer system (900) connects.
[0092] In an embodiment, the main memory (930) may be Random Access Memory (RAM), or any other dynamic storage device commonly known in the art. The read-only memory (940) may be any static storage device(s) e.g., but not limited to, a Programmable Read Only Memory (PROM) chip for storing static information e.g., start-up or basic input/output system (BIOS) instructions for the processor (970). The mass storage device (950) may be any current or future mass storage solution, which can be used to store information and/or instructions. Exemplary mass storage solutions include, but are not limited to, Parallel Advanced Technology Attachment (PATA) or Serial Advanced Technology Attachment (SATA) hard disk drives or solid-state drives (internal or external, e.g., having Universal Serial Bus (USB) and/or Firewire interfaces).
[0093] In an embodiment, the bus (920) may communicatively couple the processor(s) (970) with the other memory, storage, and communication blocks. The bus (920) may be, e.g. a Peripheral Component Interconnect PCI) / PCI Extended (PCI-X) bus, Small Computer System Interface (SCSI), USB, or the like, for connecting expansion cards, drives, and other subsystems as well as other buses, such a front side bus (FSB), which connects the processor (970) to the computer system (900).
[0094] In another embodiment, operator and administrative interfaces, e.g., a display, keyboard, and cursor control device may also be coupled to the bus (920) to support direct operator interaction with the computer system (900). Other operator and administrative interfaces can be provided through network connections connected through the communication port(s) (960). Components described above are meant only to exemplify various possibilities. In no way should the aforementioned exemplary computer system (900) limit the scope of the present disclosure.
[0095] While considerable emphasis has been placed herein on the preferred embodiments, it will be appreciated that many embodiments can be made and that many changes can be made in the preferred embodiments without departing from the principles of the disclosure. These and other changes in the preferred embodiments of the disclosure will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter is to be implemented merely as illustrative of the disclosure and not as a limitation.

ADVANTAGES OF THE INVENTION
[0096] The present disclosure provides a fault tolerant scalable data visualization and intelligence platform for building and realizing data insights at large scale from any SQL enabled datastore.
[0097] The present disclosure provides seamless integration for network authentication protocol ticket caches, making network authentication protocol enabled services function smoothly both in interactive and Asynchronous Query Execution (AQE) modes.
[0098] The present disclosure provides a pluggable architecture for most of the components while providing flexibility to replace one service implementation without affecting an end-behaviour of the data visualization and intelligence platform.
[0099] The present disclosure provides a very heavily supported platform that may be used by an open-source developer’s community around the world which incorporates new features and fixes bugs readily.
[00100] The present disclosure provides the data visualization and intelligence platform that is easy to maintain and monitor due to custom auditing and monitoring features.
,CLAIMS:1. A system (108) for data exploration and visualization, the system (108) comprising:
a processor (202) communicatively coupled to a digital platform;
a memory (204) operatively coupled with the processor (202), wherein said memory (204) stores instructions which, when executed by the processor (202), cause the processor (202) to:
authorize one or more users (102) to access the digital platform and record information in the digital platform;
automatically renew network protocols while authorizing the one or more users (102), and generate one or more datasets from the information;
enable Asynchronous Query Execution (AQE) to automatically query the one or more datasets for a predetermined period;
analyze the one or more datasets while providing security for the one or more datasets for the predetermined period; and
generate an alert for the one or more users (102) based on the analysis of the one or more datasets, wherein the alert is based on one or more pre-defined rules configured by the one or more users (102) for the one or more datasets.
2. The system (108) as claimed in claim 1, wherein the processor (202) is to send a request for credentials to the one or more users (102) for automatically querying and analyzing the one or more datasets.
3. The system (108) as claimed in claim 1, wherein the processor (202) is to determine if a Structured Query Language (SQL) condition is enabled, and based on a positive determination, transmit the alert to the one or more users (102).
4. The system (108) as claimed in claim 3, wherein in response to a negative determination, the processor (202) is to disable the alert.
5. The system (108) as claimed in claim 1, wherein the processor (202) is to configure a Light-Weight Directory Access Protocol (LDAP) that comprises a profile associated with the one or more users (102) for automatically authorizing the one or more users (102) to access the digital platform.
6. The system (108) as claimed in claim 5, wherein the processor (202) is to generate one or more customized labels associated with the profile that enables the one or more users (102) to filter the one or more datasets associated with the profile.
7. The system (108) as claimed in claim 1, wherein the processor (202) is to prioritize caching of the one or more datasets stored in the memory (204).
8. A method for data exploration and visualization, the method comprising:
authorizing, by a processor (202) associated with a system (108), one or more users (102) to access a digital platform and record information in the digital platform;
automatically renewing, by the processor (202), network protocols while authorizing the one or more users (102) and generating one or more datasets from the information;
enabling, by the processor (202), Asynchronous Query Execution (AQE) to automatically query the one or more datasets for a predetermined period;
analyzing, by the processor (202), the one or more datasets while providing security for the one or more datasets for the predetermined period; and
generating, by the processor (202), an alert for the one or more users (102) based on the one or more datasets, wherein the alert is based one or more pre-defined rules configured by the one or more users (102) for the one or more datasets.
9. The method as claimed in claim 8, comprising sending, by the processor (202), a request for credentials to the one or more users (102) for automatically querying and analyzing the one or more datasets.
10. The method as claimed in claim 8, comprising determining, by the processor (202), if a Structured Query Language (SQL) condition is enabled, and based on a positive determination, transmitting, by the processor (202), the alert to the one or more users (102).
11. The method as claimed in claim 10, wherein in response to a negative determination, the method comprises disabling, by the one or more processors (202), the alert.
12. The method as claimed in claim 8, comprising configuring, by the processor (202), a Light-Weight Directory Access Protocol (LDAP) that comprises a profile associated with the one or more users (102) for automatically authorizing the one or more users (102) to access the digital platform.
13. The method as claimed in claim 12, comprising generating, by the processor (202), one or more customized labels associated with the profile that enables the one or more users (102) to filter the one or more datasets associated with the profile.

Documents

Application Documents

# Name Date
1 202321006296-STATEMENT OF UNDERTAKING (FORM 3) [31-01-2023(online)].pdf 2023-01-31
2 202321006296-PROVISIONAL SPECIFICATION [31-01-2023(online)].pdf 2023-01-31
3 202321006296-POWER OF AUTHORITY [31-01-2023(online)].pdf 2023-01-31
4 202321006296-FORM 1 [31-01-2023(online)].pdf 2023-01-31
5 202321006296-DRAWINGS [31-01-2023(online)].pdf 2023-01-31
6 202321006296-DECLARATION OF INVENTORSHIP (FORM 5) [31-01-2023(online)].pdf 2023-01-31
7 202321006296-ENDORSEMENT BY INVENTORS [31-01-2024(online)].pdf 2024-01-31
8 202321006296-DRAWING [31-01-2024(online)].pdf 2024-01-31
9 202321006296-CORRESPONDENCE-OTHERS [31-01-2024(online)].pdf 2024-01-31
10 202321006296-COMPLETE SPECIFICATION [31-01-2024(online)].pdf 2024-01-31
11 202321006296-FORM-8 [29-02-2024(online)].pdf 2024-02-29
12 202321006296-FORM 18 [08-03-2024(online)].pdf 2024-03-08
13 Abstract1.jpg 2024-04-20
14 202321006296-FER.pdf 2025-08-08
15 202321006296-FORM 3 [08-11-2025(online)].pdf 2025-11-08

Search Strategy

1 202321006296_SearchStrategyNew_E_202315041189E_20-06-2025.pdf