Sign In to Follow Application
View All Documents & Correspondence

Method And System For Providing Indirect Visualizationaccess Of A Data Lake

Abstract: A method (400) and system (100) of providing indirect and visualization access to data of a data-lake (114) is disclosed. A user request from a user to access the data of the data lake is received. A user profile from a plurality of user profiles associated with the user is determined based on a first level of authentication. The first level of authentication is based on the domain object name and the public key. A user defined function (UDF) from a plurality of predefined UDFs associated with the user profile is determined based on the first level of authentication. A portion of the data of the data-lake (114) requested in the user request is selectively rendered based on a second level of authentication. The second level of the authentication is based on the domain object name and the private key. (To be published with FIG. 1)

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
05 April 2024
Publication Number
18/2024
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

HCL Technologies Limited
806, Siddharth, 96, Nehru Place, New Delhi - 110019, INDIA

Inventors

1. Himanshu Dubey
A8/9, Sector -60, Noida, U.P. -201301, INDIA
2. Prathameshwar Pratap Singh
A8/9, Sector -60, Noida, U.P. -201301, INDIA
3. Yogesh Gupta
A8/9, Sector -60, Noida, U.P.- 201301, INDIA

Specification

Description:DESCRIPTION
Technical Field
[001] This disclosure relates generally to data-lake management, and more particularly to a method and system for providing indirect visualization access of a data-lake.
BACKGROUND
[002] A data lake is a centralized repository designed to store, process, and secure large amounts of structured, semi-structured, and unstructured data. Data lakes are open format, low cost, and highly durable. They can be used for data analytics, business intelligence, and machine learning. Due to versatility of use of data lakes, its crucial to control access of private data stored in data lakes to maintain its confidentiality. Current methods of accessing data lakes include setting up semantic layer applications through which data may be accessed. However, such access methods require domain knowledge of access tools and a deep understanding of the actual data housed within the data lake. These methods only offer limited ways to secure data access by a user.
[003] Existing solutions suggest utilization of a semantic layer that may control how data is accessed or queried. However, such access methodologies do not prevent access of data of the data lake including private data. Since, data of the data-lake is made visible to the semantic layer, there are increased chances of the data getting leaked or manipulated.
[004] Therefore, there is a need for a method and system for providing indirect visualization access of a data-lake.
SUMMARY OF THE INVENTION
[005] In an embodiment, a method for providing indirect and visualization access to data of a data-lake is disclosed. The method may include receiving, by a processor via a semantic schema layer (SSL), a user request from a user to access the data of the data-lake. In an embodiment, the user request may include a domain-object, a public key and a private key. The method may further include determining, by the processor via the SSL, a user profile from a plurality of user profiles associated with the user based on a first level of authentication. In an embodiment, the plurality of user profiles corresponds to a plurality of users. In an embodiment, the first level of authentication may be based on the domain object name and the public key. The method may further include determining, by the processor via a user access layer (UAL), a user defined function (UDF) from a plurality of predefined UDFs associated with the user profile based on the first level of authentication. The method may further include selectively rendering, by the processor via the UAL, a portion of the data of the data-lake requested in the user request based on a second level of authentication. In an embodiment, the second level of authentication may be based on the domain object name and the private key. In an embodiment, the portion of the data may be retrieved from the data-lake based on the UDFs and the second level of authentication.
[006] In another embodiment, a system for providing indirect and visualization access to data of a data-lake. The system may include a processor, and a memory coupled to the processor. The memory stores processor-executable instructions, which, on execution, cause the processor to receive a user request from a user via a semantic schema layer (SSL) to access the data of the data-lake. In an embodiment, the user request may include a domain object name, a public key, and a private key. The processor may further determine a user profile from a plurality of user profiles associated with the user via the SSL based on a first level of authentication. In an embodiment, the plurality of user profiles may correspond to a plurality of users. In an embodiment, the first level of authentication may be based on the domain object name and the public key. The processor may further determine a user defined function (UDF) from a plurality of predefined UDFs associated with the user profile via a user access layer (UAL) based on the first level of authentication. The processor may further selectively render a portion of the data of the data-lake requested in the user request via the UAL based on a second level of authentication. In an embodiment, the second level of authentication may be based on the domain object name and the private key. In an embodiment, the portion of the data may be retrieved from the data-lake based on the UDFs and the second level of authentication.
[007] In another embodiment, a non-transitory computer-readable medium storing computer-executable instructions for providing indirect and visualization access to data of a data-lake is disclosed. The computer-executable instructions may be configured for receiving via a semantic schema layer (SSL), a user request from a user to access the data of the data-lake. In an embodiment, the user request may include a domain object name, a public key, and a private key. The computer-executable instructions configured for determining via the SSL, a user profile from a plurality of user profiles associated with the user based on a first level of authentication. In an embodiment, the plurality of user profiles may correspond to a plurality of users. In an embodiment, the first level of authentication may be based on the domain object name and the public key. The computer-executable instructions configured for determining via a user access layer (UAL), a user defined function (UDF) from a plurality of predefined UDFs associated with the user profile based on the first level of authentication. The computer-executable instructions configured for selectively rendering via the UAL, a portion of the data of the data-lake requested in the user request based on a second level of authentication. In an embodiment, the second level of the authentication may be based on the domain object name and the private key. In an embodiment, the portion of the data may be restricted from the data-lake based on the UDFs and the second level of authentication.
[008] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[009] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles.
[010] FIG. 1 illustrates a block diagram of a system for providing indirect visualization access of a data-lake, in accordance with an embodiment of the current disclosure.
[011] FIG. 2 illustrates a functional block-diagram of a computing device 102 for providing indirect visualization access of a data-lake, in accordance with an embodiment of the present disclosure.
[012] FIG. 3A illustrates an exemplary dataset housed in a data-lake, in accordance with an embodiment of the present disclosure.
[013] FIG. 3B illustrates an exemplary portion of the dataset as selectively rendered to a user, in accordance with an embodiment of the present disclosure.
[014] FIG. 4 illustrates a flowchart of a method of providing indirect and visualization access of a data-lake, in accordance with an embodiment of the present disclosure.
DETAILED DESCRIPTION OF THE DRAWINGS
[015] Exemplary embodiments are described with reference to the accompanying drawings. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope being indicated by the following claims. Additional illustrative embodiments are listed.
[016] The present disclosure provides a system and a method for providing indirect visualization access of a data-lake in order to overcome the issues of the conventional arts. FIG. 1 illustrates a block diagram of a system 100 for providing indirect and visualization access to data of a data-lake 114, in accordance with an embodiment of the present disclosure. The system 100 may include a computing device 102, a user authentication database 112, and the data-lake 114 communicably coupled to each other through a wireless communication network 110. The computing device 102 may include a processor 104, a memory 106, and input/output (I/O) device 108.
[017] In an embodiment, examples of processor(s) 104 may include, but are not limited to, an Intel® Itanium® or Itanium 2 processor(s), or AMD® Opteron® or Athlon MP® processor(s), Motorola® lines of processors, Nvidia®, FortiSOC™ system on a chip processors or other processors that can be used to execute similar functions.
[018] In an embodiment, the memory 106 may store instructions that, when executed by the processor 104, may cause the processor 104 to provide indirect and visualization access of the data-lake 114, as discussed in more detail below. In an embodiment, the memory 106 may be a non-volatile memory or a volatile memory. Examples of non-volatile memory may include, but are not limited to, a flash memory, a Read Only Memory (ROM), a Programmable ROM (PROM), Erasable PROM (EPROM), and Electrically EPROM (EEPROM) memory. Further, examples of volatile memory may include, but are not limited to, Dynamic Random Access Memory (DRAM), and Static Random-Access memory (SRAM).
[019] In an embodiment, the I/O devices 108 may include variety of interface(s), for example, interfaces for data input and output devices, and the like. The I/O devices 108 may facilitate inputting of instructions to the computing device 102 by a user. In an embodiment, the I/O devices 108 may be wirelessly connected to the computing device 102 through wireless network interfaces such as Bluetooth®, infrared, Wi-Fi, or any other wireless communication technology known in the art. In an embodiment, the I/O devices 108 may be connected to a communication pathway for one or more components of the computing device 102 to facilitate the transmission of inputted instructions and output results of data generated by various components such as, but not limited to, processor(s) 104 and memory 106. In an embodiment, the data-lake 114 may be enabled in a cloud or may be a physical database. The data-lake 114 may store structured data, unstructured data or a combination thereof.
[020] In an embodiment, the communication network 110 may be a wired or a wireless network or a combination thereof. The network 110 can be implemented as one of the different types of networks, such as but not limited to, ethernet IP network, intranet, local area network (LAN), wide area network (WAN), the internet, Wi-Fi, LTE network, CDMA network, 5G and the like. Further, the network 110 can either be a dedicated network or a shared network. The shared network represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like, to communicate with one another. Further the network 110 can include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, and the like.
[021] In an embodiment, the computing device 102 may receive a request to provide indirect visualization access of the data-lake 114 from a user using an I/O device 108. In an embodiment, the computing device 102 may be a computing system, including but not limited to, a smart phone, a laptop computer, a desktop computer, a notebook, a workstation, a portable computer, a handheld device or a mobile device.
[022] In an embodiment, the indirect visualization access of the data of the data lake 114, may be based on an authentication of a registered user profile by a semantic schema layer (SSL). In an embodiment, the SSL may act as an interface between a user and the data-lake 114. Further, a user profile may be registered by a user access layer (UAL) different from the SSL, based on defining a domain object name, a public key and a private key corresponding to each of the plurality of user profiles. In an embodiment, each of the plurality of user profiles may include a unique domain object name, a public key and one or more private keys. In an embodiment, the user authentication database 112 may store data corresponding to each of the plurality of user profiles generated based on the registration of one or more users.
[023] In an embodiment, the user profiles may be registered based on registration of users by the computing device 102 in order to access data of the data-lake 114. In an embodiment, the user authentication database 112 may include data of the plurality of user profiles of the users registered. In an embodiment, an administrator may register one or more registered users and may create one or more user profiles. In an embodiment, an administrator may include, but is not limited to, owner of an organization, head of a department, or a manager, etc.
[024] In an embodiment, the registration of the user profile may further include creation of a plurality of user defined functions (UDFs) associated with the user profile. In an embodiment, the plurality of UDFs associated with the user profile may be created via the UAL. In an embodiment, the UAL may be different from the SSL. In an embodiment, each of the plurality of UDFs may be stored in the data-lake 114 and may be determined via a query management layer (QML) of the data lake 114. In an embodiment, the plurality of UDFs may be created to provide indirect visualization access of the data of the data lake 114 via the QML. In an embodiment, the UDFs may be dynamically and created based on the domain object name associated with the user profile and stored in the data-lake 114.
[025] Further, the computing device 102 may create, via the UAL, for each of the plurality of UDFs, a unique query, and a unique schema. In an embodiment, a corresponding query and a corresponding schema associated with the UDF may be created based on an access level from a plurality of access levels associated with the user profile. In an embodiment, a plurality of access levels may be defined based on designation of a user in an entity or as defined by the administrator. In an embodiment, the plurality of access levels may include, but not limited to, an executive access, a supervisory access and an administrator access. In another embodiment, an access type associated with the plurality of user profiles may include a public access, a private access and a protected access.
[026] Accordingly, based on the registration of the user, a user profile may be created that may include a domain object name that may be assigned one of the plurality of access levels and an access type by an administrator via the UAL and saved in the user authentication database 112. Further, a plurality of UDFs may be created by the UAL associated with the corresponding user profile based on the access level and the access type associated with the user profile. In an embodiment, creation of each of the plurality of UDFs may include creation of a unique query and a unique schema. In an embodiment, a corresponding query and a corresponding schema associated with the UDF may be created based on the access level from the plurality of access levels associated with the user profile.
[027] Further, the creation of the user profile may include defining a public key and one or more private keys. In an embodiment, each UDF from the plurality of UDFs may be associated with a corresponding unique private key. In an embodiment, each of the plurality of UDFs may be accessed based on the corresponding unique private key. Notably, direct access to the data lake 114 may not be necessary for the creation of the user profile. In an embodiment, the memory 106 or the user authentication database 112 may save the plurality of user profiles generated based on the registration of the users. In an embodiment, the UDFs associated with each of the registered user profiles created for the registered users may be saved in the data-lake 114.
[028] In an embodiment, in order to provide indirect and visualization access of the data of the data-lake 114, the computing device 102 may perform various processing as described further. By way of an example, the computing device 102 may receive, via the SSL, a user request from a user to access the data of the data lake 114. In an embodiment, the user request may include a domain object name, a public key, and/or a private key.
[029] Further, the computing device 102 may determine, via the SSL, a user profile from a plurality of user profiles associated with the user based on a first level of authentication. In an embodiment, the plurality of user profiles may correspond to a plurality of registered users. In an embodiment, the first level of authentication may be based on the domain object name and the public key. In an embodiment, the first level of authentication may be performed to verify a user requesting the data access of the data lake 114. In an embodiment, the SSL may verify the public key input by the user based on the public key stored in the user authentication database 112 for the corresponding domain object name.
[030] Further, the computing device 102 may determine, via the UAL, one or more UDFs associated with the user profile based on the first level of authentication. Further, the computing device 102 may selectively render, via the UAL, a portion of the data of the data lake 114 requested in the user request based on a second level of authentication. In an embodiment, the second level of the authentication may be based on the domain object name and the private key. In an embodiment, the portion of the data may be retrieved from the data-lake 114 based on the UDFs and the second level of authentication. Accordingly, the computing device 102 may determine the UDF based on a comparison of the private key input by the user with the private key stored in the user authentication database 112 associated with the corresponding UDFs associated with the user profile.
[031] In an embodiment, the portion of the data may be retrieved from the data-lake 114 by establishing a connection between the UAL and the QML and by executing the corresponding query associated with the UDF. It is to be noted that the SSL may facilitate only the first level of authentication and the second level of authentication. Based on the first level of authentication, the corresponding one or more UDFs associated with the user profile stored in the data-lake 114 may be determined. Further, based on the second level of authentication the query associated with one UDF from the one or more UDFs may be executed.
[032] In an embodiment, the corresponding query of the UDF from the plurality of UDFs may enable selective retrieval of the portion of the data of the data-lake 114 by encrypting, masking, and hiding of the data of the data-lake 114. In an embodiment, the corresponding schema of the UDF from the plurality of UDFs may enable the selective rendering of the portion of the data of the data-lake 114 based on a data representation format. The data representation format may include selection of a format and a structure of data in each of a plurality of data fields in a table representing the portion of the data.
[033] Referring now to FIG. 2, a functional block-diagram of a computing device 102 for providing indirect visualization access of a data-lake 114 is illustrated, in accordance with an embodiment of the present disclosure. FIG. 2 is explained in conjunction with FIG. 1. The computing device 102 may include a set of components. The set of components may include a user access layer (UAL) 202, a semantic schema layer (SSL) 210, a data-lake connectivity driver 216 and a query management layer (QML) 218.
[034] The UAL 202 may include a user profile management module 204 and a user defined function management module 206. In an embodiment, the user profile management module 204 may allow registration of one or more users for providing indirect and visualization access of data of the data-lake 114. In an embodiment, the registration of a user may include defining a domain object name, a public key and a private key corresponding to a user profile. In an embodiment, each of the plurality of user profiles may include a unique domain object name, a public key and one or more private keys. In an embodiment, each of the plurality of user profiles may include a plurality of user attributes associated with each of the plurality of users. The plurality of user attributes may include, but are not limited to, at least one of employee ID, organization ID, team ID, business unit ID, user location, current designation, department, user type, access level and access type. In an embodiment, the domain object name may be user-defined or may be selected as, but not limited to, the employee ID. In an embodiment, the user authentication database 112 may store data corresponding to each of the plurality of user profiles generated based on the registration of one or more users.
[035] Accordingly, the user profile management module 204 may allow a user to register by creating a user profile based on input of the plurality of user attributes and a domain object name, a public key and/or a private key. In one embodiment, the user profile management module 204 may allow an administrator having an administrator access to assign one of the plurality of access levels and/or an access type via the UAL 202. In an embodiment, the plurality of access levels may include, but are not limited to, an executive access, a supervisory access and an administrator access. In an embodiment, an access type associated with the plurality of user profiles may include a public access, a private access and a protected access. In an embodiment, a plurality of access levels may be dynamically defined based on designation of a user in an entity or as defined by the administrator. For example, the administrator may allow access of sales data stored in the data-lake 114 to the users of an accounts team of an entity. Further, the sales data may also be shared with the marketing team and the fulfillment team of the entity. In an embodiment, the sales data may include order information and cost information may include personal data and credit card information of buyers. Accordingly, personal data and credit card information of the buyers may be labeled as sensitive information and may not be shared with users which belong to the marketing team and the fulfillment team of the entity. In an embodiment, an executive access may allow access of data corresponding to all members belonging to a particular department, a supervisory access may allow access of data corresponding to a plurality of departments and an administrator access may include access of data corresponding to all users. Further, the data of the data-lake 114 may be segregated as public data, private data, and protected data. Accordingly, an access type of the public access may allow access of public data, private access may allow access of private data and protected access may allow access of protected data. In an embodiment, public access may allow access of public data only by hiding, masking or encrypting the protected data and the private data. Further, the protected access may allow access to public data and the protected data by hiding, masking or encrypting the private data. In an embodiment, the private access may allow access to public data, protected data and the private data. In an embodiment, masking may be performed using techniques such as, but not limited to, data obfuscation. In an embodiment, encryption of the data may be performed using cryptographic algorithms, such as, but not limited to, AES, RSA, etc.
[036] Further, while registering the user profile, the user defined function (UDF) management module 206 may dynamically create one or more UDFs associated with the user profile. In an embodiment, the one or more UDFs may be dynamically created based on the user profile attributes, the access level and the access type.
[037] In an embodiment, in order to define a UDF associated with the user profile a unique query and a unique schema may be defined or created. In an embodiment, the administrator may define one or more UDFs based on the user profile attributes, access level and access type.
[038] In an embodiment, the unique query may be defined using the data lake connectivity driver 216 via the query management layer 218. In an embodiment, the unique query may be selected dynamically from a plurality of predefined queries. In an embodiment, the unique query may be defined based on customization of a predefined query from the plurality of predefined queries based on the access level and access type of the user.
[039] Further, the UDF management module 206 may define the unique schema that may allow the data retrieved from the data-lake 114 using the unique query. The schema may allow the portion of data retrieved to be selectively rendered based on a data representation format. In an embodiment, the data representation format may define how the portion of data may be selectively rendered to a user on a display device. In an embodiment, the data representation format defines a format and a structure of data in each of a plurality of data fields in a table representing the portion of the data to be selectively rendered. Further, the unique schema may ensure that the portion of the data to be rendered to a user is in accordance with the access level and/or the access type of the user.
[040] Further, each of the UDFs may be associated with a private key that may be generated using, but not limited to, a key generation algorithm and provided to the registered user. In one embodiment, the user may be allowed to define or input a private key during registration and creation of the UDFs. It is to be noted that the UDFs associated with a user profile during registration may be saved in the data-lake 114.
[041] In an embodiment, the SSL 210 may include a user request management module 212 and a user authentication module 214. The user request management module 212 may receive a user request from a user to access the data of the data-lake 114. In an embodiment, the user request may include a domain object name, a public key, and a private key.
[042] Further, the user authentication module 214 may perform first level of authentication based on the domain object name and the public key received from the user by the user request management module 212. Further, the user request management module 212 may determine a user profile associated with the user from a plurality of user profiles stored in the user authentication database 112 based on a success of the first level of authentication performed by the user authentication module 214. In an embodiment, the user authentication module 214 may map the domain object name and the public key received with the plurality of user profiles of the plurality of users registered by the user profile management module 204. In an embodiment, the first level of authentication may be based on verification of the mapping of the domain object name and the public key. Once the user profile may be determined, the user authentication module 214 may communicate the success of the first level of authentication with the UDF management module 206 of the UAL 202. The UDF management module 206 may then determine a one or more UDFs associated with the user profile based on the first level of authentication.
[043] Further, the user authentication module 214 may perform a second level of authentication based on determining a UDF from the one or more UDFs corresponding to which the private key input by the user may match. Based on a successful second level of authentication and determination of the UDF, the UDF management module 206 may execute the unique query of the UDF. The execution of the unique query may be performed by communicably connecting to the data-lake 114 via the data lake connectivity driver 216 and via the query management layer 218. In an embodiment, the data-lake connectivity driver 216 may facilitate a connection between the QML 224 and the data-lake 114 to access and interact with the data housed in the data lake.
[044] Further, the data rendering module 208 may selectively display a portion of the data of the data-lake 114 queried using the unique query from the data-lake 114. In an embodiment, the portion of data may be rendered based on the unique schema associated with the UDF executed.
[045] Accordingly, the data rendering module 208 may selectively render a portion of the data of the data-lake 114 requested in the user request based on a second level of authentication. In an embodiment, the second level of the authentication may be based on the domain object name and the private key. In an embodiment, the portion of the data may be retrieved from the data-lake 114 based on the UDF and the second level of authentication by establishing a connection between the UAL 202 and the QML 218 and by executing the corresponding unique query associated with the UDF. Accordingly, the SSL 210 may be integrated with any external platforms without having to worry about exposing sensitive data of the data-lake 114.
[046] In an embodiment, the execution of the corresponding unique query of the UDF from the plurality of UDFs may enable selective retrieval of the portion of the data of the data-lake 114 by encrypting, masking, and hiding of the data of the data-lake 114. In an embodiment, the corresponding schema of the UDF from the plurality of UDFs enables the data rendering module 208 to selective render the portion of the data of the data-lake 114 based on a data representation format. In an embodiment, the data representation format may include selection of a format and a structure of data in each of a plurality of data fields in a table representing the portion of the data. It should be noted that the data rendered may provide an indirect visualization access of the portion of the data of the data-lake 114. Thus, the user may not be able to alter the data of the data-lake 114. Referring now to a FIG. 3A, an exemplary dataset 300A housed in a data-lake 114 is illustrated, in accordance with an embodiment of the present disclosure. In an exemplary embodiment, the exemplary dataset 300A housed in a data-lake 114 may be customer transaction data requested by the user via the user request management module 212 for various analytical and operational purposes. As can be seen in FIG. 3A, the dataset 300A may include a plurality of columns 302-322 providing information such as first name 302, last name 304, email 306, occupation 308, income level 310, identifier 312, product name 314, product type 316, merchant name 318, description 320, amount 322. As can be seen, the dataset 300A includes columns including private data such as email 306, occupation 308, identifier 312.
[047] Referring now to FIG. 3B, an exemplary portion of dataset 300B that may be selectively rendered to a user is illustrated, in accordance with the exemplary embodiment of FIG. 3A. In continuation to the exemplary embodiment of FIG. 3A, based on the successful first level of authentication and the second level of authentication by the user authentication module 214. The UDF management module 206 may determine a UDF based on the second level of authentication and execute a unique query that may enable access of the dataset 300A from the data-lake 114. The data rendering module 208 may selectively retrieve the dataset 300B based on the unique schema defined in the UDF. As can be seen in FIG. 3B, the dataset 300B may be depicted as a table that may be rendered to the user by encrypting the data of column 306, masking the data of column 308 and hiding the data of column 312 of table 300A. Further, the table 300B may be rendered based on a data representation format predefined in the UDF. As can be seen, the data of the columns and rows may be formatted and structured to present the data in a legible manner. As can be seen, the data of column 322 may be formatted and structured to depict currency in “$”.
[048] Accordingly, the dataset 300A housed within the data-lake 114 may encapsulate customer transaction data, crucial for various analytical and operational purposes. In an embodiment while creating the UDF for access of the dataset 300A, columns within this dataset 300A including private data may be masked, encrypted, and hidden from users based on access level and/or access type of the user. Accordingly, the portion of dataset depicted in table 300B may be retrieved by masking, encrypting and hiding the private data may ensure that only the authorized personnel can view and analyze data, thereby upholding data integrity and confidentiality.
[049] As will be appreciated by one skilled in the art, a variety of processes may be employed for providing indirect visualization access of a data-lake 114. For example, the exemplary system 100 and the associated computing device 102 may provide indirect visualization access of a data-lake 114 by the processes discussed herein. In particular, as will be appreciated by those of ordinary skill in the art, control logic and/or automated routines for performing the techniques and steps described herein may be implemented by the system 100 and the associated computing device 102 either by hardware, software, or combinations of hardware and software. For example, suitable code may be accessed and executed by the one or more processors on the system 100 to perform some or all of the techniques described herein. Similarly, application specific integrated circuits (ASICs) configured to perform some, or all of the processes described herein may be included in the one or more processors on the system 100.
[050] Referring now to FIG. 4, a flowchart 400 of a methodology of providing indirect and virtualization access of data of the data-lake 114 is illustrated, in accordance with an embodiment of the present disclosure. In an embodiment, method 400 may include a plurality of steps that may be performed by the processor 104 to provide indirect visualization access of a data-lake 114.
[051] FIG. 4 is explained in conjunction with FIGs. 1 and 2. Each step of the flowchart 400 may be executed by various modules of the computing device 102.
[052] At step 402, a user request may be received from a user to access the data of the data-lake 114. In an embodiment the user request may include a domain object name, a public key and a private key. Further at step 404, a user profile from a plurality of user profiles associated with the user may be determined based on a first level of authentication. In an embodiment, the plurality of user profiles may correspond to a plurality of users. In an embodiment, the plurality of user profiles may be generated based on registration of the plurality of users. In an embodiment, the first level of authentication may be based on the domain object name and the public key.
[053] Further at step 406, a user defined function (UDF) from a plurality of predefined UDFs associated with the user profile may be determined based on the first level of authentication. In an embodiment, each of a plurality of UDFs associated with the user profile may be created based on the registration of the user. In an embodiment, the each of the plurality of UDFs may be communicatively coupled to a query management layer (QML) of the data-lake 114 and may be stored in the data-lake 114.
[054] Further the creation of a UDF from the plurality of UDFs may include creation of a unique query and a unique schema. In an embodiment, a corresponding query and a corresponding schema may be associated with the UDF may be created based on an access level from a plurality of access levels associated with the user profile.
[055] Further at step 408, a portion of the data of the data-lake 114 requested in the user request may be selectively rendered based on a second level of authentication. In an embodiment, the second level of authentication may be based on the domain abject name and the private key. In an embodiment, the portion of the data may be retrieved from the data-lake 114 based on the UDFs and the second level of authentication. In an embodiment, the portion of the data may be retrieved from the data-lake 114 by establishing a connection between the UAL and the QML and by executing the corresponding query associated with the UDF.
[056] In an embodiment, the corresponding query of the UDF from the plurality of UDFs may enable selective retrieval of the portion of the data of the data-lake 114 by encrypting, masking, and hiding of the data of the data-lake 114. In an embodiment, the corresponding schema of the UDF from the plurality of UDFs may enable the selective rendering of the portion of the data of the data-lake 114 based on a data representation format. In an embodiment, the data representation format may include selection of a format and a structure of data in each of a plurality of data fields in a table representing the portion of the data.
[057] Thus, the disclosed method and system tries to overcome the technical problem associated with secure data handling in data lakes, by proposing a framework for managing sensitive data while ensuring compliance with data privacy regulations and enhancing user experience. The disclosed method and system provide secure data retrieval for secure read-only access to the data lake. By incorporating encryption, masking, and access controls, the disclosed method and system addresses secure data access, personalized data retrieval, enhanced data security and privacy, and controlled access and authorization.
[058] The disclosed method and system introduce a novel solution for generating a data blind and schema-conscious semantic layer to facilitate data extraction from data lakes. This involves creating dynamic functions at the individual user level to encapsulate object definitions, access controls, and user personas. Additionally, a handshake mechanism based on public and private keys is implemented to reinforce data security and regulate access.
[059] In light of the above-mentioned advantages and the technical advancements provided by the disclosed method and system, the claimed steps as discussed above are not routine, conventional, or well understood in the art, as the claimed steps enable the following solutions to the existing problems in conventional technologies. Further, the claimed steps bring an improvement in the functioning of the device itself as the claimed steps provide a technical solution to a technical problem.
[060] The specification has described a method and system for providing indirect visualization access of a data-lake 114. The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for the purpose of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments.
[061] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term "computer-readable medium" should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, non-volatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[062] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
, Claims:CLAIMS
I/We Claim:
1. A method (400) for providing indirect and visualization access to data of a data-lake (114), the method comprising:
receiving (402), by a processor (104) and via a semantic schema layer (SSL) (210), a user request from a user to access the data of the data-lake (114), wherein the user request comprises a domain object name, a public key and a private key;
determining (404), by the processor (104) and via the SSL (210), a user profile from a plurality of user profiles associated with the user based on a first level of authentication,
wherein the plurality of user profiles corresponds to a plurality of users, and
wherein the first level of authentication is based on the domain object name and the public key;
determining (406), by the processor (104) and via a user access layer (UAL) (202), a user defined function (UDF) from a plurality of predefined UDFs associated with the user profile based on the first level of authentication; and
selectively rendering (408), by the processor (104) and via the UAL (202), a portion of the data of the data-lake (114) requested in the user request based on a second level of authentication,
wherein the second level of the authentication is based on the domain object name and the private key, and
wherein the portion of the data is retrieved from the data-lake (114) based on the UDFs and the second level of authentication.

2. The method (400) of claim 1, further comprising:
creating, by the processor (104) and via the UAL (202), each of the plurality of UDFs associated with the user profile,
wherein the each of the plurality of UDFs are communicatively coupled to a query management layer (QML) (218) of the data-lake (114),
creating, by the processor (104) and via the UAL (202), for each of the plurality of UDFs, a unique query and a unique schema.

3. The method (400) of claim 2, wherein a corresponding query and a corresponding schema associated with the UDF are created based on an access level from a plurality of access levels associated with the user profile.

4. The method (400) of claim 3, wherein the portion of the data is retrieved from the data-lake (114) by establishing a connection between the UAL (202) and the QML (218) and by executing the corresponding query associated with the UDF.

5. The method (400) of claim 3, wherein the corresponding query of the UDF from the plurality of UDFs enables selective retrieval of the portion of the data of the data-lake (114) by encrypting, masking, and hiding of the data of the data-lake (114).

6. The method (400) of claim 3, wherein the corresponding schema of the UDF from the plurality of UDFs enables the selective rendering of the portion of the data of the data-lake (114) based on a data representation format.

7. The method (400) of claim 6, wherein the data representation format comprises selection of a format and a structure of data in each of a plurality of data fields in a table representing the portion of the data.

8. The method (400) as claimed in claim 1, wherein the public key is stored within the SSL (210) and the private key is input by the user.

9. A system (100) for providing indirect and visualization access to data of a data-lake (114), comprising:
a processor (104); and
a memory (106) coupled to the processor (104), wherein the memory (106) stores processor-executable instructions, which, on execution, cause the processor (104) to:
receive a user request from a user via a semantic schema layer (SSL) (210) to access the data of the data-lake (114), wherein the user request comprises a domain object name, a public key and a private key;
determine a user profile from a plurality of user profiles associated with the user via the SSL (210) based on a first level of authentication,
wherein the plurality of user profiles corresponds to a plurality of users, and
wherein the first level of authentication is based on the domain object name and the public key;
determine a user defined function (UDF) from a plurality of predefined UDFs associated with the user profile via a user access layer (UAL) (202) based on the first level of authentication; and
selectively render a portion of the data of the data-lake (114) requested in the user request via the UAL (202) based on a second level of authentication,
wherein the second level of authentication is based on the domain object name and the private key, and
wherein the portion of the data is retrieved from the data-lake (114) based on the UDFs and the second level of authentication.

10. The system of claim 9, wherein the processor (104) is configured to:
create each of the plurality of UDFs via the UAL (202) associated with the user profile,
wherein the each of the plurality of UDFs are communicatively coupled to a query management layer (QML) (218) of the data-lake (114),
create a unique query and a unique schema for each of the plurality of UDFs via the UAL (202).

11. The system of claim 10, wherein a corresponding query and a corresponding schema associated with the UDF are created based on an access level from a plurality of access levels associated with the user profile.

12. The system of claim 11, wherein the portion of the data is retrieved from the data-lake (114) by establishing a connection between the UAL (202) and the QML (218) and by executing the corresponding query associated with the UDF.

13. The system of claim 11, wherein the corresponding query of the UDF from the plurality of UDFs enables selective retrieval of the portion of the data of the data-lake (114) by encrypting, masking, and hiding of the data of the data-lake (114).

14. The system of claim 11, wherein the corresponding schema of the UDF from the plurality of UDFs enables the selective rendering of the portion of the data of the data-lake (114) based on a data representation format.

15. The system of claim 14, wherein the data representation format comprises selection of a format and a structure of data in each of a plurality of data fields in a table representing the portion of the data.

16. The system of claim 9, wherein the public key is stored within the SSL (210) and the private key is input by the user.

17. A non-transitory computer-readable medium storing computer-executable instructions for providing indirect and visualization access to data of a data-lake (114), the computer-executable instructions configure for:
receiving, via a semantic schema layer (SSL), a user request from a user to access the data of the data-lake (114), wherein the user request comprises a domain object name, a public key and a private key;
determining, via the SSL (210), a user profile from a plurality of user profiles associated with the user based on a first level of authentication,
wherein the plurality of user profiles corresponds to a plurality of users, and
wherein the first level of authentication is based on the domain object name and the public key;
determining, via a user access layer (UAL) (202), a user defined function (UDF) from a plurality of predefined UDFs associated with the user profile based on the first level of authentication; and
selectively rendering, via the UAL (202), a portion of the data of the data-lake (114) requested in the user request based on a second level of authentication,
wherein the second level of the authentication is based on the domain object name and the private key, and
wherein the portion of the data is retrieved from the data-lake (114) based on the UDFs and the second level of authentication.

18. The non-transitory computer-readable medium of claim 17, wherein, the computer-executable instructions configure for:
creating, via the UAL (202), each of the plurality of UDFs associated with the user profile,
wherein the each of the plurality of UDFs are communicatively coupled to a query management layer (QML) (218) of the data-lake (114),
creating, via the UAL (202), for each of the plurality of UDFs, a unique query and a unique schema.

19. The non-transitory computer-readable medium of claim 18, wherein a corresponding query and a corresponding schema associated with the UDF are created based on an access level from a plurality of access levels associated with the user profile.

20. The non-transitory computer-readable medium of claim 19, wherein the portion of the data is retrieved from the data-lake (114) by establishing a connection between the UAL (202) and the QML (218) and by executing the corresponding query associated with the UDF.

Documents

Application Documents

# Name Date
1 202411028116-STATEMENT OF UNDERTAKING (FORM 3) [05-04-2024(online)].pdf 2024-04-05
2 202411028116-REQUEST FOR EXAMINATION (FORM-18) [05-04-2024(online)].pdf 2024-04-05
3 202411028116-REQUEST FOR EARLY PUBLICATION(FORM-9) [05-04-2024(online)].pdf 2024-04-05
4 202411028116-PROOF OF RIGHT [05-04-2024(online)].pdf 2024-04-05
5 202411028116-POWER OF AUTHORITY [05-04-2024(online)].pdf 2024-04-05
6 202411028116-FORM-9 [05-04-2024(online)].pdf 2024-04-05
7 202411028116-FORM 18 [05-04-2024(online)].pdf 2024-04-05
8 202411028116-FORM 1 [05-04-2024(online)].pdf 2024-04-05
9 202411028116-FIGURE OF ABSTRACT [05-04-2024(online)].pdf 2024-04-05
10 202411028116-DRAWINGS [05-04-2024(online)].pdf 2024-04-05
11 202411028116-DECLARATION OF INVENTORSHIP (FORM 5) [05-04-2024(online)].pdf 2024-04-05
12 202411028116-COMPLETE SPECIFICATION [05-04-2024(online)].pdf 2024-04-05
13 202411028116-Power of Attorney [01-08-2024(online)].pdf 2024-08-01
14 202411028116-Form 1 (Submitted on date of filing) [01-08-2024(online)].pdf 2024-08-01
15 202411028116-Covering Letter [01-08-2024(online)].pdf 2024-08-01
16 202411028116-FER.pdf 2025-11-25

Search Strategy

1 202411028116_SearchStrategyNew_E_202411028116E_21-11-2025.pdf