Sign In to Follow Application
View All Documents & Correspondence

Computing Device, System, And Method Therefor Achieving Efficient Data Communication In Columnar Databases

Abstract: In one implementation, the present invention discloses computing device, system, and method that enable to reduce the row construction cost in the server by pushing columns to client. The method, in response to a request received, preferably, from at least one client device, or an application program, to access the data stored in column-store relational database, comprises responding, by a processor of one or more computing device, with at least one column in compressed form along with at least a meta-information associated with at least one technique used for compressing the column to the request, wherein the column, matching with at least one condition received in the request, is transmitted in the compressed form to the client device or the application program.   (TO BE PUBLISHED WITH FIGURE 2&6)

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
16 August 2016
Publication Number
08/2018
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
cal@patentindia.com
Parent Application
Patent Number
Legal Status
Grant Date
2022-06-21
Renewal Date

Applicants

HUAWEI TECHNOLOGIES INDIA PVT. LTD.
SYNO 37, 46, 45/3, 45/4 ETC., KNO 1540, Kundalahalli Village, Bengaluru, Karnataka – 560 037, India

Inventors

1. BEHERA, Mahesh Kumar
SYNO 37, 46, 45/3, 45/4 ETC., KNO 1540, Kundalahalli Village, Bengaluru, Karnataka – 560 037, India
2. SIVAKUMAR, Kalyan
SYNO 37, 46, 45/3, 45/4 ETC., KNO 1540, Kundalahalli Village, Bengaluru, Karnataka – 560 037, India
3. RAMAMURTHI, Prasanna Venkatesh
SYNO 37, 46, 45/3, 45/4 ETC., KNO 1540, Kundalahalli Village, Bengaluru, Karnataka – 560 037, India

Specification

Claims:1.A computing device storing data in a column-store relational database, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns, the computing device comprising:
a processor;
a memory coupled to the processor for executing a plurality of modules present in the memory, the plurality of modules comprising:
a receiving module configured to receive at least one request to access the data stored in the column-store relational database;
a checking module configured to check if the at least one condition received in the request is being satisfied by the set of column groups, wherein the condition is being satisfied if the set of column groups or a part of the set of column groups matches the at least one condition as received in the request;
a materialization module configured to materialize the set of column groups or the part of the column satisfying the condition; and
a transmitting module configured to transmit the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, in response to the request received.

2. The computing device as claimed in claim 1, wherein the data stored in a column-store relational database is preferably in compressed column oriented format.

3. The computing device as claimed in claim 1, wherein the checking module is further configured to sequentially access the memory storing the column-store relational database, preferably, by applying a service management automation (SMA).

4. The computing device as claimed in claim 1, further comprises:
a decompression module configured to decompress the set of column groups if the part of the column matches the condition, wherein the part of the column is present in the set of column groups;
a filter module configured to filter the part of the column matching the condition selected from the set of column groups;
a compression module configured to compress the part of the column filtered; and
transmitting, by the transmitting module, the part of the column filtered along with a meta-information associated with at least one technique used for compressing the columns, in response to the request received.

5. The computing device as claimed in claim 1, wherein the meta-information is transmitted separately from the set of column groups or the part of the column in the compressed form.

6. The computing device as claimed in claim 1, wherein the meta-information is embedded in the set of column groups or the part of the column in the compressed form and thereafter transmitted.

7. The computing device as claimed in claim 1, wherein the request to access the data stored in column-store relational database is received, preferably, from at least one client device, or an application program.

8. A method of transmitting data stored in a column-store relational database, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns, the method comprising:
in response to a request received, preferably, from at least one client device, or an application program to access the data stored in the column-store relational database:
responding, by a processor of one or more computing device, with the set of column groups or a part of the set of column groups in compressed form along with at least a meta-information associated with at least one technique used for compressing the column to the request, wherein the column, matching with at least one condition received in the request, is transmitted in the compressed form to the client device or the application program.

9. The method as claimed in claim 8, wherein the data stored in a column-store relational database is preferably in compressed column oriented format.

10. The method as claimed in claim 8, wherein the meta-information is transmitted separately from the set of column groups or the part of the column in the compressed form.

11. The method as claimed in claim 8, wherein the meta-information is embedded in the set of column groups or the part of the column in the compressed form and thereafter transmitted.

12. A method of transmitting data stored in a column-store relational database, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns, the method comprising:
receiving, by at least one processor of at least one computing device, at least one request to access the data stored in the column-store relational database;
checking, by the processor, if the at least one condition received in the request is being satisfied by the set of column groups, wherein the condition is being satisfied if the set of column groups or a part of the set of column groups matches the condition as received in the request;
materializing, by the processor, the set of column groups or the part of the column satisfying the condition;
transmitting, by the processor, the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, in response to the request received.

13. The method as claimed in claim 12, wherein the data stored in a column-store, relational database is preferably in compressed column oriented format.

14. The method as claimed in claim 12, wherein checking comprises sequential access of at least a memory storing the relational database, preferably, by applying a service management automation (SMA).

15. The method as claimed in claim 12 and claim 13, wherein if the part of the column matches the condition, the method further comprises:
decompressing, by the processor, the set of column groups, wherein the part of the column is present in the set of column groups;
filtering, by the processor, the part of the column matching the condition selected from the set of column groups;
compressing, by the processor, the part of the column filtered;
transmitting, by the processor, the part of the column filtered along with a meta-information associated with at least one technique used for compressing the columns, in response to the request received.

16. The method as claimed in claim 12, wherein the meta-information is transmitted separately from the set of column groups or the part of the column in the compressed form.

17. The method as claimed in claim 12, wherein the meta-information is embedded in the set of column groups or the part of the column in the compressed form and thereafter transmitted.

18. The method as claimed in claim 12, comprises receiving the request to access the data stored in column-store relational database, preferably, from at least one client device, or an application program.

19. A system comprising:
at least one client device or application program to transmit at least one query to access data stored in column-store relational database;
at least one computing device storing data in a column-store, relational database, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns, the computing device comprising:
a receiving module configured to receive the request from the client device to access the data stored in column-store relational database;
a checking module configured to check if the at least one condition received in the request is being satisfied by the set of column groups, wherein the condition is being satisfied if at least the set of column groups or a part of the set of column groups matches the condition as received in the request;
a materialization module configured to materialize the set of column groups or the part of the column satisfying the condition; and
a transmitting module configured to transmit the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, to the client device; and
the client device, on receipt of a response from the computing device configured decompress the set of column groups or the part of the column received in the compress form using the meta-information received from the client device.
, Description:TECHNICAL FIELD

The present subject matter described herein, in general, relates to database technology, and more particularly, to devices, systems and methods for achieving efficient data communication in columnar databases.

BACKGROUND

Databases can perform large numbers of concurrent transactions involving corresponding data. In traditional client server database deployment model, the client sends a request to the server using a particular communication channel; the server processes the request and sends the result back to the client over same channel. The server process the request received from the client by constructing the results as rows and then sends the result constructed to the client.

A relational database management system (DBMS) provides data that represents a two-dimensional table, of columns and rows. Based on the storage of the data, the DBMS is broadly divided into two types, a column-oriented or columnar databases and a row-oriented or row stored databases. A column-oriented DBMS is a database management system (DBMS) that stores data tables as sections of columns of data rather than as rows of data. In comparison, most relational DBMSs store data in rows. This column-oriented DBMS has advantages for data warehouses, customer relationship management (CRM) systems, and library card catalogs, and other ad hoc inquiry systems where aggregates are computed over large numbers of similar data items.

The traditional DBMS systems store the data in disk and load the data into memory as and when required. Maximum number of the times, the data is stored in a row oriented format, where all the attributes of a record are stored together. With the increase in the data to be stored and increasing use of data for decision support systems, the row oriented format is changed to column oriented format. In this format, attributes of all the records are stored together and thus it is easy to do aggregation and analysis on same attribute. Further, the compression techniques are added to reduce the working data size further. Many algorithms are used to do the query processing directly on the compressed data to avoid the cost of decompression. Once data is scanned and aggregated, the result set is prepared and the result is sent back to the user in row oriented format. So the result formation, which requires conversion from column to row format, has become the biggest bottleneck.

Generally, in case of DBMS using the row-oriented or row store database approach for storing and transmitting the data, the server constructs the result in the form of rows and then sends this results to the client. This phase of constructing the result in the form of rows is called as row construction. Row construction is natural for the row-store databases. However, for column stores, the row construction is one of the most costly operations. As the size of the processing system increases, the time to access a memory location also increases. For example, with a very large system with 100’s of GB of physical memory, distributed in many asymmetrically designed memory location, the cost to access a memory location is in order of times costlier than access data from local memory cache. As per the experimental data, the cost of memory access from a remote location is in order of 10-15 times of that memory access from L1 cache of a processor. This cost increases with increase in the size of the system and the size of physical memory. So cache locality is an important concept for in-memory system. In-memory column stores are an architecture which takes advantage of this concept.

The entire leading columned stored database supports compressions to accommodate more data in the memory. However, they do not transmit this benefit to the user/client. With increase in data analysis complexity, the database user application has become more complex and is more tightly coupled to the data. Even though the data is stored, processed in compressed column oriented format, all the current DBMS decompresses the data to form the result in row oriented format. This leads to the extra work to be done at server side. This also prohibits the database users to take full advantages of compression techniques used in database servers. As the data is transferred from server to client over network in uncompressed row oriented format, the communication time also increases drastically. With database servers deployed on cloud and clients connected over internet, the cost increases further.

Similarly, the column stores also exploit the fact that the column information, especially in an Online Analytical Processing (OLAP) system, is a lot duplicated. By using this fact there is a need to achieve better compressions in a column store as compared to a row store. There is a need to utilize the advantage of column storage until the client side. Further, with more and more databases deployed in cloud, the cost of data transfer between database server and client is becoming more significant. Many applications using DBaaS does a lot of computation on the data at client side. This usage has started a new trend of DB usage with DB used as a store and lots of computation happening over client side. With data from DB to be transferred over network, it is always better to transfer the compressed data to client and put logic on client side to decompress and process the data as per the usage. There is also a need for the database to provide data to the application in a format suitable for application to operate on the data using latest hardware technology like Single Instruction Multiple Data (SIMD), Non-uniform memory access (NUMA) etc.

SUMMARY

This summary is provided to introduce concepts related to computing device, system, and method for achieving efficient data communication in columnar databases, which are further described below in the detailed description. This summary is not intended to identify essential features of the claimed subject matter nor is it intended for use in determining or limiting the scope of the claimed subject matter.

In order to provide a technical solution to the above mentioned technical problems in the prior-art, one aspect of the present invention is to provide a computing device, a system, and a method to reduce the row construction cost in at server side by transmitting / pushing the compressed columns to the client accessing the database. Further, the computing device, the system, and the method also enables to push the data in compressed format to reduce the data communication cost. Also, if the client is technically advanced enough, the compressed data can be processed at client side which allows the client to keep more data in system cache and this increases its overall performance.

Another aspect of the present invention is to provide the computing device, the system, and the method to reduce the row construction cost in the server by pushing columns to client.

Another aspect of the present invention is to provide the computing device, the system, and the method for transferring data from a database server to a connected client in compressed columnar form.

Yet another aspect of the present invention is to provide the computing device, the system, and the method for transferring data from a database server to a connected client in compressed columnar form wherein the table data is stored in compressed format and a metadata/ meta-information is stored to execute the database queries directly on the compressed data.

Yet another aspect of the present invention is to provide the computing device, the system, and the method to query the compressed data from the columnar database to form a compressed result in columnar format.

Yet another aspect of the present invention is to provide the computing device, the system, and the method compressed columnar data is sent to the client for further processing. No preprocessing is to be done at the server side for the result data.

Still another aspect of the present invention is to enable the client to receive the compressed columnar data and thereby operate on it using the supported metadata or meta-information.

Accordingly, in one implementation, a computing device storing data in a column-store relational database is disclosed. The column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns. The computing device comprises a processor, and a memory coupled to the processor for executing a plurality of modules present in the memory. The plurality of modules comprises a receiving module, a checking module, a materialization module, and a transmitting module. The receiving module is configured to receive at least one request to access the data stored in the column-store relational database. The checking module is configured to check if the at least one condition received in the request is being satisfied by the set of column groups, wherein the condition is being satisfied if the set of column groups or a part of the set of column groups matches the at least one condition as received in the request. The materialization module is configured to materialize the set of column groups or the part of the column satisfying the condition. The transmitting module configured to transmit the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, in response to the request received.

In one implementation, a computing device storing data in a column-store, relational database, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns is disclosed. The computing device comprises a processor, and a memory coupled to the processor for executing a plurality of modules present in the memory. The plurality of modules comprises a receiving module configured to receive at least one request to access the data stored in column-store relational database; an execution module configured to execute at least one condition received in the request; a checking module configured to check if the condition is being satisfied by the columns, wherein the condition is being satisfied if at least the set of column groups or a part of the column matches the condition as received in the request; a materialization module configured to materialize the set of column groups or the part of the column satisfying the condition; and a transmitting module configured to transmit the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, in response to the request received.

In one implementation, a method of transmitting data stored in a column-store, relational database, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns is disclosed. The method comprises, in response to a request received, preferably, from at least one client device, or an application program, to access the data stored in column-store relational database: responding, by a processor of one or more computing device, with at least one column in compressed form along with at least a meta-information associated with at least one technique used for compressing the column to the request, wherein the column, matching with at least one condition received in the request, is transmitted in the compressed form to the client device or the application program.

In one implementation, a method of transmitting data stored in a column-store relational database is disclosed. The method comprises receiving, by at least one processor of at least one computing device, at least one request to access the data stored in the column-store relational database; checking, by the processor, if the at least one condition received in the request is being satisfied by the set of column groups, wherein the condition is being satisfied if the set of column groups or a part of the set of column groups matches the condition as received in the request; materializing, by the processor, the set of column groups or the part of the column satisfying the condition; and transmitting, by the processor, the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, in response to the request received.

In one implementation, a method of transmitting data stored in a column-store, relational database, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns is disclosed. The method comprises:
• receiving, by at least one processor of at least one computing device, at least one request to access the data stored in column-store relational database;
• executing, by the processor, at least one condition received in the request;
• checking, by the processor, if the condition is being satisfied by the columns, wherein the condition is being satisfied if at least the set of column groups or a part of the column matches the condition as received in the request;
• materializing, by the processor, the set of column groups or the part of the column satisfying the condition;
• transmitting, by the processor, the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, to the client device.

In one implementation, a system is disclosed. The system comprises at least one client device or application program to transmit at least one query to access data stored in column-store relational database, and at least one computing device storing data in a column-store, relational database, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns. The computing device further comprises a receiving module configured to receive the request from the client device to access the data stored in column-store relational database; an execution module configured to execute at least one condition received in the request; a checking module configured to check if the condition is being satisfied by the columns, wherein the condition is being satisfied if at least the set of column groups or a part of the column matches the condition as received in the request; a materialization module configured to materialize the set of column groups or the part of the column satisfying the condition; and a transmitting module configured to transmit the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, to the client device. The client device, on receipt of a response from the computing device configured decompress the set of column groups or the part of the column received in the compress form using the meta-information received from the client device.

As compared to the prior-art techniques, the present invention deals with increasing the network throughput of the system by sending the data from server to client side in the same format as stored in the server, without any transformation by sending a compressed data in same format as stored in the data storage / server to the client or the consumer of the data, along with the metadata related to the sent data to help the client in building the data it requires. Further, the present invention as compared to the available prior-art techniques, reduces the row construction cost in at server side by transmitting / pushing the compressed columns to the client accessing the database. Further, the computing device, the system, and the method also enables to push the data in compressed format to reduce the data communication cost. Also, if the client is technically advanced enough, the compressed data can be processed at client side which allows the client to keep more data in system cache and this increase its overall performance

BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS

The detailed description is described with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the drawings to refer like features and components.

Figure 1 illustrates an example of pushing columns to the client, in accordance with an embodiment of the present subject matter.

Figure 2 illustrates an example of pushing columns to the client along with the meta-information, in accordance with an embodiment of the present subject matter.

Figure 3 illustrates a block diagram showing various building blocks, in accordance with an embodiment of the present subject matter.

Figure 4 illustrates an example of the processing of the data pages to obtain the materialized results, in accordance with an embodiment of the present subject matter.

Figure 5 illustrates a computing device configured to transmit the set of column groups or the part of the column in compressed form and at least a meta-information associated, in accordance with an embodiment of the present subject matter.

Figure 6 illustrates a flow chart of the overall processing of the query/request, in accordance with an embodiment of the present subject matter.

Figure 7 illustrates a method of transmitting data stored in a column-store, relational database, in accordance with an embodiment of the present subject matter.

Figure 8 illustrates the processing of an exemplary query, in accordance with an embodiment of the present subject matter.

Figure 9 illustrates the effect of present invention in an in-memory column store database.

It is to be understood that the attached drawings are for purposes of illustrating the concepts of the invention and may not be to scale.

DETAILED DESCRIPTION OF THE PRESENT INVENTION

The following clearly describes the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Apparently, the described embodiments are merely a part rather than all of the embodiments of the present invention. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts shall fall within the protection scope of the present invention.

The invention can be implemented in numerous ways, including as a process, an apparatus, a system, a composition of matter, a computer readable medium such as a computer readable storage medium or a computer network wherein program instructions are sent over optical or electronic communication links. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention.

A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.

Computing devices, systems, and methods for achieving efficient data communication in columnar databases are disclosed.

While aspects are described for computing device, system, and method therefore achieving efficient data communication in columnar databases, the present invention may be implemented in any number of different computing systems, environments, and/or configurations, the embodiments are described in the context of the following exemplary systems, apparatus, and methods.

Accordingly, in one implementation, the present invention discloses computing device, system, and method that enable to reduce the row construction cost in the server by pushing columns to client. Further method is explained to push the data in compressed format to reduce the data communication cost. If the client is advanced enough, the compressed data can be processed which allows the client to keep more data in system cache and this increase its overall performance. The present invention is executed on the data store in column oriented format and compressed

In one implementation, the present invention discloses computing device, system, and method to reduce the row construction cost in the server by pushing columns to client.

Referring now to the figure 1, the figure 1 illustrates an example of pushing columns to the client, in accordance with an embodiment of the present subject matter. As shown in figure 1, the columns C1 and C2 stores the data the user is trying to access by firing a query or sending a request to the DBMS. The request or the query may include some condition for accessing the data stored in the C1 and C2, for example, as shown in the figure 1, on the left hand side, the condition received is C1>3. As per the conventional techniques available, on the left hand side in figure 1, for accessing the data stored in the columns, the condition is satisfied by sending the respective row_id from the C1 and the associated value form the C2. A person skilled in the art will understand this type of transmission of the data is time consuming, requires processing at server end thereby increasing device overhead, and while transmission, as the data is transmitted in huge amount the network overhead also increases. However, by the implementation of the present invention, which is shown on the right hand side of the figure 1, the same transmission of the data is achieved efficiently by sending the columns in the form of values itself. The comparison of the process and the effect is clearly visible and may be identified by the person skilled in that art, as shown in the figure 1.

Referring now to figure 2, the figure 2 illustrates an example of pushing columns to the client along with the meta-information, in accordance with an embodiment of the present subject matter. As show in figure 2, an exemplary table having row and columns C1 and C3 are shown. When a query having condition of C1>3 is received by the DBMS, according to the present invention, the columns, say C1 and C3 in this case, are compressed and a single block (for example, page) that matches the condition, say C1>3 in this case, the data is sent to the client in compressed format. Further, the present invention also enables the meta-information used at server side (DBMS) to process the compressed data is to be passed/transmitted to the client. The client may then use this meta-information to avoid decompression.

In one implementation, the present invention provides two techniques for sending meta-information to client. The meta-information may be embedded with the actual/data being transmitted or may be sent separately after transmitting the actual/data. It may be understood that, in most of the cases, the data cannot be sent in one packet, in such case the meta-information need not be sent with each packet. In some of the embodiments, the meta-information may be packed with actual data (even if it does not fit into one packet) if the client processing require the meta-information. The meta-information packed with actual data may be useful in case when multiple client threads are processing the packets and to optimize the processing the actual data and meta-information are kept together.

In one implementation, the present invention, efficiently deals with increasing the network throughput of the system by sending the data from server to client side in the same format as stored in the server, without any transformation. It is further more effective in cases where the server data may be stored in a compressed format or is stored in a format suitable for compression.

In one implementation, the present invention enables to send the compressed data in same format as stored in the data storage to the client or the consumer of the data, and also sends the metadata related to the sent data to help the client in building the data it requires. As the compressed column major data is transmitted to the client the network overhead is automatically minimal.

Figure 3 illustrates a block diagram showing various building blocks, in accordance with an embodiment of the present subject matter. As shown in the figure 3, at step 1, the client sends a query or request for accessing the data stored in the database. It may be understood by the person skilled in the art that the query or the request may include some conditions or criteria’s based on which the client is trying to access the data stored in the database. For example, the conditions may be column projections, fetch size, flags and the like. At step 2, when the server or DBMS receives the query or request from the client, it may execute the query based on the various conditions received in the query and the data available in the database. While the execution of the query, the query execution engine available in the DBMS for execution of the query, applies the condition received in the query on each target column received in the query independently. This step may use sequential access of memory and applies service management automation (SMA) concept. At step 3, if a whole page in the columnar database matches the conditions received in the query, the compressed page is materialize without decompressing the page. If a part of the page matches the conditions received in the query, then the present invention enables to de-compress the page, fetch the matching results or values form the de-compressed page and then materialize the relevant values. At step 4, the results or the column materialized are then compressed. At step 5, the compressed results are projected and then transmitted to the client in compressed form. Apart from the compressed column data, the meta-information required during the compressions of the column data is also transmitted to the client. At step 6, when the client receives the compressed data from the DBMS along with the meta-information, it uses the meta-information and decompresses the compressed data and use the same for further processing.

Referring now to figure 4, the figure 4 illustrates an example of the processing of the data pages to obtain the materialized results, in accordance with an embodiment of the present subject matter. In order to understand the processing as provided in the figure 3, consider a query SELECT pdt_id, pdt_type FROM dimension_tab WHERE sales_value>100; is received from the client. To fetch the data stored in the database in the form of data pages. As shown in figure 4, assuming the table is compressed as per columns typically the column “pdt_type”. The query matches block 1 completely and part of block 2. The present invention enables to apply the filter condition on each filter column independently. This step uses sequential access of memory and applies SMA concept. As shown in figure 4, the filter condition in this case is (sales_value>100). If whole page matches the conditions, materialize the compressed page, do not de-compress. If part of the page matches materialize the relevant values. As shown in the figure 4, the materialized result obtained in compressed and then this compress result is forwarded to the client. The meta-information used for the compression is also forwarded to the client to de-compress the results and use it for further processing.

Referring now to figure 5, the figure 5 illustrates a computing device configured to transmit the set of column groups or the part of the column in compressed form and at least a meta-information associated, in accordance with an embodiment of the present subject matter. In one implementation, a computing device 500 storing data in a column-store, relational database 516, wherein the column-store relational database 516 comprises data in rows of at least one table and a set of column groups of one or more columns is disclosed.

Although the present subject matter is explained considering that the computing device 500 for configured to transmit the set of column groups or the part of the column in compressed form and at least a meta-information associated, it may be understood that the computing device 500 may also be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a mainframe computer, a server, a network server, and the like. It will be understood that the computing device 500 may be accessed by multiple users through one or more user devices (not shown) or applications residing on the user devices. Examples of the computing device 500 may include, but are not limited to, a portable computer, a personal digital assistant, a handheld device, and a workstation, routers, servers (DBMS). The computing device 500 are communicatively coupled to the other devices (not shown) through a network (not shown), for example, there may be multiple apparatuses connected to the computing device 500 accessing the database by sending a database query to fetch the result/ data stored in the database.

In one implementation, the network may be a wireless network, a wired network or a combination thereof. The network can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and the like. The network may either be a dedicated network or a shared network. The shared network represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like, to communicate with one another. Further the network may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, and the like.

In one implementation, the computing device 500 may include at least one processor 502, an input/output (I/O) interface 504, and a memory 506. The at least one processor 502 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the at least one processor 502 is configured to fetch and execute computer-readable instructions stored in the memory 506.

The I/O interface 504 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface 504 may allow the computing device 500 to interact with a user directly or through the client devices (not shown). Further, the I/O interface 504may enable the computing device 500 to communicate with other computing devices, such as web servers and external data servers (not shown). The I/O interface 504 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. The I/O interface 504 may include one or more ports for connecting a number of devices to one another or to another server.

The memory 506 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.

The modules include routines, programs, objects, components, data structures, etc., which perform particular tasks or implement particular abstract data types.

In one implementation, as shown in figure5, the computing device500is disclosed. The computing device 500 may include a processor 502 and a memory 506 coupled to the processor for executing a plurality of modules present in the memory. The modules may include a receiving module508, an execution module 510, and a checking module 512, a materialization module514, a transmitting module 518, a decompression module 520, a filter module 522, and a compression module 524. The computing device 500 may further include a database 516 storing the data in the form of tables and rows, specifically a columnar database.

In one implementation, the receiving module 508 configured to receive at least one request to access the data stored in column-store relational database. The execution module 510 configured to execute at least one condition received in the request. The checking module 512 configured to check if the condition is being satisfied by the columns, wherein the condition is being satisfied if at least the set of column groups or a part of the column matches the condition as received in the request. The materialization module 514 configured to materialize the set of column groups or the part of the column satisfying the condition. The transmitting module 518 configured to transmit the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, in response to the request received.

In one implementation, the decompression module 520, if the part of the column matches the condition, configured to decompress the set of column groups, wherein the part of the column is present in the set of column groups. The filter module 522 configured to filter the part of the column matching the condition selected from the set of column groups. The compression module 514 configured to compress the part of the column filtered, and thereby transmitting the part of the column filtered along with the meta-information associated with at least one technique used for compressing the columns, in response to the request received.

In one implementation, the data stored in a column-store, relational database 516 is preferably in compressed column oriented format.

In one implementation, the checking module512 is further configured to sequentially access the memory storing the relational database, preferably, by applying service management automation (SMA).

In one implementation, the meta-information is transmitted separately from the set of column groups or the part of the column in the compressed form.

In one implementation, the meta-information is embedded in the set of column groups or the part of the column in the compressed form and thereafter transmitted.

In one implementation, the request to access the data stored in column-store relational database is received, preferably, from at least one client device, or an application program.

In one implementation, a system comprising at least one client device or application program to transmit at least one query to access data stored in column-store relational database 516, and at least one computing device 500 storing data in a column-store, relational database 516, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns is disclosed. The computing device 500 comprises the receiving module 508 configured to receive the request from the client device to access the data stored in column-store relational database; the execution module 510 configured to execute at least one condition received in the request; the checking module 512 configured to check if the condition is being satisfied by the columns, wherein the condition is being satisfied if at least the set of column groups or a part of the column matches the condition as received in the request; the materialization module 514 configured to materialize the set of column groups or the part of the column satisfying the condition; and the transmitting module 518 configured to transmit the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, to the client device. The client device, on receipt of a response from the computing device configured decompress the set of column groups or the part of the column received in the compress form using the meta-information received from the client device.

Figure 6 illustrates a flow chart of the overall processing of the query/request, in accordance with an embodiment of the present subject matter. In a database, the data is stored in database internal storage. The memory is divided into pages and segments. In case of columnar databases, a group of columns (which can fit into a page etc.,) is compressed and stored together. The grouping of columns is based on the database internal implementation. In one implementation, as shown in figure 6, when a query is done on these columns, it is possible that the whole of group of columns, satisfies the filter condition or a part of the column satisfies the condition. This information is usually stored as metadata in the page (which stores the group of columns). So if the whole group satisfies the filter condition, then the present invention sends the whole compressed group directly. If the whole group does not satisfies the condition, then the present invention enables to decompress the group and then filter out the selected one/ results. Then the selected one/ results needs to compress (if required) the resultant group and then send the result back to client along with the meta-information.

Figure 7 illustrates a method of transmitting data stored in a column-store, relational database; in accordance with an embodiment of the present subject matter. The method may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, etc., that perform particular functions or implement particular abstract data types. The method may also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, computer executable instructions may be located in both local and remote computer storage media, including memory storage devices.

The order in which the method is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method or alternate methods. Additionally, individual blocks may be deleted from the method without departing from the protection scope of the subject matter described herein. Furthermore, the method can be implemented in any suitable hardware, software, firmware, or combination thereof. However, for ease of explanation, in the embodiments described below, the method may be considered to be implemented in the above described computing device 500.

Referring now to figure 7, the figure 7 illustrates a method of transmitting data stored in a column-store, relational database, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns.

In one implementation, the method, in response to a request received, preferably, from at least one client device, or an application program, to access the data stored in column-store relational database, comprises responding, by a processor of one or more computing device, with at least one column in compressed form along with at least a meta-information associated with at least one technique used for compressing the column to the request, wherein the column, matching with at least one condition received in the request, is transmitted in the compressed form to the client device or the application program.

In one implementation, the data stored in a column-store, relational database is preferably in compressed column oriented format.

In one implementation, the meta-information is transmitted separately from the set of column groups or the part of the column in the compressed form.

In one implementation, the meta-information is embedded in the set of column groups or the part of the column in the compressed form and thereafter transmitted.

At block 702, at least one request to access the data stored in column-store relational database is received by at least one processor of at least one computing device. The data may be stored in a column-store; relational database is preferably in compressed column oriented format.

At block 704, at least one condition received in the request is executed.

At block 706, the condition is being satisfied by the columns or not is checked. The condition is being satisfied if at least the set of column groups or a part of the column matches the condition as received in the request. The sequential access of at least a memory storing the relational database is checked, preferably, by applying service management automation (SMA), any other technique may also be used for checking.

In one implementation, as shown in figure 6, when a query is done on these columns, it is possible that the whole of group of columns, satisfies the filter condition or a part of the column satisfies the condition. This information is usually stored as metadata in the page (which stores the group of columns). So if the whole group satisfies the filter condition, then the present invention sends the whole compressed group directly. If the whole group does not satisfies the condition, then the present invention enables to decompress the group and then filter out the selected one/ results. Then the selected one/ results needs to compress (if required) the resultant group and then send the result back to client along with the meta-information.

At block 708, the set of column groups or the part of the column satisfying the condition are materialized. In one implementation, if the part of the column matches the condition, then the set of column groups are decompressed, wherein the part of the column is present in the set of column groups. The part of the column matching the condition selected is filtered from the set of column groups. The part of the column filtered is compressed. It may be understood by the person skilled in that art that the term “materialization” means copying the columns data from database internal storage (pages, segments etc.,) to the result buffer. The result buffer is the storage which is used to send the data from server to client.

At block 710, the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, are transmitted in response to the request received. In one implementation, the meta-information is transmitted separately from the set of column groups or the part of the column in the compressed form. In one implementation, the meta-information is embedded in the set of column groups or the part of the column in the compressed form and thereafter transmitted.

In one implementation, a method of transmitting data stored in a column-store, relational database, wherein the column-store relational database comprises data in rows of at least one table and a set of column groups of one or more columns is disclosed. The method comprises, in response to a request received, preferably, from at least one client device, or an application program, to access the data stored in column-store relational database: responding, by a processor of one or more computing device, with at least one column in compressed form along with at least a meta-information associated with at least one technique used for compressing the column to the request, wherein the column, matching with at least one condition received in the request, is transmitted in the compressed form to the client device or the application program.

In one implementation, a method of transmitting data stored in a column-store relational database is disclosed. The method comprises receiving, by at least one processor of at least one computing device, at least one request to access the data stored in the column-store relational database; checking, by the processor, if the at least one condition received in the request is being satisfied by the set of column groups, wherein the condition is being satisfied if the set of column groups or a part of the set of column groups matches the condition as received in the request; materializing, by the processor, the set of column groups or the part of the column satisfying the condition; and transmitting, by the processor, the set of column groups or the part of the column materialized in compressed form along with at least a meta-information associated with at least one technique used for compressing the columns, in response to the request received.

The present invention enables to send the meta-information using two different ways. One is by embedded the meta-information with the actual, data or can be sent separately. Many times the data may not be sent in one packet; in that case the meta-information need not be sent with each packet. In some of the embodiments, the meta-information may be packed with actual data (even if it does not fit into one packet) if the client processing require so. This scenario is useful in case, multiple client threads are processing the packets and to optimize the processing the actual data and metadata are kept together.

Referring now to figure 8, the figure 8 illustrates the processing of an exemplary query, in accordance with an embodiment of the present subject matter. Consider a query SELECT pdt_id, pdt_type, sales_value FROM dimension_tab WHERE sales_value<50; is received from the client to fetch data as shown in the leftmost block of data as data pages. On receipt of the query, the present invention checks if result compression is enabled. For every column materialized, perform dictionary based compression, projects the compressed columns to the client as shown with the rightmost block denoted as compressed. As shown in the figure 8, the column pdt_type, it contains only two values, hence the compression % is very high in this scenario. But the whole row is never duplicated.

In one implementation, to exhibit the technical benefit and the advantages achieved by the present invention, the present invention is implemented in an in-memory column store database, and the present invention is subjected to three different scenarios. In the first scenario, the data in server is decompressed and changed to row format before sending. In the second scenario the compressed data is sent to client which intern decompresses the data before processing. In the third scenario, the decompressed data is sent to the client and the client uses the decompressed data. Figure 9 illustrates the effect of present invention in an in-memory column store database when subjected to the above three scenarios. It may be seen for the figure 9 that, an average of 3 time improved throughput in our experiment. The experiments were carried out in a system with below configuration:
64 Bit OSCA board with 2 NUMA Nodes
Each NUMA node having 8 CPUs (HT to 16) of 2100.137 MHz
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 20480K

Apart from what is explained above, the present invention also include the below mentioned advantages:
• A computing device, system and method for transferring data from a database server to a connected client in compressed columnar form is disclosed. The present invention is executed on the data store in column oriented format and compressed.
• The present invention enables the table data is stored in compressed format and metadata is stored to execute database queries directly on the compressed data.

• The present invention queries the compressed data to form a compressed result in columnar format.
• The present invention sends the compressed columnar data to the client for further processing. No preprocessing is done at the server side for the result data.
• The present invention enables the client to receive the compressed columnar data and operates on it using the supported metadata.
• The present invention can be deployed in a DBMS. The system can be extended to any system having large amount of data movements between its key modules.
• The present invention can be applied to NO-SQL data bases which use compression technology to reduce the data footprint
• The present invention can be used by server application not necessarily be DBMS to reduce the data communication cost and to increase the client efficiency.
• The present invention can be used not only in typical client server model but in large systems with many dependent modules. The data transfer between modules can be optimized using the method explained in the invention.
• The present invention can be used to decompress the data at server side and then compress the data in client suitable format and then send it to client.
• The present invention is applicable for all scenarios where all the records have to fetch before the application starts processing. This is the normal case for many DB applications. The solution is more suitable for repeatable read, snapshot isolation levels where a large amount of records needs to be selected. The solution is not feasible for low selectivity queries on read-committed isolations. The solution is sub-optimal for cases where the application/client process row-by-row.
• The present invention reduces the row construction cost in the server by pushing columns to client. Further method is explained to push the data in compressed format to reduce the data communication cost. If the client is advanced enough, the compressed data can be processed which allows the client to keep more data in system cache and this increase its overall performance.
• The present invention deals with increasing the network throughput of the system by sending the data from server to client side in the same format as stored in the server, without any transformation.
• The present invention enables to send compressed data in same format as stored in the data storage to the client or the consumer of the data.
• The present invention enables to send the compressed column major data to the client so that the network overhead will be minimal.

A person of ordinary skill in the art may be aware that in combination with the examples described in the embodiments disclosed in this specification, units and algorithm steps may be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether the functions are performed by hardware or software depends on the particular applications and design constraint conditions of the technical solution. A person skilled in the art may use different methods to implement the described functions for each particular application, but it should not be considered that the implementation goes beyond the scope of the present invention.

It may be clearly understood by a person skilled in the art that for the purpose of convenient and brief description, for a detailed working process of the foregoing system, apparatus, and unit, reference may be made to a corresponding process in the foregoing method embodiments, and details are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the described apparatus embodiment is merely exemplary. For example, the unit division is merely logical function division and may be other division in actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented through some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.

When the functions are implemented in a form of a software functional unit and sold or used as an independent product, the functions may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of the present invention essentially, or the part contributing to the prior art, or a part of the technical solutions may be implemented in a form of a software product. The computer software product is stored in a storage medium, and includes several instructions for instructing a computer device (which may be a personal computer, a server, or a network device) to perform all or a part of the steps of the methods described in the embodiment of the present invention. The foregoing storage medium includes: any medium that can store program code, such as a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk, or an optical disc.

Although implementations for computing device, system, and method therefore achieving efficient data communication in columnar databases have been described in language specific to structural features and/or methods, it is to be understood that the appended claims are not necessarily limited to the specific features or methods described. Rather, the specific features and methods are disclosed as examples of implementations of the computing device, system, and method therefor achieving efficient data communication in columnar databases.

Documents

Application Documents

# Name Date
1 201641027904-IntimationOfGrant21-06-2022.pdf 2022-06-21
1 Power of Attorney [16-08-2016(online)].pdf 2016-08-16
2 201641027904-PatentCertificate21-06-2022.pdf 2022-06-21
2 Form 3 [16-08-2016(online)].pdf 2016-08-16
3 Form 18 [16-08-2016(online)].pdf_72.pdf 2016-08-16
3 201641027904-CLAIMS [22-08-2020(online)].pdf 2020-08-22
4 Form 18 [16-08-2016(online)].pdf 2016-08-16
4 201641027904-DRAWING [22-08-2020(online)].pdf 2020-08-22
5 Drawing [16-08-2016(online)].pdf 2016-08-16
5 201641027904-FER_SER_REPLY [22-08-2020(online)].pdf 2020-08-22
6 Description(Complete) [16-08-2016(online)].pdf 2016-08-16
6 201641027904-OTHERS [22-08-2020(online)].pdf 2020-08-22
7 abstract 201641027904.jpg 2016-10-24
7 201641027904-FER.pdf 2020-04-28
8 Other Patent Document [16-01-2017(online)].pdf 2017-01-16
8 Correspondence by Agent_Assignment_16-04-2018.pdf 2018-04-16
9 201641027904-8(i)-Substitution-Change Of Applicant - Form 6 [24-03-2018(online)].pdf 2018-03-24
9 Correspondence by Agent_Form1_23-01-2017.pdf 2017-01-23
10 201641027904-ASSIGNMENT DOCUMENTS [24-03-2018(online)].pdf 2018-03-24
10 201641027904-PA [24-03-2018(online)].pdf 2018-03-24
11 201641027904-ASSIGNMENT DOCUMENTS [24-03-2018(online)].pdf 2018-03-24
11 201641027904-PA [24-03-2018(online)].pdf 2018-03-24
12 201641027904-8(i)-Substitution-Change Of Applicant - Form 6 [24-03-2018(online)].pdf 2018-03-24
12 Correspondence by Agent_Form1_23-01-2017.pdf 2017-01-23
13 Correspondence by Agent_Assignment_16-04-2018.pdf 2018-04-16
13 Other Patent Document [16-01-2017(online)].pdf 2017-01-16
14 201641027904-FER.pdf 2020-04-28
14 abstract 201641027904.jpg 2016-10-24
15 201641027904-OTHERS [22-08-2020(online)].pdf 2020-08-22
15 Description(Complete) [16-08-2016(online)].pdf 2016-08-16
16 201641027904-FER_SER_REPLY [22-08-2020(online)].pdf 2020-08-22
16 Drawing [16-08-2016(online)].pdf 2016-08-16
17 201641027904-DRAWING [22-08-2020(online)].pdf 2020-08-22
17 Form 18 [16-08-2016(online)].pdf 2016-08-16
18 Form 18 [16-08-2016(online)].pdf_72.pdf 2016-08-16
18 201641027904-CLAIMS [22-08-2020(online)].pdf 2020-08-22
19 Form 3 [16-08-2016(online)].pdf 2016-08-16
19 201641027904-PatentCertificate21-06-2022.pdf 2022-06-21
20 Power of Attorney [16-08-2016(online)].pdf 2016-08-16
20 201641027904-IntimationOfGrant21-06-2022.pdf 2022-06-21

Search Strategy

1 searchstrategy201641027904_20-02-2020.pdf

ERegister / Renewals

3rd: 22 Aug 2022

From 16/08/2018 - To 16/08/2019

4th: 22 Aug 2022

From 16/08/2019 - To 16/08/2020

5th: 22 Aug 2022

From 16/08/2020 - To 16/08/2021

6th: 22 Aug 2022

From 16/08/2021 - To 16/08/2022

7th: 22 Aug 2022

From 16/08/2022 - To 16/08/2023

8th: 12 Jul 2023

From 16/08/2023 - To 16/08/2024

9th: 12 Jul 2024

From 16/08/2024 - To 16/08/2025

10th: 08 Jul 2025

From 16/08/2025 - To 16/08/2026