Abstract: Systems and methods for predicting execution time of applications executed on heterogeneous clusters is provided. The system includes a heterogeneous map and reduce (MR) simulator that is configured to (a) obtain information pertaining to (i) number of MR jobs and (ii) time of the number of MR jobs for each node type in a heterogeneous cluster, (b) obtain information pertaining to (i) number of machines, and (ii) number of the MR jobs running on the machines in the heterogeneous cluster, (c) obtain information pertaining to a dataset, (d) execute, using one or more MR jobs, at least one application with the dataset on the heterogeneous cluster based on the obtained information, and (e) predict an execution time of said one or more MR jobs executing the at least one application with the dataset on the heterogeneous cluster.
Claims:
1. A processor implemented method, comprising:
executing, by a map and reduce (MR) job profiler, at least one application with a first dataset on a first heterogeneous cluster comprising a first set of nodes;
determining, by a cluster configurator, one or more MR jobs executing said at least one application;
measuring, by a system profile collector, an execution time of said one or more MR jobs executing said at least one application with said first dataset on said first heterogeneous cluster comprising said first set of nodes;
predicting, using a linear regression technique, an execution time of one or more components of said one or more MR jobs executing said at least one application with a second dataset;
executing, by a heterogeneous MR simulator, said at least one application with said second dataset on a second heterogeneous cluster comprising a second set of nodes; and
predicting, by said heterogeneous MR simulator, an execution time of said one or more MR jobs executing said at least one application with said second dataset on said second heterogeneous cluster:
2. The processor implemented method of claim 1, wherein said first heterogeneous cluster is smaller as compared to said second heterogeneous cluster.
3. The processor implemented method of claim 1, wherein said first dataset is smaller as compared to said second dataset.
4. The processor implemented method of claim 1, further comprising:
processing, by a Hive Query Language (HiveQL) MR jobs profiler, a Hive query such that said Hive query is translated into Directed Acyclic Graph (DAG) of one or more MR jobs;
executing, said DAG of one or more MR jobs on said first dataset on said first heterogeneous cluster comprising said first set of nodes; and
predicting, by said heterogeneous MR simulator, a Hive query execution time of said DAG of said one or more MR jobs executed on said first dataset on said first heterogeneous cluster comprising said first set of nodes.
5. A system comprising:
a memory storing instructions;
an input/output interface; and
a hardware processor coupled to said memory via said input/output interface, wherein said hardware processor is configured by said instructions to execute:
a map and reduce (MR) job profiler that is configured to execute at least one application with a first dataset on a first heterogeneous cluster comprising a first set of nodes;
a cluster configurator that is configured to determine one or more MR jobs executing said at least one application;
a system profile collector that is configured to
measure an execution time of said one or more MR jobs executing said at least one application with said first dataset on said first heterogeneous cluster comprising said first set of nodes, and
predict, using a linear regression technique, an execution time of one or more components of said one or more MR jobs executing said at least one application with a second dataset; and
a heterogeneous MR simulator that is configured to
execute said at least one application with said second dataset on a second heterogeneous cluster comprising a second set of nodes, and
predict an execution time of said one or more MR jobs executing said at least one application with said second dataset on said second heterogeneous cluster.
6. The system of claim 5, wherein said first heterogeneous cluster is smaller as compared to said second heterogeneous cluster.
7. The system of claim 6, wherein said first dataset is smaller as compared to said second dataset.
8. The system of claim 6, further comprising:
a Hive Query Language (HiveQL) MR jobs profiler that is configured to
process a Hive query such that said Hive query is translated into Directed Acyclic Graph (DAG) of one or more MR jobs,
execute said DAG of one or more MR jobs on said first dataset on said first heterogeneous cluster comprising said first set of nodes, wherein said heterogeneous MR simulator predicts a Hive query execution time of said DAG of said one or more MR jobs executed on said first dataset on said first heterogeneous cluster comprising said first set of nodes.
9. A system comprising:
a memory storing instructions;
an input/output interface; and
a hardware processor coupled to said memory via said input/output interface, wherein said hardware processor is configured by said instructions to execute:
a heterogeneous map and reduce (MR) simulator that is configured to
(a) obtain, from a MR job profiler, information pertaining to (i) number of MR jobs and (ii) time of said number of MR jobs for each node type in a heterogeneous cluster,
(b) obtain, from a cluster configurator, information pertaining to (i) number of machines, and (ii) number of said MR jobs running on said machines in said heterogeneous cluster,
(c) obtain information pertaining to a dataset,
(d) execute, using one or more MR jobs, at least one application with said dataset on said heterogeneous cluster based on said obtained information, and
(e) predict an execution time of said one or more MR jobs executing said at least one application with said dataset on said heterogeneous cluster. , Description:
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention:
SYSTEMS AND METHODS FOR PREDICTING EXECUTION TIME OF APPLICATIONS EXECUTED ON HETEROGENEOUS CLUSTERS
Applicant:
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
The following specification particularly describes the embodiments and the manner in which it is to be performed.
TECHNICAL FIELD
[0001] The embodiments herein generally relate to data processing techniques, and, more particularly, to systems and methods for predicting execution time of applications executed on heterogeneous clusters.
BACKGROUND
[0002] Map-Reduce is a popular paradigm for processing big data due to wider availability of its implementations in open source such as Apache Hadoop. This framework has been tuned for optimal execution of an application in commodity cluster of homogeneous nodes. However, most of organization have surplus unused heterogeneous infrastructure which gets accumulated over a period of time, which may have machines of varying number of CPU cores, RAM and disk speed. The question that remains for an entity is whether it is efficient to set up Hadoop cluster on available set of heterogeneous nodes for executing their analytic workload.
SUMMARY
[0003] The following presents a simplified summary of some embodiments of the disclosure in order to provide a basic understanding of the embodiments. This summary is not an extensive overview of the embodiments. It is not intended to identify key/critical elements of the embodiments or to delineate the scope of the embodiments. Its sole purpose is to present some embodiments in a simplified form as a prelude to the more detailed description that is presented below. In view of the foregoing, an embodiment herein provides a systems and methods for predicting execution time of applications executed on heterogeneous clusters.
[0004] In one embodiment, a processor implemented method is provided. The method comprising: executing, by a map and reduce (MR) job profiler, at least one application with a first dataset on a first heterogeneous cluster comprising a first set of nodes, determining, by a cluster configurator, one or more MR jobs executing the at least one application, measuring, by a system profile collector, an execution time of the one or more MR jobs executing the at least one application with the first dataset on the first heterogeneous cluster comprising the first set of nodes, predicting, using a linear regression technique, an execution time of one or more components of the one or more MR jobs executing the at least one application with a second dataset, executing, by a heterogeneous MR simulator, the at least one application with the second dataset on a second heterogeneous cluster comprising a second set of nodes, and predicting, by the heterogeneous MR simulator, an execution time of the one or more MR jobs executing the at least one application with the second dataset on the second heterogeneous cluster.
[0005] In an embodiment, the first heterogeneous cluster is smaller as compared to the second heterogeneous cluster. In another embodiment, the first dataset is smaller as compared to the second dataset.
[0006] In an embodiment, the method may further comprise processing, by a HiveQL MR jobs profiler, a Hive query such that the Hive query is translated into Directed Acyclic Graph (DAG) of one or more MR jobs, executing, the DAG of one or more MR jobs on the first dataset on the first heterogeneous cluster comprising the first set of nodes, and predicting, by the heterogeneous MR simulator, a Hive query execution time of the DAG of the one or more MR jobs executed on the first dataset on the first heterogeneous cluster comprising the first set of nodes.
[0007] In another embodiment, a system is provided. The system comprising a memory storing instructions, an input/output interface, and a hardware processor coupled to the memory via the input/output interface, wherein the hardware processor is configured by the instructions to execute: a map and reduce (MR) job profiler that is configured to execute at least one application with a first dataset on a first heterogeneous cluster comprising a first set of nodes; a cluster configurator that is configured to determine one or more MR jobs executing the at least one application; a system profile collector that is configured to measure an execution time of the one or more MR jobs executing the at least one application with the first dataset on the first heterogeneous cluster comprising the first set of nodes, and predict, using a linear regression technique, an execution time of one or more components of the one or more MR jobs executing the at least one application with a second dataset; a heterogeneous MR simulator that is configured to execute the at least one application with the second dataset on a second heterogeneous cluster comprising a second set of nodes, and predict an execution time of the one or more MR jobs executing the at least one application with the second dataset on the second heterogeneous cluster.
[0008] In an embodiment, the first heterogeneous cluster is smaller as compared to the second heterogeneous cluster. In another embodiment, the first dataset is smaller as compared to the second dataset.
[0009] In an embodiment, the system may further comprise a Hive Query Language (HiveQL) MR jobs profiler that is configured to process a Hive query such that the Hive query is translated into Directed Acyclic Graph (DAG) of one or more MR jobs, execute the DAG of one or more MR jobs on the first dataset on the first heterogeneous cluster comprising the first set of nodes, wherein the heterogeneous MR simulator predicts a Hive query execution time of the DAG of the one or more MR jobs executed on the first dataset on the first heterogeneous cluster comprising the first set of nodes.
[0010] In yet another embodiment, a system is provided. The system comprising a memory storing instructions, an input/output interface, and a hardware processor coupled to the memory via the input/output interface, wherein the hardware processor is configured by the instructions to execute: a heterogeneous map and reduce (MR) simulator that is configured to (a) obtain, from a MR job profiler, information pertaining to (i) number of MR jobs and (ii) time of the number of MR jobs for each node type in a heterogeneous cluster, (b) obtain, from a cluster configurator, information pertaining to (i) number of machines, and (ii) number of the MR jobs running on the machines in the heterogeneous cluster, (c) obtain information pertaining to a dataset, (d) execute, using one or more MR jobs, at least one application with the dataset on the heterogeneous cluster based on the obtained information, and (e) predict an execution time of the one or more MR jobs executing the at least one application with the dataset on the heterogeneous cluster.
[0011] In a further embodiment, one or more non-transitory machine readable information storage mediums comprising one or more instructions is provided. The one or more instructions which when executed by one or more hardware processors causes executing at least one application with a first dataset on a first heterogeneous cluster comprising a first set of nodes, determining, one or more MR jobs executing the at least one application, measuring an execution time of the one or more MR jobs executing the at least one application with the first dataset on the first heterogeneous cluster comprising the first set of nodes, predicting, using a linear regression technique, an execution time of one or more components of the one or more MR jobs executing the at least one application with a second dataset, executing, by a heterogeneous MR simulator, the at least one application with the second dataset on a second heterogeneous cluster comprising a second set of nodes, and predicting, by the heterogeneous MR simulator, an execution time of the one or more MR jobs executing the at least one application with the second dataset on the second heterogeneous cluster.
[0012] In an embodiment, the one or more instructions which when executed by one or more hardware processors causes processing a Hive query such that the Hive query is translated into Directed Acyclic Graph (DAG) of one or more MR jobs, executing, the DAG of one or more MR jobs on the first dataset on the first heterogeneous cluster comprising the first set of nodes, and predicting, using the heterogeneous MR simulator, a Hive query execution time of the DAG of the one or more MR jobs executed on the first dataset on the first heterogeneous cluster comprising the first set of nodes.
[0013] In yet further embodiment, one or more non-transitory machine readable information storage mediums comprising one or more instructions is provided. The one or more instructions which when executed by one or more hardware processors causes (a) obtaining information pertaining to (i) number of MR jobs and (ii) time of the number of MR jobs for each node type in a heterogeneous cluster, (b) obtaining information pertaining to (i) number of machines, and (ii) number of the MR jobs running on the machines in the heterogeneous cluster, (c) obtaining information pertaining to a dataset, (d) executing, using one or more MR jobs, at least one application with the dataset on the heterogeneous cluster based on the obtained information, and (e) predict an execution time of the one or more MR jobs executing the at least one application with the dataset on the heterogeneous cluster.
[0014] It should be appreciated by those skilled in the art that any block diagram herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computing device or processor, whether or not such computing device or processor is explicitly shown.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] The embodiments herein will be better understood from the following detailed description with reference to the drawings, in which:
[0016] FIG. 1 is a system for predicting execution time of applications executed on heterogeneous clusters according to an embodiment of the present disclosure;
[0017] FIG. 2 is a flow diagram illustrating a method for predicting execution time of applications executed on heterogeneous clusters using the system of FIG. 1 according to an embodiment of the present disclosure;
[0018] FIG. 3 is a flow diagram illustrating a method for predicting execution time of applications executed on heterogeneous clusters using a heterogeneous map and reduce (MR) simulator of the system 100 according to an embodiment of the present disclosure;
[0019] FIG. 4A-4C show results for a financial domain query executed on a heterogeneous cluster A according to an embodiment of the present disclosure;
[0020] FIGS. 5A-5C show results for a telecom domain query executed on the heterogeneous cluster A according to an embodiment of the present disclosure;
[0021] FIGS. 6A-6C show results for the financial domain query executed on a heterogeneous cluster B according to an embodiment of the present disclosure;
[0022] FIGS. 7A-7C show results for the telecom domain query executed on the heterogeneous cluster B according to an embodiment of the present disclosure;
[0023] FIGS. 8A-8C show results for the financial domain query executed on a heterogeneous cluster C according to an embodiment of the present disclosure;
[0024] FIG. 9 shows results of the same financial domain query executed on the heterogeneous cluster A according to an embodiment of the present disclosure; and
[0025] FIG. 10 depicts a graphical representation of predicted job completion time versus actual job completion time for experiments executed on different types of heterogeneous clusters with data sizes from X GB to Y GB according to an embodiment of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS
[0026] The embodiments herein and the various features and advantageous details thereof are explained more fully with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
[0027] The words “comprising,” “having,” “containing,” and “including,” and other forms thereof, are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items.
[0028] It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise. Although any systems and methods similar or equivalent to those described herein can be used in the practice or testing of embodiments of the present disclosure, the preferred, systems and methods are now described.
[0029] Some embodiments of this disclosure, illustrating all its features, will now be discussed in detail. The disclosed embodiments are merely exemplary of the disclosure, which may be embodied in various forms.
[0030] Before setting forth the detailed explanation, it is noted that all of the discussion below, regardless of the particular implementation being described, is exemplary in nature, rather than limiting.
[0031] Referring now to the drawings, and more particularly to FIGS. 1 through 10, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[0032] FIG. 1 is a system 100 for predicting execution time of applications executed on heterogeneous clusters according to an embodiment of the present disclosure. The system 100 comprises a memory 102, a hardware processor 104, and an input/output (I/O) interface 106. The memory 102 further includes one or more modules 108 (or modules 108). The memory 102, the hardware processor 104, the input/output (I/O) interface 106, and/or the modules 108 may be coupled by a system bus or a similar mechanism.
[0033] The memory 102, may store instructions, any number of pieces of information, and data, used by a computer system, for example the system 100 to implement the functions (or embodiments) of the present disclosure. The memory 102 may include for example, volatile memory and/or non-volatile memory. Examples of volatile memory may include, but are not limited to volatile random access memory (RAM). The non-volatile memory may additionally or alternatively comprise an electrically erasable programmable read only memory (EEPROM), flash memory, hard drive, or the like. Some examples of the volatile memory includes, but are not limited to, random access memory, dynamic random access memory, static random access memory, and the like. Some example of the non-volatile memory includes, but are not limited to, hard disks, magnetic tapes, optical disks, programmable read only memory, erasable programmable read only memory, electrically erasable programmable read only memory, flash memory, and the like. The memory 102 may be configured to store information, data, applications, instructions or the like for enabling the system 100 to carry out various functions in accordance with various example embodiments.
[0034] Additionally or alternatively, the memory 102 may be configured to store instructions which when executed by the hardware processor 104 causes the system 100 to behave in a manner as described in various embodiments (e.g., obtaining information pertaining to number of MR jobs and time of the number of MR jobs for each node type in a heterogeneous cluster, obtaining information pertaining to (i) number of machines, and (ii) number of the MR jobs running on the machines in the heterogeneous cluster, obtaining information pertaining to a dataset, executing, using one or more MR jobs, at least one application with the dataset on the heterogeneous cluster based on the obtained information, and predicting an execution time of the one or more MR jobs executing the at least one application with the dataset on the heterogeneous cluster). The memory 102 stores information for example, information related to one or more applications, datasets, and performance and execution time of the one or more applications, and the like.
[0035] The hardware processor 104 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Further, the hardware processor 104 may comprise a multi-core architecture. Among other capabilities, the hardware processor 104 is configured to fetch and execute computer-readable instructions or modules stored in the memory 102. The hardware processor 104 may include circuitry implementing, among others, audio and logic functions associated with the communication. For example, the hardware processor 104 may include, but are not limited to, one or more digital signal processors (DSPs), one or more microprocessor, one or more special-purpose computer chips, one or more field-programmable gate arrays (FPGAs), one or more application-specific integrated circuits (ASICs), one or more computer(s), various analog to digital converters, digital to analog converters, and/or other support circuits.
[0036] The hardware processor 104 thus may also include the functionality to encode messages and/or data or information. The hardware processor 104 may include, among other things, a clock, an arithmetic logic unit (ALU) and logic gates configured to support operation of the hardware processor 104. Further, the hardware processor 104 may include functionality to execute one or more software programs, which may be stored in the memory 102 or otherwise accessible to the hardware processor 104. The system 100 may execute the modules 108. The modules 108, for example, may comprise, but are not limited to a map and reduce (MR) job profiler 108A, a cluster configurator 108B, a system profile collector 108C, a heterogeneous MR simulator 108D, and a Hive Query Language (HiveQL) MR jobs profiler 108E.
[0037] FIG. 2, with reference to FIG. 1, is a flow diagram illustrating a method for predicting execution time of applications executed on heterogeneous clusters using the system 100 according to an embodiment of the present disclosure. The steps of the method of the present disclosure will now be explained with reference to the components of the system 100 as depicted in FIG. 1. At step 202, the map and reduce (MR) job profiler 108A executes at least one application with a first dataset on a first heterogeneous cluster comprising a first set of nodes. At step 204, the cluster configurator 108B determines one or more MR jobs executing the at least one application. At step 206, the system profile collector 108C measures (or collects), an execution time of the one or more MR jobs executing the at least one application with the first dataset on the first heterogeneous cluster comprising the first set of nodes. At step 208, the system 100 implements a linear regression technique to predict an execution time of one or more components of the one or more MR jobs executing the at least one application with a second dataset. In an embodiment, the one or more components comprise one or more reduce jobs of the one or more MR jobs. At step 210, the heterogeneous MR simulator 108D executes the at least one application with the second dataset on a second heterogeneous cluster comprising a second set of nodes, and predicts an execution time of the one or more MR jobs executing the at least one application with the second dataset on the second heterogeneous cluster. In an embodiment, the first heterogeneous cluster is smaller as compared to the second heterogeneous cluster. In another embodiment, the first heterogeneous cluster is a smaller heterogeneous cluster. In yet another embodiment the second heterogeneous cluster is a larger heterogeneous cluster. Likewise, the first dataset is smaller as compared to the second dataset.
[0038] The method may further comprise, processing, by the Hive Query Language (HiveQL) MR jobs profiler 108E a Hive query such that the Hive query is translated into Directed Acyclic Graph (DAG) of one or more MR jobs, and executing the DAG of one or more MR jobs on the first dataset on the first heterogeneous cluster comprising the first set of nodes. The method may further comprise predicting, by using the heterogeneous MR simulator 108D, a Hive query execution time of the DAG of the one or more MR jobs executed on the first dataset on the first heterogeneous cluster comprising the first set of nodes.
[0039] FIG. 3, with reference to FIGS. 1-2, is a flow diagram illustrating a method for predicting execution time of applications executed on heterogeneous clusters using the heterogeneous MR simulator 108D of the system 100 according to an embodiment of the present disclosure. The steps of the method of the present disclosure will now be explained with reference to the heterogeneous MR simulator 108D of the system 100 as depicted in FIG. 1. At step 302, the heterogeneous map and reduce (MR) simulator 108D obtains, from the MR job profiler 108A, information pertaining to (i) number of MR jobs and (ii) time of the number of MR jobs for each node type in a heterogeneous cluster. At step 304, the heterogeneous map and reduce (MR) simulator 108D obtains, from the cluster configurator 108B information pertaining to (i) number of machines, and (ii) number of the MR jobs running on the machines in the heterogeneous cluster. At step 306, the heterogeneous map and reduce (MR) simulator 108D obtains information pertaining to a dataset. At step 308, the heterogeneous map and reduce (MR) simulator 108D executes using one or more MR jobs, at least one application with the dataset on the heterogeneous cluster based on the obtained information. At step 310, the heterogeneous map and reduce (MR) simulator 108D predicts an execution time of the one or more MR jobs executing the at least one application with the dataset on the heterogeneous cluster (e.g., a larger heterogeneous cluster).
[0040] The heterogeneous MR Simulator 108D manages all the discrete events in simulated time and performs the appropriate action on each event. The heterogeneous MR Simulator 108D maintains data structures similar to the Hadoop job master such as a queue of submitted jobs, jobQ. The slot allocation technique makes a new decision when a map or reduce task completes. The embodiments of the present disclosure enable the system 100 to simulate the jobs at the task level and do not simulate details of the TaskTrackers (worker nodes). The heterogeneous MR Simulator 108D maintains a priority queue’ Q’ for multiple (e.g., seven) event types: job arrivals and departures, map and reduce task arrivals and departures, and an event signaling the completion of the map stage. Each event is a triplet (eventTime, eventType, jobId) where eventTime is the time at which the event will occur in the simulation; eventType is one of the multiple event types; and jobId is the job index of the job with which the event is associated. The heterogeneous MR Simulator 108D triggers events and runs the corresponding event handlers, and further tracks the number of completed map and reduce tasks and the number of free slots.
[0041] In order to handle heterogeneous cluster, the heterogeneous MR Simulator 108D collects the map and reduce task durations on each types of machines separately grouped by the types of machines on which they are executed. Simulated machines are then configured with different number of map and reduce slots with their respective map and reduce execution times based on their hardware configuration. When a task is allocated to a particular machine in the heterogeneous MR Simulator 108D, a map/reduce task duration of a task running on the same type of machine is picked. If the dataset size has changed, then linear regression extrapolation is performed to get the predicted reduce task duration. This is assuming uniform data distribution, otherwise, the heterogeneous MR Simulator 108D can take in data growth pattern and can also predict execution time of reduce tasks for skewed data distributions. The heterogeneous MR Simulator 108D has been built in Java and executes in less than a second even for large data size till 100 GB.
[0042] Experimental Setup:
[0043] Experiments were performed on one homogeneous and three heterogeneous clusters. Cluster A is a homogeneous cluster consisting of 10 nodes. Each node has 8 CPU cores, 16 GB RAM and a single hard disk. A single node was used as the JobTracker and the NameNode and the remaining 9 were used as worker nodes. In order to make the nodes heterogeneous, the CPU cores were artificially turned off. The experiments were performed on three different types of hardware configurations: Large with 8 cores, Medium with 4 cores and Small with 2 cores each.
[0044] For running experiments with larger data sizes, Cluster B which consisted of more powerful machines was used. The hardware configuration of the machines in Cluster B is as shown by way of example in Table I.
Table I
Type Small Medium Large
Number of Machines 1 3 4
Number of Cores 28 56 72
RAM (in GB) 132 132 132
Disk (in GB) 1000 1000 1000
[0045] Cluster C is a heterogeneous cluster of 8 machines that were accumulated over time in a corporate division. The hardware configuration of the machines is as shown by way of example in Table II.
Table II
Type Small Medium Large
Number of Machines 1 4 3
Number of Cores 4 16 48
RAM (in GB) 4 16 64
Disk (in GB) 73 300 900
[0046] MR framework configurations are kept same (as default) across all executions of application in all heterogeneous clusters for both training and model validation. It was considered that there is no overlap between map task and shuffle phase of reduce tasks; this is to reduce complexity in explaining the model, however simulator can easily incorporate the overlapped executions of map and reduce tasks. In a heterogeneous cluster, each machine was configured with map and reduce slots equal to the number of cores on the machine. The distributed file system’s replication factor was set to 2 because the training is performed on a small cluster size. The following applications to validate the model described by the embodiments of the present disclosure implemented in the system 100:
[0047] The first three applications are open source MapReduce benchmarks and included as a part of the standard Hadoop distribution. The next two applications are industrial applications implemented as HiveQL queries.
1) WordCount: This application counts the frequency of words in the input dataset. Wikipedia articles were used as dataset as input.
2) TeraSort: This application sorts key/value records. The input is generated using the TeraGen application.
3) Grep: This application finds a particular regular expression search term in the input dataset. Wikipedia articles were used as dataset as input.
4) Financial domain query: This financial application is a HiveQL query which computes the total volume of shares traded for a particular company between two particular dates. It gets compiled into a single MapReduce job. Below is an illustrated example of a HiveQL query:
SELECT SUM(routed_shares_quantity * route_price)
FROM transactions_table
WHERE issue_symbol like 'XLP' AND
from_unixtime(cast ((orf_order_received_ts/1000)
as BIGINT),'yyyy-MM-dd') >= "2015-09-15" AND
from_unixtime(cast((orf_order_received_ts/1000)
as BIGINT),'yyyy-MM-dd') <= "2015-09-25"
GROUP BY orf_order_received_ts;
5) Telecom domain query: This telecommunication application is a HiveQL query which computes the total number of calling minutes grouped by every month. It gets compiled into a sequence of two MapReduce jobs. Below is an illustrative example of HiveQL query in telecom domain:
SELECT Year, Calendar_Year_Month,
SUM(Total_Minutes) AS Total_Minutes
FROM Billing_Fact, Date_Dimension
WHERE Billing_Fact.ID = Date_Dimension.ID
GROUP BY Year, Calendar_Year_Month
ORDER BY Year, Calendar_Year_Month
[0048] Predicting job execution time:
[0049] The embodiments of the present disclosure present experimental results of validating the model, as discussed in FIG. 2 for predicting the job execution times. The embodiments of the present disclosure enable the system 100 to collect measurements for map and reduce tasks by executing the given application on small heterogeneous cluster with small data sizes. A linear regression is then implemented to estimate map and reduce tasks execution time for larger data size for each type of the hardware node in the cluster. Finally, MR simulator (e.g., the heterogeneous MR simulator 108D) is used with these estimated map and reduce tasks time to predict the application execution time on larger heterogeneous clusters for larger data sizes. More specific, a small heterogeneous Hadoop cluster containing one node of each hardware type is configure. A small size dataset is used such that all the map slots are filled in the cluster. For example, if the cluster consists of 16 map slots, then a MapReduce job is run with number of map slots times the distributed file system (DFS) block size (which is the map input split size by default) = 16 *64MB = 1 GB of input dataset size. As the input dataset size increases, it was observed that the number of map tasks increases correspondingly as the DFS block size is kept constant. Hence, the map task durations remain the same. On the other hand, if the number of reduce tasks is kept constant, as the input dataset size increases, the data sets input to each reduce task increases (due to the assumption of uniform data distribution) and hence duration of reduce tasks increases linearly. For simplicity, it was assumed that the data grows uniformly and the data is uniformly distributed across the reduce tasks. Otherwise, skewed data processing need to be incorporated in the proposed MR simulator 108D with data skew percentage and best and worst execution time of map and reduce task were collected at each type of hardware node for given data skew percentage. Moreover, when the cluster size increases, for same data size, more map and reduce tasks may be executed in parallel in the cluster, however, for fixed MR framework settings, number of reduces and number of parallel threads fetching data from each node remains the same, hence a reduce task time is dependent on its input data size only.
[0050] Let Di be the input dataset size and Ri be the reduce task duration of the ith data point. A training set of two executions of the application on two different data sizes was used to derive two data points (D1; R1) and (D2; R2). A line C0 + C1 * Di = Ri was fit, where C0 and C1 are constants whose value is derived from the training dataset points. This line was extrapolated to predict the reduce task durations for larger dataset sizes. This linear regression model was validated on several different applications and input dataset sizes.
[0051] One time measurements were taken, both for training and validation of the model described by the embodiments of the present disclosure, by flushing all caches to bypass the effect of OS and file system caches.
[0052] For the experiments with increasing dataset sizes, the number of reduce tasks was kept constant across the different input dataset sizes. The linear regression technique was implemented (or fitted) using the first two data points of (input dataset size, reduce task duration). The regression line was then extrapolated for the remaining larger input dataset sizes. For the experiments with heterogeneous machine configurations, the input dataset size was kept constant and vary the mix of machine configurations. Finally, results for predicting application execution time from a small data and cluster size to large data and large cluster size are presented.
[0053] As described above, the cluster A was artificially made heterogeneous by turning off CPU cores on some machines. A cluster was formed out of one small, one medium and one large machine. FIG. 4A-4C, with reference to FIGS. 1 through 3, show results for a financial domain query executed on a heterogeneous cluster A according to an embodiment of the present disclosure. More particularly, FIGS. 4A-4C depict graphical representations illustrating a linear regression technique, predicting job completion time with increasing dataset size, and for different heterogeneous machine configurations for a financial domain query executed on a heterogeneous cluster A. The number of reduce tasks is kept constant at 2+4+8 = 14 = total number of concurrent reduce slots in the cluster, as the input dataset size is increased, so that one complete wave of reduce tasks is filled. The left most figure (FIG. 4A) shows that linear regression is a good fit for the data points of reduce task. The center figure (FIG. 4B) shows that an average prediction error is 5.8%. Next, five different clusters were formed with different mixes of small (S), medium (M) and large (L) machines and job completion times was predicted for 4 GB data size. The training cluster consist of 1L, 1M and 1S machine. The right most figure (FIG. 4C) shows that an average prediction error is 4.9%, while the maximum error is 9.7% for the 3L + 3M + 3S machine configuration.
[0054] FIGS. 5A-5C, with reference to FIGS. 1 through 4C, show results for a telecom domain query executed on the heterogeneous cluster A according to an embodiment of the present disclosure. More particularly, FIGS. 5A-5C, depict graphical representations illustrating the linear regression technique, prediction of job completion time with increasing dataset size, and for different heterogeneous machine configurations for a telecom domain query executed on the heterogeneous cluster A. The number of reduce tasks is kept constant at 14. The leftmost figure (FIG. 5A) shows that linear regression is a good fit for the data points. This is because only ‘Billing_fact’ table grows with increase in data size while the ‘Date_dimension’ table size is fixed, therefore MR job joining these two tables results in uniform increase in map output size and hence the embodiments of the present embodiments could use linear regression for predicting reduce tasks duration. The center figure (FIG. 5B) shows that an average prediction error is 6.2%, while the maximum error is 9.7% for the 4GB data point with increase in data size. For different heterogeneous configurations, the input dataset size is kept constant at 4GB. The rightmost figure (FIG. 5C) shows that an average prediction error is 5.4%, while the maximum error is 9.7% for the 1L + 1M + 1S machine configuration. The error may get reduced if map and reduce tasks could be measured at each machine in the cluster (even identical ones). A difference was observed in map tasks duration on different but identical configuration machines.
[0055] In order to experiment with larger cluster and input dataset sizes, a heterogeneous cluster B with configuration as shown in Table I was used. A small machine was configured as the master node for the JobTracker and the NameNode. FIGS. 6A-6C show results for the financial domain query executed on a heterogeneous cluster B according to an embodiment of the present disclosure. More particularly, FIGS. 6A-6C depict graphical representations illustrating the linear regression technique, prediction of job completion time with increasing dataset size, and for different heterogeneous machine configurations for the financial domain query executed on the heterogeneous cluster B. The leftmost figure (FIG. 6A) shows that linear regression is a good fit for the data points. The center figure (FIG. 6B) shows that an average prediction error is 8.1%, while the maximum error is 13.4% for the 50GB data point. Four different clusters were formed with different mixes of medium and large machines and job completion times were predicted. The rightmost figure (FIG. 6C) shows that an average prediction error is 4.2%, while the maximum error is 6.1% for the 4L + 3M machine configuration.
[0056] FIGS. 7A-7C, with reference to FIGS. 1 through 6C, show results for the telecom domain query executed on the heterogeneous cluster B according to an embodiment of the present disclosure. More particularly, FIGS. 7A-7C depict graphical representations illustrating the linear regression technique, prediction job completion time with increasing dataset size, and for different heterogeneous machine configurations for the telecom domain query executed on the heterogeneous cluster B. The leftmost figure (FIG. 7A) shows that linear regression is a good fit for the data points. The center figure (FIG. 7B) shows that an average prediction error is 9.1%, while the maximum error is 15.2% for the 30GB data point. Four different clusters were formed with different mixes of medium and large machines and predicted the job completion times. The rightmost figure (FIG. 7B) shows that an average prediction error is 5.6%, while the maximum error is 9.1% for the 3L + 3M machine configuration.
[0057] As a case study, a heterogeneous cluster C was used. The heterogeneous cluster C composed of 8 machines that were accumulated over time in a corporate division with configuration as shown in Table II. FIGS. 8A-8C, with reference to FIGS. 1 through 7C, show results for the financial domain query executed on a heterogeneous cluster C according to an embodiment of the present disclosure. More particularly, FIGS. 8A-8C depict graphical representations illustrating the linear regression technique, prediction job completion time with increasing dataset size, and for different heterogeneous machine configurations for the financial domain query executed on the heterogeneous cluster C. The leftmost figure (FIG. 8A) shows that linear regression is a good fit for the data points. The center figure (FIG. 8B) shows that an average prediction error is 4.2%, while the maximum error is 6.5% for the 10GB data point. Three different clusters were formed with different mixes of small, medium and large machines and job completion times were predicted. The rightmost figure (FIG. 8C) shows that an average prediction error is 4.4%, while the maximum error is 5.6% for the 2L + 2M + S machine configuration.
[0058] Scaled heterogeneous cluster:
[0059] The embodiments of the present disclosure validated the model with simultaneous increase in both data size as well as heterogeneous cluster size. FIG. 9 shows results of the same financial domain query executed on the heterogeneous cluster A according to an embodiment of the present disclosure. The model was trained on 1GB and 2GB dataset on a small heterogeneous cluster comprising of one large, one medium and one small machine and job completion time executed on larger input dataset size of 4GB on different heterogeneous machine configurations were predicted. More particularly, FIG. 9 depicts a graphical representation illustrating an average prediction error of 8.4%, and the maximum error of 10.3% for the 2L + 2M + 2S machine configuration. The error is slightly higher than in other experiments because the system 100 is implemented for predicting off larger data size and for heterogeneous mix of machines using the training dataset of smaller input data.
[0060] FIG. 10 depicts a graphical representation of predicted job completion time versus actual job completion time for experiments executed on different types of heterogeneous clusters with data sizes from X GB (1GB) to Y GB (100 GB) according to an embodiment of the present disclosure. It was observed that most of the points are clustered around the y = x line, implying that the prediction error is less in all the cases.
[0061] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[0062] Unlike conventional systems and methods which have focused on optimization and performance prediction of jobs for homogeneous cluster, the embodiments of the present disclosure implement the system 100 that predicts the performance of the given job on a set of machines in heterogeneous cluster. The system 100 implements and executes a discreet event based Map-Reduce(MR) simulator 108D for heterogeneous environments and linear regression for estimating map and reduce task timings on larger data sizes. The proposed MR simulator 108D collects map and reduce tasks measurements on a small cluster with low data volume. Based on this and linear regression, the MR simulator 108D predicts the job performance for any large cluster with mix of heterogeneous set of machines and any large data size. The above system 100 is tested for open source benchmarks and industrial applications in financial and telecom domain as described above. These applications’ performance have been predicted in mix of large, medium and small configuration machines with up to 484 cores in the cluster and 100 GB as largest data size. It was observed the proposed model provided by the embodiments of the present disclosure to be accurate with an average prediction error less than 10% in all cases.
[0063] It is, however to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software modules located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
[0064] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various modules described herein may be implemented in other modules or combinations of other modules. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[0065] The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium include a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks include compact disk-read only memory (CD-ROM), compact disk-read/write (CD-R/W), BLU-RAY, and DVD.
[0066] A data processing system suitable for storing and/or executing program code will include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code in order to reduce the number of times code must be retrieved from bulk storage during execution.
[0067] Input/output (I/O) devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers. Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
[0068] A representative hardware environment for practicing the embodiments may include a hardware configuration of an information handling/computer system in accordance with the embodiments herein. The system herein comprises at least one processor or central processing unit (CPU). The CPUs are interconnected via system bus to various devices such as a random access memory (RAM), read-only memory (ROM), and an input/output (I/O) adapter. The I/O adapter can connect to peripheral devices, such as disk units and tape drives, or other program storage devices that are readable by the system. The system can read the inventive instructions on the program storage devices and follow these instructions to execute the methodology of the embodiments herein.
[0069] The system further includes a user interface adapter that connects a keyboard, mouse, speaker, microphone, and/or other user interface devices such as a touch screen device (not shown) to the bus to gather user input. Additionally, a communication adapter connects the bus to a data processing network, and a display adapter connects the bus to a display device which may be embodied as an output device such as a monitor, printer, or transmitter, for example.
[0070] The preceding description has been presented with reference to various embodiments. Persons having ordinary skill in the art and technology to which this application pertains will appreciate that alterations and changes in the described structures and methods of operation can be practiced without meaningfully departing from the principle, spirit and scope.
| # | Name | Date |
|---|---|---|
| 1 | Form 3 [18-03-2016(online)].pdf | 2016-03-18 |
| 2 | Form 20 [18-03-2016(online)].pdf | 2016-03-18 |
| 3 | Form 18 [18-03-2016(online)].pdf | 2016-03-18 |
| 4 | Drawing [18-03-2016(online)].pdf | 2016-03-18 |
| 5 | Description(Complete) [18-03-2016(online)].pdf | 2016-03-18 |
| 6 | 201621009603-POWER OF AUTHORITY-(10-05-2016).pdf | 2016-05-10 |
| 7 | 201621009603-CORRESPONDENCE-(10-05-2016).pdf | 2016-05-10 |
| 8 | 201621009603-Form 1-090916.pdf | 2018-08-11 |
| 9 | 201621009603-Correspondence-090916.pdf | 2018-08-11 |
| 10 | 201621009603-FER.pdf | 2020-02-21 |
| 11 | 201621009603-OTHERS [21-08-2020(online)].pdf | 2020-08-21 |
| 12 | 201621009603-FER_SER_REPLY [21-08-2020(online)].pdf | 2020-08-21 |
| 13 | 201621009603-DRAWING [21-08-2020(online)].pdf | 2020-08-21 |
| 14 | 201621009603-COMPLETE SPECIFICATION [21-08-2020(online)].pdf | 2020-08-21 |
| 15 | 201621009603-CLAIMS [21-08-2020(online)].pdf | 2020-08-21 |
| 16 | 201621009603-US(14)-HearingNotice-(HearingDate-10-01-2024).pdf | 2023-12-05 |
| 17 | 201621009603-FORM-26 [08-01-2024(online)].pdf | 2024-01-08 |
| 18 | 201621009603-FORM-26 [08-01-2024(online)]-1.pdf | 2024-01-08 |
| 19 | 201621009603-Correspondence to notify the Controller [08-01-2024(online)].pdf | 2024-01-08 |
| 20 | 201621009603-Written submissions and relevant documents [19-01-2024(online)].pdf | 2024-01-19 |
| 21 | 201621009603-PatentCertificate24-01-2024.pdf | 2024-01-24 |
| 22 | 201621009603-IntimationOfGrant24-01-2024.pdf | 2024-01-24 |
| 1 | search_20-02-2020.pdf |