Abstract: Optimal trade-off between the accuracy and resource usage with best Machine Learning (ML) model selection is unaddressed technical problem. Embodiments of the present disclosure provide a method and system for accommodating a computing environment based on resource-aware machine learning model selection such that an optimal trade-off between performance and resource usage is achieved, where the performance is measured in terms of accuracy. The method of present disclosure describes resource-aware selection of models, kernel, solver and parameters based on utilization of using CPU resources and memory resources to accommodate capacity of on-premise server and the plethora of cloud service options. This helps in achieving a goal of minimizing cost and improving performance of machine learning processing. [To be published with FIG. 3]
FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION (See Section 10 and Rule 13)
Title of invention
METHODS AND SYSTEMS FOR ACCOMMODATING A COMPUTING
ENVIRONMENT BASED ON RESOURCE-AWARE MACHINE
LEARNING MODEL SELECTION
Applicant
Tata Consultancy Services Limited A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description
The following specification particularly describes the invention and the manner in which it is to be performed.
TECHNICAL FIELD [001] The embodiments herein generally relate to Machine Learning (ML) and, more particularly, to methods and systems for accommodating a computing environment based on resource-aware machine learning model selection.
BACKGROUND
[002] With emergence and application of intelligent automation in various applications in various industries, machine learning has become an integral part of multiple domains including healthcare, business, and retail. Traditionally, a data scientist having domain knowledge and familiarity with business requirements is required to create simple models and turn raw data into actionable business insights. However, to reduce manual intervention from the data scientist, automated machine learning (Auto-ML) framework has been gaining traction to automate tasks and automating some portions or even an entire ML process. Auto-ML helps in driving a project forward with lesser burden on the data scientist, effectively mitigating a problem stemming from shortage of data scientist and increased ML-enabled business opportunities.
[003] Existing auto-ML frameworks are essentially resource-oblivious, as they use predefined ML models, regardless of the host system configuration and cloud provisioning services, independent of the on-peak or off-peak hours in which the machine learning process is initiated. In addition, ML models and associated kernels differ in order of magnitudes in computing resource usage such as CPU and memory. The resource-obliviousness in existing auto-ML frameworks results in a non-negligible percentage of aborted processes due to out-of-memory and time limit exceeded (TLE) errors, overspending on cloud service and waste of computing resource. In addition, ML models and associated kernels differ in order of magnitudes in computing resource usage such as CPU and memory while maintaining high accuracy. Hence there is a requirement to achieve an optimal trade-off between the accuracy and efficient resource usage of the machine learning algorithms.
SUMMARY
[004] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
[005] For example, in one embodiment, a method for accommodating a computing environment based on resource-aware machine learning model selection is provided. The method includes receiving, via one or more hardware processors, a plurality of ML models and a training dataset of size F; determining, via the one or more hardware processors, (i) a set of low-resource usage ML models, (ii) a set of medium- resource usage ML models, and (iii) a set of set of high-resource usage ML models from the plurality of ML models; creating, via the one or more hardware processors, a resource usage unit comprising resource usage information of a host machine in a computing environment, wherein the step of creating the resource usage unit comprises: computing (i) a weighted CPU utilization C based on a real¬time average CPU utilization Creal and an hourly average CPU utilization Ch for the computing environment, wherein C = λCreal + (1 — A)Ch and wherein λ is a first adjustable parameter ranging from 0 and 1; computing a weighted memory usage M based on a current available memory size Mreal and an hourly average available memory size Mh from a historical data on the host machine when the weighted CPU utilization C on the host machine is less than a predefined CPU threshold, wherein the weighted memory usage M= μMreal + (1 — μ)Mh , and wherein μ is a second adjustable parameter ranging from 0 and 1; generating, a resource-aware pipeline of ML models by selecting at least one of (a) the set of low-resource usage ML models, (b) the set of medium-resource usage ML models, (c) the set of set of high-resource usage ML models, and (d) a combination thereof from the host machine, when at least one of a plurality of predefined conditions using an available disk size D on the host machine, the weighted CPU utilization C and the weighted memory usage M is satisfied; and creating, using the resource-aware pipeline of ML models, the resource usage unit; and selecting, via the one or more hardware processors, an optimal model as a ML model from the generated
resource-aware pipeline of ML models that is resource efficient and possess optimal accuracy with respect to the training dataset,
[006] In another aspect, a system for accommodating a computing environment based on resource-aware machine learning model selection is provided. The system comprises a memory storing instructions; one or more Input/Output (I/O) interfaces; and one or more hardware processors coupled to the memory via the one or more I/O interfaces, wherein the one or more hardware processors are configured by the instructions to receive, a plurality of ML models and a training dataset of size F; determine, a set of low-resource usage ML models, (ii) a set of medium- resource usage ML models, and (iii) a set of set of high-resource usage ML models from the plurality of ML models; create, a resource usage unit comprising resource usage information of a host machine in a computing environment, wherein the step of creating the resource usage unit comprises: computing (i) a weighted CPU utilization C based on a real-time average CPU utilization Creal and an hourly average CPU utilization Ch for the computing environment, wherein C = XCreal + (1 — λ)Ch and wherein λ is a first adjustable parameter ranging from 0 and 1; computing, a weighted memory usage M based on a current available memory size Mreal and an hourly average available memory size Mh from a historical data on the host machine when the weighted CPU utilization C on the host machine is less than a predefined CPU threshold, wherein the weighted memory usage M= μ Mreal + (1 —μ )Mh , and wherein μ is a second adjustable parameter ranging from 0 and 1; generating, a resource-aware pipeline of ML models by selecting at least one of (a) the set of low-resource usage ML models, (b) the set of medium-resource usage ML models, (c) the set of set of high-resource usage ML models and (d) a combination thereof from the host machine, when at least one of a plurality of predefined conditions using an available disk size D on the host machine, the weighted CPU utilization C and the weighted memory usage M is satisfied; and creating, using the resource-aware pipeline of ML models, the resource usage unit; and select, an optimal model as a ML model from the generated resource-aware pipeline of ML models that is resource efficient and possess optimal accuracy with respect to the training dataset.
[007] In yet another aspect, there are provided one or more non-transitory machine-readable information storage mediums comprising one or more instructions, which when executed by one or more hardware processors causes a method for accommodating a computing environment based on resource-aware machine learning model selection. The method includes receiving, via one or more hardware processors, a plurality of ML models and a training dataset of size F; determining, via the one or more hardware processors, (i) a set of low-resource usage ML models, (ii) a set of medium- resource usage ML models, and (iii) a set of set of high-resource usage ML models from the plurality of ML models; creating, via the one or more hardware processors, a resource usage unit comprising resource usage information of a host machine in a computing environment, wherein the step of creating the resource usage unit comprises: computing (i) a weighted CPU utilization C based on a real-time average CPU utilization Creal and an hourly average CPU utilization Ch for the computing environment, wherein C = λCreal + (1 — λ)Ch and wherein λ is a first adjustable parameter ranging from 0 and 1; computing a weighted memory usage M based on a current available memory size Mreal and an hourly average available memory size Mh from a historical data on the host machine when the weighted CPU utilization C on the host machine is less than a predefined CPU threshold, wherein the weighted memory usage M= μ Mreal + (1 — μ )Mh, and wherein μ is a second adjustable parameter ranging from 0 and 1; generating, a resource-aware pipeline of ML models by selecting at least one of (a) the set of low-resource usage ML models, (b) the set of medium-resource usage ML models, (c) the set of set of high-resource usage ML models, and (d) a combination thereof from the host machine, when at least one of a plurality of predefined conditions using an available disk size D on the host machine, the weighted CPU utilization C and the weighted memory usage M is satisfied; and creating, using the resource-aware pipeline of ML models, the resource usage unit; and selecting, via the one or more hardware processors, an optimal model as a ML model from the generated resource-aware pipeline of ML models that is resource efficient and possess optimal accuracy with respect to the training dataset,
[008] In accordance with an embodiment of the present disclosure, the plurality of predefined conditions comprises: a) Deferring the process and checking CPU utilization at a subsequent time interval when the weighted CPU utilization C on the host machine exceeds the predefined CPU threshold; b) selecting the set of low-resource usage ML models when a value of the available disk size on the host machine remains below twice of the size of the training dataset, D < 2 * F; c) selecting the set of medium-resource usage ML models when at least one of (i) a value of the weighted memory usage exceeds one fifth of the size of the training dataset M > F/5 and (ii) a value of the weighted CPU utilization lies in a first predefined range, wherein the first predefined range is 80% < C < 90%; d) selecting the set of high-resource usage ML models when the value of the weighted CPU utilization lies in a second predefined range, wherein the second predefined range is 70% < C < 80%; e) selecting a combination of (i) the set of low-resource usage ML models, (ii) the set of medium- resource usage ML models and (iii) the set of high- resource usage ML models using a plurality of CPU resources and a plurality of memory resources from the host machine when the weighted CPU utilization C on the host machine exceeds the predefined CPU threshold; and f) selecting at least one of: (i) the set of low-resource usage ML models, (ii) the set of medium-resource usage ML models and (iii) the set of high-resource usage ML models, using CPU resources and memory resources from a cloud environment when the weighted CPU utilization C on the host machine is less than the predefined CPU threshold.
[009] In accordance with an embodiment of the present disclosure, the predefined CPU threshold is 90 percent of the weighted CPU utilization.
[0010] In accordance with an embodiment of the present disclosure, value of the first adjustable parameter reflects a balanced consideration of current and historical CPU utilization on the host machine and is selected as 0.5.
[0011] In accordance with an embodiment of the present disclosure, value of the second adjustable parameter reflects a balanced view of current and historical memory usage on the host machine and is selected as 0.5.
[0012] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[0014] FIG. 1 is a functional block diagram of a system for accommodating a computing environment based on resource-aware machine learning model selection, in accordance with some embodiments of the present disclosure.
[0015] FIGS. 2A and 2B (collectively referred as FIG. 2) are graphical illustrations depicting evolution of CPU utilization of two different classifiers, in accordance with some embodiments of the present disclosure.
[0016] FIG. 3 is a flow diagram illustrating the method 300 for accommodating a computing environment based on resource-aware machine learning model selection, using the system of FIG. 1, in accordance with some embodiments of the present disclosure.
[0017] FIG. 4 is a flow diagram illustrating the step of creating a resource usage unit for the host machine, in accordance with some embodiments of the present disclosure.
[0018] FIG. 5 is a flow diagram illustrating the step of creating a resource usage unit in a computing environment, in accordance with some embodiments of the present disclosure, in accordance with some embodiments of the present disclosure.
[0019] It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems and devices embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, and the like represent various processes which may be substantially represented in computer readable medium
and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
DETAILED DESCRIPTION OF EMBODIMENTS
[0020] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
[0021] Referring now to the drawings, and more particularly to FIGS. 1 through 5, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[0022] FIG. 1 is a functional block diagram of a system for accommodating a computing environment based on resource-aware machine learning model selection, in accordance with some embodiments of the present disclosure.
[0023] In an embodiment, the system 100 includes a processor(s) 104, communication interface device(s), alternatively referred as input/output (I/O) interface(s) 106, and one or more data storage devices or a memory 102 operatively coupled to the processor(s) 104. The system 100 with one or more hardware processors is configured to execute functions of one or more functional blocks of the system 100.
[0024] Referring to the components of system 100, in an embodiment, the processor(s) 104, can be one or more hardware processors 104. In an embodiment, the one or more hardware processors 104 can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the
one or more hardware processors 104 are configured to fetch and execute computer-readable instructions stored in the memory 102. In an embodiment, the system 100 can be implemented in a variety of computing systems including laptop computers, notebooks, hand-held devices such as mobile phones, workstations, mainframe computers, servers, and the like.
[0025] The I/O interface(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular and the like. In an embodiment, the I/O interface (s) 106 can include one or more ports for connecting to a number of external devices or to another server or devices.
[0026] The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
[0027] Further, the memory 102 includes modules 110 (not shown) required for execution of functions of system 100. Furthermore, the memory 102 includes a database 108 that stores a plurality of ML models, a training dataset, a selected optimal model, computed weighted CPU utilization (C) and weighted memory usage (C) for the plurality of ML models, a set of low- resource usage ML models, a set of medium- resource usage ML models, a set of high-resource usage ML models, generated resource-aware pipeline of ML models, a plurality of datasets, a resource usage unit and/or the like. Further, the memory 102 may comprises information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system100 and methods of the present disclosure. In an embodiment, the database 108 may be external (not shown) to the system 100 and coupled to the system via the I/O interface 106. Functions of the components of the system 100 are explained in conjunction with flow diagram of FIG. 3 and FIGS. 4 to 5.
[0028] Automated machine learning (Auto-ML) framework has been gaining traction to automate tasks and for automating some portions or even an entire ML process. Auto-ML helps in driving a project forward with lesser burden on the data scientist, effectively mitigating a problem stemming from shortage of data scientist and increased ML-enabled business opportunities. Conventional auto-ML frameworks are essentially resource-oblivious, as they use predefined ML models, regardless of the host system configuration and cloud provisioning services, independent of the on-peak or off-peak hours in which the machine learning process is initiated. In addition, ML models and associated kernels differ in order of magnitudes in computing resource usage such as CPU and memory. The resource-obliviousness in the conventional auto-ML frameworks results in a non-negligible percentage of aborted processes due to out-of-memory and time limit exceeded (TLE) errors, overspending on cloud service and waste of computing resource. In addition, ML models and associated kernels differ in order of magnitudes in computing resource usage such as CPU and memory while maintaining high accuracy. Hence there is a requirement to achieve an optimal trade-off between the accuracy and efficient resource usage of the machine learning algorithms.
[0029] Embodiment of the present disclosure address unresolved problem of resource-obliviousness in existing auto-ML framework by resource-aware machine learning model selection. The present disclosure describes resource-aware selection of models, kernel, solver and parameters to accommodate capacity of on-premise server and the plethora of cloud service options, with a goal to minimize cost and improve performance of machine learning processing. Thus, embodiments of the present disclosure provide a method and system for accommodating a computing environment based on resource-aware machine learning model selection such that an optimal trade-off between performance and resource usage is achieved, wherein the performance is measured in terms of accuracy.
[0030] In context of the present disclosure, importance of resource-aware machine learning is illustrated by way of an example. In the example, performance of two classifiers is compared in terms of the elapsed time and classification
accuracy by conducting an experiment. The two classifiers used in the experiment are SVC (polynomial kernel) classifier, and linear discriminant analysis classifier to classify a sample dataset (say an iris dataset). The experiment was conducted on a device having 4 Core(s), 8 Logical Processor(s), in which the 4 Cores (4 Logical Processors) are used for classifying the sample dataset. FIGS. 2A and 2B (collectively referred as FIG. 2) are graphical illustrations depicting evolution of CPU utilization of two different classifiers, in accordance with some embodiments of the present disclosure. As shown in FIG. 2A that the Linear Discriminant Analysis classifier reached its 43 percent CPU capacity around 5 seconds after initialization, with the peak CPU utilization duration in less than 1 second. However, as depicted in FIG. 2B, the SVC with poly kernel reached the 53 percent CPU capacity around 5 seconds after initialization and lasted the peak CPU utilization for more than 210 seconds. In terms of classification accuracy, it is observed from FIGS. 2A and 2B that the SVC with poly kernel achieved 98.7% classification accuracy, which was marginally better than the 98% classification accuracy by Linear Discriminant Analysis classifier. The 98% classification accuracy by Linear Discriminant Analysis is good enough for most applications, the inconsequential 0.7% performance gain by the SVC with polynormal kernels is not worth the disproportionate attendant resource cost (205 peak CPU utilization time). Hence, the Linear Discriminant Analysis classifier is preferred over the SVC with poly kernel in the practical setting and in resource-constrained host environment in particular. The demand for computing resource such as CPU and memory vary widely among models and kernels chosen. Thus, from FIGS. 2A and 2B, it can be concluded that the two classifiers may differ by two orders of magnitude in terms of CPU usage, while having small difference in terms of accuracy, underscoring the fundamental trade-off between resource usage and performance. Such a trade-off constitutes a basis for developing the resource-aware pipeline disclosed by the system 100 that aims at minimizing the resource usage while maintaining the accuracy.
[0031] FIG. 3 is a flow diagram illustrating the method 300 for accommodating a computing environment based on resource-aware machine
learning model selection, using the system of FIG. 1, in accordance with some embodiments of the present disclosure.
[0032] In an embodiment, the system 100 comprises one or more data storage devices or the memory 102 operatively coupled to the processor(s) 104 and is configured to store instructions for execution of steps of the method 300 by the processor(s) or one or more hardware processors 104. The steps of the method 300 of the present disclosure will now be explained with reference to the components or blocks of the system 100 as depicted in FIG. 1 and the steps of flow diagram as depicted in FIG. 3. Although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods, and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps to be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
[0033] Referring to the steps of the method 300 in context of learnings from resource usage and accuracy trade-off of ML models, at step 302 of the method 300, the one or more hardware processors 104 are configured to receive, a plurality of ML models and a training dataset of size F. In an embodiment, the training dataset may initially comprise a plurality of training samples n. The number of training samples may keep on increasing iteratively with training process.
[0034] Further, at step 304 of FIG. 3, the one or more hardware processors 104 are configured to determine a set of low-resource usage ML models, (ii) a set of medium- resource usage ML models, and (iii) a set of set of high-resource usage ML models from the plurality of ML models. The step 304 is further illustrated by way of Table 1 and following exemplary explanation.
Classifier Rank Yo Y1 Resource usage Adj R2
GRADIENT BOOSTING CLASSIFIER 1 -931.97 182.3 High 0.91
MLPCLASSIFIER 2 484.93 84.50 High 0.96
RANDOM FOREST CLASSIFIER 3 201.82 79.51 High 0.99
LOGISTIC REGRESSION 4 92.89 71.43 High 0.99
ADABOOST CLASSIFIER 5 94.47 68.25 High 0.99
K NEIGHBORS CLASSIFIER 6 374.97 67.46 High 0.96
SVC 7 109.23 40.17 Medium 0.99
EXTRA TREES CLASSIFIER 8 337.29 26.84 Medium 0.99
PASSIVE AGGRESSIVE CLASSIFIER 9 273.31 5.64 Medium 0.99
QUADRATIC
DISCRIMINANT
ANALYSIS 10 266.70 4.91 Medium 0.99
Linear SVC 11 267.05 1.99 Low 0.99
LINEAR
DISCRIMINANT
ANALYSIS 12 291.28 1.33 Low 0.99
BERNOULLI NB 13 279.94 0.57 Low 0.98
GAUSSIAN NB 14 268.49 0.53 Low 0.94
COMPLEMENT NB 15 273.46 0.20 Low 0.93
MULTINOMIAL NB 16 282.80 0.18 Low 0.97
[0035] Table 1 provides fit results of 16 classifiers and classification of the plurality of ML models. Here, the classification of the plurality of ML models into the set of low-resource usage ML models, the set of medium- resource usage ML models, and the set of high-resource usage ML models is done based on value of slope ( Y1) of an energy consumption curve of each of the plurality of ML models against a plurality of sample sizes of a randomized grid search over a hyper-parameter space. The curve is plotted on a X-Y axis. The energy consumption curve is plotted based on value of energy consumption (E) of each of the plurality of ML models. A median energy consumption of the machine learning model is provided as shown in equation (1) below:
E(x) = Y0 + Y1x (1)
Here, Y0 denotes intercept of the energy consumption curve, Y1 denotes the slope of the energy consumption curve and x denotes a sample size for hyperparameter optimization. In Table 1, data under Adi.R2 field represent goodness-of-fit of ML
model. The higher the Adj.R2 value, the better fit the ML model is. ‘0’ and ‘1’ fields in Table 1 represent the fitted intercept and slope of the fit ML model. Table 1 shows that linear ML model fits data very well for all the classifiers under consideration, with high Adj.R2 values above an acceptable threshold vale which is 0.9. The slope Y1 reflects sensitivity of the sample size of hyperparameter optimization to the energy consumption. The intercept Yo, on the other hand, can be interpreted as a warm-up energy for ML model.
[0036] Furthermore, at step 306 of FIG. 3, the one or more hardware processors 104 are configured to create a resource usage unit comprising resource usage information of a host machine in a computing environment. In the present disclosure, the computing environment refers to a host machine in which machine learning model is executed. The host machine can be an on-premise server or a virtual machine in cloud service. The step of creating the resource usage unit is explained through following steps:
Step 1. Initially a weighted CPU utilization C for the computing environment is determined by computing a real-time average CPU utilization Creal and an hourly average CPU utilization Ch. Here, C = λCreal + (1 — λ)Ch and λ is a first adjustable parameter ranging from 0 and 1. The value of the first adjustable parameter reflects a balanced consideration of current and historical CPU utilization on the host machine and is selected as 0.5 in the present disclosure.
Step 2. When the weighted CPU utilization C on the host machine is less than a predefined CPU threshold, a weighted memory usage M is computed based on a current available memory size Mreal and an hourly average available memory size Mh from a historical data on the host machine. Here, the weighted memory usage M= μMreal + (1 — μ)Mh, and μ is a second adjustable parameter ranging from 0 and 1. The value of the second adjustable parameter reflects a balanced view of current and historical memory usage on the host machine and is selected as 0.5 in the present disclosure. However, if the weighted CPU utilization C on the host machine is greater than the predefined CPU threshold, the system 100 is considered to be at peak time and process of creating resource unit is deferred at a later time. Here, the predefined CPU threshold is 90 percent of the weighted CPU utilization.
Step 3. Once values of C and M are computed, a resource-aware pipeline of ML models is generated by selecting at least one of (a) the set of low-resource usage ML models, (b) the set of medium-resource usage ML models, (c) the set of set of high-resource usage ML models and (d) a combination thereof from the host machine, when at least one of a plurality of predefined conditions using an available disk size D on the host machine, the weighted CPU utilization C and the weighted memory usage M is satisfied. In an embodiment, the plurality of predefined conditions comprises: (a) Deferring the process and checking CPU utilization is at a subsequent time interval when the weighted CPU utilization C on the host machine exceeds the predefined CPU threshold; (b) selecting the set of low-resource usage ML models when a value of the available disk size on the host machine remains below twice of the size of the training dataset, D < 2 * F; (c) selecting the set of medium-resource usage ML models when at least one of (i) a value of the weighted memory usage exceeds one fifth of the size of the training dataset M > F/5 and (ii) a value of the weighted CPU utilization lies in a first predefined range, wherein the first predefined range is 80% < C < 90%; (d) selecting the set of high-resource usage ML models when the value of the weighted CPU utilization lies in a second predefined range, wherein the second predefined range is 70% < C < 80%; (e) selecting a combination of (i) the set of low-resource usage ML models, (ii) the set of medium- resource usage ML models and (iii) the set of high- resource usage ML models using a plurality of CPU resources and a plurality of memory resources from the host machine when the weighted CPU utilization C on the host machine exceeds the predefined CPU threshold; and selecting at least one of: (i) the set of low-resource usage ML models, (ii) the set of medium-resource usage ML models and (iii) the set of high-resource usage ML models, using CPU resources and memory resources from a cloud environment when the weighted CPU utilization C on the host machine is less than the predefined CPU threshold.
Step 4. Once the resource-aware pipeline of ML models, the resource usage unit is created using the resource-aware pipeline of ML models.
[0037] In an embodiment, the step 306 is further better understood by way of FIG. 4, FIG. 5 and following exemplary explanation.
[0038] As shown in FIG. 4, when a machine learning task is initialized, a resource component obtains a few samples on CPU utilization on the host machine that include capturing real-time average CPU utilization Creal and an hourly average CPU utilization Ch. Further, the weighted CPU utilization is computed using the real-time average CPU utilization Creal and an hourly average CPU utilization Ch as C = λCreal + (1 — λ)Ch. If the weighted CPU utilization C is above 90%, which means that the system is heavily loaded, then the ML task I is deferred at a later time. However, if the weighted CPU utilization C is below 90%, then the resource component obtains few samples on available memory size on the host machine which includes Mreal and Mh (hourly available memory), Further, the weighted memory usage M is computed using the current available memory size Mreal and an hourly average available memory size Mh as M = μMreal + (1 — μ)Mh, and an available disk size D is computed. Further, as depicted in FIG. 4, after computing the weighted memory usage M, it is checked if the available disk size D is less than 2 times of size of dataset, then the set of low resource usage ML models are selected. If D is larger than twice of dataset size, it is checked if the weighted memory usage M is less than one fifth of dataset size, if yes, then the set of medium resource usage ML models are chosen. However, If the weighed memory size is larger than the fifth of dataset size, then it is checked whether the weighted CPU utilization C is between 80% and 90%. If yes, then the set of medium resource usage ML models are chosen. If no, then it is checked whether the weighted CPU utilization is between 70% and 80%. If yes, then the set of high resource usage ML models are chosen, otherwise all models including low, medium and high resource usage ML models are chosen. As can be seen in FIG. 5, when the process is deferred for ML task 1 on host machine (alternatively referred as local server), then the whole procedure explained in conjunction with FIG. 4 is repeated for a ML task obtained from the cloud. The process of ML task execution repeatedly switches between host machine and cloud till the execution is completed.
[0039] Referring back to FIG. 3, at step 308, the one or more hardware processors 104 are configured to select, an optimal model as a ML model from the generated resource-aware pipeline of ML models that is resource efficient and possess optimal accuracy with respect to the training dataset. In an embodiment, the optimal model from the generated resource-aware pipeline of ML models is selected by first sequentially placing and executing each of (i) the set of the low resource-usage ML models, (ii) the set of the medium resource-usage ML models, and (iii) the set of the high resource-usage ML models in a descending order of a resource usage ranking in the resource-aware pipeline of ML models over the training dataset. Further, accuracy for each of (i) the set of the low resource-usage ML models, (ii) the set of the medium resource-usage ML models, and (iii) the set of the high resource-usage ML models placed in the resource-aware pipeline of ML models is computed. Further, a comparison of accuracy of each of (i) the set of the low resource-usage ML models, (ii) the set of the medium resource-usage ML models, and (iii) the set of the high resource-usage ML models placed in the resource-aware pipeline of ML models is performed to select a best ML model that has optimum or highest accuracy. In other words, the ML models in the resource-aware pipeline are evaluated in order of ranking with the least resource consuming ML model getting picked up first for evaluation for the training dataset. A best model from among the resource-aware pipeline is then selected based on an optimum accuracy value during evaluation.
[0040] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[0041] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for
implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means, and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
[0042] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[0043] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such
alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[0044] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[0045] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
We Claim:
1. A processor implemented method (300), comprising:
receiving (302), via one or more hardware processors, a plurality of
ML models and a training dataset of size F;
determining (304), via the one or more hardware processors, (i) a set
of low-resource usage ML models, (ii) a set of medium- resource usage ML
models, and (iii) a set of set of high-resource usage ML models from the
plurality of ML models;
creating (306), via the one or more hardware processors, a resource
usage unit comprising resource usage information of a host machine in a
computing environment, wherein the step of creating the resource usage unit
comprises:
computing (i) a weighted CPU utilization C based on a real-time average CPU utilization Creal and an hourly average CPU utilization Ch for the computing environment, wherein C = λCreal + (1 — λ)Ch and wherein λ is a first adjustable parameter ranging from 0 and 1;
computing a weighted memory usage M based on a current available memory size Mreal and an hourly average available memory size Mh from a historical data on the host machine when the weighted CPU utilization C on the host machine is less than a predefined CPU threshold, wherein the weighted memory usage M= μMreal + (1 — μ)Mh, and wherein μ is a second adjustable parameter ranging from 0 and 1;
generating, a resource-aware pipeline of ML models by selecting at least one of (a) the set of low-resource usage ML models, (b) the set of medium-resource usage ML models, (c) the set of set of high-resource usage ML models, and (d) a combination thereof from the host machine, when at least one of a plurality of predefined conditions using an available disk size D on the host machine, the
weighted CPU utilization C and the weighted memory usage M is
satisfied; and
creating, using the resource-aware pipeline of ML models,
the resource usage unit; and
selecting (308), via the one or more hardware processors, an optimal model as a ML model from the generated resource-aware pipeline of ML models that is resource efficient and possess optimal accuracy with respect to the training dataset.
2. The processor implemented method as claimed in claim 1, wherein the plurality of predefined conditions comprises:
a) Deferring the process and checking CPU utilization at a subsequent time interval when the weighted CPU utilization C on the host machine exceeds the predefined CPU threshold;
b) selecting the set of low-resource usage ML models when a value of the available disk size on the host machine remains below twice of the size of the training dataset, D < 2 * F;
c) selecting the set of medium-resource usage ML models when at least one of (i) a value of the weighted memory usage exceeds one fifth of the size of the training dataset M > F/5 and (ii) a value of the weighted CPU utilization lies in a first predefined range, wherein the first predefined range is 80% < C < 90%;
d) selecting the set of high-resource usage ML models when the value of the weighted CPU utilization lies in a second predefined range, wherein the second predefined range is 70% < C < 80%;
e) selecting a combination of (i) the set of low- resource usage ML models, (ii) the set of medium- resource usage ML models and (iii) the set of high- resource usage ML models using a plurality of CPU resources and a plurality of memory resources from the host machine when the weighted CPU utilization C on the host machine exceeds the predefined CPU threshold; and
f) selecting at least one of: (i) the set of low-resource usage ML models, (ii) the set of medium-resource usage ML models and (iii) the set of high-resource usage ML models, using CPU resources and memory resources from a cloud environment when the weighted CPU utilization C on the host machine is less than the predefined CPU threshold.
3. The processor implemented method as claimed in claim 1, wherein the predefined CPU threshold is 90 percent of the weighted CPU utilization.
4. The processor implemented method as claimed in claim 1, wherein value of the first adjustable parameter reflects a balanced consideration of current and historical CPU utilization on the host machine and is selected as 0.5.
5. The processor implemented method as claimed in claim 1, wherein value of the second adjustable parameter reflects a balanced view of current and historical memory usage on the host machine and is selected as 0.5.
6. A system (100) comprising:
a memory (102) storing instructions;
one or more Input/Output (I/O) interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the
one or more I/O interfaces (106), wherein the one or more hardware
processors (104) are configured by the instructions to:
receive, a plurality of ML models and a training dataset of size F;
determine, a set of low-resource usage ML models, (ii) a set of medium- resource usage ML models, and (iii) a set of set of high-resource usage ML models from the plurality of ML models;
create, a resource usage unit comprising resource usage information of a host machine in a computing environment, wherein the step of creating the resource usage unit comprises:
computing (i) a weighted CPU utilization C based on a real-time average CPU utilization Creal and an hourly average CPU utilization Ch for the computing environment, wherein C = λCreal + (1 — λ)Ch and wherein λ is a first adjustable parameter ranging from 0 and 1;
computing, a weighted memory usage M based on a current available memory size Mreal and an hourly average available memory size Mh from a historical data on the host machine when the weighted CPU utilization C on the host machine is less than a predefined CPU threshold, wherein the weighted memory usage M= μMreaX + (1 — μ)Mh, and wherein μ is a second adjustable parameter ranging from 0 and 1;
generating, a resource-aware pipeline of ML models by selecting at least one of (a) the set of low-resource usage ML models, (b) the set of medium-resource usage ML models, (c) the set of set of high-resource usage ML models and (d) a combination thereof from the host machine, when at least one of a plurality of predefined conditions using an available disk size D on the host machine, the weighted CPU utilization C and the weighted memory usage M is satisfied; and
creating, using the resource-aware pipeline of ML models, the resource usage unit; and
select, an optimal model as a ML model from the generated resource-aware pipeline of ML models that is resource efficient and possess optimal accuracy with respect to the training dataset.
7. The system as claimed in claim 6, wherein the plurality of predefined conditions comprises:
a) Deferring the process and checking CPU utilization at a subsequent time interval when the weighted CPU utilization C on the host machine exceeds the predefined CPU threshold;
b) selecting the set of low-resource usage ML models when a value of the available disk size on the host machine remains below twice of the size of the training dataset, D < 2 * F;
c) selecting the set of medium-resource usage ML models when at least one of (i) a value of the weighted memory usage exceeds one fifth of the size of the training dataset M > F/5 and (ii) a value of the weighted CPU utilization lies in a first predefined range, wherein the first predefined range is 80% < C < 90%;
d) selecting the set of high-resource usage ML models when the value of the weighted CPU utilization lies in a second predefined range, wherein the second predefined range is 70% < C < 80%;
e) selecting a combination of (i) the set of low- resource usage ML models, (ii) the set of medium- resource usage ML models and (iii) the set of high- resource usage ML models using a plurality of CPU resources and a plurality of memory resources from the host machine when the weighted CPU utilization C on the host machine exceeds the predefined CPU threshold; and
f) selecting at least one of: (i) the set of low-resource usage ML models, (ii) the set of medium-resource usage ML models and (iii) the set of high-resource usage ML models, using CPU resources and memory resources from a cloud environment when the weighted CPU utilization C on the host machine is less than the predefined CPU threshold.
8. The system as claimed in claim 6, wherein the predefined CPU threshold is 90 percent of the weighted CPU utilization.
9. The system as claimed in claim 6, wherein value of the first adjustable parameter reflects a balanced consideration of current and historical CPU utilization on the host machine and is selected as 0.5.
10. The system as claimed in claim 6, wherein value of the second adjustable parameter reflects a balanced view of current and historical memory usage on the host machine and is selected as 0.5.
| # | Name | Date |
|---|---|---|
| 1 | 202221025319-STATEMENT OF UNDERTAKING (FORM 3) [29-04-2022(online)].pdf | 2022-04-29 |
| 2 | 202221025319-REQUEST FOR EXAMINATION (FORM-18) [29-04-2022(online)].pdf | 2022-04-29 |
| 3 | 202221025319-FORM 18 [29-04-2022(online)].pdf | 2022-04-29 |
| 4 | 202221025319-FORM 1 [29-04-2022(online)].pdf | 2022-04-29 |
| 5 | 202221025319-FIGURE OF ABSTRACT [29-04-2022(online)].jpg | 2022-04-29 |
| 6 | 202221025319-DRAWINGS [29-04-2022(online)].pdf | 2022-04-29 |
| 7 | 202221025319-DECLARATION OF INVENTORSHIP (FORM 5) [29-04-2022(online)].pdf | 2022-04-29 |
| 8 | 202221025319-COMPLETE SPECIFICATION [29-04-2022(online)].pdf | 2022-04-29 |
| 9 | 202221025319-Proof of Right [31-05-2022(online)].pdf | 2022-05-31 |
| 10 | 202221025319-FORM-26 [23-06-2022(online)].pdf | 2022-06-23 |
| 11 | Abstract1.jpg | 2022-08-10 |