Method And System For Energy Aware Auto Ml Framework

< Back

Method And System For Energy Aware Auto Ml Framework

Abstract: ABSTRACT METHOD AND SYSTEM FOR ENERGY-AWARE Auto-ML FRAMEWORK Optimal trade-off between the accuracy and energy consumption for best Machine Learning (ML) model selection is unaddressed technical problem. Embodiments of the present disclosure provide a method and system for an energy-aware AutoML framework for selecting energy efficient ML model having optimal trade-off between expected accuracy and energy consumption. The method formulates a Marginal Accuracy Benefit (MAB) metric as a function of energy consumption and accuracy to rank machine learning models or algorithms to generate an energy-aware pipeline. wherein higher the MAB less is the energy consumption and vice versa. The ML models in the energy-aware pipeline are evaluated in order of ranking with the least energy consuming ML model getting picked up first for evaluation for a dataset of interest. The best model is from among the energy-aware pipeline is then selected by applying a prescribed accuracy criteria during evaluation. [To be published with 1B]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

22 April 2022

Publication Number

43/2023

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Tata Consultancy Services Limited

Nirmal Building, 9th floor, Nariman point, Mumbai 400021, Maharashtra, India

Inventors

1. LING, Yibei

Tata Consultancy Services Limited 379 Thornall Street, Edison 08837, NJ, USA

2. KHATUA, Chitta

Tata Consultancy Services Limited, Kalinga Park, SEZ Cargo, Plot No 35, Chandaka Industrial Estate. Near Infocity, Patia, Chandrasekharpur, Bhubaneswar 751024, Odisha, India

Specification

Description: FORM 2

THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003

COMPLETE SPECIFICATION
(See Section 10 and Rule 13)

Title of invention:
METHOD AND SYSTEM FOR ENERGY-AWARE Auto-ML FRAMEWORK

Applicant
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India

Preamble to the description:
The following specification particularly describes the invention and the manner in which it is to be performed.
TECHNICAL FIELD
The embodiments herein generally relate to Machine Learning (ML) and, more particularly, to a method and system for energy-aware Automated ML (AutoML) framework for selecting energy efficient ML models.

BACKGROUND
With intelligent automation entering into almost every application domain, Machine Learning (ML) ceases to be an option but is rather an essential part of every domain right from medical, business to smart homes and smart cities. Obviously, there is a greater demand for data scientists but not that many available to fill the demand. However, there exists a non-ML expert, also referred to as citizen data scientist, who is characterized as a person who is familiar with the nuances of the data and has domain knowledge and insight to spot a business opportunity, has the basic skill to create simple models and turn raw data into actionable business insights. However, they usually lack the skill to work with advanced data analytics and modeling. Automated machine learning (AutoML) framework has been gaining traction to mitigate such a demand supply gap by freeing up data scientists to work on more impactful and hard-to-automate tasks and automating some portions or even the entire ML process. AutoML empowers the citizen data scientist to drive the project forward with lesser burden on the data scientist, effectively mitigating the problem stemming from shortage of data scientist and increased ML-enabled business opportunities.
AutoML frameworks address various requirements in ML process. One such requirement is automated best model selection. However, during best model selection, conventional Machine Learning frameworks tend to focus on prediction accuracy aspect and hardly focus on energy consumption by the ML-model. Attempts have been made to capture the energy consumption perspective during ML model selection. For example, one of the existing works, ‘What should mobile app developers do 2 about machine learning and energy?’ by Andrea McIntos, Abram Hindle, attempts to characterize the time duration as a measure energy consumption ) of ML algorithms in mobile devices and characterize the accuracy of ML algorithms with different datasets. However, time duration is not always proportional to energy consumption and the existing approach may not rightly capture energy consumption. Majority of the prior works are based on techniques like hyperparameter optimization, Bayesian optimization, cross validation, which do not take energy consumption into account.
However, prediction accuracy of the ML model is critical and is highly dependent on requirements of the end application/end user. Hence there is a requirement to achieve an optimal trade-off between the accuracy and energy consumption of the machine learning algorithms. Automating this accuracy-energy consumption trade-off in automated ML model selection is technically challenging as does not exists any metric for this. Currently the accuracy-energy consumption trade-off is resolved with expert intervention.

SUMMARY
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
For example, in one embodiment, a method for energy-aware Automated ML (Auto-ML) framework for selecting energy efficient ML model is provided. The method includes generating an energy-aware pipeline of ML models comprising a plurality of ML models for selection of a best model having an optimal trade-off between an energy consumption (E) and an accuracy (P) with respect to a dataset of interest. Generating the energy-aware pipeline of ML models comprises steps of : (a) determining the energy consumption (E) of each of the plurality of ML models by computing a CPU utilization (U) of each of the plurality of ML models for executing a ML process across a plurality of sample sizes of a randomized grid search over hyper-parameter space , wherein the CPU utilization (U) is directly proportional to the energy consumption (E) of each of the plurality of ML models; (b) plotting a curve of the energy consumption (E) of each of the plurality of ML models against the plurality of sample sizes of the randomized grid search on a X-Y axis to identify a slope? ( ??_1) of the curve and a y-axis intercept ? ( ??_0). (c) computing an energy consumption E(x) of each of the plurality of ML models for a sample size (x) selected from among the plurality of sample sizes, wherein the E(x)=? ??_0+ ?_1 x); (d) computing an accuracy P(x) of each of the plurality of ML models for the sample size (x) on hyperparameter space, wherein the P(x)= a(1-ß exp??(-k.x)?), wherein a is an asymptotic limit of a corresponding ML model, ß represents growth range of the corresponding ML model indicative of a leeway for further improvement in the accuracy P(x), and k represents rate indicating convergence of the corresponding ML model to the asymptomatic limit; (e) determining a Marginal Accuracy Benefit MAB(x) of each of the plurality of ML models for the sample size (x) by computing an incremental change in the accuracy P(x) with respect an incremental change in the energy consumption E(x) for the sample size (x); and (f) arranging the plurality of ML models according to decreasing order of the MAB(x) to generate the energy-aware pipeline of ML models for selecting the best model, wherein the MAB(x) is inversely proportional to the energy consumption E(x).
Furthermore, the method includes selecting the best model from the energy-aware pipeline of ML models by: (a) sequentially performing hyperparameter optimization of each of the plurality of ML models arranged in a MAB descending order in the energy-aware pipeline of ML models over the dataset of interest; and (b) selecting the best model as one of (a) a ML model among the energy-aware pipeline of ML models that satisfies the prescribed accuracy criteria, and (b) a ML model having the highest accuracy if none of the ML models in the power efficient pipeline satisfy the prescribed accuracy.
In another aspect, a system for energy-aware Automated ML (Auto-ML) framework for selecting energy efficient ML model is provided. The system comprises a memory storing instructions; one or more Input/Output (I/O) interfaces; and one or more hardware processors coupled to the memory via the one or more I/O interfaces, wherein the one or more hardware processors are configured by the instructions to generate an energy-aware pipeline of ML models comprising a plurality of ML models for selection of a best model having an optimal trade-off between an energy consumption (E) and an accuracy (P) with respect to a dataset of interest. Generating the energy-aware pipeline of ML models comprises steps of: (a) determining the energy consumption (E) of each of the plurality of ML models by computing a CPU utilization (U) of each of the plurality of ML models for executing a ML process across a plurality of sample sizes of a randomized grid search over hyper-parameter space of, wherein the CPU utilization (U) is directly proportional to the energy consumption (E) of each of the plurality of ML models; (b) plotting a curve of the energy consumption (E) of each of the plurality of ML models against the plurality of sample sizes of on a X-Y axis to identify a slope? ( ??_1) of the curve and a y-axis intercept ? ( ??_0). (c) computing an energy consumption E(x) of each of the plurality of ML models for a sample size (x) selected from among the plurality of sample sizes, wherein the E(x)=? ??_0+ ?_1 x); (d) computing an accuracy P(x) of each of the plurality of ML models for the sample size (x) on hyperparameter space, wherein the P(x)= a(1-ß exp??(-k.x)?), wherein a is an asymptotic limit of a corresponding ML model, ß represents growth range of the corresponding ML model indicative of a leeway for further improvement in the accuracy P(x), and k represents rate indicating convergence of the corresponding ML model to the asymptomatic limit; (e) determining a Marginal Accuracy Benefit MAB(x) of each of the plurality of ML models for the sample size (x) by computing an incremental change in the accuracy P(x) with respect an incremental change in the energy consumption E(x) for the sample size (x); and (f) arranging the plurality of ML models according to decreasing order of the MAB(x) to generate the energy-aware pipeline of ML models for selecting the best model, wherein the MAB(x) is inversely proportional to the energy consumption E(x).
Furthermore, the one or more hardware processors are configured to select the best model from the energy-aware pipeline of ML models by: (a) sequentially performing hyperparameter optimization of each of the plurality of ML models arranged in a MAB descending order inthe energy-aware (power efficient) pipeline of ML models over the dataset of interest; and (b) selecting the best model as one of (a) a ML model among the energy-aware pipeline of ML models that satisfies the prescribed accuracy criteria, and (b) a ML model having the highest accuracy if none of the ML models in the power efficient pipeline satisfy the prescribed accuracy.
In yet another aspect, there are provided one or more non-transitory machine-readable information storage mediums comprising one or more instructions, which when executed by one or more hardware processors causes a method for energy-aware Automated ML (Auto-ML) framework for selecting energy efficient ML model.
The method includes generating an energy-aware pipeline of ML models comprising a plurality of ML models for selection of a best model having an optimal trade-off between an energy consumption (E) and an accuracy (P) with respect to a dataset of interest. Generating the energy-aware pipeline of ML models comprises steps of : (a) determining the energy consumption (E) of each of the plurality of ML models by computing a CPU utilization (U) of each of the plurality of ML models for executing a ML process across a plurality of sample sizes of a randomized grid search over hyper-parameter space, wherein the CPU utilization (U) is directly proportional to the energy consumption (E) of each of the plurality of ML models; (b) plotting a curve of the energy consumption (E) of each of the plurality of ML models against the plurality of sample sizes of on a X-Y axis to identify a slope? ( ??_1) of the curve and a y-axis intercept ? ( ??_0). (c) computing an energy consumption E(x) of each of the plurality of ML models for a sample size (x) selected from among the plurality of sample sizes, wherein the E(x)=? ??_0+ ?_1 x); (d) computing an accuracy P(x) of each of the plurality of ML models for the sample size (x) on hyperparameter space, wherein the P(x)= a(1-ß exp??(-k.x)?), wherein a is an asymptotic limit of a corresponding ML model, ß represents growth range of the corresponding ML model indicative of a leeway for further improvement in the accuracy P(x), and k represents rate indicating convergence of the corresponding ML model to the asymptomatic limit; (e) determining a Marginal Accuracy Benefit MAB(x) of each of the plurality of ML models for the sample size (x) by computing an incremental change in the accuracy P(x) with respect an incremental change in the energy consumption E(x) for the sample size (x); and (f) arranging the plurality of ML models according to decreasing order of the MAB(x) to generate the energy-aware pipeline of ML models for selecting the best model, wherein the MAB(x) is inversely proportional to the energy consumption E(x).
Furthermore, the method includes selecting the best model from the energy-aware pipeline of ML models by: (a) sequentially performing hyperparameter optimization of each of the plurality of ML models arranged in a MAB descending order in the energy-aware (power-efficient) pipeline of ML models over the dataset of interest; and (b) selecting the best model as one of (a) a ML model among the energy-aware pipeline of ML models that satisfies the prescribed accuracy criteria, and (b) a ML model having the highest accuracy if none of the ML models in the power efficient pipeline satisfy the prescribed accuracy.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
FIG. 1A is a functional block diagram of a system providing an energy-aware Automated Machine Learning (Auto-ML) framework for selecting energy efficient ML model, in accordance with some embodiments of the present disclosure.
FIG. 1B illustrates an architectural overview of the system of FIG. 1 for generating an energy-aware pipeline of ML models, in accordance with some embodiments of the present disclosure.
FIGS. 2A through 2B (collectively referred as FIG. 2) is a flow diagram illustrating a method providing the energy-aware Automated ML (Auto-ML) framework for selecting energy efficient ML model, using the system of FIG. 1, in accordance with some embodiments of the present disclosure.
FIGS. 3A through 3D (collectively referred as FIG. 3) are graphical illustrations depicting evolution of CPU utilization and energy consumption of a ML model for a random sample size for two different datasets, in accordance with some embodiments of the present disclosure.
FIGS. 4A and 4B (collectively referred as FIG. 4) are graphical illustrations depicting variation of energy consumption of the ML model with respect to variation in sample size for two different datasets, in accordance with some embodiments of the present disclosure.
FIGS. 5A and 5B (collectively referred as FIG. 5) are graphical illustrations depicting variation of classification accuracy of the ML model with respect to variation in sample size for two different datasets, in accordance with some embodiments of the present disclosure.
FIG. 6 depicts an energy-aware pipeline of ML models ranked based on Marginal Accuracy Benefit (MAB) to be used for best model selection, in accordance with some embodiments of the present disclosure.
FIG. 7 is a graphical illustrations depicting diminishing MAB with increase in energy consumption of the ML model, in accordance with some embodiments of the present disclosure.
FIG. 8 illustrates overview of best model selection process from generated energy-aware pipeline of ML models, in accordance with some embodiments of the present disclosure
It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems and devices embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.

DETAILED DESCRIPTION OF EMBODIMENTS
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
Embodiments of the present disclosure provide a method and system for an energy-aware Automated Machine Learning (Auto-ML) framework for selecting energy or power efficient ML model having optimal trade-off between performance and energy consumption, wherein performance is measured in terms of accuracy. The method formulates a Marginal Accuracy Benefit (MAB) metric as a function of energy consumption and accuracy, to rank machine learning models or algorithms to generate an energy-aware pipeline, wherein higher the MAB less is the energy consumption and vice versa. The energy consumption parameter of the MAB is captured by the method disclosed herein in terms of CPU utilization, which is generally regarded as a reliable indicator of energy consumption. The ML models in the energy-aware pipeline are evaluated in order of ranking with the least energy consuming ML model getting picked up first for evaluation for a dataset of interest. A best model from among the energy-aware pipeline is then selected by applying a prescribed accuracy criteria during evaluation.
The advantage of using MAB ranking, as disclosed, rather than actual value of the MAB of each model is that the MAB ranking is invariant over the datasets. Thus, once ranking is determined, it can be used for the best model selection across different datasets without requiring repetition of rank determination, making it time efficient process.
Referring now to the drawings, and more particularly to FIGS. 1A through 8, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
FIG. 1A is a functional block diagram of a system providing an energy-aware Automated Machine Learning (Auto-ML) framework for selecting energy efficient ML model, in accordance with some embodiments of the present disclosure.
In an embodiment, the system 100 includes a processor(s) 104, communication interface device(s), alternatively referred as input/output (I/O) interface(s) 106, and one or more data storage devices or a memory 102 operatively coupled to the processor(s) 104. The system 100 with one or more hardware processors is configured to execute functions of one or more functional blocks of the system 100.
Referring to the components of system 100, in an embodiment, the processor(s) 104, can be one or more hardware processors 104. In an embodiment, the one or more hardware processors 104 can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 104 are configured to fetch and execute computer-readable instructions stored in the memory 102. In an embodiment, the system 100 can be implemented in a variety of computing systems including laptop computers, notebooks, hand-held devices such as mobile phones, workstations, mainframe computers, servers, and the like.
The I/O interface(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular and the like. In an embodiment, the I/O interface (s) 106 can include one or more ports for connecting to a number of external devices or to another server or devices.
The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
Further, the memory 102 includes modules 110 (not shown) required for execution of functions of system 100. Furthermore, the memory 102 includes a database 108 that stores a plurality of ML models, the best model selected for a dataset of interest, computed energy consumption (E) and accuracy (P) for the plurality of ML models, computed MAB for each ML model, generated energy-aware pipeline of ML models based on raking of MAB of each model, a plurality of datasets, prescribed accuracy for a dataset of interest and the like. Further, the memory 102 may comprises information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system100 and methods of the present disclosure. In an embodiment, the database 108 may be external (not shown) to the system 100 and coupled to the system via the I/O interface 106. Functions of the components of the system 100 are explained in conjunction with architectural overview in FIG. 1B, flow diagram of FIG. 2 and FIGS. 3 to 8.
FIG. 1B illustrates an architectural overview of the system of FIG. 1 for generating an energy-aware pipeline of ML models, in accordance with some embodiments of the present disclosure and is explained in conjunction with a method 200 of FIG. 2. The energy-aware pipeline enables selecting the best model for the dataset of interest, wherein the best model provides an optimal trade-off between an expected accuracy (one measure of performance) and an energy consumption. Measuring the energy consumption of ML-related task is the prerequisite for building the energy efficient ML framework. It involves measuring some observable system parameter and then relating it to the actual energy consumption that reflect ML-related tasks. There are a lot of system parameters that are correlated with the power consumption of a processor, such as the number of L1 primary cache and L2 cache references per second, the number of L2 cache misses per second, floating point instructions retired per second and branch instructions retired per second [Chen et al., 2008, Dargie, 2014, Bircher and John, 2012, Zhang et al., 2013]. The precise measurement of L1 and L2 cache statistics requires the use of specialized utilities which are hard to come by in practice. For deep neutral network, Yang et al. [Yang et al., 2017a, Yang et al., 2017b] used the multiply and accumulate (MAC) operation metric for estimating the energy consumption. The most straightforward observable parameter is CPU utilization. There is a general consensus [Zhang et al., 2013, Dargie, 2014] that the power consumption of a processor grows linearly with CPU utilization, such a linear relationship has been substantiated by several experiments [Dargie, 2014] under different processor types and workloads.
In light of the extensive literature the CPU utilization is modelled as a metric for power consumption, and relationship between the CPU utilization and power consumption is modeled as:
P= P_idle+((P_max- P_idle )(U-U_idle))/(U_max-U_idle ) = P_idle+ a (U-U_idle) ---(1)
where P_max and P_idle denote the the power consumption when the CPU utilization of a processor at the peak U_max and at the idle U_idle respectively, and
the coefficient of proportionality between the CPU utilization U and power consumption P.
Referring to CPU Utilization and Power Consumption graph from Blackburn 2008, it is observed can see that when U_idle =5% and U_max=95%, their corresponding power consumption P_idle = 220W and P_95 = 310W, respectively. By substituting these values of P_5 and P_95 into Eq (1) obtained is the value of
a as (one watt )/(one percentage utilization), meaning that an increase in 1% CPU utilization corresponds to one watt W increase per unit time. Notice that the value of
may vary with processor types, and herein the value of the
a chosen as 1 in the experimental study. In a discrete setting, the energy consumed over the interval [t_1,t_2] can be approximated as:
E= ?_(i=0)^n¦(P_idle+ a (U?(t?_i )-U_idle)) ??t?_i ---(2)
where ?U(t?_i) refers to the CPU utilization at t_idiscrete time instant and ??t?_i=t_(i+1)- t_i,for 0=i=n.
Energy Consumption Comparison: To get a sense of how much energy is used in a ML process, the method disclosed herein performs measuring and comparing the energy consumption of mlpclassifier and multinomialnb classifiers on the publicly available iris and digits data sets. FIGS. 3A through 3D (collectively referred as FIG. 3) are graphical illustrations depicting evolution of CPU utilization and energy consumption of a ML model for a random sample size for two different datasets, in accordance with some embodiments of the present disclosure. Thus, FIG. 3 shows the evolution of CPU utilization and cumulative energy consumption over the training process. Observations derived from FIG. 3 are used by the method for generating the energy-aware pipeline. The multinomialnb takes 2 seconds for classifying the iris data set, as opposed to 110 seconds by the mlpclassifier on the same data set. Table 1 below compares the energy and accuracy of the two classifiers:
TABLE 1:
dataset classifier accuracy Energy –joules
iris MULTINOMIALNB 95.3% 320
iris MLPCLASSIFIER 96.0% 8207
digits MULTINOMIALNB 90.5% 910
digits MLPCLASSIFIER 99.8% 161435

The multinomialnb consumed the 320 joules in classifying the iris data set, achieved the 95.3% accuracy as opposed to the 8207 joules with the 96% accuracy by the mlpclassifier classifier. In other words, mlpclassifier gained only 1% in accuracy but consumed more than 25 times as much as the energy consumed by the multinomialnb model. As a result, the amount of energy saving by choosing the multinomialnb model over the mlpclassifier amounts to the energy of lighting up 100W light bulb for 78 seconds or of reducing CO2 emissions by 0.51 gram [Carbon Trust, 2020]. As shown in table 1, for the digits data set, the multinomialnb model achieved the 91% accuracy with the energy of 337.71 joules consumed on training, as opposed to the 99.8% accuracy by the mlpclassifier one with the energy of 156,999.88 joules consumed on model training. Thus, at a slight sacrifice in accuracy, the amount of energy saving by choosing the multinomialnb model over the mlpclassifier one amounts to the energy of raising the temperature of 749 grams of water from 20 to 70. Or lighting up 100W light bulb for 1566 seconds, equivalently. a reduction of 10.4 gram in CO2 emissions 2. The CO2 emissions are used as a metric to quantify the negative environmental impact of machine learning and solidify the fact that the extent of such impact depends on the choice of ML models and the size of data set.
Thus, from table 1, is can be concluded that different ML models may differ by orders of magnitude in energy consumption, while having small difference in terms of accuracy, underscoring the fundamental trade-off between energy efficiency and performance. Such a trade-off constitutes a basis for developing the energy-aware pipeline disclosed by the system 100 that aims at minimizing the energy consumption while maintaining the prescribed accuracy.
Energy Consumption Characterization: To understand the energy consumption characterization, experiment on energy-accuracy relationship of 16 classifiers using the randomized grid search over the hyper-parameter space is observed. A 5-fold cross validation and the 4 concurrent jobs, and accuracy as the default experiment setting. Detailed hyper-parameter configuration of the classifiers (ML models or algorithms) is known apriori. The experiment begins by disabling non-essential background processes using to reduce the impact of background processes on the energy measurement. For each classifier, 10 replicated energy consumption and accuracy measurements at each sample size are run, and the median value is gathered over the 10 measurements to represent the performance of the classifier at the given sample size. The sample size ranges from 10 to 100 with increment of 10, plus the sample sizes of 1 and 5 and the total number of runs for each classifier is 120. FIGS. 4A and 4B (collectively referred as FIG. 4) are graphical illustrations depicting variation of energy consumption of the ML model with respect to variation in sample size for two different datasets, in accordance with some embodiments of the present disclosure. The energy consumption of the mlpclassifier and multinomial classifiers are plotted under different the sample size of the randomized grid search in where the vertical/horizontal axis denote the energy consumption in the unit of joules and the sample size, respectively. The distribution over the 10 trials at each sample size is plotted in the boxplot whose boxed section spans the lower and upper quartiles and includes a line at the median. The whiskers below and above each boxed section represent the position of the most extreme data point within 1.5 times the inter-quartile range with respect to the nearest quartile. Data points outside the whiskers marked with plus symbols are considered as outliers. Visual inspection of FIG. 4 shows that the relationship between the energy consumption and the sample size appears to be linearly related, despite the presence of outliers. Consequently, it is reasonable to assume that such a relation can be well approximated by a linear model as:
E(x)=? ??_0+ ?_1 x (2)
where ? ??_0 denotes the intercept, ?_1 the slope, and E(x) the median energy consumption in classifying the iris data set at the x sample size. The reason of choosing the median over the mean is that the median is widely believed to be more robust to outliers than the mean. The fit results of the 16 classifiers are tabulated in table depicted in FIG. 6 where the data under the Adj.R^2 field represent the goodness-of-fit of the model. The higher the Adj.R^2value, the better fit the model, and the ? ??_0 and ? ??_1 fields represent the fitted intercept and slope of the fit model. Table depicted in FIG. 6 shows that linear model fits the data very well for all the classifiers under consideration, with high Adj.R^2values above the acceptable threshold 0.9. The slope ? ??_1 is of practical importance as it reflects the sensitivity of the sample size to the energy consumption. The intercept ? ??_0, on the other hand, can be interpreted as the warm-up energy for classifier. Comparing the ? ??_1 values in FIG. 6, it can be seen that the gradient boosting and mlpclassifier represent two extremes of sensitivity of the sample size to the energy consumption. The former is the costliest classifier while the latter is the least costly classifier: the two classifiers differ by two orders of magnitude in energy consumption. The remaining classifiers lie between these two extremes.
FIGS. 5A and 5B (collectively referred as FIG. 5) are graphical illustrations depicting variation of classification accuracy of the ML model with respect to variation in sample size for two different datasets, in accordance with some embodiments of the present disclosure. The accuracy performance of the mlpclassifier and multinomial classifiers presented in FIG. 5, the vertical/horizontal axis denote the accuracy and the sample size, respectively. The distribution of the energy consumption over the 10 independent runs at each sample size is plotted as a scatterplot in which the symbol ? denotes the median accuracy over the 10 runs. In this particular case, the performance of these two classifiers reach to the asymptotic limit
when the sample size is very small. As a result, increasing the sample size does not lead to further improvement.
Let P(x) as an accuracy function of sample size x. A monomolecular model [Draper and Smith, 1998, Rawlings et al., 1998] is used to establish the relationship between the accuracy P(x) and the sample size x as follows:
P(x)= a(1-ß exp??(-k.x)?) (3)
where, a,ß and k are the coefficients to be estimated. There is an intuitive explanation of the coefficients. The a value is thought to be the asymptotic limit of algorithm, and k values represent the growth range and rate, respectively. The higher the
value, the better the performance of classifier. A large k value implies a fast convergence to the asymptotic limit, a,ß and k while a large value of ß offers more leeway for further improvement. The model is fit using the median accuracy points over the 10 runs. The fitted curves, together with the exact function, plotted in FIG. 5A and 5B give a clear picture of the performance of classifiers.
Insightful conclusions are be drawn by comparing the a,ß and k values. As an example, consider a pair of algorithms (ML models), bernoullinb and mlpclassifier. From table of FIG. 6, it is observed that the
a value for mlpclassifier is 1, as opposed to the a
value of 0.87 for bernoullinb, meaning that the mlpclassifier outperforms bernoullinb in the accuracy limit. On another hand, there is a little leeway for the mlpclassifier for further improvement because the ß is already very close to 0. In comparison, the ß and k values for bernoullinb are 0.07,4.47, which suggests that an increase in the sample size could lead to further accuracy improvement.
Marginal Accuracy Benefit MAB(x) for the sample size (x): A concept from economics called the marginal accuracy benefit (MAB hereafter) is applied for measuring the incremental change in accuracy with respect to the incremental change in energy, written as (dP(x))/(dE(x)). By substituting the monomolecular accuracy model P(x) in equation 3 and linear energy model E(x) in equation 4, a Marginal Accuracy Benefit is obtained as follows:
=(dP(x))/(dE(x))=(aßk exp??(-k .x)?)/? ??_1 (4)
In view of equation 4, lim-(x?8)??MAB(x)??0, thus it can be seen that the k value in fact determines how fast the MAB function is decreasing. It can be noted that a and k are non-negative, and ß can be positive, zero or negative. The larger the k value, the faster the MAB decreases. A plot of the MAB curves as function of the energy consumption for few chosen classifiers is depicted in FIG. 7, where the y-axis denotes MAB and the x-axis the energy consumption on the log scale. It can be observed from FIG. 7 that the MAB(x) of classifier generally decreases as the energy consumption increases. The MAB(x) for quadratic discriminant analysis is decreasing gradually as the energy consumption increases, as opposed to that for linearsvc classifier which drops abruptly at one energy consumption level. This means that after a certain energy consumption level, further increasing the sample size (so energy consumption level) does not bring any accuracy benefit but waste the energy. Since E(x)=? ??_0+ ?_1 x and MAB(x) =(dP(x))/(dE(x))=(aßk exp??(-k .x)?)/? ??_1 , thus the per-sample energy consumption can be approximated by the slope 1, that is, E(x)/x˜ ? ??_1, and the reciprocal of ? ??_1 value thus can be thought of as the upper bound for MAB(x), that is, MAB(x) < ? 1/??_1 . The larger the 1 value, the higher the energy consumed but the lower the marginal accuracy benefit. Thus, the per-sample energy consumption equals to the reciprocal of the upper bound of marginal accuracy benefit.
Two concepts can be present in a duality: minimizing the energy consumption is equivalent to maximizing the marginal accuracy benefit. Based on this insight the MAB upper bound ? 1/??_1 to rank classifiers as shown in table of FIG. 6 depicting the table with the energy-aware pipeline of ML models ranked based on Marginal Accuracy Benefit (MAB) to be used for best model selection, in accordance with some embodiments of the present disclosure. The highest ranked classifier can be thought of as the highest marginal accuracy benefit or the lowest energy classifier. Thus, MAB and energy rankings are reciprocal to each other. The primary reason for using the ranking, instead of their actual values, is that the MAB ranking is invariant over the data sets.
It is to be noted similar experiment, when performed using the digits data set, it is found that the fitted parameters such as ? ??_0,? ??_1,and a of classifiers ( ML models or algorithms) are different from ones in table of FIG. 6, but the ranking of classifiers remains intact. FIG. 6 depicts that different ML models or algorithms have different characteristics in terms of the marginal accuracy benefit: the mlpclassifier has the highest MAB ranking, gradientboosting, on the other hand, has the lowest MAB ranking, which corresponds to the well-known fact that a substantial computing may result in a very small improvement in performance. The MAB ranking of ML algorithms as shown in FIG. 6 solidifies the fact that different ML models may differ by orders of magnitude in energy consumption, while having small difference in terms of accuracy.
This energy ranking of algorithms is used by the system 100 and the method 200 to minimize the energy consumption of ML process while satisfying prescribed accuracy during best model selection as depicted in FIG. 2 and FIG. 8.
FIGS. 2A through 2B (collectively referred as FIG. 2) is a flow diagram illustrating the method 200 providing the energy-aware Automated ML (Auto-ML) framework for selecting energy efficient ML model, using the system of FIG. 1A, in accordance with some embodiments of the present disclosure.
In an embodiment, the system 100 comprises one or more data storage devices or the memory 102 operatively coupled to the processor(s) 104 and is configured to store instructions for execution of steps of the method 200 by the processor(s) or one or more hardware processors 104. The steps of the method 200 of the present disclosure will now be explained with reference to the components or blocks of the system 100 as depicted in FIG. 1 and the steps of flow diagram as depicted in FIG. 2. Although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods, and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps to be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
Referring to the steps of the method 200 in context of learnings from energy consumption and accuracy trade-off of ML models, at step 202 of the method 200, the one or more hardware processors 104 generate the energy-aware pipeline of ML models comprising a plurality of ML models for selection of a best model having an optimal trade-off between an energy consumption (E) and an accuracy (P). The energy-aware pipeline generation and selection of the best model is explained through steps 202a through 202f.
Step 202a- Initially the energy consumption (E) of each of the plurality of ML models available in the database 108 is determined by computing a CPU utilization (U) of each of the plurality of ML models for executing a ML process across a plurality of sample sizes of a randomized grid search over hyper-parameter space. As mentioned in FIG. 1B, the CPU utilization (U) is directly proportional to the energy consumption (E) of each of the plurality of ML models.
Step 202b- Once the energy consumption is computed, a curve of the energy consumption (E) of each of the plurality of ML models against the plurality of sample sizes of the randomized grid search over hyper-parameter space is plotted on a X-Y axis to identify the slope? ( ??_1) of the curve and the y-axis intercept ? ( ??_0).
Step 202c- Once values of ? ??_0 and ? ??_10 are obtained, the energy consumption E(x) of each of the plurality of ML models is computed for specific sample size (x) selected from among the plurality of sample sizes, wherein from equation 2, the E(x)=? ??_0+x ?_1 .
Step 202d- Simultaneously, the accuracy P(x) of each of the plurality of ML models for the sample size (x) is computed on hyperparameter space, wherein the P(x)= a(1-ß exp??(-k.x)?) as in equation 3, wherein a is the asymptotic limit of a corresponding ML model, ß represents the growth range of the corresponding ML model indicative of a leeway for further improvement in the accuracy P(x), and k represents the rate indicating convergence of the corresponding ML model to the asymptomatic limit.
Step 202e- Post deriving P(x) and E(x), the Marginal Accuracy Benefit MAB(x) of each of the plurality of ML models for the sample size (x) is derived by computing the incremental change in the accuracy P(x) with respect the incremental change in the energy consumption E(x) for the sample size (x). As mentioned in FIG. 1B description, the per sample energy consumption E(x)/x is approximated by the slope ? ??_1, and the upper bound of the MAB(x) for the sample size (x) is limited by reciprocal of the slope ? ??_1. The E(x)/x and the upper bound vary with respect to a chosen dataset, whereas the decreasing order of the MAB(x) remains unchanged even with variation in the dataset. Reiterating the point that the advantage of using MAB ranking, as disclosed, rather than actual value of the MAB of each model is that the MAB ranking is invariant over the datasets. Thus, once ranking is determined, it can be used for the best model selection across different datasets without requiring repetition of rank determination, making it time efficient process.
Step 202f- Based on the MAB(x), the plurality of ML models are arranged according to decreasing order of the MAB(x) to generate the energy-aware pipeline of ML models for selecting the best model, which is power efficient and still satisfies prescribed accuracy constraints required by end application for which the ML model is to be installed. As mentioned earlier the MAB(x) is inversely proportional to the energy consumption E(x). Selection of the best model from the energy-aware pipeline of ML models is depicted in FIG. 8. At first hyperparameter optimization of each of the plurality of ML models is performed sequentially based on the top to bottom ranked ML models in descending order of MAB in the energy-aware pipeline of ML models over the dataset of interest. Thereafter, the best model selected is one of (i) a ML model among the energy-aware pipeline of ML models that satisfies the prescribed accuracy criteria, and (ii) a ML model having the highest accuracy if none of the ML models in the power efficient pipeline satisfy the prescribed accuracy.
Efficacy Analysis of Energy-Aware Pipeline:
Lemma 1 (Hardy, Littlewoood and Polya).
If a_i=0 and b_i=0 0, bi _0, i=1,· · ·,n are two sets of non-negative numbers,
?(a?_([1],) b_((1) )………a_((n),) b_((n) ))?_w ?(a?_(i,) b_(i )………a_(n,) b_n)
?_w ?(a?_([1],) b_([1] )………a_([n],) b_([n]))
Lemma 1, which is given in [Marshall and Olkin, 1979],has an intuitive explanation in the context of energy efficiency: Interpret the symbol ‘a’ as the energy consumption rankings of a set of algorithms, and ‘b’ as probabilities of selecting algorithms interchangeably referred as ML model. To get the minimum average energy consumption (or maximum average MAB), the algorithm with the lowest energy consumption ranking algorithm (equivalently, the highest MAB ranking algorithm) will be selected with the highest probability. This gives considerable insight into how to arrange algorithms in the pipeline so as to minimize the energy consumption of ML process. The problem of minimizing the energy consumption reduces to the problem of assigning the probability pA_(i,) to algorithms.
Definition: A pipeline is called energy-aware if the algorithms in the pipeline are arranged in the increasing energy consumption ranking e_?= (E_((1),) E_((2),)…….E_((m) ) ), which is denoted by [A_((1),) A_((2),)…….A_((m) )]. Hereafter, the square/round bracket is used to indicate the pipeline/vector, respectively. Define ?_(i,) (A_(i,) D) to be the accuracy of executing algorithm A_(i,) on the dataset D. The energy-aware pipeline execution flow is depicted via pseudocode 1 and the process flow represented in FIG. 8, where the pipeline P and the counting array C are persistent variables across multiple user sessions. All the algorithms in the pipeline A_((1),) A_((2),)…….A_((m) ) are configured in the increasing energy consumption rankings. The pseudocode 1 is loaded from the energy-aware pipeline at a time and invoked against the data set D. If the obtained accuracy is better than or equal to the required accuracy ?_(i ) (A_(i,) D) = ? ( prescribed accuracy), then the process ends and returns the result as shown in line 14 of pseudocode 1. Otherwise, it fetches the next algorithm (ML model) in the pipeline and repeats the same process. The c_(i,) is an ever-increasing persistent variable for keeping track of the number of times of the corresponding algorithm A_(i,) invocation over multiple requests. In the case where all the algorithms in the pipeline fail to satisfy the prescribed accuracy, the best performed algorithm in the pipeline is chosen as a result of pipeline execution as shown in line 18 of pseudocode 1, which is the worst-case scenario.
Pseudocode 1: Energy-aware pipeline
1: Input data set: D, required or prescribed accuracy:?
2: Initial variables: i = 0,?_(best )= 0,k = 0
3: Persistent Pipeline: ?P=[A?_((1),) A_((2),)…….A_((m) )]
4: Persistent Counter Array: C=c_(1,) c_(2,………….) c_(m )
5: while i = m do
6: A P[i] {get an algorithm from P}
7: c_(i )? c_(i )+1 {increment ci for A_(i,)}
8: ?_(a ) ????_(i ) (A_(i,) D) {get the accuracy from A_(i,)}
9: if ?_(a )> ?_(best )best then
10: ?_(best )??_(a )
11: k?i
12: end if
13: if ?_(a )= ?_ then
14: return A_(i,), ?_(best ){ends if {?_(a )> ??}?_ }
15: end if
16: i ?i+1
17: end while
18: return k,?_(best ),A_((k),) {return the best performed algorithm in P}
The following theorem 1 shows that this pipeline structure is optimal.
Theorem 1. The energy-aware pipeline is optimal in terms of energy efficiency.
Proof. Let ?P=[A?_((1),) A_((2),)…….A_((m) )] be a set of algorithms in the energy-aware pipeline, e =[E_((1),) E_((2),)…….E_((m) )], be the corresponding energy consumption rankings, and C=c_(1,) c_(2,………….) c_(m )be the number of times of invoking algorithm A_(i,). The sequential execution dependence in the pipeline E_((1),) E_((2),)…….E_((m) )sures that c_(i )> c_(i )+1=i=m-1. Hence C=c_(1,) c_(2,………….) c_(m )=c_([1],) c_([2],………….) c_([m ] ) Define p_i=c_i/(?_(i=1)^m¦c_i ), where p_i is the fraction of invoking the algorithm A_i. It is apparent that P=p_(1,) p_(2,………….) p_(m )=p_([1],) p_([2],………….) p_([m ]). Since the algorithms in P=[A_((1)……… ) A_((m) )]=c_([1],) c_([2],………….) c_([m ] )are in the increasing energy consumption ranking order. Thus, the two vectors E_((1),) E_((2),)…….E_((m) )and p_([1],) p_([2],………….) p_([m ] )are oppositely ordered. It follows directly from Lemma 1 that the average energy consumption ranking of the energy-aware pipeline execution ?_(i=1)^m¦E_((i),) P_([i] ) is minimal.
Theorem 1 states that the energy saving of ML process can be achieved only when an algorithm with lower energy consumption ranking has a greater chance of being evaluated. In other words, the likelihood of an algorithm being executed is in an inverse proportion to its energy consumption ranking. To achieve this objective in a practical setting, first the comprehensive empirical study is performed to derive the energy consumption rankings of models/algorithms and show that such an energy consumption ranking is invariant with respect to data sets, then the algorithms (ML models) are placed in the ranking ascending order in the pipeline that guarantees that a lower-energy algorithm has a higher chance of being evaluated. For randomly selected algorithms, each algorithm then has an equal probability of being executed, so the average energy consumption ranking is (E_avg ) ¯= ?_(i=1)^m¦E_((i) ) /m which is outperformed by the energy-aware pipeline P=[A_((1)……… ) A_((m) ), (E_opt ) ¯= ?_(i=1)^m¦?E_((i) ) ?Ep?_([i] ) ?as shown in Theorem 1.
The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means, and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims. ,
Claims: We Claim:
1. A processor implemented method (200) comprising:
generating (202), by one or more hardware processors, an energy-aware pipeline of ML models comprising a plurality of ML models for selection of a best model having an optimal trade-off between an energy consumption (E) and an accuracy (P) with respect to a dataset of interest, wherein generating the energy-aware pipeline of ML models comprising:
determining the energy consumption (E) of each of the plurality of ML models by computing a CPU utilization (U) of each of the plurality of ML models for executing a ML process across a plurality of sample sizes of a randomized grid search over hyper-parameter space , wherein the CPU utilization (U) is directly proportional to the energy consumption (E) of each of the plurality of ML models (202a);
plotting a curve of the energy consumption (E) of each of the plurality of ML models against the plurality of sample sizes of the randomized grid search over hyper-parameter space of on a X-Y axis to identify a slope? ( ??_1) of the curve and a y-axis intercept ? ( ??_0) (202b);
computing an energy consumption E(x) of each of the plurality of ML models for a sample size (x) selected from among the plurality of sample sizes, wherein the E(x)=? ??_0+ ?_1 x (202c);
computing an accuracy P(x) of each of the plurality of ML models for the sample size (x) on hyperparameter space, wherein the P(x)= a(1-ß exp??(-k.x)?), wherein a is an asymptotic limit of a corresponding ML model, ß represents growth range of the corresponding ML model indicative of a leeway for further improvement in the accuracy P(x), and k represents rate indicating convergence of the corresponding ML model to the asymptomatic limit (202d);
determining a Marginal Accuracy Benefit MAB(x) of each of the plurality of ML models for the sample size (x) by computing an incremental change in the accuracy P(x) with respect an incremental change in the energy consumption E(x) for the sample size (x) (202e); and
arranging the plurality of ML models according to decreasing order of the MAB(x) to generate the energy-aware pipeline of ML models for selecting the best model, wherein the MAB(x) is inversely proportional to the energy consumption E(x) (202f).

2. The method as claimed in claim 1, wherein selecting the best model from the energy-aware pipeline of ML models comprises:
sequentially performing hyperparameter optimization of each of the plurality of ML models arranged in a MAB descending order in the energy-aware pipeline of ML models over the dataset of interest; and
selecting the best model as one of (a) a ML model among the energy-aware pipeline of ML models that satisfies the prescribed accuracy criteria, and (b) a ML model having the highest accuracy if none of the ML models in the energy-aware pipeline of ML models satisfy the prescribed accuracy.

3. The method as claimed in claim 1, wherein the MAB(x) =(dP(x))/(dE(x))=(aßk exp??(-k .x)?)/? ??_1 , calculated experimentally using same dataset for each of the plurality of ML models, and wherein the MAB(x) order of ML models remains unchanged across different datasets.

4. The method as claimed in claim 1, wherein per sample energy consumption E(x)/x is approximated by the slope ? ??_1, and an upper bound of the MAB(x) for the sample size (x) is limited by reciprocal of the slope ? ??_1, wherein the E(x)/x and the upper bound vary with respect to a chosen dataset, whereas the decreasing order of the MAB(x) remains unchanged even with variation in the dataset.

5. The method as claimed in claim 1, wherein the energy-aware pipeline of ML models corresponds to the arrangement of ML models in a decreasing MAB ranking, which is equivalent to an increasing energy consumption ranking.

6. A system (100) comprising:
a memory (102) storing instructions;
one or more Input/Output (I/O) interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the one or more I/O interfaces (106), wherein the one or more hardware processors (104) are configured by the instructions to:

generate an energy-aware pipeline of ML models comprising a plurality of ML models for selection of a best model having an optimal trade-off between an energy consumption (E) and an accuracy (P) with respect to a dataset of interest, wherein generating the energy-aware pipeline of ML models comprising:
determining the energy consumption (E) of each of the plurality of ML models by computing a CPU utilization (U) of each of the plurality of ML models for executing a ML process across a plurality of sample sizes of a randomized grid search over hyper-parameter space, wherein the CPU utilization (U) is directly proportional to the energy consumption (E) of each of the plurality of ML models;
plotting a curve of the energy consumption (E) of each of the plurality of ML models against the plurality of sample sizes of the randomized grid search over hyper-parameter space of the dataset on a X-Y axis to identify a slope? ( ??_1) of the curve and a y-axis intercept ? ( ??_0)
computing an energy consumption E(x) of each of the plurality of ML models for a sample size (x) selected from among the plurality of sample sizes, wherein the E(x)=? ??_0+ ?_1 x);
computing an accuracy P(x) of each of the plurality of ML models for the sample size (x) on hyperparameter space, wherein the P(x)= a(1-ß exp??(-k.x)?), wherein a is an asymptotic limit of a corresponding ML model, ß represents growth range of the corresponding ML model indicative of a leeway for further improvement in the accuracy P(x), and k represents rate indicating convergence of the corresponding ML model to the asymptomatic limit;
determining a Marginal Accuracy Benefit MAB(x) of each of the plurality of ML models for the sample size (x) by computing an incremental change in the accuracy P(x) with respect an incremental change in the energy consumption E(x) for the sample size (x); and
arranging the plurality of ML models according to decreasing order of the MAB(x) to generate the energy-aware pipeline of ML models for selecting the best model, wherein the MAB(x) is inversely proportional to the energy consumption E(x).

7. The system as claimed in claim 6, wherein the one or more hardware processors 104 are configured to select the best model from the energy-aware pipeline of ML models by:
sequentially performing hyperparameter optimization of each of the plurality of ML models arranged in a MAB descending order in the energy-aware pipeline of ML models over the dataset of interest; and
selecting the best model as one of (a) a ML model among the energy-aware pipeline of ML models that satisfies the prescribed accuracy criteria, and (b) a ML model having the highest accuracy if none of the ML models in the energy-aware pipeline of ML models satisfy the prescribed accuracy.

8. The system as claimed in claim 6, wherein the MAB(x) =(dP(x))/(dE(x))=(aßk exp??(-k .x)?)/? ??_1 , and calculated experimentally using same dataset for each of the plurality of ML models, and wherein the MAB(x) order of ML models remains unchanged across different datasets.

9. The system as claimed in claim 6, wherein per sample energy consumption E(x)/x is approximated by the slope ? ??_1, and an upper bound of the MAB(x) for the sample size (x) is limited by reciprocal of the slope ? ??_1, wherein the E(x)/x and the upper bound vary with respect to a chosen dataset, whereas the decreasing order of the MAB(x) remains unchanged even with variation in the dataset.

10. The system as claimed in claim 6, wherein the energy-aware pipeline of ML models corresponds to the arrangement of ML models in a decreasing MAB ranking, which is equivalent to an increasing energy consumption ranking.

Dated this 22nd day of April 2022
Tata Consultancy Services Limited
By their Agent & Attorney

(Adheesh Nargolkar)
of Khaitan & Co
Reg No IN-PA-1086

Documents

Application Documents

#	Name	Date
1	202221023926-STATEMENT OF UNDERTAKING (FORM 3) [22-04-2022(online)].pdf	2022-04-22
2	202221023926-REQUEST FOR EXAMINATION (FORM-18) [22-04-2022(online)].pdf	2022-04-22
3	202221023926-FORM 18 [22-04-2022(online)].pdf	2022-04-22
4	202221023926-FORM 1 [22-04-2022(online)].pdf	2022-04-22
5	202221023926-FIGURE OF ABSTRACT [22-04-2022(online)].jpg	2022-04-22
6	202221023926-DRAWINGS [22-04-2022(online)].pdf	2022-04-22
7	202221023926-DECLARATION OF INVENTORSHIP (FORM 5) [22-04-2022(online)].pdf	2022-04-22
8	202221023926-COMPLETE SPECIFICATION [22-04-2022(online)].pdf	2022-04-22
9	202221023926-FORM-26 [23-06-2022(online)].pdf	2022-06-23
10	202221023926-Proof of Right [29-06-2022(online)].pdf	2022-06-29
11	Abstract1.jpg	2022-07-30
12	202221023926-FER.pdf	2025-04-01
13	202221023926-FER_SER_REPLY [29-09-2025(online)].pdf	2025-09-29
14	202221023926-CLAIMS [29-09-2025(online)].pdf	2025-09-29

Search Strategy

1	SearchHistory(82)E_23-02-2024.pdf
2	NPL1E_23-02-2024.pdf