Abstract: A capacity planning tool for estimating the number of nodes required for hosting a web based application has been disclosed. The tool includes a benchmarking unit 102 which provides the "cost per CRUD operation value" based on performance profiling of sample ASP.NET applications. The tool further comprises a processing unit 114 which based on a predetermined user profile and "cost per CRUD operation value" provided by the benchmarking unit 102, computes the peak total cost of business operations value for the web based application which is further used to calculate the total number of NODES required for hosting the application.
FORM-2 THE PATENTS ACT, 1970
(39 of 1970)
THE PATENTS RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
CAPACITY PLANNING TOOL
TATA CONSULTANCY SERVICES LTD.,
an Indian Company of Nirmal Building, 9th Floor,
Nariman Point, Mumbai - 21,
Maharashtra, India.
The following specification particularly describes the invention and the manner in which it is to be performed.
This application is a patent of addition to Indian Patent Application No. 1172/MUM/2009 filed on May 4th, 2009, the entire contents of which, is specifically incorporated herein by reference.
FIELD OF INVENTION
The present invention relates to the field of capacity planning.
Particularly, the present invention relates to a tool for sizing hardware capacity for the web/application servers required for typically hosting ASP.NET applications.
DEFINITION OF TERMS USED IN THE SPECIFICATION
In this specification the term 'Node' refers to a computing unit including a standalone workstation, a server, a central processing unit and the like.
In this specification the term 'Application Specific Information' refers to information including 'approximate number of concurrent users in peak usage hours', average expected usage profile for the web based application in terms of create, read, update and delete operations', 'average time said profile will be taking for execution' i.e. 'average user session length value', 'designated processing unit's speed' and 'designated processing unit's utilization'.
In this specification the term 'Average user session length value' refers to the average time calculated between first and last requests made by concurrent users accessing a web based application.
In this specification the term 'actual cost per CRUD operation value' refers to total cost computed for each of create, read, update and delete database operations in Megacycle at maximum request per second for a sample web based application under performance profiling.
In this specification the term 'Atomic profile' refers to a set of individual create, read, update and delete database operations which can be performed on a web application. And the term 'Mixed profile' refers to a plurality of multiple create, read, update, delete database operations which are performed on the web application.
BACKGROUND OF THE INVENTION
Capacity planning in general refers to a process for gauging the computational requirements/ hardware resources for hosting any web or windows application. Typically, when designing a web application, development teams have to keep in mind that hundreds of thousands of users can concurrently access the application in production. This concurrent user load can place tremendous stress upon system resources like CPU, memory, disk I/O and the like causing unexpected delays in application response, and thereby resulting in a poor user experience.
Inadequate hardware sizing is one of the prime reasons for a web application to perform poorly. Applications designed with highly scalable architecture and all best practices for development can also perform poorly if the capacity planning for the hardware which hosts the application has not been performed adequately.
Sizing is typically done based on a tiered architecture with the web server and application server deployed on the same server and database deployed on a separate server. The methodology can also be extended to applications deployed on N tier architectures. The typical deployment architecture of ASP.NET applications that can be sized has been shown in FIGURE 1 of the accompanying drawings.
Conventionally, TCA (Transaction Cost Analysis) is a sizing methodology that is used for modeling Web/Application server performances. TCA is a good approach towards profiling a web application based on its workload profile, identifying the most viewed pages and then predicting the web/application server sizing based on the performance of
those pages under user load. There is a proportional relationship between website throughput and user load in TCA based analysis.
TCA allows calculation of cost of transactions of a web application profile by simulating client transactions on the host server. It can determine the resource utilization under varying user load by varying the user transaction rate. Usage profile can be designed to capture anticipated user behaviour. This usage profile can be used to determine the throughput target and other important transaction parameters from which resource utilization and capacity requirements are derived.
Typically, TCA can be used to measure the cost of individual shopper operations on an E-Commerce site such as registering on the site, browsing items, searching items, and adding an item to the shopping cart, checking out and the like. The capacity of a typical E-Commerce site can be determined by consolidating the costs of individual shopper operations and then converting this cost into the total resources required by the server. Transaction Cost Analysis typically comprises the following:
• Compiling a User Profile
• Measuring the Cost of Each Operation
• Estimating the Site Capacity
• Calculating the Site Capacity
• Verifying the Site Capacity
One major shortcoming which TCA suffers is that it can be applied only to a developed site and when the most common user profile and cost of operations are known.
The IT system integrators are required to budget the cost of hardware resources for the project well in advance before development, at the Pre Bid Stage or RFP (Request for Proposal) stage wherein the usage/ load on the system is not known. For a situation when sizing has to be done before hand at project planning stage or RFP stage, TCA cannot be applied in its present form. To overcome this shortcoming there is a need for sizing analysis at the RFP stage itself.
There have been various attempts in the prior art for developing such tools which can be applied at the RFP stage.
Particularly, United States Patent Application 2009/0235268 discloses a method for capacity planning based on determining resource utilization as a function of workload. Workload includes different types of requests made to a system including login request, request to purchase an item and the like. The capacity planning as disclosed in this patent application can be performed using a hypothetical workload, thus can be applied at the RFP stage. The method performs the capacity planning by determining the characteristics of workload and utilization of resources for processing that workload. Further, the critical resource levels are detected for a workload and capacity is estimated based on these workloads.
United States Patent Application 2008/0221941 discloses a system and method for capacity planning for computing systems. The disclosed system receives a representative load workload of a computing system and determines the resource cost for the workload and based on the same derives the capacity planning analysis for the computing system. The disclosure also takes into account the time delay between two transactions termed as 'think time', for accurately deciding the capacity needs of the computing system. The resources include the various transactions that take place on the computing system.
The above disclosures estimate the hardware sizing based on the various transactions which can be performed by the application hence the estimation is done at the application level which increase the time and complexity of the system to derive a particular solution. Moreover, the disclosure provides means for verifying whether the computed hardware resource estimates are accurate.
Therefore, there is felt a need for a hardware sizing system and methodology which based on the web application requirements can perform capacity planning to meet the application needs. In addition, there is a need for a system which is less complex and accurately estimates the sizing requirements even at the RFP stage.
OBJECT OF THE INVENTION
It is an object of the present invention to provide a time efficient capacity planning and sizing tool.
It is another object of the present invention to provide an accurate tool for calculating the hardware sizing based on the web application's needs.
It is still another object of the present invention to provide a user-friendly, portable and scalable tool for capacity planning and sizing.
It is yet another object of the present invention to provide a tool for calculating hardware sizing of applications at Request For Proposal (RFP) stage as well as for developed web servers.
One more object of the present invention is to provide a tool for capacity planning of new applications when workload and the technical architecture are the only details available.
SUMMARY OF THE INVENTION
The present invention envisages a capacity planning tool for estimating the number of nodes required for hosting a web based application, wherein the tool comprises the following components:
• a benchmarking unit adapted to receive a sample web based application specific information, the benchmarking unit having:
■ recording means adapted to record all atomic transactions which can be performed by the sample web based application based on Create, Read, Update and Delete (CRUD) database operations and provide an atomic user profile and further adapted to record a mix of all transactions based on CRUD database operations and provide a mixed user profile;
■ a repository to store the atomic user profile and the mixed user profile;
■ testing means adapted to iteratively test the web application performance based on the atomic user profile and the mixed user profile respectively with increasing load and time till a maximum throughput is achieved and further adapted to generate test results in the form of 'cost of single CRUD operation value' for the atomic user profile and the mixed user profile respectively;
■ a comparator adapted to receive the 'cost of single CRUD operation value' for the atomic user profile and the mixed user profile respectively and compare the 'cost of single CRUD operation value' for the atomic user profile with the 'cost of single CRUD operation value' for the mixed user profile and generate a feedback value;
■ verification means adapted to verify if the feedback value is greater than a pre-determined value and provide a verified feedback value and further adapted to select an 'actual cost per CRUD operation value' based on the verified feedback value;
• a processing unit having:
* receiver means adapted to receive the target application specific information and the 'actual cost per CRUD operation value' from the benchmarking unit;
* collation means adapted to collate the target application specific information and provide collated information;
* first computational means adapted to process the collated information in a pre-determined manner to generate results in the form of a 'computed peak cost of business operations value';
* second computational means adapted to compute the number of nodes based on the 'computed peak cost of business operations value' and an 'average user session length value' obtained from the application specific information and provide a computed number of nodes; and
• display means to display the computed number of nodes required to host the web based application.
In accordance with the present invention, the first computational means is adapted to compute the 'computed peak cost of business operations value' as the as the summation of product of a 'number of individual CRUD operation request' and the selected 'actual cost per CRUD operation value' for each of the create, read, update and delete operations and a number of concurrent users accessing the web application.
The second computational means may increase the 'computed peak cost of business operation value' by 25% if Secure Socket Layer is implemented at a node, hosting the web based application.
In addition, the second computational means is further adapted to derive a 'total node capacity required value in GHz' by dividing a peak Megacycles required value by the 'average user session length value' and still further adapted to compute the number of nodes as the division of the 'total node capacity required value in GHz' with an 'effective node value'.
Typically, the recording means is adapted to use widely adopted web application frameworks like DotNetNuke for recording atomic and mixed user profiles representing the Create, Read, Update and Delete operations using the information stored in the database of the web based application.
A method for estimating the number of nodes required for hosting a web based application is envisaged, the method comprising the following steps:
a. recording all transactions which can be performed by the web based
application as atomic and mixed user profiles in terms of create, read,
update and delete database operations;
b. storing the recorded atomic and mixed user profiles in a repository;
c. testing the web application's performance for the atomic user profile and
the mixed user profile with a step load pattern;
d. generating test results in the form of 'single cost per CRUD operation
value' for the atomic user profile and the mixed user profile;
e. comparing the 'single cost per CRUD operation value' for the atomic user
profile with the 'single cost per CRUD operation value' for the mixed user
profile and generating a feedback value;
f. verifying if the feedback value is greater than a pre-determined value;
g. selecting a 'actual cost per CRUD operation value' determined by either
the atomic user profile or the mixed user profile based on the feedback
value;
h. receiving a target application specific information;
i. collating the received application specific information;
j. processing the collated information in a pre-determined manner to
generate results in the form of 'computed peak cost of business operations
value'; k. computing the number of nodes required for hosting the web based
application based the 'computed peak cost of business operations value';
and 1. displaying the computed number of nodes required to host the web based
application.
Typically, testing the web applications performance includes the steps of employing a predetermined load on the test node hosting the web application for a predetermined time window, executing the operations of the stored profiles, determining the nodes response in requests/second, iteratively increasing the time and load by a pre-determined value till a maximum throughput is achieved.
Preferably, the step of generating test results includes the step of selecting the actual cost of operation value as the value which is achieved as a highest request/second value by the node under test.
Furthermore, the step of computing the number of nodes for the target application includes the steps of:
a. receiving the selected actual cost per CRUD operation value;
b. determining 'total Megacycles' required by the node as a summation of
product of the 'number of individual CRUD operation request' and the
selected cost per CRUD operation value for each of the create, read,
update and delete operations and the number of concurrent users;
c. deriving the total node capacity required value in GHz by dividing the
peak Megacycles required value by the average user session length.
d. deriving a 'effective node value' as the product of the nominated node
speed and nominated utilization divided by 100; and
e. computing the number of nodes as the division the determined 'total node
capacity in Megacycles' value with the 'effective node value'.
BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS
Other aspects of the invention will become apparent by consideration of the accompanying drawings and their description stated below, which is merely illustrative of a preferred embodiment of the invention and does not limit in any way the nature and scope of the invention.
FIGURE 1 illustrates conventional reference architecture for web applications; FIGURE 2 illustrates a schematic of the capacity planning tool in accordance with the present invention;
FIGURE 3 illustrates a sample screenshot of a typical GUI and output provided by the capacity planning tool in accordance with the present invention;
FIGURE 4 illustrates a sample screenshot of typical input in the form of user profile to
the capacity planning tool in accordance with the present invention;
FIGURE 5 is a flowchart showing the steps for estimating the hardware requirements for
a web based application in accordance with the present invention;
FIGURE 6 illustrates the workload modeling for various user profiles and scenarios in
accordance with the present invention;
FIGURE 7 illustrates a graph showing Request/Sec Vs User Load for a read operation
for an application in accordance with the present invention;
FIGURE 8 illustrates a graph Requests in application queue Vs User Load for a read
operation for an application in accordance with the present invention;
FIGURE 9 illustrates a graph showing Requests Execution Time Vs User Load for a
read operation for an application in accordance with the present invention;
FIGURE 10 illustrates a graph showing Processor Time Vs User Load for a read
operation for an application in accordance with the present invention;
FIGURE 11 illustrates a graph showing Request Execution Time Vs User Load for a
create operation for an application in accordance with the present invention;
FIGURE 12 illustrates a graph showing Requests in Application queue Vs User Load for
a create operation for an application in accordance with the present invention;
FIGURE 13 illustrates a graph showing Request/ Sec Vs User Load for a create
operation for an application in accordance with the present invention;
FIGURE 14 illustrates a graph showing Processor Time Vs User Load for a create
operation for an application in accordance with the present invention;
FIGURE 15 illustrates a graph showing Request/Sec Vs User Load for an update
operation for an application in accordance with the present invention;
FIGURE 16 illustrates a graph showing Requests Execution Time Vs User Load for an
update operation for an application in accordance with the present invention;
FIGURE 17 illustrates a graph showing Processor Time Vs User Load for an update
operation for an application in accordance with the present invention;
FIGURE 18 illustrates a graph showing Request Execution Time Vs User Load for a
delete operation for an application in accordance with the present invention;
FIGURE 19 illustrates a graph showing Requests in Application queue Vs User Load for
a delete operation for an application in accordance with the present invention;
FIGURE 20 illustrates a graph showing Request/ Sec Vs User Load for a delete
operation for an application in accordance with the present invention; and
FIGURE 21 illustrates a graph showing Processor Time Vs User Load for a delete
operation for an application in accordance with the present invention.
DETAILED DESCRIPTION
The invention will now be described with reference to the accompanying drawings which do not limit the scope and ambit of the invention. The description provided is purely by way of example and illustration. Although the invention has been tested using, specific web application frameworks such as DotNetNuke (DNN) and the transactions are recorded using tools like Fiddler, it is envisaged that the invention can be tested and data can be gathered using other web applications with or without modification or alteration of the invention.
The systems disclosed in the prior art provide hardware sizing mechanisms, but these mechanism work on the principal of tracking the resource utilization for each of the transactions performed by an application whose hardware sizing is to be determined. This methodology involves extensive computational processing and may not provide accurate estimation at the request for proposal stage.
Thus, the present invention envisages a capacity planning tool which can be applied at the request for proposal stage and can accurately estimate the hardware requirements of the application based on the application profile modeled on CRUD (Create, Read, Update and Delete) database operations. Particularly, this tool can be used for sizing capacity of nodes for the web/application servers which are required for hosting ASP.NET applications, typically those with standard architectures involving browsers/ web client
10 for the users 12 which use a Wide Area Network (WAN) 14 connect to the web server/application server 16 and database server 18 as shown in FIGURE 1 of the accompanying drawings.
The present invention overcomes the shortcomings of the prior art and the TCA approach with 'cost per CRUD operation' approach by enhancing the TCA analysis for sizing ASP.NET applications at the Request for Proposal (RFP) stage itself.
Specifically, the methodology of the present invention is to select a workload profile and further select a benchmark for testing the performance of multi tiered enterprise grade web applications.
In accordance with the present invention, the proposed capacity planning tool involves two basic steps which are as follows:
1. Selecting a workload profile:
A workload profile consists of an aggregate mix of users performing various operations. For example, for a load of 200 concurrent users, the profile might indicate that 20 percent of users perform order placement, 30 percent add items to a shopping cart, while 50 percent browse the product catalog. This helps in identifying and optimizing the areas that consume an unusually large proportion of server resources.
2. Selecting a Benchmark:
Specifically, in .NET world there is no standard benchmark that can be used for testing the performance of multi tiered enterprise grade web applications. So, benchmarks like Nperf are evaluated, however they are still in the early phases of development and not applicable to load tests for web applications. Based on these findings, the present invention has developed its own performance benchmark which can be used by projects to determine load and estimations which can be performed for effective hardware sizing.
In accordance with the present invention, the 'cost per CRUD operation' approach envisaged by the present invention takes into account the basic data access operations (CRUD Operations) which are part of every data driven application developed using ASP.NET and have maximum impact on the application performance. These operations are typical ADO.NET calls to a database, since ADO.NET calls have the maximum processing overhead.
These operations include:
• Create - A typical database insert operation which creates a record in a table.
• Read - A typical database select operation.
• Update - A typical database update operation.
• Delete - A typical database delete operation.
• Batch Update and Batch Delete - A typical batch operation implies a state where the number of queries to be performed is large enough to grow the page response time beyond acceptable limits, as a result such operations are normally done using Ajax calls or done under a separate thread and the page response is sent back.
Based on this methodology, the cost of the aforementioned individual operations in Megacycles at maximum requests per second (throughput) can be calculated.
As create, read, update, and delete are very atomic operations and any generic ASP.NET application can be broken down to these atomic operations at the base level, the cost per CRUD operation methodology proves much useful.
Referring to the accompanying drawings, FIGURE 2 shows a schematic of the capacity planning tool, represented by reference numeral 100 in accordance with the present invention. The capacity planning tool for estimating the number of nodes required for hosting a web based application comprises a benchmarking unit 102 for performing sample web application profiling. This benchmarking unit 102 provides the actual cost per CRUD operation by profiling sample ASP.NET applications.
In accordance with the present invention, the benchmarking unit 102 conducts the profiling tests on the web application using a web application framework like DotNetNuke (DNN). DNN is an open source portal building application infrastructure powered by a strong built-in framework for content management. This application framework is chosen to benchmark data intensive web applications, since DNN creates every page from the information stored in a database and is a highly data intensive application.
The benchmarking unit 102 performs the sample web application profiling by receiving the sample web based application specific information from users through a graphical interface and processes this information to determine the 'actual cost per CRUD operation value' by employing a recording means 104. The recording means 104 records all atomic transactions which can be performed by the sample web based application based on Create, Read, Update and Delete (CRUD) database operations and provides an atomic user profile and further records a mix of all transactions based on CRUD database operations and provides a mixed user profile. The recording means 104 typically uses tools like Fiddler and VSTS Test Edition for recording the user profiles. This recorded atomic and mixed user profiled are stored in a repository 106 for further use by testing means 108. The testing means 108 iteratively tests the web application performance based on the stored atomic user profile and the stored mixed user profile respectively with increasing load and time till a maximum throughput is achieved and further generates test results in the form of 'cost of single CRUD operation value' for the atomic user profile and the mixed user profile respectively.
The 'cost of single CRUD operation value' for the atotnic user profile and the mixed user profile respectively are received by the comparator ll0 to compare the 'cost of single CRUD operation value' for the atomic user profile with the 'cost of single CRUD operation value' for the mixed user profile and generate a feedback value. The feedback value; is calculated as the 'cost of single CRUD operation value' for the atomic user
profile (calculated cost) divided by the 'cost of single CRUD operation value' for the mixed user profile (actual cost). Further, a verification means 112 verifies if the feedback value is greater than a pre-determined value and provides a verified feedback value and selects an 'actual cost per CRUD operation value' based on the verified feedback value.
In accordance with this invention, if the feedback value is greater than 1, it indicates that the calculated cost of operations is greater than actual cost obtained by the benchmarking unit 102. So, if calculated cost of operations for siting is used then a bigger size than what is actually required is selected and proposed by the present invention.
The 'actual cost of CRUD operation value' is then given to a processing unit 114 of the system 100 to determine the number of nodes required for hosting the sample ASP.NET application.
The processing unit 114 comprises a receiver means 116 to receive target application specific information (the application for which sizing is to be performed), single cost per operation value and the 'actual cost of CRUD operation value' from the benchmarking unit 102. This information includes mandatory details including approximate number of concurrent users in peak usage hours, average expected usage profile for the web based application in terms of create, read, update and delete operations, average session time the profile will be taking for execution, designated processing unit's speed and designated processing unit's utilization. A collation means 118 receives this application specific information and collates this information and sends it to the first computational means 120. The first computational means 120 processes this collated information in a pre-determined manner to generate results in the form of a 'computed peak cost of business operations value'. The first computational means 120 computes the 'computed peak cost of business operations value' based on the following equations and key terms:
Actual Cost Per single CRUD Operation (in Megacycles): This is a predetermined value in Megacycles which is received from the benchmarking unit 110 for individual CRUD operations.
No of CRUD Operations in the average user profile: This is an input to the receiving means 116 in terms of no of Create, Read, Update and Delete Operations performed by an average user for different business operations in a typical session in the target application.
Further, Computed Peak Cost of Business Operations in MegaCycles is computed as: (Cost of Single Create Operation * No of Create Operations + Cost of Single Read Operation * No of Read Operations + Cost of Single Update Operation * No of Update Operations + Cost of Single Delete Operation * No of Delete Operations)* No of Concurrent Users
1 GHz = 1000 Megacycles; and
Usage Profile: The various profiles created to calculate costs of individual operations
like create, read, update and delete.
The key terms, notations and details of cost per operation approach are listed as shown in TABLE 1
Notation Definition Comment
Ncu Number of users 100 users using http can be
concurrently using service i at peak time. expressed as Ncu
CPR Cost Per Request. The A request is a simple http get or
cost of a request is CPU cycles needed to execute that request. post.
CPO Cost Per Operation- An operation can have more than
An operation is a CRUD operation like create, read, update one request.
and delete.
CSO Cost of Single Some operation like delete and
Operation - The cost update are accompanied by cost
of single operation is of read operation. To calculate the
CPU cycles required single operation cost the read
to execute a single operation's cost has to be
CRUD operation. subtracted.
Ox (C,R,U,D) Count of operations in a profile 023(c) means 23 Create operations.
TABLE 1: Keywords, notations and details of cost per operation approach
Second computational means 122 uses the 'computed peak cost per business operation value' to compute the number of nodes required for hardware sizing.
The following computations are performed by the second computational means 122 to
arrive at the number of nodes:
The number of Node's required for X number of users is calculated as:
Total Node Mega Cycles Required:
COSTcpu = (Σ CSO1 * O1 Ncu i.e. COSTcpu = (CSOc * Oc + CSOr * Or + CSOu * Ou + CSOd * Od, * Ncu
Average Session Time in seconds = S seconds
CPU Required ICPU = COSTcpu / (S * 1000) GHz
Target Node Size = X Ghz
Target Node Utilization = Y % (In percentage)
Effective Node Available (ECPU) = X*Y/100
Therefore, the number of Nodes required (round to the nearest integer); = ICPU/ECPU
If SSL is implemented at the web server, its cost is taken as 25% extra.
In accordance with the present invention, methodology used by the benchmarking unit
102 for calculating the cost of CRUD operations is described as below:
Under maximum throughput conditions, for- Read Operation Profile with 10 Read
Requests , cost per read request (CPR) is calculated as:
CPR = (2200 x 4 x 0.8913)/680 = 11.534 Mega Cycles
where,
Processor Speed in GHz = 2.2
Number of Processor Cores = 4
Percentage Node Utilization in this profile test = 89.12 %
Maximum Requests per sec (Throughput) at 1IS= 680 CPR = 11.534 Mega Cycles
For Read Operations, CPR = CPO = CSO
Since, a read operation can have only one request. The same applies to a create operation
as well.
For Update and Delete Operations,
CSO = CPO - CPORead
Update and delete operations are almost always led by a read operation and hence the effective cost of update and delete operations are calculated by subtracting the cost of read operation.
Using the above equations the cost of each operation can be calculated as shown below: CSOr = 11.534 Mega Cycles CSOc = 25.558 Mega Cycles CSOu = 20.593 Mega Cycles CSOd = 32.972 Mega Cycles
In accordance with the present invention, methodology used by the second computational means 122 for calculating the number of Nodes required for an application where
hundred concurrent users access the application for a particular usage profile is described as below:
Number of Node's required for 500 users can be calculated as: For user profile with the following mix of CRUD operations: Oc = 4 Or = 50 Ou = 4 Od = 5
Total Node Mega Cycles Required: COSTcpu =( Σ CSO1 * O1) * Ncu
COSTcpu = (CSOc * Oc + CSOr * Or + CSOu * Ou + CSOd * Od) * Ncu COSTcpa =( 25.558*4 + 11.534*50 +20.593*4 + 32.972*5) = 463082
Average Session Time in seconds = 600 seconds
CPU Required ICpu = COSTcpu / (S * 1000) GHz = 463082/(600*1000) = 0.7718
Target Node Size - X Ghz = 2.2 GHz
Target Node Utilization = Y % (In percentage) = .70
Effective Node Available (ECPU) = X*Y/100
ECPU=2.2*.70 = 1.54GHZ
Number of Nodes Required (Round to the nearest integer ) = Icpu/ ECPU= 0.501 =1 If SSL is implemented at the web server, its cost is taken as 25% extra.
Display means 124 displays the number of nodes required calculated by the second computational means 122 to host the web based application.
In accordance with the preferred embodiment of the present invention, Microsoft Excel is used as the graphical interface and mathematical tool to implement the invention. In the services industry there are a number of proposals to size for, and quite often the sizing
expert needs to provide capacity estimates within the scope of stringent deadlines. The use of excel facilitates portability as well as usability.
Typically, the read module of the present invention will help users to create a profile and get the calculated Number of NODE'S.
In accordance with the present invention, FIGURE 3 of the accompanying drawings represents the screenshot of the capacity sizing tool in accordance with the invention. FIGURE 4 of the accompanying drawings shows typically inputs provided to the invention in terms of user profile.
According to the invention, users have to fill up the following fields in the MS Excel based capacity planning tool:
1. Approximate Number of Concurrent Users in Peak Hours - Sizing is always to be done for the peak usage load hence the approximate number of concurrent users in peak usage hours is taken.
2. Application Usage Profile - The average expected usage profile of the application in terms of business operations and resultant CRUD operations carried out by a user in a typical session, based on this profile the total cost of business operations is calculated. The number of operations has a major impact on the CPU cycles used. Hence this profile should be the maximum close to the actual deployment if possible.
3. Average User Session Length in Seconds.
4. Default NODE size used under production - The target NODE size which will be deployed. Typically, sizes like 2.2 GHz or 3 GHz for each core are normally used for deployments.
5. Target NODE Utilization: Effective NODE that needs to used at the deployment servers.
To use the tool users are required to follow the following steps:
1. For the target application, create a profile of the pages which will include the all the business operations performed in the user session and the corresponding number of CRUD operations. The number of CRUD operations has a major impact on the NODE cycles used. Hence this profile should be the maximum close to the actual deployment.
2. Identify the peak usage requirements including concurrent users and the total capacity of processor to be used.
3. Specify the average session time a user will be using the application.
4. Based on the profile, cost of individual CRUD operations and number of concurrent users the total cost of business operations during peak usage is calculated.
5. Based on the average session time the NODE capacity required is calculated.
6. If the web application works across a secure channel (https) additional processor usage is factored into the calculation. This is typically 25 %.
7. In addition any other factor that may lead to additional NODE utilization can be factored in.
8. The following terms in the invention are explained as given below:
a. Default NODE size used under production: Typically, this is the target
NODE size which will be deployed. Sizes like 2.2 GHz or 3 GHz for each
core are normally used for deployments.
b. Target NODE Utilization: Effective NODE that needs to be used at the
deployment servers.
9. Based on the above steps the Number of NODE's required is calculated.
Referring to FIGURE 5, a method for estimating the number of nodes required for hosting a web based application has been illustrated; the method comprises the following steps:
■ recording all transactions which can be performed by a sample web based application as atomic and mixed user profiles in terms of create, read, update and delete database operations, 1000;
■ storing the recorded atomic and mixed user profiles in a repository, 1002a and 1002b;
■ testing the web application's performance for said atomic user profile and said mixed user profile with a step load pattern, 1004;
■ generating test results in the form of 'single cost per CRUD operation value' for said atomic user profile and said mixed user profile, 1006;
■ comparing said 'single cost per CRUD operation value' for said atomic user profile with said 'single cost per CRUD operation value' for said mixed user profile and generating a feedback value, 1008;
■ verifying if the feedback value is greater than a pre-determined value, 1010;
■ selecting a 'actual cost per CRUD operation value' determined by either the atomic user profile or the mixed user profile based on the feedback value, 1012;
■ receiving a target application specific information, 1014;
■ collating said received application specific information, 1016;
■ processing said collated information in a pre-determined manner to generate results in the form of 'computed peak cost of business operations value', 1018;
■ computing the number of nodes required for hosting the web based
application based the 'computed peak cost of business operations
value', 1020; and
■ displaying the computed number of nodes required to host the web
based application, 1022.
EXPERIMENTAL DETAILS
The test environment comprised the following:
Web/Application Server details as shown in TABLE 2
Machine AMD Opteron Processor (848) x 4
Processor 2.20 GHz
RAM 7.83 GB
Storage 40 GB
OS Windows Server 2003 Enterprise Edition SP1
llS 6.0
.NET Framework 2.0
TABLE 2
Database Server details as shown in TABLE 3
Machine Intel Xeon Processor x 4
Processor 3.20 GHz
RAM 1GB
Storage 70 GB
OS Windows Server 2003 Enterprise Edition SPl
Database SQL Server Enterprise 2005 SPl
Network 1 GBPS Ethernet connectivity on each Server
TABLE 3
The tools used are described in TABLE 4
Tool Name Description
Visual Studio Team This is a latest Visual Studio released by Microsoft
System 2008 - Tester Corporation for Application Lifecycle Management. The
Edition tester Edition was used for conducting the tests.
.NET CLR Profiler The CLR profiler from RedGate Inc called as ANTS profiler
was used to profile the applications for checking performance
issues.
SQL Server Profiler The SQL Profiler which comes with SQL server was used to
profile the database for any performance issues.
TABLE 4
DotNetNuke: DotNetNuke (DNN) 4.8.1 was taken to perform the tests. DNN is portal building software from DotNetNuke Corporation and has been built on strong extensible framework. It is built on top of the IBuyspy portal building framework published by
Microsoft.
The store module of DNN, which is a full fledged shopping cart application, was chosen to perform the tests. DNN was modified to include a delete operation, where a screen was made to delete a record from the cart table in the database. The Step load pattern was used in steps of 30 seconds, and the test was performed for a much longer duration of 2 hours 30 minutes each.
The testing methodology was as follows:
1. User profiles of atomic Create, Read, Update and Delete operations were recorded using fiddler tool as VSTS web tests.
2. Step load for 2 hours 30 minutes, with initial load of 5 users and step count of 5 users on a step increment time of 30 seconds was done.
3. The processor utilization where the request/sec was maximum was picked up from the performance counters recorded at the web server end.
4. User profile having a mix of 70% read, 10 % create, 10% update, and 10% delete was also tested on a siep load with same run settings.
5. Load Tests were performed on these mixed user profiles and Cost of Single Operation was calculated.
6. A comparison was done between the observed results and the mathematically calculated results and a feedback value was calculated.
7. This feedback value validates and verifies the Capacity Modeling Methodology of the present invention.
In the test a feedback value greater than 1 was achieved, which states that the calculated Cost of Operations is always greater than actual obtained. So if the Calculated Cost of Operations for sizing was used, a bigger size is used than what is actually required to be on the safe side.
More over sizing has to be done for a consistent period of time where the database may grow considerably; keeping that in mind the present invention's methodology Seems fits since the sizing is done using the feedback value.
Workload Profile:
Typically, a workload profile consists of atomic create operations or atomic read operations or atomic read or delete operations. Along with this a mix operational profile was also taken which had a mix of all the four operations in various percentages as seen in Table 6.
User Profile Percentage Simultaneous Users
Browse 70% 250/500/750/1000
Insert 10% 250/500/750/1000
Update 10% 250/500/750/1000
Delete 10% 250/500/750/1000
Table 6
In this scenario 70% of browse operations were performed and 10% each of other operations were performed. Similarly other operations were increased to 70% in the operation mix and tests were performed.
Workload Modeling:
Figure 6 of the accompanying drawings shows the workload modeling for various
profiles and scenarios. The results of the workload modeling can be seen in Table 7.
Operation A B c D
Percentage 25% 25% 25% 25%
Percentage 70% 10% 10% 10%
Percentage 10% 70% 10% 10%
Percentage 10% 10% 70% 10%
Percentage 10% 10% 10% 70%
Table 7
Benchmarking Results:
DotnetNuke test results:
The test results for DNN in Figure 7 of the accompanying drawings show a graphical representation of 2 point moving average of number of requests/sec with increase in user load on the server for a read operation.
The test results for DNN in Figure 8 of the accompanying drawings show a graphical representation for read operation of requests in application queue Vs user load.
The test results for DNN in Figure 9 of the accompanying drawings show a graphical representation of increase in Request Execution Time with increasing User Load for read operations on the server.
The test results for DNN in Figure 10 of the accompanying drawings show a graphical representation of 2 point moving average of User Load for read operations on the server.
The test results for DNN in Figure 11 of the accompanying drawings show a graphical representation of increase in request execution time with increase in User Load for create operations on the server.
The test results for DNN in Figure 12 of the accompanying drawings show a graphical representation of increase in requests in application queue with increase in User Load for create operations on the server.
The test results for DNN in Figure 13 of the accompanying drawings show a graphical representation of 3 point moving average in requests per second with increase in User Load for create operations on the server.
The test results for DNN in Figure 14 of the accompanying drawings show a graphical representation of increase 3 point moving average in processor utilization with increase in User Load for create operations on the server.
The test results for DNN in Figure 15 of the accompanying drawings show a graphical representation of 2 point moving average of requests per second with increase in User Load for update operations on the server.
The test results for DNN in Figure 16 of the accompanying drawings show a graphical representation of increase in request execution time with increase in User Load for update operations on the server.
The test results for DNN in Figure 17 of the accompanying drawings show a graphical representation of 2 point moving average in processor utilization with increase in User Load for update operations on the server.
The test results for DNN in Figure 18 of the accompanying drawings show a graphical representation of 50 point moving average in requests per second with increase in User Load for delete operations on the server.
The test results for DNN in Figure 19 of the accompanying drawings show a graphical representation of increase in requests in application queue with increase in User Load for delete operations on the server.
The test results for DNN in Figure 20 of the accompanying drawings show a graphical representation of increase in request execution time with increase in User Load for delete operations on the server.
The test results for DNN in Figure 21 of the accompanying drawings show a graphical representation of 50 point moving average in processor utilization with increase in User Load for delete operations on the server.
Test Results Analysis:
The result obtained by DotnetNuke was used to test 2 live applications.
Project A: The sizing was validated with 24.83% errors. Details of the sizing and usage profile can be seen in Table 8.
Transaction Select Insert Update Delete
Registering Ticket 170 16 11 5
Common Queue 20 1 0 0
My Queue 100 I 6 0
History 90 1 2 0
Resolving Ticket 300 14 64 10
Updating Ticket 250 10 17 5
Table 8
The production setup for performing the sizing is as follows: 1 Intel Xeon 3.0 GHz Dual Core Processor Processor Utilization; 90% Users: 200 concurrent users Average Session Time: 300 seconds
Project B: The sizing was validated with 30.75% errors on the upper side. Details of the sizing and usage profile can be seen in Table 9.
Transaction Select Insert Update Delete
Get catalog 400 10 4 0
Download Pictures 40 10 4 0
Upload content 0 10 0 0
Publish Pictures 45 120 20 0
Publish Audio 45 200 20 0
Publish Video 45 120 20 0
Download Audio 40 10 4 0
Download Video 40 10 4 0
Device Search - Audio 140 0 4 0
Device Search - File 200 0 4 0
Device Search - Folder 60 0 4 0
Desktop Slideshow 40 4 6 0
Desktop Thumbnail 40 4 6 0
Table 9
The production setup for performing the sizing is as follows: 4 Intel XEON E5345® 2.33 GHz Dual Core Processor Processor Utilization: 80% Users: 75 concurrent users Average Session Time: 300 Seconds.
TECHNICAL ADVANTAGES
The technical advancements of the present invention include in providing: •• a time efficient capacity planning and sizing tool; •• an accurate tool for calculating the hardware sizing based on the web
application's needs; •• a user-friendly, portable and scalable tool for capacity planning and sizing; •• a tool for calculating hardware sizing of applications at request for proposal
(RFP) stage as well as developed web servers;
• a tool for capacity planning of new applications when workload and the technical architecture are the only details available; and
• a tool that leverages the data access or CRUD operations for modeling the business operations in a typical user session in the application for the capacity planning
While considerable emphasis has been placed herein on the components and component parts of the preferred embodiments, it will be appreciated that many embodiments can be made and that many changes can be made in the preferred embodiments without departing from the principles of the invention. These and other changes in the preferred embodiment as well as other embodiments of the invention will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter is to be interpreted merely as illustrative of the invention and not as a limitation.
CLAIMS
We claim:
1. A capacity planning tool for estimating the number of nodes required for hosting a web based application, said tool comprising:
• a benchmarking unit adapted to receive a sample web based application specific information, said benchmarking unit having:
• recording means adapted to record all atomic transactions which can
be performed by the sample web based application based on Create,
Read, Update and Delete (CRUD) database operations and provide an
atomic user profile and further adapted to record a mix of all
transactions based on CRUD database operations and provide a mixed
user profile;
" a repository to store said atomic user profile and said mixed user profile;
• testing means adapted to iteratively test the web application
performance based on said atomic user profile and said mixed user
profile respectively with increasing load and time till a maximum
throughput is achieved and further adapted to generate test results in
the form of 'cost of single CRUD operation value' for said atomic user
profile and said mixed user profile respectively;
■ a comparator adapted to receive said 'cost of single CRUD operation value' for said atomic user profile and said mixed user profile respectively and compare said 'cost of single CRUD operation value' for said atomic user profile with said 'cost of single CRUD operation value' for said mixed user profile and generate a feedback value;
■ verification means adapted to verify if said feedback value is greater than a pre-determined value and provide a verified feedback value and further adapted to select an 'actual cost per CRUD operation value' based on said verified feedback value;
• a processing unit having:
■ receiver means adapted to receive a target application specific information and said 'actual cost per CRUD operation value' from said benchmarking unit;
■ collation means adapted to collate said target application specific information and provide collated information;
■ first computational means adapted to process said collated information in a pre-determined manner to generate results in the form of a 'computed peak cost of business operations value';
• second computational means adapted to compute the number of nodes based on said 'computed peak cost of business operations value' and an 'average user session length value' obtained from said application specific information and provide a computed number of nodes; and
• display means to display said computed number of nodes required to host the
web based application.
2. The system as claimed in claim 1, wherein said first computational means is adapted to compute the 'computed peak cost of business operations value' as the as the summation of product of a 'number of individual CRUD operation request' and the selected 'actual cost per CRUD operation value' for each of the create, read, update and delete operations and the number of concurrent users accessing the web application.
3. The system as claimed in claim 1, wherein said second computational means is adapted to increase the 'computed peak cost of business operation value' by 25% if Secure Socket Layer is implemented at a node, hosting the web based application.
4. The system as claimed in claim 1, wherein said second computational means is further adapted to derive a 'total node capacity required value in GHz' by
dividing a peak Megacycles required value by said 'average user session length value' and still further adapted to compute the number of nodes as the division of the 'total node capacity required value in GHz' with an 'effective node value'.
5. The system as claimed in claim 1, wherein said recording means is adapted to use application frameworks like DotNetNuke for recording atomic and mixed user profiles representing the Create, Read, Update and Delete operations using the information stored in the database of the web based application.
6. A method for estimating the number of nodes required for hosting a web based application, said method comprising the following steps:
a. recording all transactions which can be performed by a sample web based
application as atomic and mixed user profiles in terms of create, read,
update and delete database operations;
b. storing the recorded atomic and mixed user profiles in a repository;
c. testing the web application's performance for said atomic user profile and
said mixed user profile with a step load pattern;
d. generating test results in the form of 'single cost per CRUD operation
value' for said atomic user profile and said mixed user profile;
e. comparing said 'single cost per CRUD operation value' for said atomic
user profile with said 'single cost per CRUD operation value' for said
mixed user profile and generating a feedback value;
f. verifying if the feedback value is greater than a pre-determined value;
g. selecting a 'actual cost per CRUD operation value' determined by either
the atomic user profile or the mixed user profile based on the feedback
value;
h. receiving a target application specific information;
i. collating said received application specific information;
j. processing said collated information in a pre-deterniined manner to
generate results in the form of'computed peak cost of business operations
value'; k. computing the number of nodes required for hosting the web based
application based the 'computed peak cost of business operations value';
and 1. displaying the computed number of nodes required to host the web based
application.
7. The method as claimed in claim 6, wherein the step of testing the web applications performance includes the steps of employing a predetermined load on the test node hosting the web application for a predetermined time window, executing the operations of the stored profiles, determining the nodes response in requests/second, iteratively increasing the time and load by a pre-determined value till a maximum load is achieved.
8. The method as claimed in claim 6, wherein the step of generating test results includes the step of selecting the actual cost of operation value as the value which is achieved as a highest request/second value by the node under test.
9. The method as claimed in claim 6, wherein the step of computing the number of nodes for the target application includes the steps of:
a. receiving the selected actual cost per CRUD operation value;
b. determining 'total Megacycles' required by the node as a summation of
product of the 'number of individual CRUD operation request' and the
selected cost per CRUD operation value for each of the create, read,
update and delete operations and the number of concurrent users;
c. deriving the total node capacity required value in GHz by dividing the
peak Megacycles required value by the average user session length.
d. deriving a 'effective node value' as the product of the nominated node
speed and nominated utilization divided by 100; and
e. computing the number of nodes as the division said determined 'total node
capacity in MegaCycles' value with said 'effective node value'.
| # | Name | Date |
|---|---|---|
| 1 | Other Patent Document [07-10-2016(online)].pdf | 2016-10-07 |
| 2 | abstract1.jpg | 2018-08-10 |
| 3 | 2004-mum-2010-form 3.pdf | 2018-08-10 |
| 4 | 2004-mum-2010-form 26.pdf | 2018-08-10 |
| 5 | 2004-mum-2010-form 2.pdf | 2018-08-10 |
| 6 | 2004-mum-2010-form 2(title page).pdf | 2018-08-10 |
| 7 | 2004-MUM-2010-FORM 18(27-6-2013).pdf | 2018-08-10 |
| 8 | 2004-mum-2010-form 1.pdf | 2018-08-10 |
| 9 | 2004-MUM-2010-FORM 1(21-7-2010).pdf | 2018-08-10 |
| 10 | 2004-mum-2010-drawing.pdf | 2018-08-10 |
| 11 | 2004-mum-2010-description(complete).pdf | 2018-08-10 |
| 12 | 2004-mum-2010-correspondence.pdf | 2018-08-10 |
| 13 | 2004-MUM-2010-CORRESPONDENCE(27-6-2013).pdf | 2018-08-10 |
| 14 | 2004-MUM-2010-CORRESPONDENCE(21-7-2010).pdf | 2018-08-10 |
| 15 | 2004-mum-2010-claims.pdf | 2018-08-10 |
| 16 | 2004-mum-2010-abstract.pdf | 2018-08-10 |
| 17 | 2004-MUM-2010-FER.pdf | 2019-09-20 |
| 18 | 2004-MUM-2010-FER_SER_REPLY [11-03-2020(online)].pdf | 2020-03-11 |
| 19 | 2004-MUM-2010-DRAWING [11-03-2020(online)].pdf | 2020-03-11 |
| 20 | 2004-MUM-2010-CLAIMS [11-03-2020(online)].pdf | 2020-03-11 |
| 21 | 2004-MUM-2010-ABSTRACT [11-03-2020(online)].pdf | 2020-03-11 |
| 22 | 2004-MUM-2010-US(14)-HearingNotice-(HearingDate-22-12-2022).pdf | 2022-11-21 |
| 23 | 2004-MUM-2010-FORM-26 [21-12-2022(online)].pdf | 2022-12-21 |
| 24 | 2004-MUM-2010-Correspondence to notify the Controller [21-12-2022(online)].pdf | 2022-12-21 |
| 25 | 2004-MUM-2010-Written submissions and relevant documents [30-12-2022(online)].pdf | 2022-12-30 |
| 26 | 2004-MUM-2010-PatentCertificate24-01-2023.pdf | 2023-01-24 |
| 27 | 2004-MUM-2010-IntimationOfGrant24-01-2023.pdf | 2023-01-24 |
| 28 | 2004-MUM-2010-RELEVANT DOCUMENTS [30-09-2023(online)].pdf | 2023-09-30 |
| 1 | Searchstrategy(2004MUM2010)_04-09-2019.pdf |