Abstract: Identification of workload depended faults in an Information Technology (IT) system is a challenge because faults caused by disruptive workload patterns show non-deterministic behavior. Systems and methods of the present disclosure facilitate ascertaining firstly whether a failed transaction is due to workload dependent factor. It further enables identifying a subset of Uniform Resource Locators (URLs) that may be causing failure of a particular URL, which can be further useful in reproducing or analyzing causes for faulty behavior. Systems and methods of the present disclosure also enable deriving a Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern using only historic Hypertext Transfer Protocol (HTTP) access log entries and the derived WFM can further enable predicting a failure probability for a new workload vector.
, Description:FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention:
SYSTEMS AND METHODS FOR DETECTION AND PREDICTION OF WORKLOAD DEPENDENT FAULTS
Applicant:
Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
having address:
Nirmal Building, 9th Floor,
Nariman Point, Mumbai 400021,
Maharashtra, India
The following specification particularly describes the invention and the manner in which it is to be performed.
TECHNICAL FIELD
The disclosure herein generally relates to analysis of online transaction failures, and particularly to systems and methods for detection and prediction of workload dependent faults.
BACKGROUND
For any Information Technology (IT) based solution, platform and/or system, even after thorough and rigorous system testing during development stage as well as during deployment stage, few defects or bugs, especially mandelbugs, remain undetected. Reasons for these residual bugs could be due to faults in hardware or in software or a combination of both. A mandelbug is a software bug whose behavior appears to be chaotic or even non-deterministic and hard to reproduce in practice. Due to complex propagation mechanism of mandelbugs, problem at one node may cause and show error at some other node. It may be even possible that errors due to mandelbugs appear with a considerable time lag after the occurrence of problems/faults adding one more degree of difficulty in identification or even in detection of a bug. Furthermore, during development stage, often individual applications and/or services are tested independently. But in a live production environment, all applications and services are hosted on the same physical machine and they share common infrastructure. This multi-tenant environment may give rise to different sets of problems which were not detected at the development stage. Moreover, the overall system behavior highly depends upon the workload pattern served by the application. This is primarily because a single workload pattern comprises several service classes and each service class may have different service demands. Also the existence of collinear relations between the service classes adds one more degree of variation in service demands. If there is any change in the workload pattern, in the presence of such bugs, the behavior of the application also may be different.
Accordingly, with regards to instances of online transaction failures, determining a root cause is a difficult task. It could either be a bug associated with a specific Hypertext Transfer Protocol (HTTP) (web) request alone or the result of an undesirable state created by previous HTTP transactions. The latter case, referred to as a workload dependent fault, is relatively difficult to diagnose or even reproduce.
SUMMARY
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
In an aspect, there is provided a processor implemented method comprising: scanning historic Hypertext Transfer Protocol (HTTP) access log entries pertaining to successful transactions and failed transactions to generate a master list (U) of unique Uniform Resource Locators (URLs) associated thereof; and a set of erroneous Uniform Resource Locators (URLs) corresponding to the failed transactions having an erroneous status code beginning with 5; and iteratively performing the following steps for each of the erroneous URLs: identifying a disruptive workload pattern (U?) as a subset of the master list (U) of the unique URLs; and deriving a Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern (U?).
In another aspect, there is provided a system comprising: one or more data storage devices operatively coupled to the one or more processors and configured to store instructions configured for execution by the one or more processors to: scan historic Hypertext Transfer Protocol (HTTP) access log entries pertaining to successful transactions and failed transactions to generate a master list (U) of unique Uniform Resource Locators (URLs) associated thereof; and a set of erroneous Uniform Resource Locators (URLs) corresponding to the failed transactions having an erroneous status code beginning with 5; and iteratively performing the following steps for each of the erroneous URLs: identify a disruptive workload pattern (U?) as a subset of the master list (U) of the unique URLs; and derive a Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern (U?).
In an embodiment of the present disclosure, the one or more hardware processors are further configured to identify the disruptive workload pattern (U?) by: converting the historic HTTP access log entries associated with each of the erroneous URLs and having a semi-structured structure into structured workload patterns in the form of workload tuples, wherein elements of each of the workload tuples include a failure probability (y) and a workload vector (X) representing frequencies (xi) of each of the unique URLs from the master list within a pre-configured time interval (dT); dividing the workload tuples into a training set and a validation set wherein each of the training set and the validation set are characterized by workload vectors corresponding to the failed transactions and the successful transactions; analyzing each of the identified erroneous URLs using the training set and the validation set by iteratively applying a genetic algorithm for a plurality of mathematical models configured to evaluate a fitness function associated thereof, the genetic algorithm comprising: generating a random population of a first set of chromosomes, wherein the chromosomes are subsets of the unique URLs (U) and correspond to a potential disruptive workload pattern; randomly identifying a pre-defined number of pairs from the first set of chromosomes as parent chromosomes; iteratively performing the following steps for a pre-defined number of generations: reproducing offspring chromosomes based on a recombination of the parent chromosomes, wherein the recombination includes crossover and mutation methods; evaluating a fitness measure of the offspring chromosomes in each of the generations and the parent chromosomes in a zeroth generation of the pre-defined number of generations based on each of the plurality of mathematical models using a subset of the workload vectors from the training set by performing one of: deriving a plurality of statistical measures using the validation set; or using a weighted nearest neighbor method using the validation set; selecting, for a next iteration, a second set of chromosomes constituting one or more of the parent chromosomes and one or more of the offspring chromosomes based on the evaluated fitness measure; until the evaluated fitness measure for the validation set is 100% or a maximum number of generations is reached; identifying one chromosome of the second set of chromosomes having a highest fitness measure in a last iteration and an associated mathematical model; and identifying the disruptive workload pattern (U?) corresponding to the identified chromosome.
In an embodiment of the present disclosure, the one or more hardware processors are further configured to derive the Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern (U?) based on the plurality of statistical measures.
In an embodiment of the present disclosure, the one or more hardware processors are further configured to predict a failure probability y of each of the identified URLs for a new workload vector based on the identified disruptive workload pattern (U?) and the corresponding workload dependent fault model (WFM).
In yet another aspect, there is provided a computer program product comprising a non-transitory computer readable medium having a computer readable program embodied therein, wherein the computer readable program, when executed on a computing device, causes the computing device to: scan historic Hypertext Transfer Protocol (HTTP) access log entries pertaining to successful transactions and failed transactions to generate a master list (U) of unique Uniform Resource Locators (URLs) associated thereof; and a set of erroneous Uniform Resource Locators (URLs) corresponding to the failed transactions having an erroneous status code beginning with 5; and iteratively performing the following steps for each of the erroneous URLs: identify a disruptive workload pattern (U?) as a subset of the master list (U) of the unique URLs; and derive a Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern (U?).
In an embodiment of the present disclosure, the computer readable program further causes the computing device to identify the disruptive workload pattern (U?) by: converting the historic HTTP access log entries associated with each of the erroneous URLs and having a semi-structured structure into structured workload patterns in the form of workload tuples, wherein elements of each of the workload tuples include a failure probability (y) and a workload vector (X) representing frequencies (xi) of each of the unique URLs from the master list within a pre-configured time interval (dT); dividing the workload tuples into a training set and a validation set wherein each of the training set and the validation set are characterized by workload vectors corresponding to the failed transactions and the successful transactions; analyzing each of the identified erroneous URLs using the training set and the validation set by iteratively applying a genetic algorithm for a plurality of mathematical models configured to evaluate a fitness function associated thereof, the genetic algorithm comprising: generating a random population of a first set of chromosomes, wherein the chromosomes are subsets of the unique URLs (U) and correspond to a potential disruptive workload pattern; randomly identifying a pre-defined number of pairs from the first set of chromosomes as parent chromosomes; iteratively performing the following steps for a pre-defined number of generations: reproducing offspring chromosomes based on a recombination of the parent chromosomes, wherein the recombination includes crossover and mutation methods; evaluating a fitness measure of the offspring chromosomes in each of the generations and the parent chromosomes in a zeroth generation of the pre-defined number of generations based on each of the plurality of mathematical models using a subset of the workload vectors from the training set by performing one of: deriving a plurality of statistical measures using the validation set; or using a weighted nearest neighbor method using the validation set; selecting, for a next iteration, a second set of chromosomes constituting one or more of the parent chromosomes and one or more of the offspring chromosomes based on the evaluated fitness measure; until the evaluated fitness measure for the validation set is 100% or a maximum number of generations is reached; identifying one chromosome of the second set of chromosomes having a highest fitness measure in a last iteration and an associated mathematical model; and identifying the disruptive workload pattern (U?) corresponding to the identified chromosome.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the embodiments of the present disclosure, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles.
FIG.1 illustrates an exemplary block diagram of a system for detection and prediction of workload dependent faults, in accordance with an embodiment of the present disclosure.
FIG.2 is an exemplary high level flow chart illustrating a computer implemented method for detection and prediction of workload dependent faults, in accordance with an embodiment of the present disclosure.
FIG.3 is an exemplary flow diagram illustrating the computer implemented method of FIG.2, in accordance with an embodiment of the present disclosure.
FIG.4 is an exemplary illustration of derivation of workload vectors, in accordance with an embodiment of the present disclosure.
It should be appreciated by those skilled in the art that any block diagram herein represent conceptual views of illustrative systems embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudo code, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computing device or processor, whether or not such computing device or processor is explicitly shown.
DETAILED DESCRIPTION
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the spirit and scope of the disclosed embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope and spirit being indicated by the following claims.
A perceived failure in an Information Technology (IT) system could be either due to implementation dependent factors (faulty code, incorrect input, unhandled exceptions, etc.), due to environment dependent factors (wrong configuration, temporary unavailability of third party services etc.) or could be due to workload dependent factors (e.g. disruptive workload pattern). Failures caused by implementation dependent factors and environment related factors are relatively easy to identify and may be easy to fix due to their monotonous and deterministic behavior. As against this, faults arising due to workload dependent factors like a disruptive workload pattern, show non-deterministic behavior. Traditionally, the problem of workload dependent fault was analyzed based on concurrency related aspects. Online transaction failures may be associated with a specific Hypertext Transfer Protocol (HTTP) request alone or the result of an undesirable state created by previous HTTP transactions. The later scenario being workload dependent faults are addressed by the present disclosure. Systems and methods of the present disclosure enable identifying such disruptive workload pattern or detecting a specific set of HTTP transactions (or URLs) causing an internal error on the access of a specific URL using only web server logs as input. Particularly, the present disclosure enables identifying a disruptive workload pattern, deriving workload dependent fault model corresponding to the identified disruptive workload pattern and predicting occurrence of a failure for a new or unknown workload pattern.
In the context of the present disclosure, the expression “workload” refers to frequency of each unique functionality (URL) served by a server over a given period of time.
Referring now to the drawings, and more particularly to FIGS. 1 through 4, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and method.
FIG.1 illustrates an exemplary block diagram of a system 100 for detection and prediction of workload dependent faults, in accordance with an embodiment of the present disclosure. In an embodiment, the system 100 includes one or more processors 104, communication interface device(s) or input/output (I/O) interface(s) 106, and one or more data storage devices or memory 102 operatively coupled to the one or more processors 104. The one or more processors 104 that are hardware processors can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, graphics controllers, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor(s) are configured to fetch and execute computer-readable instructions stored in the memory. In an embodiment, the system 100 can be implemented in a variety of computing systems, such as laptop computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud and the like.
The I/O interface device(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface device(s) can include one or more ports for connecting a number of devices to one another or to another server.
The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, one or more modules (not shown) of the system 100 can be stored in the memory 102.
In an embodiment, the system 100 comprises one or more data storage devices or memory 102 operatively coupled to the one or more processors 104 and is configured to store instructions configured for execution of steps of the method 200 by the one or more processors 104.
FIG.2 is an exemplary high level flow chart and FIG.3 is an exemplary flow diagram illustrating a computer implemented method for detection and prediction of workload dependent faults, in accordance with an embodiment of the present disclosure. The steps of the method 200 of FIG.3 will now be explained in detail with reference to the components of the system 100 of FIG.1 and the high level flow chart of FIG.2. Although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
Any Hypertext Transfer Protocol (HTTP) request, processed by a server, gets logged in the HTTP access log. By default, access logs are recorded in a common log format (CLF). CLF collects forward IP address, timestamp of the request, HTTP request method, requested resource, Uniform Resource Locator (URL), response status code and response size in bytes. The response status code is of particular importance to the present disclosure. The first digit of the status code specifies one of following standard classes of responses: the request resulted in a successful response (codes beginning in 2), a redirection (codes beginning in 3), an error caused by the client (codes beginning in 4), or an error in the server (codes beginning in 5). In accordance with the present disclosure, primary focus is on the HTTP access log entries having a status code beginning with 5.
Thus, in accordance with an embodiment of the present disclosure, the one or more processors 104 are configured to firstly scan, at step 202, historic Hypertext Transfer Protocol (HTTP) access log entries pertaining to successful transactions and failed transactions. As illustrated in FIG.2, as part of a pre-preprocessing exercise, semi-structured HTTP access log entries (Input A) are converted into structured workload patterns. Further, using a genetic algorithm (GA), a specific set of URLs (Output A) or a disruptive workload pattern is identified based on which a Workload dependent Fault Model (WFM) may be derived which may be further employed for performing a what-if analysis on a futuristic workload pattern (Input B) for predicting whether it may be disruptive or not (Output B).
Once the historic HTTP access log entries are scanned, a master list (U) of unique URLs is generated at step 202a. From the master list (U), a set of erroneous Uniform Resource Locators (URLs) corresponding to the failed transactions having an erroneous status code beginning with 5 is also generated at step 202b.
In accordance with an embodiment of the present disclosure, the one or more processors 104 are configured to iteratively perform, at step 204, a plurality of steps for each of the erroneous URLs identified at step 202b. The plurality of steps include identifying a disruptive workload pattern (U^') as a subset of the master list (U) of the unique URLs at step 204a and deriving a Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern (U^') at step 204b. The steps 204a and 204b will now be explained in detail hereinafter. It may be noted that for ease of explanation, the entire method may be described with reference to one erroneous URL. However, the described method needs to be iteratively repeated and the Workload dependent Fault Model (WFM) is derived for each of the erroneous URLs independently.
In accordance with an embodiment of the present disclosure, the step 204a of identifying the disruptive workload pattern (U^') comprises firstly converting the historic HTTP access log entries associated with each of the erroneous URLs and having a semi-structured structure into structured workload patterns in the form of workload tuples. Accordingly, after selecting one particular URL from the set of erroneous URLs, a workload tuple is calculated for each occurrence (successful as well as failed classes corresponding to successful and failed transactions) of this particular URL. Elements of each of the workload tuples include a failure probability (y) and a workload vector (X) representing frequencies (xi) of each of the unique URLs from the master list within a pre-configured time interval (dT) before the occurrence of a particular instance (occurrence) under consideration. The time interval (dT) maybe pre-configured based on empirical observations.
FIG.4 is an exemplary illustration of derivation of workload vectors, in accordance with an embodiment of the present disclosure. In the exemplary illustration there are four unique URLs (Page1.jsp, Page2.jsp, Counter1.jpg, Logo2.gif) and only one URL (Page2.jsp) is logged with an erroneous status code of 500. So the focus of the analysis is on Page2.jsp. Three workload tuples are calculated corresponding to the three occurrences of Page2.jsp. The length of each workload vector is 5 considering count of successful execution of Page2.jsp, count of erroneous execution of Page2.jsp and the frequency counts of the remaining 3 URLs. In the illustrated example, the time interval (dT) is considered 1 second. After identifying the workload vectors for all instances of the selected URL from the historic HTTP access logs, by comparing workload vectors of successful transactions with the workload vectors of failed transactions, a Workload dependent Fault Model (WFM) is derived for predicting a failure probability y of the selected URL for an unknown or new workload vector.
The computed workload tuples are divided into a training set and a validation set wherein each of the training set and the validation set are characterized by workload vectors corresponding to both the failed transactions and the successful transactions. Each of the identified erroneous URLs are then analyzed using the training set and the validation set by iteratively applying a genetic algorithm (GA) for a plurality of mathematical models configured to evaluate a fitness function associated thereof. It may be noted that for a single problem, using a plurality of mathematical models may not produce a better result than when a single mathematical model is used. However, when evaluated across many problems, using a plurality of mathematical models may typically produce much better results, on an average, than a single model in isolation. In accordance with the present disclosure, the mathematical models are derived only once and can be recalibrated for any new set of workload vectors.
In accordance with an embodiment of the present disclosure, the plurality of mathematical models may include one or more regression based models such as Partial Least Square (PLS), Multiple Linear Regression (MLR) and Logistic Regression model (LRM) and projection based models such as Principal Component Model (PCM). PLS and MLR are linear models in which the failure probability y is expressed as:
y=ß_0+ß_1 x_1+?…+ß_n x_n, wherein ß = (ß0, ß1,….., ßn) is a (n +1) dimensional parameter vector representing regression coefficients and xn represents frequencies of each unique URLs calculated while deriving workload vectors (X).
LRM estimates a failure probability y as:
y=1/(1+e^(-(ß_0+ß_1 x_1+?…+ß_n x_n)) )
In the Project based approach, samples of workload tuples are projected into an n-dimensional space and a cluster of two independent classes is identified, wherein the classes are successful transactions (y = 0) and failed transactions (y = 1). But instead of using the n-dimensional space, a first few principal components (typically 2 or 3) are calculated and workload tuples are projected onto the principal component space to preserve relationships among variables in a lower dimensionality space. To predict the failure probability y, the workload tuples may be projected onto this principal component space and proximity to one of pre-defined clusters may be estimated. For mathematic representation, let K denote the number of workload tuples belonging to failed transactions (y = 1). For calculating the failure probability y of an unknown workload vector, K nearest workload tuples of a known group membership (class of transaction) in the principal component space are identified. Then the failure probability y of the unknown sample may be calculated as: ?_(i=1)^K¦y_i/K.
It may be noted that in either cases of regression based approaches or projection based approach, if the entire set of workload vectors is considered to derive the mathematical model, there may be excessive amount of noise leading to an inaccurate model. Hence, only a disruptive workload pattern (U^' | U^'?U) is considered to reduce likelihood of chance classification and thereby improve the accuracy as well as efficiency of the derived WFM. In accordance with the present disclosure, the problem of identifying the disruptive workload pattern (U^') as a subset of the master list (U) of the unique URLs is viewed as an optimization problem and the genetic algorithm is applied to identify the disruptive workload pattern (U^') for a selected erroneous URL.
In accordance with an embodiment of the present disclosure, the genetic algorithm comprises firstly generating a random population of a first set of chromosomes, wherein the chromosomes are subsets of the unique URLs (U) and correspond to a potential disruptive workload pattern. A pre-defined number of pairs from the first set of chromosomes are randomly identified as parent chromosomes. A new generation of offspring chromosomes is derived by manipulating the parent chromosomes by mathematical operators that model biological processes of reproduction such as recombination methods of crossover and mutation. Fit solutions based on evaluated fitness measures of the parents in the zeroth generation and offspring chromosomes in each of the generations are retained for a next generation and entire process is repeated for a pre-defined number of generations or until the evaluated fitness measure for the validation set is 100%.
In accordance with an embodiment of the present disclosure, the fitness measure may be evaluated based on each of the plurality of mathematical models using a subset of the workload vectors from the training set. For evaluating the fitness measure, a plurality of statistical measures corresponding to the regression based models may be derived using the validation set or a weighted nearest neighbor method may be used on the validation set corresponding to the projection based models. For calculating fitness of an individual chromosome, in an embodiment, firstly one of the three regression based models is derived using a subset Y_T | X_TM^' of the workload tuples from the training set, wherein YT represents a vector containing a group membership of the training set and X_TM^' represents a corresponding workload matrix containing frequency counts of only the URLs which are encoded by the particular chromosome under consideration (a subset).
The fitness of a particular chromosome may be calculated as
Fitness (Accuracy)= (TP+TN)/(TP+FP+FN+TN)
wherein, TP ? True Positive, TN ? True Negative, FP ? False Positive, FN ? False Negative.
Since failure transactions are typically far less compared to successful transactions, the training set tends to be imbalanced. Accordingly, TP << TN and fitness may be biased with higher values of TN. Hence a plurality of statistical measures such as sensitivity (S), precision (P), F1 score (F) and arithmetic mean of sensitivity and precision (M) are used as a measure of fitness corresponding to the regression in accordance with an embodiment of the present disclosure.
Sensitive or True Positive Ratio (TPR) = TP / (TP+FN)
Precision or Positive Predictive Value (PPV) = TP / (TP+FP)
F1 score = (2*TP) / (2*TP+FP+FN)
For PCM, first few principal components (typically two or three) are extracted from the workload matrix X_TM^'. The fitness of an individual chromosome depends on the quality of the cluster generated in the principal component space. Smaller the distance between any two workload tuples of the same class compared to the distance between any two workload tuples of the opposite classes, better is the cluster quality. To assess and quantify the cluster quality for each chromosome, identify K nearest neighbors of each workload tuple in the principal component space, wherein K denotes the number of workload tuples belonging to failed transactions (y = 1). Then, from the K neighbors, the number of workload tuples ki having the same class membership as that of workload tuple i are identified.
The fitness of an individual chromosome may be calculated as follows.
Fitness= ?_¦(For all data points having @y=0 (success))¦?k_i/K+ ?_¦(For all data points having @y= 1 (failure))¦k_i/K?
As explained herein above, this expression maybe biased as the failure transactions are typically far less compared to successful transactions. To remove this bias, let W0 and W1 denote weights for the successful transactions and the failed transactions respectively. Then in accordance with an embodiment of the present disclosure, the fitness may be calculated as:
Fitness= ?_success¦?W_0 k_i/K?+?_failure¦?W_1 k_i/K?
In accordance with the present disclosure, a second set of chromosomes constituting one or more of the parent chromosomes and one or more of the offspring chromosomes based on the evaluated fitness measure are selected for a next iteration. Once the iterative evaluation of fitness measures is completed, one chromosome from the second set of chromosomes having a highest fitness measure in a last iteration and an associated mathematical model are identified thereby identifying the disruptive workload pattern (U^') corresponding to the identified chromosome.
In an embodiment, the first set and the second set of chromosomes comprise n number of chromosomes and the pre-defined number of pairs from the first set of chromosomes as parent chromosomes is n/2.
In accordance with an embodiment of the present disclosure, the step 204b of deriving a Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern (U^') is based on the plurality of statistical measures.
In accordance with an embodiment of the present disclosure, the one or more processors 104 are configured to predict, at step 206, a failure probability y of each of the identified URLs for a new workload vector based on the identified disruptive workload pattern (U^') and the corresponding workload dependent fault model.
EXPERIMENTAL ANALYSIS
The systems and methods of the present disclosure were validated using HTTP access logs obtained from five diverse live projects. Altogether there were about 15 different URLs across the five logs which were executed successfully most of the time but occasionally they were logged with an erroneous status code of 500. Table 1 below enlists details about the 15 different URLs considered for the experimental analysis.
Table 1:
Source URL # of Success # of Failure Training Set Validation Set Prediction Set
Log 1 URL 1 1,309 57 341
(14:327) 341
(14:327) 684
(29:655)
URL 2 2,086 57 535
(14:521) 535
(14:521) 1073
(29:1044)
URL 3 9,502 43 2010
(10:2000) 1010
(10:1000) 6525
(23:6502)
Log 2 URL 4 35,006 13 2003
(3:2000) 1003
(3:1000) 32013
(7:32006)
URL 5 36,988 48 2012
(12:2000) 1012
(12:1000) 34012
(24:33988)
Log 3 URL 6 16,617 898 2224
(224:2000) 1224
(224:1000) 14067
(450:13617)
URL 7 43,241 249 2062
(62:2000) 1062
(62:1000) 40366
(125:40241)
Log 4 URL 8 3,507 156 915
(39:876) 915
(39:876) 1833
(78:1755)
URL 9 23,614 232 2058
(58:2000) 1058
(58:1000) 20730
(116:20614)
URL 10 8,611 923 2230
(230:2000) 1230
(230:1000) 6074
(463:5611)
URL 11 9,162 152 2038
(38:2000) 1038
(38:1000) 6238
(76:6162)
URL 12 2,323 32 588
(8:580) 588
(8:580) 1179
(16:1163)
Log 5 URL 13 6,796 872 1917
(218:1699) 1218
(218:1000) 4533
(436:4097)
URL 14 18,794 106 2026
(26:2000) 1026
(26:1000) 15848
(54:15794)
URL 15 13,285 244 2061
(61:2000) 1061
(61:1000) 10407
(122:10285)
For performing workload dependent fault analysis, each URL was analyzed independently. In accordance with the present disclosure, workload vectors for each occurrence of the URL under investigation were to be derived from the historical HTTP access log. For example, for URL1, 1366 workload vectors having a length of 654 (unique URLs) were derived by considering dT value of 15 seconds. Then these workload vectors were divided into training set, validation set and prediction set. Table 1 above provides details of these divisions as well as their composition within a set. First number in each column represents the total number of workload vectors for this set. First number in the bracket denotes the number of workload vectors pertaining to the failed transactions whereas the second number inside the bracket denotes the number of workload vectors pertaining to the successful transactions.
Each of the erroneous URLs was analyzed independently using the plurality mathematical models. When applying the genetic algorithm explained above, the regression based models such as PLS, MLR and LRM were derived turn by turn using the four statistical measures (sensitivity (S), precision (P), F1 score (F) and arithmetic mean of sensitivity and precision (M)) as an objective function for fitness evaluation. The Projection based model, PCM, was analyzed using K nearest neighbor approach as explained above. For instance, in the analysis of URL#1, PLS model with F1 score as a fitness measure was selected first. The genetic algorithm was then applied using 341 training workload vectors having length 654 (total unique URLs), at each generation, for each chromosome (a unique URL subset). The PLS model was built using the URL subset encoded by the chromosome. Then using the built PLS model and the 341 workload vectors from the validation set, group membership of each workload vector (0 or 1) in the validation set was predicted. By comparing the derived group memberships against the known group memberships, the fitness of the chromosome is evaluated as the overall F1 score of the PLS model. The same process was repeated 13 times using the different models and different objective functions each time. At the end of each run of for a specific URL, the genetic algorithm identified a subset of URLs which could be responsible for causing failure for that particular URL (status code 5XX). For instance, for URL#1, the applied genetic algorithm using PLS and F1 score as an evaluation measure, identified a subset of 7 URLs out of the 654 unique URLs. After identifying the best subset of URLs and the corresponding WFM in each case, a final validation was done using the prediction set. Table 2 below summarizes the best outcome of each combination.
Table 2:
URL # Model Fitness Objective Sensitivity Precision F1 Score Accuracy
URL 1 PLS F 100.0% 100.0% 100.0% 100.00%
MLR X 100.0% 100.0% 100.0% 100.00%
LRM F 100.0% 82.9% 90.6% 99.12%
PCM 100.0% 90.6% 95.1% 99.56%
URL 2 PLS F 100.0% 100.0% 100.0% 100.00%
MLR F 100.0% 100.0% 100.0% 100.00%
LRM P 100.0% 76.3% 86.6% 99.16%
PCM 100.0% 100.0% 100.0% 100.00%
URL 3 PLS F 100.0% 100.0% 100.0% 100.00%
MLR F 100.0% 100.0% 100.0% 100.00%
LRM F 100.0% 63.9% 78.0% 99.80%
PCM 100.0% 100.0% 100.0% 100.00%
URL 4 PLS F 100.0% 100.0% 100.0% 100.00%
MLR P 100.0% 100.0% 100.0% 100.00%
LRM P 42.9% 5.2% 9.2% 99.82%
PCM 100.0% 100.0% 100.0% 100.00%
URL 5 PLS F 100.0% 100.0% 100.0% 100.00%
MLR F 100.0% 100.0% 100.0% 100.00%
LRM S 100.0% 57.1% 72.7% 99.95%
PCM 100.0% 100.0% 100.0% 100.00%
URL 6 PLS F 92.0% 70.8% 80.0% 98.53%
MLR S 94.7% 64.1% 76.4% 98.13%
LRM F 91.8% 66.6% 77.2% 98.27%
PCM 93.3% 38.4% 54.4% 94.99%
URL 7 PLS F 93.6% 22.8% 36.7% 99.00%
MLR F 93.6% 22.8% 36.7% 99.00%
LRM F 23.2% 28.2% 25.4% 99.58%
PCM 93.6% 24.3% 38.6% 99.08%
URL 8 PLS F 51.3% 87.0% 64.5% 97.60%
MLR F 71.8% 86.2% 78.3% 98.31%
LRM F 78.2% 74.4% 76.3% 97.93%
PCM 78.2% 78.2% 78.2% 98.15%
URL 9 PLS F 0.0% 0.0% 0.0% 99.44%
MLR F 0.0% 0.0% 0.0% 99.44%
LRM F 50.9% 41.3% 45.6% 99.32%
PCM 55.2% 22.9% 32.3% 98.71%
URL 10 PLS F 1.9% 25.0% 3.6% 92.08%
MLR S 4.3% 25.0% 7.4% 91.72%
LRM F 18.6% 51.2% 27.3% 92.44%
PCM 44.3% 37.8% 40.8% 90.19%
URL 11 PLS F 1.3% 25.0% 2.5% 98.75%
MLR F 1.3% 25.0% 2.5% 98.75%
LRM X 2.6% 9.1% 4.1% 98.49%
PCM 0.0% 0.0% 0.0% 98.78%
URL 12 PLS F 0.0% 0.0% 0.0% 98.64%
MLR P 0.0% 0.0% 0.0% 98.64%
LRM X 18.8% 2.2% 3.9% 87.36%
PCM 0.0% 0.0% 0.0% 98.64%
URL 13 PLS F 2.1% 37.5% 3.9% 90.25%
MLR S 2.5% 21.2% 4.5% 89.72%
LRM S 0.9% 33.3% 1.8% 90.29%
PCM 0.0% 0.0% 0.0% 90.38%
URL 14 PLS F 0.0% 0.0% 0.0% 99.66%
MLR P 1.9% 12.5% 3.2% 99.62%
LRM F 0.0% 0.0% 0.0% 99.66%
PCM 0.0% 0.0% 0.0% 99.66%
URL 15 PLS S 5.7% 14.0% 8.1% 98.48%
MLR S 4.1% 17.9% 6.7% 98.65%
LRM F 2.5% 15.8% 4.3% 98.70%
PCM 5.7% 2.7% 3.6% 96.44%
From Table 2 above, following inferences were drawn:
For the first five URLs, at least one model was able to give sensitivity, precision, F1 score and thereby accuracy of 100%. This clearly concludes that for these five URLs, faults were workload dependent and using these models occurrence of workload dependent failure for any unseen workload pattern accurately can be predicted.
For the next set of five URLs, the analysis was partially successful in identifying workload dependent faults. In each case at least one mathematical model was able to generate either sensitivity or precision greater than 50%.
For the last five URLs, none of the mathematical models were able to predict accurately, indicating the faults were caused by non-workload related factors.
Table 3 below summarizes the analyses for the assessment of efficiency of the four selected models and the fitness objective functions.
Table 3a: Sensitivity or TPR
URL PCM PLS MLR LMR BoM
S P F X S P F X S P F X
URL#1 100.0% 79.3% 100.0% 100.0% 79.3% 79.3% 20.7% 89.7% 100.0% 100.0% 0.0% 100.0% 0.0% 100.0%
URL#2 100.0% 100.0% 93.1% 100.0% 96.6% 100.0% 82.8% 100.0% 96.6% 100.0% 100.0% 100.0% 0.0% 100.0%
URL#3 100.0% 100.0% 69.6% 100.0% 100.0% 100.0% 73.9% 100.0% 100.0% 100.0% 0.0% 100.0% 100.0% 100.0%
URL#4 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 28.6% 42.9% 0.0% 0.0% 100.0%
URL#5 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 0.0% 0.0% 0.0% 100.0%
URL#6 93.3% 90.7% 34.7% 92.0% 91.3% 94.7% 11.1% 94.4% 93.8% 28.9% 0.0% 91.8% 57.1% 94.7%
URL#7 93.6% 93.6% 12.8% 93.6% 56.0% 93.6% 18.4% 93.6% 88.8% 7.2% 0.0% 23.2% 4.8% 93.6%
URL#8 78.2% 51.3% 15.4% 51.3% 30.8% 51.3% 51.3% 71.8% 51.3% 74.4% 65.4% 78.2% 74.4% 78.2%
URL#9 55.2% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 50.9% 0.0% 55.2%
URL#10 44.3% 1.7% 0.9% 1.9% 1.7% 4.3% 0.4% 3.5% 1.3% 2.4% 0.0% 18.6% 0.0% 44.3%
URL#11 0.0% 1.3% 0.0% 1.3% 0.0% 1.3% 1.3% 1.3% 1.3% 0.0% 0.0% 0.0% 2.6% 2.6%
URL#12 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 18.8% 18.8%
URL#13 0.0% 1.6% 0.2% 2.1% 1.2% 2.5% 0.7% 2.1% 1.2% 0.9% 0.0% 0.2% 0.0% 2.5%
URL#14 0.0% 0.0% 0.0% 0.0% 0.0% 1.9% 1.9% 1.9% 1.9% 0.0% 0.0% 0.0% 0.0% 1.9%
URL#15 5.7% 5.7% 3.3% 4.9% 3.3% 4.1% 4.1% 3.3% 4.1% 0.0% 0.0% 2.5% 1.6% 5.7%
Average 58.0% 48.4% 35.3% 49.8% 44.0% 48.9% 31.1% 50.8% 49.3% 36.2% 13.9% 37.7% 17.3% 59.8%
In a Table 3b: Precision or PPV
URL PCM PLS MLR LMR BoM
S P F X S P F X S P F X
URL#1 90.6% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 38.7% 0.0% 82.9% 0.0% 100.0%
URL#2 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 59.2% 76.3% 60.4% 0.0% 100.0%
URL#3 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 53.5% 0.0% 63.9% 60.5% 100.0%
URL#4 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 87.5% 100.0% 100.0% 5.2% 0.0% 0.0% 100.0%
URL#5 100.0% 100.0% 96.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 57.1% 0.0% 0.0% 0.0% 100.0%
URL#6 38.4% 76.8% 66.4% 70.8% 71.0% 64.1% 73.5% 72.2% 71.9% 69.9% 0.0% 66.6% 28.2% 76.8%
URL#7 24.3% 22.8% 45.7% 22.8% 41.7% 22.8% 67.7% 22.8% 37.5% 90.0% 0.0% 28.2% 60.0% 90.0%
URL#8 78.2% 87.0% 85.7% 87.0% 100.0% 87.0% 87.0% 86.2% 87.0% 84.1% 86.4% 74.4% 100.0% 100.0%
URL#9 22.9% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 41.3% 0.0% 41.3%
URL#10 37.8% 22.2% 50.0% 25.0% 27.6% 25.0% 25.0% 28.6% 40.0% 55.0% 0.0% 51.2% 0.0% 55.0%
URL#11 0.0% 25.0% 0.0% 25.0% 0.0% 25.0% 25.0% 25.0% 25.0% 0.0% 0.0% 0.0% 9.1% 25.0%
URL#12 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 2.2% 2.2%
URL#13 0.0% 14.9% 14.3% 37.5% 27.8% 21.2% 100.0% 22.0% 41.7% 33.3% 0.0% 33.3% 0.0% 100.0%
URL#14 0.0% 0.0% 0.0% 0.0% 0.0% 3.2% 12.5% 3.2% 3.2% 0.0% 0.0% 0.0% 0.0% 12.5%
URL#15 2.7% 14.0% 100.0% 50.0% 40.0% 17.9% 11.4% 100.0% 11.4% 0.0% 0.0% 15.8% 0.2% 100.0%
Average 46.3% 50.8% 57.2% 54.5% 53.9% 51.1% 60.1% 56.5% 54.5% 42.7% 11.2% 34.5% 17.3% 73.5%
Table 3c: F1 score
URL PCM PLS MLR LMR BoM
S P F X S P F X S P F X
URL#1 95.1% 88.5% 100.0% 100.0% 88.5% 88.5% 34.3% 94.6% 100.0% 55.8% 0.0% 90.6% 0.0% 100.0%
URL#2 100.0% 100.0% 96.4% 100.0% 98.3% 100.0% 90.6% 100.0% 98.3% 74.4% 86.6% 75.3% 0.0% 100.0%
URL#3 100.0% 100.0% 82.1% 100.0% 100.0% 100.0% 85.0% 100.0% 100.0% 69.7% 0.0% 78.0% 75.4% 100.0%
URL#4 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 93.3% 100.0% 44.4% 9.2% 0.0% 0.0% 100.0%
URL#5 100.0% 100.0% 98.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 72.7% 0.0% 0.0% 0.0% 100.0%
URL#6 54.4% 83.2% 45.6% 80.0% 79.9% 76.4% 19.3% 81.8% 81.4% 40.9% 0.0% 77.2% 37.7% 83.2%
URL#7 38.6% 36.7% 20.0% 36.7% 47.8% 36.7% 28.9% 36.7% 52.7% 13.3% 0.0% 25.4% 8.9% 52.7%
URL#8 78.2% 64.5% 26.1% 64.5% 47.1% 64.5% 64.5% 78.3% 64.5% 78.9% 74.5% 76.3% 85.3% 85.3%
URL#9 32.3% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 45.6% 0.0% 45.6%
URL#10 40.8% 3.2% 1.7% 3.6% 3.3% 7.4% 0.9% 6.2% 2.5% 4.6% 0.0% 27.3% 0.0% 40.8%
URL#11 0.0% 2.5% 0.0% 2.5% 0.0% 2.5% 2.5% 2.5% 2.5% 0.0% 0.0% 0.0% 4.1% 4.1%
URL#12 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 3.9% 3.9%
URL#13 0.0% 2.9% 0.5% 3.9% 2.2% 4.5% 1.4% 3.8% 2.2% 1.8% 0.0% 0.5% 0.0% 4.5%
URL#14 0.0% 0.0% 0.0% 0.0% 0.0% 2.4% 3.2% 2.4% 2.4% 0.0% 0.0% 0.0% 0.0% 3.2%
URL#15 3.6% 8.1% 6.4% 9.0% 6.1% 6.7% 6.0% 6.4% 6.0% 0.0% 0.0% 4.3% 0.4% 9.0%
Average 49.5% 46.0% 38.4% 46.7% 44.9% 46.0% 35.8% 47.1% 47.5% 30.4% 11.4% 33.4% 14.4% 55.5%
Table 3d: Accuracy
URL PCM PLS MLR LMR BoM
S P F X S P F X S P F X
URL#1 99.6% 99.1% 100.0% 100.0% 99.1% 99.1% 96.6% 99.6% 100.0% 93.3% 95.8% 99.1% 95.8% 100.0%
URL#2 100.0% 100.0% 99.8% 100.0% 99.9% 100.0% 99.5% 100.0% 99.9% 98.1% 99.2% 98.2% 97.3% 100.0%
URL#3 100.0% 100.0% 99.9% 100.0% 100.0% 100.0% 99.9% 100.0% 100.0% 99.7% 99.7% 99.8% 99.8% 100.0%
URL#4 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 99.8% 100.0% 99.9% 100.0%
URL#5 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 99.9% 99.9% 99.9% 100.0%
URL#6 95.0% 98.8% 97.4% 98.5% 98.5% 98.1% 97.0% 98.7% 98.6% 97.3% 96.8% 98.3% 94.0% 98.8%
URL#7 99.1% 99.0% 99.7% 99.0% 99.6% 99.0% 99.7% 99.0% 99.5% 99.7% 99.7% 99.6% 99.7% 99.7%
URL#8 98.2% 97.6% 96.3% 97.6% 97.1% 97.6% 97.6% 98.3% 97.6% 98.3% 98.1% 97.9% 98.9% 98.9%
URL#9 98.7% 99.4% 99.4% 99.4% 99.4% 99.4% 99.4% 99.4% 99.4% 99.4% 99.4% 99.3% 99.4% 99.4%
URL#10 90.2% 92.1% 92.4% 92.1% 92.2% 91.7% 92.3% 92.0% 92.3% 92.4% 92.4% 92.4% 92.3% 92.4%
URL#11 98.8% 98.8% 98.8% 98.8% 98.8% 98.8% 98.8% 98.8% 98.8% 98.8% 98.8% 98.8% 98.5% 98.8%
URL#12 98.6% 98.6% 98.6% 98.6% 98.6% 98.6% 98.6% 98.6% 98.6% 98.6% 98.6% 96.4% 87.4% 98.6%
URL#13 90.4% 89.7% 90.3% 90.3% 90.2% 89.7% 90.5% 89.9% 90.3% 90.3% 90.4% 90.4% 90.3% 90.5%
URL#14 99.7% 99.7% 99.7% 99.7% 99.7% 99.5% 99.6% 99.5% 99.5% 99.7% 99.7% 99.7% 99.7% 99.7%
URL#15 96.4% 98.5% 98.9% 98.8% 98.8% 98.7% 98.5% 98.9% 98.5% 98.8% 98.8% 98.7% 89.2% 98.9%
Average 97.6% 98.1% 98.1% 98.2% 98.1% 98.0% 97.9% 98.2% 98.2% 97.6% 97.8% 97.9% 96.1% 98.4%
By focusing on the experiments wherein faults were workload dependent (first five URLs), it may be noted that Tables 3a and 3b demonstrate multiple approaches with equivalent performances, whereas Tables 3c and 3d illustrate that PLS regression model and F1 score as a fitness objective function for the genetic algorithm perform better. When all the 15 experiments are considered, the projection based approach PCM yields better sensitivity (Table 3a) and F1 score (Table 3c); MRL with precision as fitness objection function yields better precision; whereas MLR with mean of sensitivity and precision yields marginally better accuracy. It may also be noted that success rate for LRM is lower compared to any other model. This may be due to the fact that LRM is an iterative algorithm and took relatively long to converge. Therefore, the value of maximum iterations was reduced to 15 to get a faster output. Within 15 iterations, the model may not have converged to an optimal value. Furthermore, it may also be noted that on an average, performance of a plurality of mathematical models is always better than when a single mathematical model is used.
Traditionally, the problem of workload dependent fault was analyzed only from the concurrency related aspects. In accordance with the present disclosure, systems and methods of the present disclosure facilitate workload dependent fault analysis for a web-based IT system. If any failure is reported in an HTTP access log (status code of 5XX), for a specific URL, then the systems and methods of the present disclosure help in ascertaining whether it is due to workload dependent factor. If it is, then the present disclosure further enables identifying a subset of URLs that may be causing failure of a particular URL. This information could be useful in reproducing or analyzing causes for faulty behavior which can ultimately help in determining proper correction or prevention mechanism. It is imperative to note that the systems and methods of the present disclosure only require HTTP access logs as an input. Although this would be very helpful for any IT system in production, this approach may also be applied at all major phases of software development life cycle.
The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software modules located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various modules described herein may be implemented in other modules or combinations of other modules. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope and spirit of disclosed embodiments being in
Claims:
A processor implemented method (200) comprising:
scanning historic Hypertext Transfer Protocol (HTTP) access log entries pertaining to successful transactions and failed transactions (202) to generate
a master list (U) of unique Uniform Resource Locators (URLs) associated thereof (202a); and
a set of erroneous Uniform Resource Locators (URLs) corresponding to the failed transactions having an erroneous status code beginning with 5 (202b); and
iteratively performing the following steps for each of the erroneous URLs (204):
identifying a disruptive workload pattern (U^') as a subset of the master list (U) of the unique URLs (204a); and
deriving a Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern (U^') (204b).
The processor implemented method of claim 1, wherein the step of identifying the disruptive workload pattern (U^') comprises:
converting the historic HTTP access log entries associated with each of the erroneous URLs and having a semi-structured structure into structured workload patterns in the form of workload tuples, wherein elements of each of the workload tuples include a failure probability (y) and a workload vector (X) representing frequencies (xi) of each of the unique URLs from the master list within a pre-configured time interval (dT);
dividing the workload tuples into a training set and a validation set wherein each of the training set and the validation set are characterized by workload vectors corresponding to the failed transactions and the successful transactions;
analyzing each of the identified erroneous URLs using the training set and the validation set by iteratively applying a genetic algorithm for a plurality of mathematical models configured to evaluate a fitness function associated thereof, the genetic algorithm comprising:
generating a random population of a first set of chromosomes, wherein the chromosomes are subsets of the unique URLs (U) and correspond to a potential disruptive workload pattern;
randomly identifying a pre-defined number of pairs from the first set of chromosomes as parent chromosomes;
iteratively performing the following steps for a pre-defined number of generations:
reproducing offspring chromosomes based on a recombination of the parent chromosomes, wherein the recombination includes crossover and mutation methods;
evaluating a fitness measure of the offspring chromosomes in each of the generations and the parent chromosomes in a zeroth generation of the pre-defined number of generations based on each of the plurality of mathematical models using a subset of the workload vectors from the training set by performing one of:
deriving a plurality of statistical measures using the validation set; or
using a weighted nearest neighbor method on the validation set;
selecting, for a next iteration, a second set of chromosomes constituting one or more of the parent chromosomes and one or more of the offspring chromosomes based on the evaluated fitness measure;
until the evaluated fitness measure for the validation set is 100% or a maximum number of generations is reached;
identifying one chromosome of the second set of chromosomes having a highest fitness measure in a last iteration and an associated mathematical model; and
identifying the disruptive workload pattern (U^') corresponding to the identified chromosome.
The processor implemented method of claim 2, wherein the step of deriving the Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern (U^') is based on the plurality of statistical measures.
The processor implemented method of claim 2, wherein the first set and the second set of chromosomes have n number of chromosomes and the pre-defined number of pairs from the first set of chromosomes as parent chromosomes is n/2.
The processor implemented method of claim 2, wherein the plurality of mathematical models include one or more of regression based models selected from the group consisting of Partial Least Square (PLS), Multiple Linear Regression (MLR) and Logistic Regression model (LRM) and projection based models being a Principal Component Model (PCM), and wherein the plurality of statistical measures correspond to the regression based models and the weighted nearest neighbor method corresponds to the projection based models.
The processor implemented method of claim 5, wherein the plurality of statistical measures include sensitivity (S), precision (P), F1 score (F) and arithmetic mean of sensitivity and precision (M).
The processor implemented method of claim 3 further comprising predicting a failure probability y of each of the identified URLs for a new workload vector based on the identified disruptive workload pattern (U^') and the corresponding workload dependent fault model (WFM) (206).
A system (100) comprising:
one or more data storage devices (102) operatively coupled to one or more hardware processors (104) and configured to store instructions configured for execution by the one or more hardware processors to:
scan historic Hypertext Transfer Protocol (HTTP) access log entries pertaining to successful transactions and failed transactions to generate
a master list (U) of unique Uniform Resource Locators (URLs) associated thereof; and
a set of erroneous Uniform Resource Locators (URLs) corresponding to the failed transactions having an erroneous status code beginning with 5; and
iteratively performing the following steps for each of the erroneous URLs:
identify a disruptive workload pattern (U^') as a subset of the master list (U) of the unique URLs; and
derive a Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern (U^').
The system of claim 8, wherein the one or more hardware processors are further configured to identify the disruptive workload pattern (U^') by:
converting the historic HTTP access log entries associated with each of the erroneous URLs and having a semi-structured structure into structured workload patterns in the form of workload tuples, wherein elements of each of the workload tuples include a failure probability (y) and a workload vector (X) representing frequencies (xi) of each of the unique URLs from the master list within a pre-configured time interval (dT);
dividing the workload tuples into a training set and a validation set wherein each of the training set and the validation set are characterized by workload vectors corresponding to the failed transactions and the successful transactions;
analyzing each of the identified erroneous URLs using the training set and the validation set by iteratively applying a genetic algorithm for a plurality of mathematical models configured to evaluate a fitness function associated thereof, the genetic algorithm comprising:
generating a random population of a first set of chromosomes, wherein the chromosomes are subsets of the unique URLs (U) and correspond to a potential disruptive workload pattern;
randomly identifying a pre-defined number of pairs from the first set of chromosomes as parent chromosomes;
iteratively performing the following steps for a pre-defined number of generations:
reproducing offspring chromosomes based on a recombination of the parent chromosomes, wherein the recombination includes crossover and mutation methods;
evaluating a fitness measure of the offspring chromosomes in each of the generations and the parent chromosomes in a zeroth generation of the pre-defined number of generations based on each of the plurality of mathematical models using a subset of the workload vectors from the training set by performing one of:
deriving a plurality of statistical measures using the validation set; or
using a weighted nearest neighbor method using the validation set;
selecting, for a next iteration, a second set of chromosomes constituting one or more of the parent chromosomes and one or more of the offspring chromosomes based on the evaluated fitness measure;
until the evaluated fitness measure for the validation set is 100% or a maximum number of generations is reached;
identifying one chromosome of the second set of chromosomes having a highest fitness measure in a last iteration and an associated mathematical model; and
identifying the disruptive workload pattern (U^') corresponding to the identified chromosome.
The system of claim 9, wherein the one or more hardware processors are further configured to derive the Workload dependent Fault Model (WFM) corresponding to the identified disruptive workload pattern (U^') based on the plurality of statistical measures.
The system of claim 10, wherein the one or more hardware processors are further configured to predict a failure probability y of each of the identified URLs for a new workload vector based on the identified disruptive workload pattern (U^') and the corresponding workload dependent fault model (WFM).
| Section | Controller | Decision Date |
|---|---|---|
| # | Name | Date |
|---|---|---|
| 1 | 201721036175-IntimationOfGrant13-03-2024.pdf | 2024-03-13 |
| 1 | 201721036175-STATEMENT OF UNDERTAKING (FORM 3) [11-10-2017(online)].pdf | 2017-10-11 |
| 2 | 201721036175-REQUEST FOR EXAMINATION (FORM-18) [11-10-2017(online)].pdf | 2017-10-11 |
| 2 | 201721036175-PatentCertificate13-03-2024.pdf | 2024-03-13 |
| 3 | 201721036175-Written submissions and relevant documents [07-03-2024(online)].pdf | 2024-03-07 |
| 3 | 201721036175-FORM 18 [11-10-2017(online)].pdf | 2017-10-11 |
| 4 | 201721036175-FORM 1 [11-10-2017(online)].pdf | 2017-10-11 |
| 4 | 201721036175-Correspondence to notify the Controller [21-02-2024(online)].pdf | 2024-02-21 |
| 5 | 201721036175-FORM-26 [21-02-2024(online)].pdf | 2024-02-21 |
| 6 | 201721036175-US(14)-HearingNotice-(HearingDate-23-02-2024).pdf | 2024-01-11 |
| 6 | 201721036175-DRAWINGS [11-10-2017(online)].pdf | 2017-10-11 |
| 7 | 201721036175-COMPLETE SPECIFICATION [11-10-2017(online)].pdf | 2017-10-11 |
| 7 | 201721036175-CLAIMS [09-09-2020(online)].pdf | 2020-09-09 |
| 8 | 201721036175-FORM-26 [22-11-2017(online)].pdf | 2017-11-22 |
| 8 | 201721036175-COMPLETE SPECIFICATION [09-09-2020(online)].pdf | 2020-09-09 |
| 9 | 201721036175-Proof of Right (MANDATORY) [23-11-2017(online)].pdf | 2017-11-23 |
| 9 | 201721036175-FER_SER_REPLY [09-09-2020(online)].pdf | 2020-09-09 |
| 10 | 201721036175-OTHERS [09-09-2020(online)].pdf | 2020-09-09 |
| 10 | ABSTRACT 1.jpg | 2018-08-11 |
| 11 | 201721036175-FER.pdf | 2020-03-09 |
| 11 | 201721036175-ORIGINAL UNDER RULE 6 (1A)-FORM 1,26-241117.pdf | 2018-08-11 |
| 12 | 201721036175-FER.pdf | 2020-03-09 |
| 12 | 201721036175-ORIGINAL UNDER RULE 6 (1A)-FORM 1,26-241117.pdf | 2018-08-11 |
| 13 | 201721036175-OTHERS [09-09-2020(online)].pdf | 2020-09-09 |
| 13 | ABSTRACT 1.jpg | 2018-08-11 |
| 14 | 201721036175-FER_SER_REPLY [09-09-2020(online)].pdf | 2020-09-09 |
| 14 | 201721036175-Proof of Right (MANDATORY) [23-11-2017(online)].pdf | 2017-11-23 |
| 15 | 201721036175-COMPLETE SPECIFICATION [09-09-2020(online)].pdf | 2020-09-09 |
| 15 | 201721036175-FORM-26 [22-11-2017(online)].pdf | 2017-11-22 |
| 16 | 201721036175-CLAIMS [09-09-2020(online)].pdf | 2020-09-09 |
| 16 | 201721036175-COMPLETE SPECIFICATION [11-10-2017(online)].pdf | 2017-10-11 |
| 17 | 201721036175-DRAWINGS [11-10-2017(online)].pdf | 2017-10-11 |
| 17 | 201721036175-US(14)-HearingNotice-(HearingDate-23-02-2024).pdf | 2024-01-11 |
| 18 | 201721036175-FORM-26 [21-02-2024(online)].pdf | 2024-02-21 |
| 19 | 201721036175-FORM 1 [11-10-2017(online)].pdf | 2017-10-11 |
| 19 | 201721036175-Correspondence to notify the Controller [21-02-2024(online)].pdf | 2024-02-21 |
| 20 | 201721036175-Written submissions and relevant documents [07-03-2024(online)].pdf | 2024-03-07 |
| 20 | 201721036175-FORM 18 [11-10-2017(online)].pdf | 2017-10-11 |
| 21 | 201721036175-REQUEST FOR EXAMINATION (FORM-18) [11-10-2017(online)].pdf | 2017-10-11 |
| 21 | 201721036175-PatentCertificate13-03-2024.pdf | 2024-03-13 |
| 22 | 201721036175-STATEMENT OF UNDERTAKING (FORM 3) [11-10-2017(online)].pdf | 2017-10-11 |
| 22 | 201721036175-IntimationOfGrant13-03-2024.pdf | 2024-03-13 |
| 1 | searchstrategy_27-02-2020.pdf |