System And Method For Large Language Model Based Automated Test Input

System And Method For Large Language Model Based Automated Test Input Generation For Web Applications

Abstract: ABSTRACT SYSTEM AND METHOD FOR LARGE LANGUAGE MODEL BASED AUTOMATED TEST INPUT GENERATION FOR WEB APPLICATIONS Existing techniques for automated generation of test data for testing web applications need detailed requirement documents. The present disclosure receives a plurality of textual documents to extract context. Rephrasing the extracted context by implementing a plurality of rules and passing extracted context along with a first set of prompts to Large Language Model (LLM). Generating program, validator and first set of constraints for extracted context and generating test data by running the generated program. Assigning ranking to test data and selecting the test data with highest ranking. Statically refining the generated program by calling a mathematical library function on the highest ranked test data to generate structural information and modifying language of the second set of prompts passed to the LLM. Dynamically refining the generated program by passing feedback generated by executing the highest ranked test data on a web application and refining the response obtained. [To be published with FIG. #2]

Patent Information

Application #

Filing Date

01 August 2023

Publication Number

06/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

Parent Application

Applicants

Tata Consultancy Services Limited

Nirmal Building, 9th floor, Nariman point, Mumbai 400021, Maharashtra, India

Inventors

1. AGRAWAL, Supriya

Tata Consultancy Services Limited, Quadra ll (Second Floor), Survey No 238/239(Opposite Magarpatta City), Hadapsar, Pune 411028, Maharashtra, India

2. KARMARKAR, Hrishikesh

Tata Consultancy Services Limited, 9th to 13th Floors, Kensington, Wing B, Hiranandani Builders SEZ, Hiranandani Business Park, Powai, Mumbai 400076, Maharashtra, India

3. CHAUHAN, Avriti

Tata Consultancy Services Limited, Quadra ll (Second Floor), Survey No 238/239(Opposite Magarpatta City), Hadapsar, Pune 411028, Maharashtra, India

4. SHETE, Pranav Gaurishankar

Tata Consultancy Services Limited, Plot No. 2 & 3, MIDC-SEZ, Rajiv Gandhi Infotech Park, Hinjewadi Phase III, Pune 411057, Maharashtra, India

5. ARORA, Nishtha

Tata Consultancy Services Limited, Plot No. 2 & 3, MIDC-SEZ, Rajiv Gandhi Infotech Park, Hinjewadi Phase III, Pune 411057, Maharashtra, India

Specification

Description:FORM 2

THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003

COMPLETE SPECIFICATION
(See Section 10 and Rule 13)

Title of invention:

SYSTEM AND METHOD FOR LARGE LANGUAGE MODEL BASED AUTOMATED TEST INPUT GENERATION FOR WEB APPLICATIONS

Applicant

Tata Consultancy Services Limited
A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India

Preamble to the description:

The following specification particularly describes the invention and the manner in which it is to be performed.
TECHNICAL FIELD
The disclosure herein generally relates to the field of automated test data generation, and, more particularly, to a system and method for large language model based automated test input generation for web applications.

BACKGROUND
Web applications are prevalent and considered the mainstay of information systems for organizations. At the same time, web applications are getting more complex, costly for development, and testing. Employees, customers, and/or business partners rely on these information systems to accomplish their business processes and tasks. Accordingly, users of these web applications assume that these systems are error-free and reliable. Automation testing is imperative to assure regression testing, off-load repetitive tasks from test engineers, and keep the pace between test engineers and developers.
Automated test data generation is a technique to create different tests in an automated fashion, apply them and then record and summarize the collected data. Current approaches including functional test generation and structural test generation need detailed requirement documents and code respectively, without which these approaches cannot be applied. Further, only technique that can be applied with minimum information, is either random generation or Large Language Model (LLM) based approach. Random generation may work in trivial cases, but it produces a lot of infeasible data for complex application. Web applications are often complex with multiple screens and random generation fails to even reach many internal screens. Current Large Language Model (LLM) based approach also produce a lot of vague and incorrect data, which makes it very difficult to ensure coverage.
The test data can be created through requirements, code, or similar inputs. Functional test case generation deals with capturing the requirements in formal languages and applying techniques to generate data from these formal notations. Some examples of such techniques include Random Test case Generation (RTG), Model-Based Testing (MBT) and Expressive Decision Tables (EDT) based testing (EBT). But these techniques cannot work in absence of good requirement documents, which are very rarely available. Structural testing is basically related to the internal design and implementation of the software. Structural test case generation takes code as an input and try to generate test cases by applying methods like path coverage or Modified Condition Decision Coverage (MCDC). But it is very difficult to get full access to code and, in such cases, structural test case generation cannot be used. In cases where only the web interface (executable) and some preliminary textual documents are available, none of the above techniques can be applied. For such cases, using RTG (Random Test Generation) or Large Language Model (LLM) for generating test data is an option. But the results are often incorrect and vague. Also, ensuring coverage will be a challenge. Hence, one cannot rely on only randomness or Large Language Model (LLM) to generate the test cases from minimum information.
As mentioned above, many web-testing projects lag specification documents and other related documents required for testing. Even if they are available, they are not detailed and updated. In such cases, it is very difficult to automatically generate test data.

SUMMARY
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one embodiment, a method for large language model based automated test input generation for web applications is provided. The method includes receiving, via one or more hardware processors, a plurality of textual documents and extracting context related to each field comprised in the plurality of textual documents; rephrasing, via the one or more hardware processors, the extracted context by: (i) implementing a plurality of rules to obtain a rephrased context having a meaning identical to the extracted context; and (ii) passing each extracted context along with a first set of prompts to a Large Language Model (LLM) to obtain a set of rephrased contexts having a meaning identical to the extracted context; generating, via one or more hardware processors, a program, a validator and a first set of constraints for each extracted context, the rephrased context and the set of rephrased contexts by passing a second set of prompts to the Large Language Model (LLM); generating, via one or more hardware processors, one or more test data by running the generated program; assigning, via one or more hardware processors, ranking to the one or more test data, wherein the ranking is assigned based on a number of validators which are successfully validated and selecting the one or more test data with highest ranking; statically refining, via one or more hardware processors, the generated program using a static refinement engine by: (i) calling a mathematical library function on the highest ranked one or more test data to generate structural information pertaining to the highest ranked one or more test data for the Large Language Model (LLM); and (ii) modifying language of the second set of prompts passed to the Large Language Model (LLM) based on the structural information generated; executing, via one or more hardware processors, the highest ranked one or more test data on a web application and receiving feedback from the web application; and dynamically refining, via one or more hardware processors, each generated program using a dynamic refinement engine by: (i) passing the feedback to the Large Language Model (LLM) with a third set of prompts, wherein the Large Language Model (LLM) takes content from the feedback and provides: a) a response if there is an error message; b) a field corresponding to the error message; and c) type of a second set of constraints being violated in the error message; and (ii) refining the program for the field corresponding to the error message dynamically based on the error message received from the feedback by comparing the first set of constraints with the second set of constraints using the dynamic refinement engine.
In another aspect, there is provided a system for large language model based automated test input generation for web applications. The system comprises: a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces, wherein the one or more hardware processors are configured by the instructions to: receive a plurality of textual documents and extracting context related to each field comprised in the plurality of textual documents. The system further comprises rephrasing the extracted context by: (i) implementing a plurality of rules to obtain a rephrased context having a meaning identical to the extracted context; and (ii) passing each extracted context along with a first set of prompts to a Large Language Model (LLM) to obtain a set of rephrased contexts having a meaning identical to the extracted context; generating a program, a validator and a first set of constraints for each extracted context, the rephrased context and the set of rephrased contexts by passing a second set of prompts to the Large Language Model (LLM); generating one or more test data by running the generated program; assigning ranking to the one or more test data, wherein the ranking is assigned based on a number of validators which are successfully validated and selecting the one or more test data with highest ranking; statically refining the generated program using a static refinement engine by: (i) calling a mathematical library function on the highest ranked one or more test data to generate structural information pertaining to the highest ranked one or more test data for the Large Language Model (LLM); and (ii) modifying language of the second set of prompts passed to the Large Language Model (LLM) based on the structural information generated; executing the highest ranked one or more test data on a web application and receiving feedback from the web application; and dynamically refining each generated program using a dynamic refinement engine by: (i) passing the feedback to the Large Language Model (LLM) with a third set of prompts, wherein the Large Language Model (LLM) takes content from the feedback and provides: a) a response if there is an error message; b) a field corresponding to the error message; and c) type of a second set of constraints being violated in the error message; and (ii) refining the program for the field corresponding to the error message dynamically based on the error message received from the feedback by comparing the first set of constraints with the second set of constraints using the dynamic refinement engine.
In yet another aspect, there are provided one or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors cause receiving, via one or more hardware processors, a plurality of textual documents and extracting context related to each field comprised in the plurality of textual documents, rephrasing the extracted context by: (i) implementing a plurality of rules to obtain a rephrased context having a meaning identical to the extracted context; and (ii) passing each extracted context along with a first set of prompts to a Large Language Model (LLM) to obtain a set of rephrased contexts having a meaning identical to the extracted context; generating a program, a validator and a first set of constraints for each extracted context, the rephrased context and the set of rephrased by passing a second set of prompts to the Large Language Model (LLM); generating one or more test data by running the generated program; assigning ranking to the one or more test data, wherein the ranking is assigned based on a number of validators which are successfully validated and selecting the one or more test data with highest ranking; statically refining the generated program using a static refinement engine by: (i) calling a mathematical library function on the highest ranked one or more test data to generate structural information pertaining to the highest ranked one or more test data for the Large Language Model (LLM); and (ii) modifying language of the second set of prompts passed to the Large Language Model (LLM) based on the structural information generated; executing the highest ranked one or more test data on a web application and receiving feedback from the web application; and dynamically refining each generated program using a dynamic refinement engine by: (i) passing the feedback to the Large Language Model (LLM) with a third set of prompts, wherein the Large Language Model (LLM) takes content from the feedback and provides: a) a response if there is an error message; b) a field corresponding to the error message; and c) type of a second set of constraints being violated in the error message; and (ii) refining the program for the field corresponding to the error message dynamically based on the error message received from the feedback by comparing the first set of constraints with the second set of constraints using the dynamic refinement engine.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.

BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
FIG. 1 illustrates an exemplary title system for large language model based automated test input generation for web applications, according to some embodiments of the present disclosure.
FIG. 2 is a functional block diagram of the system for large language model based automated test input generation for web applications, according to some embodiments of the present disclosure.
FIGS. 3A and 3B are flow diagrams illustrating the steps involved in the method for large language model based automated test input generation for web applications, according to some embodiments of the present disclosure.
FIGS. 4A and 4B are block diagrams illustrating the method for large language model based automated test input generation for web applications, according to some embodiments of the present disclosure.

DETAILED DESCRIPTION OF EMBODIMENTS
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
The present disclosure provides a system and method for large language model based automated test input generation for web applications. The present disclosure enables Large Language Model (LLM) based automated generation of strings for testing web applications from natural language documents. System and method of the present disclosure generate a program, a validator and a first set of constraints using the Large Language Model (LLM). One or more test data are generated by running the generated program and the validator validates the generated one or more test data. Further, the result of the validation by an ensemble of validators is used to rank the generated one or more test data and the highest ranked one or more test data is selected as the valid test data. The present disclosure performs static refinement to generate structural properties of the highest ranked one or more test data by calling a mathematical library function and modifying the language of the prompts passed to the Large Language Model (LLM) based on the structural information generated. Further, the present disclosure performs dynamic refinement on the generated program by passing feedback generated by executing the highest ranked one or more test data on a web application and refining the response obtained using a dynamic refinement engine.
Referring now to the drawings, and more particularly to FIG. 1 through FIG.4B, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
FIG. 1 illustrates an exemplary system 100 for large language model based automated test input generation for web applications, according to some embodiments of the present disclosure. In an embodiment, the system 100 includes one or more processors 104, communication interface device(s) or input/output (I/O) interface(s) 106, one or more data storage devices or memory 102 operatively coupled to the one or more processors 104. The one or more processors 104 that are hardware processors can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, graphics controllers, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor(s) are configured to fetch and execute computer-readable instructions stored in the memory. In an embodiment, the system 100 can be implemented in a variety of computing systems, such as laptop computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud and the like.
The I/O interface device(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface device(s) can include one or more ports for connecting a number of devices to one another or to another server.
The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random-access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, one or more modules (not shown) of the system 100 can be stored in the memory 102.
FIG. 2, with reference to FIG. 1, illustrates a functional block diagram of the system for large language model based automated test input generation for web applications, according to some embodiments of the present disclosure. In an embodiment, the system 200 includes an input module 202, a context extractor 204, a pre-processor 206, a rephraser 208, a Large Language Model (LLM) 210, a prompt synthesizer 212, a test data selector 214, a static refinement engine 216, a graphical user interface 218, a feedback retriever 220 and a dynamic refinement engine 222.
FIGS. 3A and 3B are flow diagrams illustrating a processor implemented method for large language model based automated test input generation for web applications using the system of FIG. 1, according to some embodiments of the present disclosure. Steps of the method of FIG. 3 shall be described in conjunction with the components of FIG. 2. At step 302 of the method 300, the one or more hardware processors 104 receives a plurality of textual documents and extract context related to each field comprised in the plurality of textual documents. The plurality of textual documents can be any natural language documents including user manuals, frequently asked questions (FAQ) documents, user stories, user requirement documents and the like, which is represented by the input module 202. A context is a block of text relevant to web form elements for which data needs to be generated. The context extractor 204 takes the plurality of textual documents as input and extracts the text related to each field. Further, the extraction of the context can be done manually or through string pattern matching. A Large Language Model (LLM) connector establishes the connection with given Large Language Model (LLM) 210 based on the given parameters (such as degree of randomness, model, version). Once the connection is established, the Large Language Model (LLM) connector fires the prompt and stores the response received for further processing. The Large Language Model (LLM) connector is internally used by multiple components or modules to fire prompts and get updated prompts i.e., updated response.
At step 304 of the method 300, the one or more hardware processors 104 rephrase the extracted context by:
implementing a plurality of rules to obtain a rephrased context having a meaning identical to the extracted context; and
passing each extracted context along with a first set of prompts to a Large Language Model (LLM) to obtain a set of rephrased contexts having a meaning identical to the extracted context.
The pre-processor 206 implements a plurality of rules to rephrase the context having the meaning identical to the extracted context. The purpose of obtaining the rephrased context is to generate multiple programs and validators to be used for refinement. The repharser 208 passes each extracted context along with a first set of prompts to the Large Language Model (LLM) 210 to obtain a set of rephrased contexts having a meaning identical to the extracted context. The difference between the pre-processor 206 and the repharser 208 is that the repharser 208 uses the Large Language Model (LLM) 210 to rephrase the extracted context whereas the pre-processor 206 uses rules to rephrase the extracted context.
Use case for the rephraser 208:
Original context C1: Password should be alphanumeric. Minimum length of 8 characters and maximum of 20 characters. The password should contain at least one special character and at least one alphabet in capital.
Rephrased context C2 by method 1 (implementing a plurality of rules to obtain a rephrased context having a meaning identical to the extracted context): The password must be a combination of letters and numbers, with a minimum length of 8 characters and a maximum length of 20 characters. The password must include at least one special character and at least one uppercase letter.
Rephrased contexts by method 2 (passing each extracted context along with a first set of prompts to the Large Language Model (LLM) 210 to obtain a set of rephrased contexts having a meaning identical to the extracted context):
C3: The password must be a combination of letters and numbers, with a minimum length of 8 characters and a maximum length of 20 characters. The password should include at least one special character and one uppercase letter.
C4: The password must be a combination of letters and numbers, with a minimum length of 8 characters and a maximum length of 20 characters. The password should also include at least one special character and at least one uppercase letter.
At step 306 of the method 300, the one or more hardware processors 104 generate a program, a validator and a first set of constraints for each extracted context, the rephrased context and the set of rephrased contexts by passing a second set of prompts to the Large Language Model (LLM) 210. The validator is a function generated from each context, wherein the validator implements a plurality of constraints comprised in it. If any constraint comprised in the validator fails, the validator should return false. The prompt synthesizer 212 is responsible for creating prompts to be fired. In first iteration of loop, the second set of prompts are based on the contexts and wherein the second set of prompts generate constraints, programs, and validators from the Large Language Model (LLM) 210. In a subsequent loop iteration, the prompt synthesizer 212 interacts with the static refinement engine 216 or the dynamic refinement engine 222 and receives structural information that can be used to repair an incorrect input leading to repair the generated program. Based on the received structural information about the one or more test data, the prompt synthesizer 212 creates prompts to get information from the Large Language Model (LLM) 210 regarding a possible cause for any error which might occur in the generated program and to fix the generated program.

Use case for generating constraints, programs, validators, and test data for each context (C1 to C4):
Table 1 depicts the constraints generated for each context (C1 to C4).
C1 C2 C3 C4
1. Minimum length of 8 characters
2. Maximum length of 20 characters
3. Should be alphanumeric
4. Should contain atleast one special character
5. Should contain atleast one alphabet in capital 1. Combination of letters and numbers
2. Minimum length of 8 characters
3. Maximum length of 20 characters
4. At least one special character
5. At least one uppercase letter 1. Combination of letters and numbers
2. Minimum length of 8 characters
3. Maximum length of 20 characters
4. At least one special character
5. At least one uppercase letter 1. Combination of letters and numbers
2. Minimum length of 8 characters
3. Maximum length of 20 characters
4. At least one special character
5. At least one uppercase letter
Table 1
Table 2 to Table 5 depicts the programs generated for each context (C1 to C4).
C1 original program C1P1
```python
import random
import string

def generate_password():
special_chars = "!@#$%^&*()_+{}:<>?|[];',./`~"
password = []
password.append(random.choice(string.ascii_uppercase))
password.append(random.choice(string.digits + string.ascii_lowercase + special_chars))
password += random.choices(string.digits + string.ascii_lowercase + special_chars, k=6)
random.shuffle(password)
return '.join(password)

if __name__ == '__main__':
password = generate_password()
print(password)
```
Table 2
C2 original program C2P1
```python
import random
import string

def generate_password():
special_chars = "!@#$%^&*()_+-=[]{}|;:,.<>/?"
password = ""
while True:
for i in range(8):
password += random.choice(string.ascii_letters)
password += random.choice(string.digits)
password += random.choice(special_chars)
password = '.join(random.sample(password, len(password)))
if (any(c.isupper() for c in password)
and any(c.isdigit() for c in password)
and any(c in special_chars for c in password)):
break
password = ""
return password

if __name__ == "__main__":
password = generate_password()
print(password)
```
Table 3
C3 original program C3P1
```python
import random
import string

def generate_password():
special_chars = "!@#$%^&*()_+-=[]{}|;:,.<>/?"
password = ""
while True:
password = '.join(random.choice(string.ascii_letters + string.digits + special_chars) for i in range(random.randint(8, 20)))
if (any(c.isupper() for c in password)
and any(c.isdigit() for c in password)
and any(c in special_chars for c in password)):
break
return password

def main():
password = generate_password()
print(password)

if __name__ == "__main__":
main()
```
Table 4
C4 original program C4P1
```python
import random
import string

def generate_password():
special_chars = "!@#$%^&*()_+"
password = ""
while True:
password = '.join(random.choice(string.ascii_letters + string.digits + special_chars) for i in range(random.randint(8, 20)))
if (any(c.isupper() for c in password)
and any(c.isdigit() for c in password)
and any(c in special_chars for c in password)):
break
return password

def main():
password = generate_password()
print(password)

if __name__ == "__main__":
main()
```
Table 5

Table 6 to Table 9 depicts the validators generated for each context (C1 to C4).
C1 validator C1V
```python
import re
import sys

def validate_password(password):
if len(password) < 8 or len(password) > 20:
return False
if not re.search("[a-z]", password):
return False
if not re.search("[A-Z]", password):
return False
if not re.search("[0-9]", password):
return False
if not re.search("[_@$]", password):
return False
return True

if __name__ == "__main__":
password = sys.argv[1]
print(validate_password(password))
```
Table 6
C2 validator C2V
```python
import re
import sys

def validate_password(password):
if len(password) < 8 or len(password) > 20:
return False
if not re.search("[a-z]", password):
return False
if not re.search("[A-Z]", password):
return False
if not re.search("[0-9]", password):
return False
if not re.search("[!@#$%^&*()_+-={};:'\"<>,.?/|\\]", password):
return False
return True

if __name__ == "__main__":
password = sys.argv[1]
print(validate_password(password))
```
Table 7
C3 validator C3V
```python
import re
import sys

def validate_password(password):
if len(password) < 8 or len(password) > 20:
return False
if not re.search("[a-z]", password):
return False
if not re.search("[A-Z]", password):
return False
if not re.search("[0-9]", password):
return False
if not re.search("[!@#$%^&*()_+-={};:'\"|,.<>/?`~]", password):
return False
return True

if __name__ == "__main__":
password = sys.argv[1]
print(validate_password(password))
```
Table 8
C4 validator C4V
```python
import re
import sys

def validate_password(password):
if len(password) < 8 or len(password) > 20:
return False
if not re.search("[a-z]", password):
return False
if not re.search("[A-Z]", password):
return False
if not re.search("[0-9]", password):
return False
if not re.search("[!@#$%^&*()_+-={};':\"\\|,.<>?]", password):
return False
return True

if __name__ == "__main__":
password = sys.argv[1]
print(validate_password(password))
```
Table 9
At step 308 of the method 300, the one or more hardware processors 104 generate one or more test data by running the generated program. In an embodiment, “test data” may be referred as “string” and can be used interchangeably in the present disclosure Table 10 depicts one or more test data generated by running the generated program.
Context C1 C2 C3 C4
Text data >Q.wo{ft /-y0)Z]a46O_0!48)W-Jc36R @EGKxurf5>E3VlJ 9z!D5Z6Lmmy0V
Table 10
At step 310 of the method 300, the one or more hardware processors 104 assigns ranking to the one or more test data, wherein the ranking is assigned based on a number of validators which are successfully validated. The one or more test data with highest ranking are selected. The test data selector 214 takes the first set of constraints, program, validator, and the one or more test data as input for each extracted and rephrased context, wherein the one or more test data is generated by running the generated program and not the Large Language Model (LLM) 210. Further, the test data selector 214 tries to provide ranking to the one or more test data and select the one or more test data with highest ranking. The ranking for the one or more test data is assigned based on the result of execution of the one or more test data of each generated program on a plurality of validators having the context set. The test data selector 214 assigns a corresponding ranking to the one or more test data depending on whether the validator returns, true or false. Further the test data selector 214 assigns a different ranking in case the validator does not return an output or gives an error. The highest ranking is assigned to the one or more test data if highest number of validators are successfully validated for the corresponding one or more test data. Based on this, the one or more test data is ranked, and the best one or more test data with highest ranking is selected for each context and executed on available web application.
Table 11 depicts assigning of ranks to the one or more test data based on the number of validators which are successfully validated.

Context C1 C2 C3 C4
Text data >Q.wo{ft /-y0)Z]a46O_0!48)W-Jc36R @EGKxurf5>E3VlJ 9z!D5Z6Lmmy0V
Weight 1 0 2 2
Table 11
At step 312 of the method 300, the one or more hardware processors 104 statically refine the generated program using a static refinement engine by:
calling a mathematical library function on the highest ranked one or more test data to generate structural information pertaining to the highest ranked one or more test data for the Large Language Model (LLM); and
modifying language of the second set of prompts passed to the Large Language Model (LLM) based on the structural information generated.
In an embodiment of the present disclosure, the inputs to the static refinement engine 216 are the first set of constraints, the program, the validator, and the one or more test data with the corresponding ranking. Herein, the refinement is called static, because it does not use any feedback from the web application and uses the structural data already available and creates prompts along with the prompt synthesizer 212 to refine the programs. The static refinement engine 216 initially tries to understand the structural information of the one or more test data through a plurality precise prompts. The static refinement engine 216 comprises two types of prompts: precise and generic. The plurality of precise prompts is fired using library functions that gives precise answers about the structural information of the one or more test data. A plurality of library functions can be used to obtain structural information of the one or more test data. For e.g., “what is the length of string?” or “does it contain special characters?”. Based on this structural information of the one or more test data, the static refinement engine 216 creates a plurality of generic prompts. For e.g., if the one or more test data contains special characters, the static refinement engine 216 creates the generic prompt “Is it allowed in given context?”. However, response to this generic prompt to the Large Language Model (LLM) 210 may give vague answers, wherein combination of the plurality of generic prompts and the precise prompts makes the understanding of the structural information of the one or more test data better.

Use case for static refinement based on precise mathematical function and approximate queries.
Extracting precise information about the one or more test data using library functions which includes a plurality of mathematical functions:
For example, C2 test data has length = 24, it has special characters, it has upper case letters and lower-case letters.
Now, using the above information, a sample updated prompt for the one or more test data is “The total length of Password as per one of the outputs of the above python program is 24. Is this length as per the allowed range or value according to the above constraints of Password? If so, answer with string "YES" only, and do not give any other explanation. Otherwise, please provide complete python program with the corrected length.”
Further, using a plurality of updated prompts, a first set of refined programs and test data are generated after the static refinement which are depicted as follows:
Table 12 to Table 15 depicts the first set of refined programs and test data generated after static refinement.
C1 updated program C1P2
```python
import random
import string

def generate_password():
special_chars = "!@#$%^&*()_+{}:<>?|[];',./`~"
password = []
password.append(random.choice(string.ascii_uppercase))
password.append(random.choice(string.digits + string.ascii_lowercase + special_chars))
password += random.choices(string.digits + string.ascii_lowercase + special_chars, k=6)
random.shuffle(password)
return '.join(password)

print(generate_password())
```
Table 12

C2 updated program C2P2
```python
import random
import string

def generate_password():
special_chars = "!@#$%^&*()_+-=[]{}|;:,.<>/?"
password = ""
while True:
for i in range(8):
password += random.choice(string.ascii_letters)
password += random.choice(string.digits)
password += random.choice(special_chars)
password = '.join(random.sample(password, len(password)))
if (any(c.isupper() for c in password)
and any(c.isdigit() for c in password)
and any(c in special_chars for c in password)):
break
password = ""
return password[:20]

if __name__ == "__main__":
print(generate_password())
```
Table 13
C3 updated program C3P2
```python
import random
import string

def generate_password():
special_chars = "!@#$%^&*()_+-=[]{}|;:,.<>/?"
password = ""
while True:
password = '.join(random.choice(string.ascii_letters + string.digits + special_chars) for i in range(random.randint(8, 20)))
if (any(c.isupper() for c in password)
and any(c.isdigit() for c in password)
and any(c in special_chars for c in password)):
break
return password

print(generate_password())
```
Table 14
C4 updated program C4P2
```python
import random
import string

def generate_password():
special_chars = "!@#$%^&*()_+"
password = ""
while True:
password = '.join(random.choice(string.ascii_letters + string.digits + special_chars) for i in range(random.randint(8, 20)))
if (any(c.isupper() for c in password)
and any(c.isdigit() for c in password)
and any(c in special_chars for c in password)):
break
return password

print(generate_password())
```
Table 15
Table 16 depicts assigning corresponding ranks to the one or more test data generated through the first set of refined programs.
Context C1 C2 C3 C4
Text data ](eg.`Iu [1<1,XNMHrM*1v(=34[3 jera_+FQKw/q1 LxbgkNMOoH^wx4
Ranks 0 2 3 2
Table 16
At step 314 of the method 300, the one or more hardware processors 104 executes the highest ranked one or more test data on a web application and receives feedback from the web application. The feedback retriever 220 takes the final test data i.e., the highest ranked one or more test data. For e.g., username = X, password = Y is test data or collection of two highest ranked test data generated by the test data selector 214 and executed on the web application. The feedback retriever 220 stores the feedback in the form of HyperText Markup Language (HTML), JavaScript, typescript, images, screenshots or combinations thereof. Further, the stored feedback is passed to the dynamic refinement engine 222.

Use case for executing the one or more test data and executing on the web application:
Referring to the table 16, it is observed that the test data for refined C3 program has highest ranking. Hence jera_+FQKw/q1 is selected and executed on the web application. After executing, the feedback which may or may not have error is obtained, wherein this feedback is passed to the dynamic refinement engine 222 for the dynamic refinement.
At step 316 of the method 300, the one or more hardware processors 104 dynamically refine each generated program using a dynamic refinement engine by:
passing the feedback to the Large Language Model (LLM) with a third set of prompts, wherein the Large Language Model (LLM) takes content from the feedback and provides:
a response if there is an error message;
a field corresponding to the error message; and
type of a second set of constraints being violated in the error message; and
refining the program for the field corresponding to the error message dynamically based on the error message received from the feedback by comparing the first set of constraints with the second set of constraints using the dynamic refinement engine.
In an embodiment of the present disclosure, the inputs to the dynamic refinement engine 222 are the first set of constraints, the program, the validator, the one or more test data with the corresponding ranking and feedback received from the web application after executing the highest ranked one or more test data. The feedback can be any form including HTML, JavaScript and the like. Further, the feedback is processed and passed to the Large Language Model (LLM) 210 with the third set of prompts. The Large Language Model (LLM) 210 takes the content from the feedback and provides a response, wherein the response contains information whether there is any error message or not. Based on the information about the error message, the dynamic refinement engine 222 again interacts with the Large Language Model (LLM) 210 to refine the generated program based on the relevant error message.

Use case for Dynamic refinement:
Consider “ABCD1234E” is executed for PAN number
Following is a sample segment of feedback received:

PAN no *:

Invalid PAN Number. Length is less

Nationality *:

The feedback is then processed and given to the Large Language Model (LLM) 210 using third set of prompts to generate following information:
Error message presence: YES
Field of error message: PAN number
Violated constraint type: Length
Violated Constraint: Length is less than expected
Now, the above-mentioned violated constraint (the second set of constraints) is compared with the first set of constraints generated using the second set of prompts. One of the ways of comparing the violated constraint with the first set of constraints is by converting both to satisfiability formulas and check for their satisfiability.
Based on result of this comparison, the third set of prompts are provided to the Large Language Model (LLM) 210 to generate a second set of refined programs (the second set of refined programs are nothing but further refined version of the first set of refined programs) and the test data after dynamic refinement.
FIGS. 4A and 4B are block diagrams illustrating the method for large language model based automated test input generation for web applications, according to some embodiments of the present disclosure. FIG.4A deals with creating a set of context sets for each context provided in the plurality of textual documents, wherein these contexts are mapped to various fields in the web application. As mentioned above in the earlier sections, the flow chart starts with the preliminary documents from which context are extracted field wise either manually or by using simple pattern matching. For each context C_i, two processes are executed in parallel, wherein in first process, C_i is passed to the pre-processor 206 and the plurality of predefined rules are implemented to rephrase the context. Further, the plurality of predefined rules has derived manually from analyzing various context and their variants which have the meaning identical to the context. In second process, the Large Language Model (LLM) 210 is used to rephrase the context which have the meaning identical to the context, wherein this interaction with Large Language Model (LLM) 210 happens via the Large Language Model (LLM) Connector that fires the first set of prompts. After the first process and second process, the set of rephrased texts for each context that have same meaning.

Documents

Application Documents

#	Name	Date
1	202321051754-STATEMENT OF UNDERTAKING (FORM 3) [01-08-2023(online)].pdf	2023-08-01
2	202321051754-REQUEST FOR EXAMINATION (FORM-18) [01-08-2023(online)].pdf	2023-08-01
3	202321051754-FORM 18 [01-08-2023(online)].pdf	2023-08-01
4	202321051754-FORM 1 [01-08-2023(online)].pdf	2023-08-01
5	202321051754-FIGURE OF ABSTRACT [01-08-2023(online)].pdf	2023-08-01
6	202321051754-DRAWINGS [01-08-2023(online)].pdf	2023-08-01
7	202321051754-DECLARATION OF INVENTORSHIP (FORM 5) [01-08-2023(online)].pdf	2023-08-01
8	202321051754-COMPLETE SPECIFICATION [01-08-2023(online)].pdf	2023-08-01
9	202321051754-FORM-26 [29-09-2023(online)].pdf	2023-09-29
10	Abstract.1.jpg	2024-01-08
11	202321051754-RELEVANT DOCUMENTS [28-03-2024(online)].pdf	2024-03-28
12	202321051754-FORM 13 [28-03-2024(online)].pdf	2024-03-28
13	202321051754-AMMENDED DOCUMENTS [28-03-2024(online)].pdf	2024-03-28
14	202321051754-FORM 3 [18-07-2024(online)].pdf	2024-07-18
15	202321051754-Power of Attorney [02-08-2024(online)].pdf	2024-08-02
16	202321051754-Form 1 (Submitted on date of filing) [02-08-2024(online)].pdf	2024-08-02
17	202321051754-Covering Letter [02-08-2024(online)].pdf	2024-08-02
18	202321051754-CORRESPONDENCE(IPO)-(WIPO DAS)-08-08-2024.pdf	2024-08-08
19	202321051754-FORM-26 [07-11-2025(online)].pdf	2025-11-07