Sign In to Follow Application
View All Documents & Correspondence

Methods And Systems For Automated Management Of Application Elements And Generation Of Smart Dataset

Abstract: The disclosure herein generally relates to the field of automated testing of software applications to ensure software quality assurance. The present disclosure provides methods and systems for automated management of the application elements and generation of smart dataset, that addresses the technical problems with the automated process. The present disclosure makes use of both an artificial intelligence technique and a form element extraction technique to automatically identify and extract the application elements present in the web application. Further, the smart data and the smart datasets (support input data) associated with the application elements is also generated automatically, using a smart data generation techniques. With the help of the extracted application elements and the smart data and the smart datasets, the test scenarios are quickly and automatically generated.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
28 September 2021
Publication Number
13/2023
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
kcopatents@khaitanco.com
Parent Application

Applicants

Tata Consultancy Services Limited
Nirmal Building, 9th Floor, Nariman Point Mumbai Maharashtra India 400021

Inventors

1. ARORA, Nishtha
Tata Consultancy Services Limited Sahyadri Park, Plot No. 2, 3,Rajiv Gandhi Infotech Park, Phase III, Hinjawadi-Maan, Pune Maharashtra India 411057
2. SUHANE, Gautam
Tata Consultancy Services Limited Sahyadri Park, Plot No. 2, 3,Rajiv Gandhi Infotech Park, Phase III, Hinjawadi-Maan, Pune Maharashtra India 411057
3. PANDIAN, Meenatchi
Tata Consultancy Services Limited SIRUSERI TECHNO PARK, Plot No. 1/G1, SIPCOT IT Park, Siruseri, Navalur Post, Kancheepuram Dist., Chennai Tamil Nadu India 603103
4. ATHMANATHAN, Rajesh
Tata Consultancy Services Limited SIRUSERI TECHNO PARK, Plot No. 1/G1, SIPCOT IT Park, Siruseri, Navalur Post, Kancheepuram Dist., Chennai Tamil Nadu India 603103

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION (See Section 10 and Rule 13)
Title of invention:
METHODS AND SYSTEMS FOR AUTOMATED MANAGEMENT OF APPLICATION ELEMENTS AND GENERATION OF SMART DATASET
Applicant
Tata Consultancy Services Limited A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description:
The following specification particularly describes the invention and the
manner in which it is to be performed.

CROSS-REFERENCE TO RELATED APPLICATIONS AND PRIORITY
[001] The present application is a patent of addition of Indian patent application no. 202021016457, filed on April 16, 2020. The entire contents of the aforementioned application are incorporated herein by reference.
TECHNICAL FIELD [002] The disclosure herein generally relates to the field of automated testing of software applications to ensure software quality assurance, and, more particularly, to methods and systems for automated management of application elements present in the software applications and generation of smart dataset for the application elements, to implement automated testing of the software applications.
BACKGROUND [003] Software applications especially web applications are designed to accommodate a multitude of transactions, where each transaction often requires the performance of a significant number of functions. Testing of the software applications (herein after referred as ‘software testing’) plays a crucial role to ensure the conformance of developed software applications with their requirements. Creating test scenarios for the software testing is a manually thought through process to identify possible tests that need to be performed on the software application to confirm that the software application is working as per the predefined business requirements. To test the test scenarios, test engineers need to pass various test scenarios (positive and negative test data) with the intention to ensure all possible combinations are covered. And the failures are reported for fixes. Exhaustive testing procedures are enforced by functional safety standards which mandate that each requirement be covered in the test scenarios (test data). Test engineers need to identify all the representative test execution scenarios from requirements, determine the runtime conditions that trigger these scenarios, and finally provide the test input data that satisfy these conditions.

[004] Currently, the test scenarios are manually written by a software developer or a tester based on their knowledge on the software application and understanding from requirement specifications. Execution of the test scenarios helps to ensure that defects have not been introduced or uncovered in unchanged areas of codebase as a result of the modification. Today, providers of the software applications and/or their corresponding services are faced with the problem of having large test scenarios (or test data) that are written manually based on users knowledge and the need to automate these test scenarios to function within any one of a number of industry standard automation tools.
[005] Therefore, focus has been shifted on automation to convert these test scenarios into automation scripts using industry tools available in market and execute repeatedly and frequently passing various combination of test scenarios. However, with these approaches, software industry still scrambles with the problems such as exhaustiveness of test scenarios, probability of combination of test data selected, automation script maintenance (as only unique id is used for reference), coverage of test scenarios, or challenges to select right test scenarios for execution based on time, business need, urgency, etc. Also, automating the manual test cases is both time consuming and effort intensive.
[006] In case of web applications, there may be several application elements such as text box, list box, buttons, radio buttons, check box and so on. Without the support input data of the application elements, automatic generation of test scenarios covering the application elements may be much more challenging and time-consuming process. Further, missing some of the application elements in the test scenario affects the overall quality of the software application.
[007] In agile implementations, every features needs to be delivered as consumable by end of every sprint. Every sprint requires quality of the software to be tested so that it does not break in the real production environment. In this situation, testing becomes a big challenge, as how much to test, what to test, within the time given to ensure higher level of quality. Selecting the right level of test scenarios to ensure higher degree of quality, or informed risk to know the quality of the software product before releasing are also challenging. At present, no

automated technology is available to generate the test scenarios for the completeness of the software testing. Current market tools and products in the test automation, requires automation tester or engineer to configure the test scripts. Configuring the test scripts also requires them to recognize the application elements present in the web application every time. Moreover, creating the test scenarios for various validations is also a time-consuming task.
SUMMARY
[008] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
[009] In an aspect, a processor-implemented method for automated management of application elements and generation of a smart dataset for the application elements is provided. The method including the steps of: receiving a start uniform resource locator (URL) associated with a start web page of a web application to be crawled and adding the start URL to a URL dictionary; extracting one or more successive URLs present in the start web page of the web application, using a code-based approach, and adding the one or more successive URLs to the URL dictionary; identifying one or more unique URLs out of the one or more successive URLs and the start URL present in the URL dictionary; assigning a unique ID to each of the one or more unique URLs; passing a screenshot of the web page associated with each unique URL of the one or more unique URLs, to an artificial intelligence-based pipeline, to extract first application elements data associated with each unique URL, wherein the first application elements data comprises at least one of: (i) a label name for each of one or more first application elements, (ii) an element type for each of the one or more first application elements, (iii) an element default value for each of the one or more first application elements, and (iv) a clickable XY-coordinates for each of the one or more first application elements; passing each unique URL of the one or more unique URLs, to a form element extractor, to extract second application elements data associated with each unique URL, wherein the second application elements data comprises at least one

of: (i) the label name for each of one or more second application elements, (ii) the element type for each of the one or more second application elements, (iii) the element default value for each of the one or more second application elements, (iv) the clickable XY-coordinates for each of the one or more second application elements, (v) an element identification number (ID) for each of the one or more second application elements, (vi) master options for each of the one or more second application elements, and (vii) a command type for each of the one or more second application elements; integrating the first application elements data and the second application elements data associated with each unique URL, to obtain a unique application elements data associated with each unique URL, wherein the unique application elements data associated with each unique URL comprises at least one of: (i) the label name for each of one or more unique application elements, (ii) the element type for each of the one or more unique application elements, (iii) the element default value for each of the one or more unique application elements, (iv) the clickable XY-coordinates for each of the one or more unique application elements, (v) the element identification number (ID) for each of the one or more unique application elements, (vi) the master options for each of the one or more unique application elements, and (vii) the command type for each of the one or more unique application elements; extracting validation data for each of the one or more unique application elements present in each unique URL, using a validation data extraction model; generating (i) a smart data for each of the one or more unique application elements present in each unique URL, and (ii) one or more smart datasets for each unique URL; and performing an exception handling while navigating from one unique URL to another unique URL of the one or more unique URLs present in the web application, based on the validation data associated with each of the one or more unique application elements.
[010] In another aspect, a system for automated management of application elements and generation of a smart dataset for the application elements is provided. The system includes: a memory storing instructions; one or more Input/Output (I/O) interfaces; and one or more hardware processors coupled to the memory via the one or more I/O interfaces, wherein the one or more hardware

processors are configured by the instructions to: receive a start uniform resource locator (URL) associated with a start web page of a web application to be crawled and add the start URL to a URL dictionary; extract one or more successive URLs present in the start web page of the web application, using a code-based approach, and adding the one or more successive URLs to the URL dictionary; identify one or more unique URLs out of the one or more successive URLs and the start URL present in the URL dictionary; assign a unique ID to each of the one or more unique URLs; pass a screenshot of the web page associated with each unique URL of the one or more unique URLs, to an artificial intelligence-based pipeline, to extract first application elements data associated with each unique URL, wherein the first application elements data comprises at least one of: (i) a label name for each of one or more first application elements, (ii) an element type for each of the one or more first application elements, (iii) an element default value for each of the one or more first application elements, and (iv) a clickable XY-coordinates for each of the one or more first application elements; pass each unique URL of the one or more unique URLs, to a form element extractor, to extract second application elements data associated with each unique URL, wherein the second application elements data comprises at least one of: (i) the label name for each of one or more second application elements, (ii) the element type for each of the one or more second application elements, (iii) the element default value for each of the one or more second application elements, (iv) the clickable XY-coordinates for each of the one or more second application elements, (v) an element identification number (ID) for each of the one or more second application elements, (vi) master options for each of the one or more second application elements, and (vii) a command type for each of the one or more second application elements; integrate the first application elements data and the second application elements data associated with each unique URL, to obtain a unique application elements data associated with each unique URL, wherein the unique application elements data associated with each unique URL comprises at least one of: (i) the label name for each of one or more unique application elements, (ii) the element type for each of the one or more unique application elements, (iii) the element default value for each of the one or more

unique application elements, (iv) the clickable XY-coordinates for each of the one or more unique application elements, (v) the element identification number (ID) for each of the one or more unique application elements, (vi) the master options for each of the one or more unique application elements, and (vii) the command type for each of the one or more unique application elements; extract validation data for each of the one or more unique application elements present in each unique URL, using a validation data extraction model; generate (i) a smart data for each of the one or more unique application elements present in each unique URL, and (ii) one or more smart datasets for each unique URL; and perform an exception handling while navigating from one unique URL to another unique URL of the one or more unique URLs present in the web application, based on the validation data associated with each of the one or more unique application elements.
[011] In yet another aspect, there is provided a computer program product comprising a non-transitory computer readable medium having a computer readable program embodied therein, wherein the computer readable program, when executed on a computing device, causes the computing device to: receive a start uniform resource locator (URL) associated with a start web page of a web application to be crawled and add the start URL to a URL dictionary; extract one or more successive URLs present in the start web page of the web application, using a code-based approach, and adding the one or more successive URLs to the URL dictionary; identify one or more unique URLs out of the one or more successive URLs and the start URL present in the URL dictionary; assign a unique ID to each of the one or more unique URLs; pass a screenshot of the web page associated with each unique URL of the one or more unique URLs, to an artificial intelligence-based pipeline, to extract first application elements data associated with each unique URL, wherein the first application elements data comprises at least one of: (i) a label name for each of one or more first application elements, (ii) an element type for each of the one or more first application elements, (iii) an element default value for each of the one or more first application elements, and (iv) a clickable XY-coordinates for each of the one or more first application elements; pass each unique URL of the one or more unique URLs, to a form element extractor, to extract second application

elements data associated with each unique URL, wherein the second application elements data comprises at least one of: (i) the label name for each of one or more second application elements, (ii) the element type for each of the one or more second application elements, (iii) the element default value for each of the one or more second application elements, (iv) the clickable XY-coordinates for each of the one or more second application elements, (v) an element identification number (ID) for each of the one or more second application elements, (vi) master options for each of the one or more second application elements, and (vii) a command type for each of the one or more second application elements; integrate the first application elements data and the second application elements data associated with each unique URL, to obtain a unique application elements data associated with each unique URL, wherein the unique application elements data associated with each unique URL comprises at least one of: (i) the label name for each of one or more unique application elements, (ii) the element type for each of the one or more unique application elements, (iii) the element default value for each of the one or more unique application elements, (iv) the clickable XY-coordinates for each of the one or more unique application elements, (v) the element identification number (ID) for each of the one or more unique application elements, (vi) the master options for each of the one or more unique application elements, and (vii) the command type for each of the one or more unique application elements; extract validation data for each of the one or more unique application elements present in each unique URL, using a validation data extraction model; generate (i) a smart data for each of the one or more unique application elements present in each unique URL, and (ii) one or more smart datasets for each unique URL; and perform an exception handling while navigating from one unique URL to another unique URL of the one or more unique URLs present in the web application.
[012] In an embodiment, the first application elements data associated with each unique URL is extracted by detecting one or more region of interests (ROIs) associated with one or more first application elements present in the screenshot of the web page associated with each unique URL, using a trained ROI detection model; detecting (i) a contour and (ii) a clickable XY-coordinates for each

of the one or more ROIs, using a contour detection technique; detecting the element type for each of the one or more ROIs, using a trained element type detection model; and extracting (i) the label name and (ii) the element default value, for each of the one or more ROIs, using an optical character recognition model.
[013] In an embodiment, the validation data for each of one or more unique application elements present in each unique URL, is extracted using the validation data extraction model, by: generating one or more intents for each of the one or more unique application elements present in each unique URL, based on the label name associated with the unique application element, using an intent generation algorithm; extracting an intent related information for each of the one or more unique application elements present in each unique URL, based on the one or more intents generated for the unique application element, using a parsing algorithm; and extracting the validation data for each of the one or more unique application elements present in each unique URL, by classifying the intent related information associated with the unique application element, using a text classification model.
[014] In an embodiment, the smart data for each of the one or more unique application elements present in each unique URL, is generated by: generating the smart data for each unique application element, based on a custom input provided for the unique application element; and generating the smart data for each unique application element, based on the element type associated with the unique application element.
[015] In an embodiment, the one or more smart datasets for each unique URL, is generated based on possible combinations of the one or more unique application elements present in each unique URL.
[016] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS

[017] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[018] FIG. 1 is an exemplary block diagram of a system for automated management of application elements and generation of smart dataset, in accordance with some embodiments of the present disclosure.
[019] FIG. 2A and FIG. 2B illustrates exemplary flow diagrams of a processor-implemented method for automated management of application elements and generation of smart dataset, in accordance with some embodiments of the present disclosure.
[020] FIG. 3 is a flowchart describing a generation of smart data for each of the application elements based on element type, in accordance with some embodiments of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS [021] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
[022] The Applicant has addressed some of the discussed technical challenges present in the automated testing in Indian patent application no. 202021016457, filed on April 16, 2020. The applicant discussed about a method and system for automated generation of test scenarios and automation of scripts, in the Indian patent application no. 202021016457. However automated identification and management of the application elements present in the web application and generation of the smart data (the support input data) associated with the application elements are not explicitly disclosed.

[023] The present disclosure herein provides methods and systems for automated management of the application elements and generation of smart dataset, that addresses the technical problems with the automated process. The present disclosure makes use of both an artificial intelligence technique and a form element extraction technique to automatically identify and extract the application elements present in the web application. Further, the smart data and the smart datasets (support input data) associated with the application elements is also generated automatically, using a smart data generation techniques. With the help of the extracted application elements and the smart data and the smart datasets, the test scenarios are quickly and automatically generated.
[024] Referring now to the drawings, and more particularly to FIG. 1 through FIG. 3, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary systems and/or methods.
[025] FIG. 1 is an exemplary block diagram of a system 100 for automated management of the application elements and generation of the smart dataset, in accordance with some embodiments of the present disclosure. In an embodiment, the system 100 includes or is otherwise in communication with one or more hardware processors 104, communication interface device(s) or input/output (I/O) interface(s) 106, and one or more data storage devices or memory 102 operatively coupled to the one or more hardware processors 104. The one or more hardware processors 104, the memory 102, and the I/O interface(s) 106 may be coupled to a system bus 108 or a similar mechanism.
[026] The I/O interface(s) 106 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface(s) 106 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a keyboard, a mouse, an external memory, a plurality of sensor devices, a printer and the like. Further, the I/O interface(s) 106 may enable the system 100 to communicate with other devices, such as web servers and external databases.

[027] The I/O interface(s) 106 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, local area network (LAN), cable, etc., and wireless networks, such as Wireless LAN (WLAN), cellular, or satellite. For the purpose, the I/O interface(s) 106 may include one or more ports for connecting a number of computing systems with one another or to another server computer. Further, the I/O interface(s) 106 may include one or more ports for connecting a number of devices to one another or to another server.
[028] The one or more hardware processors 104 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 104 are configured to fetch and execute computer-readable instructions stored in the memory 102. In the context of the present disclosure, the expressions ‘processors’ and ‘hardware processors’ may be used interchangeably. In an embodiment, the system 100 can be implemented in a variety of computing systems, such as laptop computers, portable computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud and the like.
[029] The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, the memory 102 includes a plurality of modules 102a and a repository 102b for storing data processed, received, and generated by one or more of the plurality of modules 102a. The plurality of modules 102a may include routines, programs, objects, components, data structures, and so on, which perform particular tasks or implement particular abstract data types.
[030] The plurality of modules 102a may include programs or computer-readable instructions or coded instructions that supplement applications or

functions performed by the system 100. The plurality of modules 102a may also be used as, signal processor(s), state machine(s), logic circuitries, and/or any other device or component that manipulates signals based on operational instructions. Further, the plurality of modules 102a can be used by hardware, by computer-readable instructions executed by the one or more hardware processors 104, or by a combination thereof. In an embodiment, the plurality of modules 102a can include various sub-modules (not shown in FIG. 1). Further, the memory 102 may include information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system 100 and methods of the present disclosure.
[031] The repository 102b may include a database or a data engine. Further, the repository 102b amongst other things, may serve as a database or includes a plurality of databases for storing the data that is processed, received, or generated as a result of the execution of the plurality of modules 102a. Although the repository 102a is shown internal to the system 100, it will be noted that, in alternate embodiments, the repository 102b can also be implemented external to the system 100, where the repository 102b may be stored within an external database (not shown in FIG. 1) communicatively coupled to the system 100. The data contained within such external database may be periodically updated. For example, data may be added into the external database and/or existing data may be modified and/or non-useful data may be deleted from the external database. In one example, the data may be stored in an external system, such as a Lightweight Directory Access Protocol (LDAP) directory and a Relational Database Management System (RDBMS). In another embodiment, the data stored in the repository 102b may be distributed between the system 100 and the external database.
[032] The system 100 is configured to record details of the web application at micro level (page level), index based on the page navigation. The system 100 is further configured to create a mind map or a tree using a traverse algorithm for creating the necessary test scenarios based on page flows or page navigation. During the same time, the system 100 captures all the underlying screen (the screen of the web application) properties and the application elements. Further the system 100 is configured to generate the test scenarios and the test scripts automatically,

using the screen properties, the application elements and the smart data and smart datasets associated with the application elements.
[033] Referring to FIG. 2A and FIG. 2B, components and functionalities of the system 100 are described in accordance with an example embodiment of the present disclosure. For example, FIG. 2A and FIG. 2B illustrates exemplary flow diagrams of a processor-implemented method 200 for automated management of the application elements and generation of the smart dataset, in accordance with some embodiments of the present disclosure. Although steps of the method 200 including process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps be performed in that order. The steps of processes described herein may be performed in any practical order. Further, some steps may be performed simultaneously, or some steps may be performed alone or independently.
[034] Initially, at step 202 of the method 200, the one or more hardware processors 104 of the system 100 are configured to receive a start uniform resource locator (URL) associated with a start web page of the web application from which the application elements to be crawled. The start web page may be a main web page or a host web page of the web application. Further, the start web page of the web application defines a starting location from which the web application to be tested. The start URL associated with the start web page is added to a URL dictionary. In an embodiment, the URL dictionary may be present in the repository 102b of system 100.
[035] At step 204 of the method 200, the one or more hardware processors 104 of the system 100 are configured to extract one or more successive URLs present in the start web page of the web application received at step 202 of the method 200. The one or more successive URLs are associated with the hyperlinks present in the start web page of the web application. In an embodiment, a code-based approach is used to extract the one or more successive URLs present in the start web page. In an embodiment, the code-based approach may be a software

script or a program configured to identify the hypertext markup language (HTML) tags present in the HTML code associated with the start web page, and extract the HTML tags associated with the hyperlinks. The extracted one or more successive URLs are added to the URL dictionary.
[036] At step 206 of the method 200, the one or more hardware processors 104 of the system 100 are configured to identify one or more unique URLs out of the one or more successive URLs and the start URL present in the URL dictionary. Each URL is compared to each other URL among the one or more successive URLs and the start URL present in the URL dictionary to identify the one or more unique URLs. The duplicate or redundant URLs (listed for more than once) among the one or more successive URLs and the start URL are removed in this step. At step 208 of the method 200, the one or more hardware processors 104 of the system 100 are configured to assign a unique identification number (ID) to each of the one or more unique URLs identified at step 206 of the method 200, for easy identification and use.
[037] At step 210 of the method 200, the one or more hardware processors 104 of the system 100 are configured to pass a screenshot of the web page associated with each unique URL of the one or more unique URLs, to extract first application elements data associated with each unique URL. The first application elements data is associated with the one or more application elements present in the corresponding web page. The one or more application elements includes a test box, list box, radio button, checkbox, and so on, that are present in the corresponding web page. More specifically, the first application elements data includes at least one of: (i) a label name for each of one or more first application elements, (ii) an element type for each of the one or more first application elements, (iii) an element default value for each of the one or more first application elements, and (iv) a clickable XY-coordinates for each of the one or more first application elements.
[038] The label name for each of one or more first application elements is the label name defined for the corresponding application element. An example for the label name is: USERNAME. The element type for each of the one or more first application elements is the type of the application element such as text box, list box,

radio button, and so on. The element default value for each of the one or more first application elements is the default value available for the corresponding application element. For example, the element default value of the label name ‘USERNAME’ is: John_123. The default value may not be available for some or all of the one or more first application elements. The clickable XY-coordinates for each of the one or more first application elements are the X-coordinate and the Y-coordinate that defines the clickable region of the corresponding application element.
[039] In an embodiment, an artificial intelligence-based pipeline is used to extract the first application elements data associated with each unique URL, from the corresponding screenshot of the web page. The artificial intelligence-based pipeline includes a trained region of interest (ROI) detection model, a contour detection technique, a trained element type detection model, and an optical character recognition model.
[040] Firstly, one or more region of interests (ROIs) associated with one or more first application elements present in the screenshot of the web page associated with each unique URL, are detected, using the trained ROI detection model of the artificial intelligence-based pipeline. In an embodiment, a RETINANET model is trained using a plurality of training image data. The plurality of training image data are labelled screenshots of the web pages with 80 classes of ROIs. In an embodiment, a labeling tool such as a LabelImg tool may be used for labelling the screenshots and to obtain the labelled screenshots. For example, the LabelImg tool generates extensible markup language (XML) format of the screenshot, which is then parsed using the XML code to obtain the coordinates for all ROIs. Further, all ROIs obtained are labelled with the defined classes. The web pages are collected from various web applications from different domains such as E-commerce websites, social media websites, business websites, and so on. The plurality of training image data is preprocessed before passing them to the RETINANET model for the training. The preprocessing includes re-sizing the images (labelled screenshots) to a predefined input image size and augmenting the images. In an embodiment, the predefined input image size may be 800X1333 pixels. Augmenting the images involves horizontal flipping of the images. The

backbone model used in the RETINANET model may be ResNet50 and the RETINANET model is trained with the plurality of training image data at a learning rates of [2.5e-06, 0.000625, 0.00125, 0.0025, 0.00025, 2.5e-05], to obtain the trained ROI detection model.
[041] Next, a contour and a clickable XY-coordinates for each of the one or more ROIs, are detected using the contour detection technique of the artificial intelligence-based pipeline. The contour defines the boundary for each of the one or more ROIs. In an embodiment, the contour detection technique employs a contour detection algorithm such as computer vision-based algorithm, to detect the contour and the clickable XY-coordinates for each of the one or more ROIs.
[042] Further, the element type for each of the one or more ROIs, is detected using the trained element type detection model of the artificial intelligence-based pipeline. In an embodiment, the trained element type detection model is obtained by training a fully convolutional network (FCN) with a training cropped image data. The training cropped image data includes a plurality of cropped images (ROIs) labelled with the classes such as the text box, a dropdown, the radio button, checkbox, and the button. The fully convolutional network (FCN) includes 2-dimensional (2-D) convolutional layers (Conv2D) with a dropout layers and a batch normalization layers. The activation function used is a Relu activation.
[043] Lastly, the label name and the element default value, for each of the one or more ROIs, are detected using the optical character recognition model of the artificial intelligence-based pipeline. The optical character recognition model recognizes the text associated with each of the one or more ROIs to detect the label name and the element default value.
[044] At step 212 of the method 200, the one or more hardware processors 104 of the system 100 are configured to pass each unique URL of the one or more unique URLs identified at step 206 of the method 200, to a form element extractor, to extract second application elements data associated with each unique URL. The second application elements data is associated with the one or more application elements present in the corresponding web page. The one or more application elements includes a test box, list box, radio button, checkbox, and so on, that are

present in the corresponding web page. More specifically, the second application elements data includes at least one of: (i) the label name for each of one or more second application elements, (ii) the element type for each of the one or more second application elements, (iii) the element default value for each of the one or more second application elements, (iv) the clickable XY-coordinates for each of the one or more second application elements, (v) an element identification number (ID) for each of the one or more second application elements, (vi) master options for each of the one or more second application elements, and (vii) a command type for each of the one or more second application elements.
[045] The label name for each of one or more second application elements is the label name defined for the corresponding application element. An example for the label name is: USERNAME. The element type for each of the one or more second application elements is the type of the application element such as text box, list box, radio button, and so on. The element default value for each of the one or more second application elements is the default value available for the corresponding application element. For example, the element default value of the label name ‘USERNAME’ is: John_123. The default value may not be available for some or all of the one or more second application elements. The clickable XY-coordinates for each of the one or more second application elements are the X-coordinate and the Y-coordinate that defines the clickable region of the corresponding application element.
[046] The element identification number (ID) for each of the one or more second application elements, defines the application element identification number of the corresponding application element. For example, possible application element identification numbers for the textbox are txt_1, txt_2, txt_3, and so on. The master options for each of the one or more second application elements define the list if the application element is the list box or a dropdown element. The command type for each of the one or more second application elements, defines the type of the command associated with the corresponding application element.
[047] In an embodiment, the form element extractor includes a software script, or a program configured to identify the hypertext markup language (HTML)

tags present in the HTML code associated with the web page of each unique URL. More specifically, the HTML code associated with the web page of each unique URL is parsed through the form element extractor to extract the second application elements data associated with each unique URL.
[048] At step 214 of the method 200, the one or more hardware processors 104 of the system 100 are configured to integrate the first application elements data obtained at step 210 of the method 200, and the second application elements data obtained at step 212 of the method 200, associated with each unique URL, to obtain a unique application elements data associated with each unique URL. The integration helps in identifying all unique application elements present in the corresponding web page, along with the unique application elements data, and removing all the duplicate or redundant application elements and the corresponding unique application elements data. Further, the integration ensures that no application element is missed, and the corresponding application element data is also not missed.
[049] The unique application elements data associated with each unique URL includes at least one of: (i) the label name for each of one or more unique application elements, (ii) the element type for each of the one or more unique application elements, (iii) the element default value for each of the one or more unique application elements, (iv) the clickable XY-coordinates for each of the one or more unique application elements, (v) the element identification number (ID) for each of the one or more unique application elements, (vi) the master options for each of the one or more unique application elements, and (vii) the command type for each of the one or more unique application elements.
[050] At step 216 of the method 200, the one or more hardware processors 104 of the system 100 are configured to extract a validation data for each of the one or more unique application elements present in each unique URL, obtained at step 214 of the method 200, using a validation data extraction model. The validation data for each of the one or more unique application elements, defined specific rules applied on particular application element or field. For example, password must be of at least 8 characters and it must contain lowercase, uppercase and special

characters. The validation data for each of the one or more unique application elements, may be present in various application support documents such as user manuals, user guides, codes, and so on. The validation data extraction model includes an intent generator, a parser, and a text classification model.
[051] Firstly, one or more intents for each of the one or more unique application elements present in each unique URL, are generated by the intent generator, based on the label name associated with the unique application element. The one or more intents for each of the one or more unique application elements, defines the synonyms or alternative names for the corresponding unique application element. For example, the one or more intents for the application element having the label name as ‘username’, are: user, user ID, employee ID, student ID, login id, and so on. The intent generator employs an intent generation algorithm to fetch the one or more intents for each of the one or more unique application elements. For every intent, there may be a lot of information present in various application support documents, but only relative validation data is to be extracted for each intent. Hence, an intent related information for each of the one or more unique application elements present in each unique URL, are extracted from the application support documents, based on the one or more intents generated for the unique application element, using a parsing algorithm present in the parser. The intent related information may be one or more paragraphs that are present in the application support documents and associated with the one or more intents corresponding to each of the one or more unique application elements.
[052] Hence, lastly, the text classification model is used to extract the validation data, from the application support documents, for each of the one or more unique application elements present in each unique URL, by classifying the intent related information associated with the unique application element. In an embodiment, the text classification model is obtained by training an ensemble of supervised machine leaning models such as a Logistic regression, Support Vector Machine, random forest classifier, Naïve Bayes classifier with the training data. The training data includes a plurality of validation data and non-validation data, and

associated classes namely ‘validation information’, and non-validation information’.
[053] At step 218 of the method 200, the one or more hardware processors 104 of the system 100 are configured to generate a smart data for each of the one or more unique application elements present in each unique URL, and one or more smart datasets for each unique URL. The smart data for each unique application element, is the support input data for the associated application element. In an embodiment, the smart data for each unique application element, is generated based on the validation data associated with the corresponding application element and using a smart data generator. For example, the smart data for the application element ‘username’ having the validation data as ‘ the username should be of 6 characters wherein the first and last characters are capital’, are: ArtrcD, UrhhkR, KtyrtW, and so on.
[054] If the user (end user such as the tester) provides a custom input data for one unique application element, then the smart data for that application element is generated based on the user provided custom input data using the smart data generator. The smart data generator is a software routine or a program that generates the smart data using one or more of natural language processing techniques, regular expressions, and so on. Further, the smart data for each unique application element is generated based on the element type of the corresponding application element. FIG. 3 is a flowchart describing a generation of smart data for each of the application elements based on element type, in accordance with some embodiments of the present disclosure. As shown in FIG. 3, if the element type is a text type, then the smart data generator generates the smart data based on the label name extracted by the label based data generator and using the corresponding validation data. If the element type is a list type such as dropdown, checkbox, radio button, then the smart data generator generates the smart data based on the choice (out of the list types) extracted by the choice based data generator, using the corresponding validation data. Similarly, if the element type is an email, a password or a URL, then the smart data generator generates the smart data based on the element type extracted by the type based data generator, using the corresponding validation data.

[055] Further, one or more smart datasets for each unique URL is generated based on all possible combinations of the unique application elements and their choices (like in case of a dropdown element) for each of the unique URL. For Example, if the application screen (URL) contains one textbox, one dropdown and two radio buttons, and the dropdown has five choices, then a total possible smart datasets for the application screen becomes 10 (5X2 where 5 is for the total dropdown choices and 2 is for the total radio buttons.)
[056] Further, the system 100 is configured to traverse from one unique URL to another unique URL among the one or more URLs. Hence if there is some error encountered while traversing between the unique URLs, then the one or more hardware processors 104 of the system 100 are configured to perform an exception handling, through an exception handler.
[057] The present disclosure automatically extracts the application elements present in each web page of the web application, and generates the smart data associated with each application element and the smart datasets associated with each web page. Hence the test scenarios and the test scripts may be automatically generated accurately in quick time, and the software quality assurance may be achieved. Further, the present disclosure uses both the AI based pipeline and the form element extractor for extracting the application elements present in the web page, hence all the application elements may be accurately extracted.
[058] The present disclosure navigates the entire web application logically and autonomously, without any human intervention and provide the complete application flow along with various combination of data and property information of every screen labels, all of which may be used for autonomously automating the web application. Further, the present disclosure understands the screen, extract the attributes of all application elements, perform relevant actions on actionable elements like buttons, links, etc., just like a human would.
[059] The present disclosure addresses the technical problems present in the Indian patent application no. 202021016457, with the automated process, including automated identification and management of the application elements present in the web application and automated generation of the smart data (the

support input data) associated with the application elements and the smart datasets of the each web page. Hence the test scenarios are easily and accurately generated and automation of test scripts is achieved for the entire web application, with the use of extracted application elements of each web page, smart data of each application element and the smart datasets of each web page.
[060] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[061] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
[062] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by

various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[063] The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[064] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include

random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[065] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.

We Claim:
1. A processor-implemented method (200) for automated management of application elements and generation of a smart dataset for the application elements, the method comprising the steps of:
receiving, via one or more hardware processors, a start uniform resource locator (URL) associated with a start web page of a web application to be crawled and adding the start URL to a URL dictionary (202);
extracting, via the one or more hardware processors, one or more successive URLs present in the start web page of the web application, using a code-based approach, and adding the one or more successive URLs to the URL dictionary (204);
identifying, via the one or more hardware processors, one or more unique URLs out of the one or more successive URLs and the start URL present in the URL dictionary (206);
assigning, via the one or more hardware processors, a unique ID to each of the one or more unique URLs (208);
passing, via the one or more hardware processors, a screenshot of the web page associated with each unique URL of the one or more unique URLs, to an artificial intelligence-based pipeline, to extract first application elements data associated with each unique URL, wherein the first application elements data comprises at least one of: (i) a label name for each of one or more first application elements, (ii) an element type for each of the one or more first application elements, (iii) an element default value for each of the one or more first application elements, and (iv) a clickable XY-coordinates for each of the one or more first application elements (210);
passing, via the one or more hardware processors, each unique URL of the one or more unique URLs, to a form element extractor, to extract second application elements data associated with each unique URL, wherein the second application elements data comprises at least one of: (i) the label name for each of one or more second application elements, (ii) the element

type for each of the one or more second application elements, (iii) the element default value for each of the one or more second application elements, (iv) the clickable XY-coordinates for each of the one or more second application elements, (v) an element identification number (ID) for each of the one or more second application elements, (vi) master options for each of the one or more second application elements, and (vii) a command type for each of the one or more second application elements (212);
integrating, via the one or more hardware processors, the first application elements data and the second application elements data associated with each unique URL, to obtain a unique application elements data associated with each unique URL, wherein the unique application elements data associated with each unique URL comprises at least one of: (i) the label name for each of one or more unique application elements, (ii) the element type for each of the one or more unique application elements, (iii) the element default value for each of the one or more unique application elements, (iv) the clickable XY-coordinates for each of the one or more unique application elements, (v) the element identification number (ID) for each of the one or more unique application elements, (vi) the master options for each of the one or more unique application elements, and (vii) the command type for each of the one or more unique application elements (214);
extracting, via the one or more hardware processors, validation data for each of the one or more unique application elements present in each unique URL, using a validation data extraction model (216); and
generating, via the one or more hardware processors, (i) a smart data for each of the one or more unique application elements present in each unique URL, and (ii) one or more smart datasets for each unique URL, based on the validation data associated with each of the one or more unique application elements (218).

2. The method as claimed in claim 1, wherein passing the screenshot of the
web page associated with each unique URL, to the artificial intelligence-
based pipeline, to extract the first application elements data associated with
each unique URL, further comprising:
detecting one or more region of interests (ROIs) associated with one or more first application elements present in the screenshot of the web page associated with each unique URL, using a trained ROI detection model;
detecting (i) a contour and (ii) a clickable XY-coordinates for each of the one or more ROIs, using a contour detection technique;
detecting the element type for each of the one or more ROIs, using a trained element type detection model; and
extracting (i) the label name and (ii) the element default value, for each of the one or more ROIs, using an optical character recognition model.
3. The method as claimed in claim 1, wherein extracting the validation data
for each of one or more unique application elements present in each unique
URL, using the validation data extraction model, further comprising:
generating one or more intents for each of the one or more unique application elements present in each unique URL, based on the label name associated with the unique application element, using an intent generation algorithm;
extracting an intent related information for each of the one or more unique application elements present in each unique URL, based on the one or more intents generated for the unique application element, using a parsing algorithm; and
extracting the validation data for each of the one or more unique application elements present in each unique URL, by classifying the intent related information associated with the unique application element, using a text classification model.

4. The method as claimed in claim 1, wherein generating the smart data for
each of the one or more unique application elements present in each unique
URL, further comprising at least one of:
generating the smart data for each unique application element, based on a custom input provided for the unique application element; and
generating the smart data for each unique application element, based on the element type associated with the unique application element.
5. The method as claimed in claim 1, wherein the one or more smart datasets for each unique URL, is generated based on possible combinations of the one or more unique application elements present in each unique URL.
6. The method as claimed in claim 1, further comprising performing an exception handling while navigating from one unique URL to another unique URL of the one or more unique URLs present in the web application.
7. A system (100) for automated management of application elements and generation of a smart dataset for the application elements, the system (100) comprising:
a memory (102) storing instructions;
one or more Input/Output (I/O) interfaces (106); and
one or more hardware processors (104) coupled to the memory (102) via the
one or more I/O interfaces (106), wherein the one or more hardware
processors (104) are configured by the instructions to:
receive a start uniform resource locator (URL) associated with a start web page of a web application to be crawled and add the start URL to a URL dictionary;
extract one or more successive URLs present in the start web page of the web application, using a code-based approach, and add the one or more successive URLs to the URL dictionary;

identify one or more unique URLs out of the one or more successive URLs and the start URL present in the URL dictionary;
assign a unique ID to each of the one or more unique URLs;
pass a screenshot of the web page associated with each unique URL of the one or more unique URLs, to an artificial intelligence-based pipeline, to extract first application elements data associated with each unique URL, wherein the first application elements data comprises at least one of: (i) a label name for each of one or more first application elements, (ii) an element type for each of the one or more first application elements, (iii) an element default value for each of the one or more first application elements, and (iv) a clickable XY-coordinates for each of the one or more first application elements;
pass each unique URL of the one or more unique URLs, to a form element extractor, to extract second application elements data associated with each unique URL, wherein the second application elements data comprises at least one of: (i) the label name for each of one or more second application elements, (ii) the element type for each of the one or more second application elements, (iii) the element default value for each of the one or more second application elements, (iv) the clickable XY-coordinates for each of the one or more second application elements, (v) an element identification number (ID) for each of the one or more second application elements, (vi) master options for each of the one or more second application elements, and (vii) a command type for each of the one or more second application elements;
integrate the first application elements data and the second application elements data associated with each unique URL, to obtain a unique application elements data associated with each unique URL, wherein the unique application elements data associated with each unique URL comprises at least one of: (i) the label name for each of one or more unique application elements, (ii) the element type for each of the one or more unique application elements, (iii) the element default value for each of the

one or more unique application elements, (iv) the clickable XY-coordinates for each of the one or more unique application elements, (v) the element identification number (ID) for each of the one or more unique application elements, (vi) the master options for each of the one or more unique application elements, and (vii) the command type for each of the one or more unique application elements;
extract validation data for each of the one or more unique application elements present in each unique URL, using a validation data extraction model; and
generate (i) a smart data for each of the one or more unique application elements present in each unique URL, and (ii) one or more smart datasets for each unique URL, based on the validation data associated with each of the one or more unique application elements.
8. The system as claimed in claim 7, wherein the one or more hardware
processors (104) are configured to pass the screenshot of the web page
associated with each unique URL, to the artificial intelligence-based
pipeline, to extract the first application elements data associated with each
unique URL, by:
detecting one or more region of interests (ROIs) associated with one or more first application elements present in the screenshot of the web page associated with each unique URL, using a trained ROI detection model;
detecting (i) a contour and (ii) a clickable XY-coordinates for each of the one or more ROIs, using a contour detection technique;
detecting the element type for each of the one or more ROIs, using a trained element type detection model; and
extracting (i) the label name and (ii) the element default value, for each of the one or more ROIs, using an optical character recognition model.
9. The system as claimed in claim 7, wherein the one or more hardware
processors (104) are configured to extract the validation data for each of one

or more unique application elements present in each unique URL, using the validation data extraction model, by:
generating one or more intents for each of the one or more unique application elements present in each unique URL, based on the label name associated with the unique application element, using an intent generation algorithm;
extracting an intent related information for each of the one or more unique application elements present in each unique URL, based on the one or more intents generated for the unique application element, using a parsing algorithm; and
extracting the validation data for each of the one or more unique application elements present in each unique URL, by classifying the intent related information associated with the unique application element, using a text classification model.
10. The system as claimed in claim 7, wherein the one or more hardware
processors (104) are configured to generate the smart data for each of the
one or more unique application elements present in each unique URL, by at
least one of:
generating the smart data for each unique application element, based on a custom input provided for the unique application element; and
generating the smart data for each unique application element, based on the element type associated with the unique application element.
11. The system as claimed in claim 7, wherein the one or more hardware processors (104) are configured to generate the one or more smart datasets for each unique URL, based on possible combinations of the one or more unique application elements present in each unique URL.
12. The system as claimed in claim 7, wherein the one or more hardware processors (104) are further configured to perform an exception handling

while navigating from one unique URL to another unique URL of the one or more unique URLs present in the web application.

Documents

Application Documents

# Name Date
1 202123044013-STATEMENT OF UNDERTAKING (FORM 3) [28-09-2021(online)].pdf 2021-09-28
2 202123044013-REQUEST FOR EXAMINATION (FORM-18) [28-09-2021(online)].pdf 2021-09-28
3 202123044013-PROOF OF RIGHT [28-09-2021(online)].pdf 2021-09-28
4 202123044013-FORM 18 [28-09-2021(online)].pdf 2021-09-28
5 202123044013-FORM 1 [28-09-2021(online)].pdf 2021-09-28
6 202123044013-FIGURE OF ABSTRACT [28-09-2021(online)].jpg 2021-09-28
7 202123044013-DRAWINGS [28-09-2021(online)].pdf 2021-09-28
8 202123044013-DECLARATION OF INVENTORSHIP (FORM 5) [28-09-2021(online)].pdf 2021-09-28
9 202123044013-COMPLETE SPECIFICATION [28-09-2021(online)].pdf 2021-09-28
10 202123044013-FORM-26 [21-10-2021(online)].pdf 2021-10-21
11 Abstract1.jpg 2022-02-14
12 202123044013-FER.pdf 2023-10-13
13 202123044013-FER_SER_REPLY [23-02-2024(online)].pdf 2024-02-23
14 202123044013-CLAIMS [23-02-2024(online)].pdf 2024-02-23

Search Strategy

1 SEARCH_STRATEGY_121023E_12-10-2023.pdf