Sign In to Follow Application
View All Documents & Correspondence

Populating A Graphical User Interface Based Application Using A Guided Natural Language Input

Abstract: The disclosed system (110) and method (400) facilitates to populate a Graphical User Interface (GUI) based application using a guided Natural Language (NL) input. The method (400) includes parsing (402) elements of the GUI based application being executed on the computing device (108) and determining (404) properties of the parsed elements of the GUI based application. Further, the method (400) includes receiving (406) the NL input corresponding to at least one of the parsed elements and extracting (408) parameters from the received NL input. The method (400) includes associating (410) the extracted parameters with at least one of the parsed elements of the GUI based application. Responsive to a successful association, the method (400) includes converting (412) the extracted parameters to a text input. Subsequently, the elements of the GUI based application are populated with the text input.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
15 January 2025
Publication Number
14/2025
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

Newgen Software Technologies Limited
E- 44/13, Okhla Phase-2, New Delhi-110020

Inventors

1. Puja Lal
House No. 9M, Ruby M Tower, Olympia Opaline Sequel, Navalur, OMR, Chennai- 603103
2. Sanjay Pandey
House No - 703, Tower - i, Supertech Ecociti, Sector 137, Noida, U. P.- 201304
3. Lal Chandra
H. No. 525, Sector 30, Faridabad, Haryana- 121003
4. Sonia Wadhwa
E-1701, R G Residency, Sector 120, Noida, U.P – 201301
5. Swapnil Pandey
T2-701, Ace divino, Greater Noida West, Sector-1, Greater Noida-201306

Specification

Description:POPULATING A GRAPHICAL USER INTERFACE BASED APPLICATION USING A GUIDED NATURAL LANGUAGE INPUT

FIELD OF THE INVENTION
[0001] The embodiments of the present disclosure generally relate to a field of voice-based communications with computing devices, and specifically to a system and a method for populating a Graphical User Interface (GUI) based application using a guided Natural Language (NL) input.

BACKGROUND OF THE INVENTION
[0002] The following description of the related art is intended to provide background information pertaining to the field of the disclosure. This section may include certain aspects of the art that may be related to various features of the present disclosure.

[0003] As the World Wide Web (WWW) continues to evolve, users generally interact with forms available on computing devices, through various mechanisms, to submit and receive information. Several existing solutions address the challenges of form filling on the computing devices. One such solution is an auto-fill feature, which automatically populates form fields with previously entered data or user profile information. While convenient, this approach relies on pre-existing data and may not work well with more complex or dynamic forms.

[0004] Another approach involves using third-party form filling applications for mobile devices that offer features like data synchronization across devices along with advanced algorithms for form completion. However, the need for the users to install and manage additional software limits their widespread adoption.

[0005] Another solution is a responsive form design, where websites and applications optimize form layouts for mobile devices by adjusting elements for smaller screens. Although this solution improves usability, it does not fully resolve challenges of manual input and navigation.

[0006] While the above mentioned solution provides some improvement for the form filling, these solutions generally fall short of delivering a seamless and efficient form filling experience on the mobile devices, especially for lengthy or complex forms.

[0007] There is, therefore, a need in the art to provide an improved system and a method to efficiently populate a Graphical User Interface (GUI) based form using a voice based input.

OBJECTS OF THE INVENTION
[0008] Some of the objects of the present disclosure, which at least one embodiment herein satisfies are listed herein below.
[0009] It is an object of the present disclosure to provide a system and a method for populating a Graphical User Interface (GUI) based application using a guided Natural Language (NL) input.
[0010] It is an object of the present disclosure to provide a system and a method for performing semantic analysis on voice input.
[0011] It is an object of the present disclosure to provide a system and a method that offers a voice-guided conversational interface for more intuitive and user-friendly form filling.
[0012] It is an object of the present disclosure to provide a system and a method that uses the voice input that is typically faster and more convenient than typing thus enabling quick and efficient form filling.
[0013] It is an object of the present disclosure to provide a system and a method that uses an intelligent form parsing engine for minimizing typographical errors and inaccuracies by accurately interpreting spoken input and matching it to correct form fields.
[0014] Yet another object of the present disclosure is to provide a system and a method that adjusts its responses based on user input and context, thus offering personalized guidance and feedback during the form filling.

SUMMARY OF THE INVENTION
[0015] In an aspect, the present disclosure relates to a method for populating a Graphical User Interface (GUI) based application using a guided Natural Language (NL) input. The method may include parsing one or more elements of the GUI based application that is being executed on a computing device. Further, the method may include determining one or more properties of the parsed one or more elements of the GUI based application. The method may include receiving the NL input corresponding to at least one of the parsed one or more elements. Furthermore, the method may include extracting one or more parameters from the received NL input and associating the extracted one or more parameters with at least one of the parsed one or more elements of the GUI based application. Responsive to a successful association, the method may include converting the extracted one or more parameters to a text input. The text input may be populated in the corresponding one or more elements of the GUI based application.
[0016] In an embodiment, the one or more elements of the GUI based application may include a layout of the GUI based application, a count of the one or more elements, and a label corresponding to each of the one or more elements.
[0017] In an embodiment, the determined one or more properties of the parsed one or more elements of the GUI based application may correspond to an input type that is acceptable for entry in the one or more elements. The input type may be at least one of an alphabet type, a numeric type and an alpha-numeric type.
[0018] In an embodiment, the method may include receiving the NL input via a spoken utterance.
[0019] In an embodiment, the method may include displaying the text input populated in the corresponding one or more elements of the GUI based application on a display screen of the computing device.
[0020] In an embodiment, the method may include determining a relationship between the one or more parameters of the NL input and the parsed one or more elements of the GUI based application based on semantic analysis.
[0021] In an embodiment, the method may perform the semantic analysis that includes word segmentation, syntactic analysis, named entity recognition and keyword extraction.
[0022] In an embodiment, the syntactic analysis may be performed using multiple parsers.
[0023] In an embodiment, the GUI based application may be a web based graphical form.
[0024] In an embodiment, the method may include identifying at least one of an unpopulated element of the GUI based application and generating a response comprising a prompt to receive the NL input corresponding to at least one of the unpopulated element, and where the response is either a verbal response or a written response.
[0025] In an aspect, the present disclosure relates to a system for populating a GUI based application using a guided NL input. The system may include one or more processors associated with a computing device, a memory operatively coupled to the one or more processors, wherein the memory comprises processor-executable instructions, which on execution, cause the one or more processors to parse one or more elements of the GUI based application being executed on the computing device. The one or more processors may be configured to determine one or more properties of the parsed one or more elements of the GUI based application. Further, the one or more processors may receive the NL input corresponding to at least one of the parsed one or more elements. Furthermore, the one or more processors may be configured to extract one or more parameters from the received NL input and associate the extracted one or more parameters with at least one of the parsed one or more elements of the GUI based application. Responsive to a successful association, the one or more processors may be configured to convert the extracted one or more parameters to a text input. The text input may be populated in the corresponding one or more elements of the GUI based application.
[0026] In an aspect, the present disclosure relates to non-transitory computer-readable medium comprising processor-executable instructions that may cause a processor to parse one or more elements of the GUI based application being executed on the computing device. The one or more processors may determine one or more properties of the parsed one or more elements of the GUI based application. Further, the one or more processors may receive the NL input corresponding to at least one of the parsed one or more elements and extract one or more parameters from the received NL input. Furthermore, the one or more processors may associate the extracted one or more parameters with at least one of the parsed one or more elements of the GUI based application. In response to a successful association, the one or more processors may convert the extracted one or more parameters to a text input. The text input may be populated in the corresponding one or more elements of the GUI based application.

BRIEF DESCRIPTION OF DRAWINGS
[0027] The accompanying drawings, which are incorporated herein, and constitute a part of this disclosure, illustrate exemplary embodiments of the disclosed methods and systems which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure.

[0028] FIG. 1 illustrates an exemplary block diagram representation of a network architecture implementing a proposed system for populating a Graphical User Interface (GUI) based application using a guided Natural Language (NL) input, in accordance with an embodiment of the present disclosure.
[0029] FIG. 2 illustrates exemplary functional units of the proposed system, in accordance with an embodiment of the present disclosure.
[0030] FIG. 3A illustrates an exemplary representation of a blank GUI based application running on a computing device and awaiting instructions from a user to enable activation of the device’s microphone, in accordance with an embodiment of the present disclosure.
[0031] FIG. 3B illustrates an exemplary representation of a populated GUI based application based on a guided NL input, in accordance with an embodiment of the present disclosure.
[0032] FIG. 4 is a flow diagram depicting a proposed method for populating the GUI based application using the guided NL input, in accordance with an embodiment of the present disclosure.
[0033] FIG. 5 illustrates an exemplary computer system in which or with which embodiments of the present disclosure may be utilized in accordance with embodiments of the present disclosure.
[0034] The foregoing shall be more apparent from the following more detailed description of the disclosure.

DETAILED DESCRIPTION OF THE INVENTION

[0035] In the following description, for the purposes of explanation, various specific details are set forth in order to provide a thorough understanding of embodiments of the present disclosure. It will be apparent, however, that embodiments of the present disclosure may be practiced without these specific details. Several features described hereafter can each be used independently of one another or with any combination of other features. An individual feature may not address all of the problems discussed above or might address only some of the problems discussed above. Some of the problems discussed above might not be fully addressed by any of the features described herein.
[0036] The ensuing description provides exemplary embodiments only and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the ensuing description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing an exemplary embodiment. It should be understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the disclosure as set forth.
[0037] Also, it is noted that individual embodiments may be described as a process that is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed but could have additional steps not included in a figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination can correspond to a return of the function to the calling function or the main function.
[0038] The word “exemplary” and/or “demonstrative” is used herein to mean serving as an example, instance, or illustration. For the avoidance of doubt, the subject matter disclosed herein is not limited by such examples. In addition, any aspect or design described herein as “exemplary” and/or “demonstrative” is not necessarily to be construed as preferred or advantageous over other aspects or designs, nor is it meant to preclude equivalent exemplary structures and techniques known to those of ordinary skill in the art. Furthermore, to the extent that the terms “includes,” “has,” “contains,” and other similar words are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising” as an open transition word without precluding any additional or other elements.
[0039] Reference throughout this specification to “one embodiment” or “an embodiment” or “an instance” or “one instance” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
[0040] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a”, “an”, and “the” are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
[0041] The primary objective of this disclosure is to address the problem of time wastage and cumbersome nature of form filling on mobile devices. Typically, traditional approaches require users to manually type on small screens of electronic devices for filling a form, which can lead to wastage of time. In addition, navigating a long form makes form filling process more difficult, often causing the users to abandon their tasks. The disclosure simplifies a form filling process by introducing a voice guided form filling system. By utilizing a voice based technology, the users may speak field names present on the form along with their corresponding values, thus removing a need for typing in the forms and making navigation in the form smoother through conversational interaction. The disclosed system and method enhances usability of a mobile form filling through voice commands thereby improving user experience and productivity.
[0042] Various embodiments of the present disclosure will be explained in detail with reference to Figs 1-5.
[0043] FIG. 1 illustrates an exemplary block diagram representation of a network architecture 100 implementing a proposed system 110 for populating a Graphical User Interface (GUI) based application using a guided Natural Language (NL) input, according to embodiments of the present disclosure. The network architecture 100 may include the system 110, a computing device 108, a centralized server 118, a decentralized database 120. The system 110 may be communicatively connected to the centralized server 118, and the decentralized database (or node(s)) 120, via a communication network 106. The centralized server 118 may include, but is not limited to, a stand-alone server, a remote server, a cloud computing server, a dedicated server, a rack server, a server blade, a server rack, a bank of servers, a server farm, hardware supporting a part of a cloud service or system, a home server, hardware running a virtualized server, one or more processors executing code to function as a server, one or more machines performing server-side functionality as described herein, at least a portion of any of the above, some combination thereof, and the like. The communication network 106 may be a wired communication network or a wireless communication network. The wireless communication network may be any wireless communication network capable of transferring data between entities of that network such as, but is not limited to, a carrier network including a circuit-switched network, a public switched network, a Content Delivery Network (CDN) network, a Long-Term Evolution (LTE) network, a New Radio (NR), a Global System for Mobile Communications (GSM) network and a Universal Mobile Telecommunications System (UMTS) network, an Internet, intranets, Local Area Networks (LANs), Wide Area Networks (WANs), mobile communication networks, combinations thereof, and the like.
[0044] The system 110 may be implemented by way of a single device or a combination of multiple devices that may be operatively connected or networked together. For example, the system 110 may be implemented by way of a standalone device such as the centralized server 118 (and/or a decentralized server or node(s)), and the like, and may be communicatively coupled to the computing device 108. In another example, the system 110 may be implemented in/ associated with the computing device 108. In yet another example, the system 110 may be implemented in/associated with respective electronic devices 104-1, 104-2, …..., 104-N (individually referred to as electronic device 104, and collectively referred to as electronic devices 104), associated with one or more user 102-1, 102-2, …..., 102-N (individually referred to as the user 102, and collectively referred to as the users 102). In such a scenario, the system 110 may be replicated in each of the electronic devices 104. The users 102 may be a user of, but are not limited to, an electronic commerce (e-commerce) platform, a hyperlocal platform, a super-mart platform, a media platform, a service providing platform, a social networking platform, a messaging platform, a bot processing platform, a virtual assistance platform, an Artificial Intelligence (AI) based platform, and the like. In some instances, the user 102 may include an entity or an administrator, a speaker who is in conversation with the electronic device 104. The computing device 108 may be at least one of, an electrical, an electronic, and an electromechanical device. The computing device 108 may include, but is not limited to, a mobile device, a smart- phone, a Personal Digital Assistant (PDA), a tablet computer, a phablet computer, a wearable device, a Virtual Reality/Augmented Reality (VR/AR) device, a laptop, a desktop, a server, and the like. The system 110 may be implemented in hardware or a suitable combination of hardware and software. The system 110 or the centralized server 118 or the decentralized database 120 may be associated with entities (not shown). The entities may include, but are not limited to, an e-commerce company, a company, an outlet, a manufacturing unit, an enterprise, a facility, an organization, an educational institution, a secured facility, and the like.
[0045] Further, the system 110 may include a processor 112, an Input/Output (I/O) interface 114, and a memory 116. The Input/Output (I/O) interface 114 of the system 110 may be used to receive user inputs, from one or more electronic devices 104 associated with the one or more users 102. The processor 112 may be configured to the computing device 108. The processor 112 may be coupled with the memory 116. The memory 116 may store one or more instructions that are executable by the processor to populate the form using voice input.
[0046] In an embodiment, the system 110 may populate the GUI based application using the guided NL input. The GUI based application may be a web based graphical form (also, referred to hereinafter as a web form with graphical content or a form). As may be noted, in some scenarios, the system 110 may function as a browser for filling input fields of the form, in response to the received guided NL input via a spoken utterance (also, referred to hereinafter as speech, spoken input, speech input or speech utterance). The guided NL input may be provided by the user 102 through a microphone of the electronic device 104. The guided NL input may be a spoken language that is distinct from constructed or formal languages, such as those used in computer programming or logical analysis. As may be appreciated, the speech input may be provided in any language, and may be processed by the Natural Language Processing (NLP) model accordingly. The NLP model is designed to recognize, interpret, and analyze the speech input, regardless of the language of the speech input. The NLP model leverages advanced linguistic models to break down structure and meaning of the speech input, converting the speech input into text and extracting relevant information. This allows the system 110 to handle multilingual inputs, making it adaptable for global users, whether they are interacting in English, Spanish, Mandarin, or any other language. Further, the NLP model is equipped to manage different accents, dialects, and variations in pronunciation, ensuring accurate processing and comprehension of the received speech input.
[0047] The system 110 may parse one or more elements of the form. The one or more elements of the form may include a layout of the form, a count of one or more input fields, and a label corresponding to each of the one or more input fields. The layout of the form may be, for example, a single page form, a multi-page form, an expandable form, a folded form and the like. The one or more input fields may be, for example, a name input field for the user 102 to input his name, a phone number input field for entering his phone number, an address input field for entering his address, an email input field for entering his email address, and the like. The input fields may represent, for example, text entry fields for which the user 102 may provide relevant input via the speech input. In addition, each of the exemplary input fields may have a corresponding label. For example, the label for the name input field may be ‘Name’, the label for the phone number input field may be ‘Phone Number’, and so on.
[0048] The system 110 may determine one or more properties of the parsed one or more elements of the form and receive the speech input corresponding to at least one of the parsed one or more elements. The determined one or more properties of the parsed one or more elements of the GUI based application may correspond to an input type that is acceptable for entry in the one or more elements. The input type may be at least one of an alphabet type, a numeric type and an alpha-numeric type. In continuation of the previous example, the property of the input field ‘name’ is the alphabet type, the property of the input field ‘phone number’ is the numeric type, the property of the input field ‘address’ is the alpha-numeric type, and the like.
[0049] In an embodiment, the system 110 may extract one or more parameters from the received NL input and associate the extracted one or more parameters with at least one of the parsed one or more elements of the GUI based application. It may be noted that the one or more parameters are extracted from the NL input using the Natural Language Processing (NLP) model that combines the power of computational linguistics together with Machine Learning (ML) algorithms and deep learning. The NLP model uses two primary types of analysis: syntactic analysis and semantic analysis. The syntactic analysis focuses on determining a meaning of a word, phrase, or sentence by parsing a syntax and applying pre-defined grammatical rules. The semantic analysis, on the other hand, is built on a syntactic output to interpret the meaning of words within a context of the sentence.
[0050] In an embodiment, the system 110 may determine a relationship between the one or more parameters of the NL input and the parsed one or more elements of the GUI based application based on the semantic analysis. The semantic analysis may include word segmentation, the syntactic analysis, named entity recognition and keyword extraction. In addition, the syntactic analysis may be performed using multiple parsers.
[0051] To perform the syntactic and the semantic analysis, word parsing may be performed. Dependency parsing, which is a form of the word parsing, focuses on relationships between words and identifies elements like nouns and verbs. On the other hand, constituency parsing creates a parse tree (or syntax tree), which is a rooted and ordered representation of a sentence’s syntactic structure. These parse trees form a foundation for language translation and speech recognition mechanisms.
[0052] Further, the speech recognition mechanism or a speech-to-text mechanism involves accurately converting a spoken language into written text. Complexity of the speech recognition mechanism arises from a natural way in which people speak, i.e., rapidly, by blending words together, and using different emphases and intonations.
[0053] Further, responsive to a successful association, the system 110 may convert the extracted one or more parameters to a text input. The text input may be populated in the corresponding one or more elements of the GUI based application. In addition, the text input populated in the corresponding one or more elements of the GUI based application may be displayed on a display screen of the computing device 108.
[0054] In further continuation of the above mentioned example, the NLP model may be used to extract the one or more parameters from the speech input received from the user 102. For example, from the received speech input ‘My name is James’, the NLP model may extract the relevant parameters ‘name’ and ‘James’ and successfully associate them to the relevant element, i.e., input field ‘name’. On successful association, the parameter ‘James’ may be converted from a speech form into a text form. The parameter ‘James’ is then populated and displayed in the input field ‘name’ of the form as text.
[0055] In some implementations, the system 110 may include data, and modules. As an example, the data may be stored in the memory 116 configured in the system 110. In an embodiment, the data may be stored in the memory in the form of various data structures. Additionally, the data may be organized using data models, such as relational or hierarchical data models.
[0056] In an embodiment, the data stored in the memory 116 may be processed by the modules of the system 110. The modules may be stored within the memory. In an example, the modules communicatively coupled to the processor configured in the system, may also be present outside the memory, and implemented as hardware. As used herein, the term modules refer to an Application-Specific Integrated Circuit (ASIC), an electronic circuit, a processor (shared, dedicated, or group) and the memory that execute one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that provide the described functionality.
[0057] Further, the system 110 may also include other units such as a display unit, an input unit, an output unit, and the like, however the same are not shown in FIG. 1, for the purpose of clarity. Also, in FIG. 1 only a few units are shown, however, the system 110 or the network architecture 100 may include multiple such units or the system 110/network architecture 100 may include any such numbers of the units, obvious to a person skilled in the art or as required to implement the features of the present disclosure. The system 110 may be a hardware device including the processor 112 executing machine-readable program instructions to populate the GUI based application using the guided NL input.
[0058] Execution of the machine-readable program instructions by the processor 112 may enable the system 110 to receive the NL input and populate the form. The “hardware” may comprise a combination of discrete components, an integrated circuit, an application-specific integrated circuit, a field-programmable gate array, a digital signal processor, or other suitable hardware. The “software” may comprise one or more objects, agents, threads, lines of code, subroutines, separate software applications, two or more lines of code, or other suitable software structures operating in one or more software applications or on one or more processors. The processor 112 may include, for example, but are not limited to, microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuits, and any devices that manipulate data or signals based on operational instructions, and the like. Among other capabilities, the processor 112 may fetch and execute computer-readable instructions in the memory 116 operationally coupled with the system 110 for performing tasks such as data processing, input/ output processing, and/or any other functions. Any reference to a task in the present disclosure may refer to an operation being or that may be performed on data.
[0059] FIG. 2 illustrates, at 200, exemplary functional units of the proposed system 110, in accordance with an exemplary embodiment of the present disclosure. The system 110 may include the one or more processor(s) 112. The one or more processor(s) 112 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, logic circuitries, and/or any devices that manipulate data based on operational instructions. Among other capabilities, the one or more processor(s) 112 are configured to fetch and execute computer-readable instructions stored in a memory 204. The memory 204 may store one or more computer-readable instructions or routines, which may be fetched and executed to create or share the data units over a network service. The memory 204 may include any non-transitory storage device including, for example, volatile memory such as RAM, or non-volatile memory such as EPROM, flash memory, and the like.
[0060] In an embodiment, the system 110 may also include an interface(s) 114. The interface(s) 114 may include a variety of interfaces, for example, interfaces for data input and output devices, referred to as I/O devices, storage devices, and the like. The interface(s) 114 may facilitate communication with various other devices coupled to the one or more processor(s) 112. The interface(s) 114 may also provide a communication pathway for one or more components of the one or more processor(s) 112. Examples of such components include, but are not limited to, processing engine(s) 208 and database 210.
[0061] In an embodiment, the processing engine(s) 208 may be implemented as a combination of hardware and programming (for example, programmable instructions) to implement one or more functionalities of the processing engine(s) 208. In examples described herein, such combinations of hardware and programming may be implemented in several different ways. For example, the programming for the processing engine(s) 208 may be processor executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the processing engine(s) 208 may include a processing resource (for example, one or more processors), to execute such instructions.
[0062] In the present examples, the machine-readable storage medium may store instructions that, when executed by the processing resource, implement the processing engine(s) 208. In such examples, the processor(s) 112 may include the machine-readable storage medium storing the instructions and the processing resource to execute the instructions, or the machine-readable storage medium may be separate but accessible to the system 110 and the processing resource. In other examples, the processing engine(s) 208 may be implemented by electronic circuitry. The database 210 may include data that is either stored or generated as a result of functionalities implemented by any of the components of the processing engine(s) 208. In an embodiment, the processing engine(s) 208 may include a parsing unit 212, a determination unit 214, a receiving unit 216, an extracting unit 218, an associating unit 220, a converting unit 222, and other units(s) 224. The other unit(s) 224 may implement functionalities that supplement applications/ functions performed by the system 110. In another embodiment, the system 110, through the other unit(s) 224, may manage populating the form using the speech input.
[0063] The parsing unit 212 may parse one or more elements of the form that may be executed on the computing device 108. The one or more elements of the form may include a layout of the form, a count of one or more input fields and a label corresponding to each of the one or more input fields.
[0064] The determination unit 214 may determine one or more properties of the parsed one or more elements of the form. The determined one or more properties of the parsed one or more elements of the form may correspond to an input type that is acceptable for entry in the one or more elements. The input type may be at least one of an alphabet type, a numeric type and an alpha-numeric type.
[0065] The receiving unit 216 may receive the speech input corresponding to at least one of the parsed one or more elements. As may be appreciated, the speech input may be received from the user 102 who may dictate into a telephone, handheld recorder, a short-range communication system, a microphone of the electronic device, or other device.
[0066] The extracting unit 218 may extract one or more parameters from the received speech input. It is to be noted that a speech recognition model may be used for processing the received speech input. The speech input may correspond to the text entry with respect to multiple fields present on the form. As may be appreciated, speech recognition is an interdisciplinary field that is focused on creating methods and technologies that facilitate to recognize and convert the speech input into the text input. The speech recognition model is also referred to as an automatic speech recognition (ASR) model, a computer speech recognition model, or a speech-to-text (STT) model. The speech recognition model provides a high quality speech recognition and may be configured to convert the received speech input with respect to the input fields to the relevant text entry. While using the speech recognition model, audio data may be derived from the user’s speech input and each of spoken words present in the speech input may be transcribed into a corresponding word in a text format. This allows the user 102 to dictate field names and values present on the form thus streamlining the form filling process.
[0067] The associating unit 220 may associate the extracted one or more parameters with at least one of the parsed one or more elements of the GUI based application.
[0068] In response to a successful association, the converting unit 222 may convert the extracted one or more parameters to the text input which may be populated in the corresponding one or more elements of the form. The form may then be displayed on the display screen of the computing device 108. As may be appreciated, the NLP model may be used to convert the speech input into the text input by using the speech-to-text mechanism. As an example and not by way of limitation, when the user input comprises the speech input, then the speech input may be received at the ASR model. The ASR model may allow the user to dictate and have the speech input transcribed as written text or issue commands that are recognized as such by the system 110. In addition, output of the ASR model may be sent to a Natural Language Understanding (NLU) model. In some cases, the NLU model may perform a Named Entity Resolution (NER) to determine an intent, a slot, or a domain of the speech input. Each of the domains may have specially configured components to perform various steps of NLU operations. For example, each domain may be associated with a particular language model and/or grammar database, a particular set of intents/commands, and a particular personalized lexicon. Also, domain-specific gazetteers may include domain - indexed lexical information associated with a particular user and/or a device. For example, a user’s study-domain lexical information might include book titles, author names, and the like, whereas a user’s contact-list lexical information might include names of contacts.
[0069] Further, the ASR model employs a Large Language Model (LLM) that is trained on vast amounts of text data and employs sophisticated algorithms to generate human-like text from the speech input. The LLM tackles syntactic ambiguity present in the received speech input by performing contextual understanding and evaluating statistical patterns to determine how a sentence is put together. Further, the LLM tackles semantic constraints by determining what the words in a sentence means and how they fit together. This helps the LLM model to understand tricky sentences by figuring out which meaning makes the most sense. For example, if the speech input received for the form is say, “designation is date Scientist”. The LLM may correct it to “Designation is Data Scientist,” thus enhancing transcription accuracy.
[0070] While performing the semantic analysis, the LLM analyzes grammatical format of the sentences, including an arrangement of words, phrases, and clauses, to determine relationships between independent terms in a specific context. Process of the semantic analysis begins by studying and analyzing dictionary definitions and meanings of individual words also referred to as lexical semantics and by determining a relationship between words in a sentence to provide clear understanding of a context of the form. For example, ‘Blackberry is known for its sweet taste’ may directly refer to the fruit, but ‘I got a blackberry’ may refer to a fruit or a Blackberry product. As such, understanding the context of the form is vital in the semantic analysis and requires additional information to assign a correct meaning to the whole sentence or language.
[0071] In addition, the employed LLM may perform the syntactic analysis by drawing exact meaning or dictionary meaning from the text. The syntax analysis may check text for meaningfulness by comparing the text to rules of formal grammar. For example, sentences like “hot ice-cream” shall be rejected by a semantic analyzer. The syntactic analysis or a parsing process may be defined as a process of analyzing strings of symbols in the NL that conform to rules of formal grammar.
[0072] Parsers may be used to implement the parsing process. The parser may be defined as a software component that is designed for taking input data, for example, text and then giving structural representation of the input data after checking for correct syntax as per formal grammar. The system 110 employs a text parsing algorithm that is powered by the LLM to conduct both the semantic analysis and syntactic analysis of corrected text.
[0073] A syntactic structure of the text is highly valuable while filling the form. The parser derives dictionary meanings of the words from the text and determines whether a sequence of tokens conforms to a specific grammar. By conducting the syntactic analysis, the parser examines individual words in the text and determines their structure using underlying grammar rules. In essence, the parser takes an input string along with a set of grammar rules and generates a parse tree based on the input string.
[0074] The parser works in tandem with the NLP to ensure that corrected speech input is matched with appropriate input fields thus facilitating accurate data entry and mapping. In addition, the parser may interpret the speech input, match it with the appropriate input fields, and automatically populate the input fields. This helps to eliminate a need for manual navigation and input in the forms.
[0075] FIG. 3A illustrates an exemplary representation 300 of a blank GUI based application 302 running on the computing device 108 and awaiting instructions from the user 102 to enable activation of the device’s microphone, in accordance with an embodiment of the present disclosure. With reference to FIG. 3A, the blank form 302 may be presented to the user 102 on execution of the website on which the form 302 is hosted. By way of an example, the form 302 represents an account opening claimer form. It may be noted that the form 302 is not limited to the mentioned exemplary type and can have multiple different representations, such as that related to a feedback form, an admission form, a withdrawal form, an invitation form, and the like. As is shown, initially, each of the input fields of the form 302 are blank. Further, to receive the user speech input, the computing device 108 generates a request, for example, as a pop up window, which is to be approved by the user 102 to activate the microphone of the computing device 108. Once the user 102 accepts the request, the microphone of the computing device 108 is activated and the blank form 302 is ready to receive the user speech input.
[0076] A form-filling process may begin when the user 102 activates a speech input feature within a form-filling application or on the website hosting the form, typically by selecting an option or enabling the microphone and providing the speech input. FIG. 3B illustrates an exemplary representation 320 of a populated GUI based application 322 based on the received NL input, in accordance with an embodiment of the present disclosure. As is illustrated, the user 102 provides the speech input with respect to the input fields of the form 322. The received speech input may be converted to text after syntactic and semantic validation and converted to text. The text may then be filled in each of the respective input fields of the form 322.
[0077] By way of an example, the user 102 may provide verbal instructions via the speech input, for filling the form. In the speech input, the user 102 may specify both the labels corresponding to the input fields and values for the input fields, such as “Name: John Smith”, “Gender: Male”, “Date of Birth: 09/07/1988”, and “Contact Number:6373662778”, and the like. The system 110 may then process the speech input using a form parsing engine, which analyzes a structure of the form 322 and maps the received speech input to the appropriate input fields. Further, the system 110 may validate accuracy of the speech input, prompting the user 102 for clarification if any errors or discrepancies are detected. In case no discrepancies are found, the system 110 may convert the received speech input into text and populate the text in the appropriate input field. After all required input fields are correctly filled, the system 110 may either automatically submit the form 322 or save the entered data, thus streamlining the entire form-filling process and reducing manual effort.
[0078] With respect to FIG. 3B, the form 322 has the input fields with associated titles. The titles as shown are, for example, Gender at 324, Name at 326, Date of Birth at 328, Contact Number at 330, ID Card number at 332, ID Card Address at 324 and Mailing Address at 326. In addition, the form 322 has a submit option 328 for submitting the form, a save option 330 for saving the form, a previous option 332 and a next option 334 for traversing the form 322 to a previous format or to a subsequent format respectively.
[0079] FIG. 4 is a flow diagram 400 depicting a proposed method for populating the GUI based application using the guided NL input, in accordance with an embodiment of the present disclosure. At step 402, the method includes, parsing one or more elements of the GUI based application. The GUI based application may be executed on the computing device. The one or more elements include a layout of the GUI based application, a count of one or more input fields, and a label corresponding to each of the one or more input fields.
[0080] At step 404, the method determines one or more properties of the parsed one or more elements of the GUI based application. The one or more properties include an input type that is acceptable for entry in the one or more elements. The input type is at least one of an alphabet type, a numeric type and an alpha-numeric type.
[0081] The method at step 406, receives the NL input corresponding to at least one of the parsed one or more elements and extracts one or more parameters from the received NL input, at step 408. The method, at step 410, associates the extracted one or more parameters with at least one of the parsed one or more elements of the GUI based application. Further, a relationship is determined between the one or more parameters of the NL input and the parsed one or more elements of the GUI based application, based on semantic analysis. Responsive to a successful association, the method, at step 412, converts the extracted one or more parameters to the text input. The text input is populated in the corresponding one or more elements of the GUI based application.
[0082] Those skilled in the art would appreciate that embodiments of the present disclosure provides a voice recognition interface that enables the users to interact with the forms through natural language commands. These commands allow the user to verbally dictate labels along with relevant values for the input fields. The disclosed system and method uses an intelligent form parsing engine to analyze the form’s structure to identify the individual input fields and their labels. On receiving relevant speech input, the speech input is converted to text and the values are filled in the input fields of the form. By leveraging the NLP model, the system interprets the speech input and matches it to corresponding input fields. This setup supports conversational interaction and guides the users through the form-filling process in a smooth and intuitive manner. The system also adapts its responses in real-time based on user input and context, thus offering feedback and assistance resulting in improved usability and reduced errors.
[0083] FIG. 5 illustrates an exemplary computer system 500 in which or with which embodiments of the present disclosure may be implemented. As shown in FIG. 5, the computer system 500 may include an external storage device 510, a bus 520, a main memory 530, a read-only memory 540, a mass storage device 550, communication port(s) 560, and a processor 570. A person skilled in the art will appreciate that the computer system 500 may include more than one processor and communication ports. The processor 570 may include various modules associated with embodiments of the present disclosure. The communication port(s) 560 may be any of an RS-232 port for use with a modem-based dialup connection, a 10/100 Ethernet port, a Gigabit or 10 Gigabit port using copper or fiber, a serial port, a parallel port, or other existing or future ports. The communication port(s) 560 may be chosen depending on a network, such a Local Area Network (LAN), Wide Area Network (WAN), or any network to which the computer system 500 connects. The main memory 530 may be random access memory (RAM), or any other dynamic storage device commonly known in the art. The read-only memory 540 may be any static storage device(s) e.g., but not limited to, a Programmable Read Only Memory (PROM) chips for storing static information e.g., start-up or BIOS instructions for the processor 570. The mass storage device 550 may be any current or future mass storage solution, which can be used to store information and/or instructions. Exemplary mass storage device 550 includes, but is not limited to, Parallel Advanced Technology Attachment (PATA) or Serial Advanced Technology Attachment (SATA) hard disk drives or solid-state drives (internal or external, e.g., having Universal Serial Bus (USB) and/or Firewire interfaces), one or more optical discs, Redundant Array of Independent Disks (RAID) storage, e.g. an array of disks.
[0084] The bus 520 communicatively couples the processor 570 with the other memory, storage, and communication blocks. The bus 520 may be, e.g. a Peripheral Component Interconnect (PCI)/PCI Extended (PCI-X) bus, Small Computer System Interface (SCSI), USB, or the like, for connecting expansion cards, drives, and other subsystems as well as other buses, such a front side bus (FSB), which connects the processor 570 to the computer system 500.
[0085] Optionally, operator and administrative interfaces, e.g. a display, keyboard, joystick, and a cursor control device, may also be coupled to the bus 520 to support direct operator interaction with the computer system 500. Other operator and administrative interfaces can be provided through network connections connected through the communication port(s) 560. Components described above are meant only to exemplify various possibilities. In no way should the aforementioned exemplary computer system 500 limit the scope of the present disclosure.
[0086] While the foregoing describes various embodiments of the invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof. The scope of the invention is determined by the claims that follow. The invention is not limited to the described embodiments, versions or examples, which are included to enable a person having ordinary skill in the art to make and use the invention when combined with information and knowledge available to the person having ordinary skill in the art.
[0087] Since many modifications, variations, and changes in detail can be made to the described preferred embodiments of the invention, it is intended that all matters in the foregoing description and shown in the accompanying drawings be interpreted as illustrative and not in a limiting sense. Thus, the scope of the invention should be determined by the appended claims and their legal equivalents.
ADVANTAGES OF THE PRESENT DISCLOSURE
[0088] The present disclosure provides a system and method for populating a Graphical User Interface (GUI) based application using a guided Natural Language (NL) input.
[0089] The present disclosure provides a system and method for performing semantic analysis on voice input.
[0090] The present disclosure provides a system and method that offers a voice-guided conversational interface for more intuitive and user-friendly form filling.
[0091] The present disclosure provides a system and method that uses voice input that is typically faster and more convenient than typing thus enabling quick and efficient form filling.
[0092] The present disclosure provides a system and method that uses an intelligent form parsing engine for minimizing typographical errors and inaccuracies by accurately interpreting spoken input and matching it to the correct form fields.
[0093] The present disclosure provides a system and method that adjusts its responses based on user input and context, thus offering personalized guidance and feedback during form filling.
, Claims:CLAIMS

We Claim:
1. A method (400) for populating a Graphical User Interface (GUI) based application using a guided Natural Language (NL) input, the method (400) comprising:
parsing (402), by one or more processors (112) of a computing device (108), one or more elements of the GUI based application, where the GUI based application is executed on the computing device (108);
determining (404), by the one or more processors (112), one or more properties of the parsed one or more elements of the GUI based application;
receiving (406), by the one or more processors (112), the NL input corresponding to at least one of the parsed one or more elements;
extracting (408), by the one or more processors (112), one or more parameters from the received NL input;
associating (410), by the one or more processors (112), the extracted one or more parameters with at least one of the parsed one or more elements of the GUI based application; and
responsive to a successful association, converting (412), by the one or more processors, the extracted one or more parameters to a text input, where the text input is populated in the corresponding one or more elements of the GUI based application.
2. The method (400) as claimed in claim 1, wherein the one or more elements of the GUI based application comprises a layout of the GUI based application, a count of one or more input fields, and a label corresponding to each of the one or more input fields.
3. The method (400) as claimed in claim 1, wherein the determined one or more properties of the parsed one or more elements of the GUI based application correspond to an input type that is acceptable for entry in the one or more elements, and where the input type is at least one of an alphabet type, a numeric type and an alpha-numeric type.
4. The method (400) as claimed in claim 1, further comprising:
receiving, by the one or more processors (112), the NL input via a spoken utterance.
5. The method (400) as claimed in claim 1, further comprising:
displaying, by the one or more processors (112), the text input populated in the corresponding one or more elements of the GUI based application on a display screen of the computing device (108).
6. The method (400) as claimed in claim 1, further comprising:
determining, by the one or more processors (112), a relationship between the one or more parameters of the NL input and the parsed one or more elements of the GUI based application based on semantic analysis.
7. The method (400) as claimed in claim 6, wherein the semantic analysis comprises word segmentation, syntactic analysis, named entity recognition and keyword extraction.
8. The method (400) as claimed in claim 7, wherein the syntactic analysis is performed using multiple parsers.
9. The method (400) as claimed in claim 1, wherein the GUI based application is a web based graphical form.
10. The method (400) as claimed in claim 1, further comprising:
identifying, by the one or more processors (112), at least one of an unpopulated element of the GUI based application; and
generating, by the one or more processors (112), a response comprising a prompt to receive the NL input corresponding to at least one of the unpopulated element, and where the response is either a verbal response or a written response.
11. A system (110) for populating a Graphical User Interface (GUI) based application using a guided Natural Language (NL) input, the system (110) comprising:
one or more processors (112) associated with a computing device (108); and
a memory (116) operatively coupled to the one or more processors (112), wherein the memory (116) comprises processor-executable instructions, which on execution, cause the one or more processors (112) to:
parse one or more elements of the GUI based application, where the GUI based application is executed on the computing device (108);
determine one or more properties of the parsed one or more elements of the GUI based application;
receive the NL input corresponding to at least one of the parsed one or more elements;
extract one or more parameters from the received NL input;
associate the extracted one or more parameters with at least one of the parsed one or more elements of the GUI based application; and
responsive to a successful association, convert the extracted one or more parameters to a text input, where the text input is populated in the corresponding one or more elements of the GUI based application.
12. The system (110) as claimed in claim 11, wherein the one or more elements of the GUI based application comprises a layout of the GUI based application, a count of one or more input fields, and a label corresponding to each of the one or more input fields.
13. The system (110) as claimed in claim 11, wherein the determined one or more properties of the parsed one or more elements of the GUI based application correspond to an input type that is acceptable for entry in the one or more elements, and where the input type is at least one of an alphabet type, a numeric type and an alpha-numeric type.
14. The system (110) as claimed in claim 11, wherein the one or more processors are further configured to receive the NL input via a spoken utterance.
15. The system (110) as claimed in claim 11, wherein the one or more processors are further configured to display the text input populated in the corresponding one or more elements of the GUI based application on a display screen of the computing device (108).
16. The system (110) as claimed in claim 11, wherein the one or more processors are further configured to determine a relationship between the one or more parameters of the NL input and the parsed one or more elements of the GUI based application based on semantic analysis.
17. The system (110) as claimed in claim 16, wherein the semantic analysis comprises word segmentation, syntactic analysis, named entity recognition and keyword extraction.
18. The system (110) as claimed in claim 17, wherein the syntactic analysis is performed using multiple parsers.
19. The system (110) as claimed in claim 11, wherein the GUI based application is a web based graphical form.
20. A non-transitory computer-readable medium comprising processor-executable instructions that cause a processor (112) to:
parse one or more elements of the GUI based application, where the GUI based application is executed on the computing device (108);
determine one or more properties of the parsed one or more elements of the GUI based application;
receive the NL input corresponding to at least one of the parsed one or more elements;
extract one or more parameters from the received NL input;
associate the extracted one or more parameters with at least one of the parsed one or more elements of the GUI based application; and
responsive to a successful association, convert the extracted one or more parameters to a text input, where the text input is populated in the corresponding one or more elements of the GUI based application.

Documents

Application Documents

# Name Date
1 202511003529-FORM 1 [15-01-2025(online)].pdf 2025-01-15
2 202511003529-DRAWINGS [15-01-2025(online)].pdf 2025-01-15
3 202511003529-COMPLETE SPECIFICATION [15-01-2025(online)].pdf 2025-01-15
4 202511003529-FORM-9 [25-01-2025(online)].pdf 2025-01-25
5 202511003529-FORM-5 [15-02-2025(online)].pdf 2025-02-15
6 202511003529-FORM 18 [24-02-2025(online)].pdf 2025-02-24
7 202511003529-FORM-26 [28-03-2025(online)].pdf 2025-03-28