Abstract: The invention relates to a method and system [102] for enabling a context based dynamic dialogue, by receiving a first natural language input from a user device, analysing the first natural language input to generate a plurality of semantic representations, selecting a dynamic state machine architecture by including a first sub-graph capable of generating a response to the first natural language input, receiving a second natural language input from the user device, analysing the second natural language input to generate a plurality of semantic representations, dynamically expanding the dynamic state machine architecture by adding to the first sub-graph a second sub-graph capable of generating a response to the second natural language input, and generating a natural language response to the second natural language input using the expanded dynamic state architecture.
TECHNICAL FIELD
The present invention relates broadly to a dialogue system, and more particularly, to management of a dialogue between a human and a machine based on dynamic state machines.
BACKGROUND
Conventionally, dialogue managers have been built for catering to single command like utterance received from a user. However, with the advancements in speech recognition and dialogue design technologies, speech enabled applications are used in computing devices for handling complex queries in handling human machine interactions. In general, building dialogue management systems for such applications involve building domain content, grammar etc. to control the sequence of interactions of a machine with a user. Some dialogue management systems known in the art are implemented through state machines, for example, based on Markov decision process or a mixed dialogue system. In operation, a state machine model uses states and transitions between states to control the dialog between a user and a dialogue manager. Other implementations of dialogue management systems are scaled into multiple machines.
A human-machine conversation on such dialogue management systems runs on fixed predefined routes allowing for a small number of variations. However, human conversations tend to flow naturally and move to new contexts abruptly. There can be references to new topics that were not previously part of an interaction. In order to manage a smooth and fluidic dialogue with a user, it is essential to achieve sharing of information between interactions. Dialogue managers need to keep track of the state of the interaction for each session or for each user and gather information as the dialogue between the machine and the user proceeds. The information gathered is used to perform actions in response to a user’s query/instruction in a dialogue. As multiple interactions between machine and the user and can result in multiple flows, it becomes essential to
retain earlier information while gathering new information.
However, dialogue managers traditionally have failed to retain contexts when engaged in complex conversations. None of the existing solutions provide any viable method to re-use context information in efficient dialogue management system. In view of the above shortcomings, there exists a need to have solutions directed towards addressing these problems and capable of dynamically connecting information that is available from previous interactions with new information that was previously not present within the interaction/dialogue flow. Thus, it is desired to have novel and improved solutions which overcome the problems associated with prior art and provide efficient and improved dialogue between the user and the system.
SUMMARY
The following presents a simplified summary of some embodiments of the disclosure in order to provide a basic understanding of the embodiments. This summary is not an extensive overview of the embodiments. It is not intended to identify key/critical elements of the embodiments or to delineate the scope of the embodiments. Its sole purpose is to present some embodiments in a simplified form as a prelude to the more detailed description that is presented below.
In view of the foregoing, an embodiment herein provides a processor– implemented method for enabling a context based dynamic dialogue between a user and a computing device. Initially, a first natural language input is received from a user device. The method comprises analysing the first natural language input to generate a plurality of semantic representations of the first natural language input. Further, a dynamic state machine architecture is selected based on the plurality of semantic representations, wherein the dynamic state machine architecture includes a first sub-graph that is capable of generating a response to the first natural language input. Further, a second natural language input is received which is analysed to generate a plurality of semantic representations of
the second natural language input. On receiving the plurality of semantic representations of the second natural language input, the dynamic state machine architecture is dynamically expanded based on the second natural language input, wherein dynamically expanding the dynamic state machine includes adding a second sub-graph to the first sub-graph in the dynamic state machine architecture. Furthermore, the method comprises generating a natural language response to the second natural language input using the expanded dynamic state architecture and transmitting the natural language response to the user device.
In another aspect, an embodiment of the invention provides a system for enabling a context based dynamic dialogue. The system comprises at least one processor and at least one tangible, non-transitory memory coupled with the processor. The system further comprises a receiving module coupled to the processor, the receiving module configured to receive a first natural language input from a user device. The system further comprises a natural language processing module configured to analyse the first natural language input to generate a plurality of semantic representations of the first natural language input. Further, the system comprises a dialogue management module configured to select a dynamic state machine architecture based on the plurality of semantic representations, the dynamic state machine architecture including a first sub-graph capable of generating a response to the first natural language input. The dialogue management module is further configured to receive a second natural language input from the user device and analyse the second natural language input to generate semantic representations of the second natural language input. The dialogue management module is also configured to expand the dynamic state machine architecture dynamically based on the second natural language input, wherein dynamically expanding the dynamic state machine includes adding a second sub-graph to the first sub-graph in the dynamic state machine architecture. Finally, the dialogue management module is configured to generate a natural language response to the second natural language input using the
expanded dynamic state architecture and transmit the natural language response to the user device.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that disclosure of such drawings include disclosure of electrical components or circuitry commonly used to implement such components. Connections between various components have not been shown in the drawings for the sake of clarity, and all components in the drawings shall be presumed to be connected to each other unless explicitly otherwise stated in the disclosure herein.
Figure 1 illustrates an exemplary network environment in which a system for enabling a context based dynamic state dialogue manager is implemented, in accordance with an embodiment of the present disclosure.
Figure 2 illustrates an exemplary system for enabling a context based dynamic state dialogue manager, in accordance with an embodiment of the present disclosure.
Figure 3 illustrates an exemplary method for enabling a context based dynamic state dialogue manager, in accordance with an embodiment of the present disclosure.
Figures 4A and 4B illustrate a set of nodes in a flow graph representing exemplary dynamic state machine models for enabling context based exemplary dialogue in accordance with an embodiment of the present invention.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
Exemplary embodiments are described with reference to the accompanying
drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the spirit and scope of the disclosed embodiments. It is intended that the following detailed description be considered as explanatory only, with the true scope and spirit being indicated by the following claims. Several features described hereafter can each be used independently of one another or with any combination of other features. However, any individual feature may not address all of the problems discussed above.
System and method for implementing a context based dynamic state dialogue manager are described. The dialogue manager encompassed by the present invention is based on a dynamic state machine architecture, wherein the flow of conversations is managed by dynamically adding nodes capable of handling a given utterance to the state machine/graph. In some embodiments, when a user provides a first utterance in a natural language by way of text or a speech, the present system analyzes the inputs and generates semantic representations of the first utterance. Further, a dynamic state machine architecture is selected based on the semantic representations such that the dynamic state machine architecture includes a first sub-graph capable of generating a response to the first utterance in a dialogue. As the dialogue proceeds, i.e. as the system receives a second utterance from the user, the system analyzes the second utterance so as to generate the semantic representations for the second utterance. The dynamic state machine architecture is then expanded based on the semantic representations for the second utterance, such that the dynamically expanded dynamic state machine includes a second sub-graph including a second node capable of generating a response to the second utterance. The response generated as a result in the form of semantic representations is converted into a natural language and transmitted to the user device. The system may implement
dynamic addition of nodes for any type of interaction, such as a short command like conversation where user provides all the information in one utterance or a step by step interaction which drills down one node at a time.
Referring now to the drawings, and more particularly to Figures 1 through 4, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and method.
Figure 1 illustrates a network implementation of an exemplary architecture (100) including a system for implementing a dynamic state dialogue manager (hereinafter referred to as, system (102), in accordance with an embodiment of the present disclosure. More particularly, figure 1 illustrates a system (102), and one or more user devices (104) connected to the system (102) over a network (106), wherein the connection may include a physical connection (such as a wired/wireless connection), a logical connection (such as through logical gates of semiconducting device), other suitable connections, or a combination of such connections, as may be obvious to a skilled person. In one embodiment, the network (106) may be a wireless network, a wired network, or a combination thereof. The network may be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, etc. The network may either be a dedicated network or a shared network. The shared network may represent an association of the different types of networks that use a variety of protocols (e.g., Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), etc.) to communicate with one another. Further, the network (106) may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, etc.
Although the present disclosure is explained considering that the system (102) is
implemented on a server, it is appreciated that the system (102) may also be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a mainframe computer, a phablet, a portable electronic device and the like. In one embodiment, the system (102) may be implemented in a cloud-based environment. It is also appreciated that the system (102) may be accessed by multiple users through one or more user devices 104-1, 104-2, . . . 104-n, collectively referred to as user devices (104) hereinafter, or applications residing on the user devices (104). The user devices (104) may be equipped with a speech recognition utility to identify the words spoken by the user. Examples of the user devices (104) may be electronic devices including, but not limited to, a portable computer, a tablet computer, a personal digital assistant, a handheld device, a cellular phone, a wireless device and a workstation. In one implementation, the user device (104) may be used to send an input in a natural language to the system (102).
Figure 2 illustrates a block diagram of system (102) for implementing a dynamic state dialogue manager according to an embodiment of the present disclosure. The system (102) comprises a hardware processor (202), an input/output (I/O) communication interface (204) and memory (206) coupled to the hardware processor (202). Although the exemplary block diagram and the associated description refers to a hardware processor, an input/output interface and a memory, it may be understood that one or more memory units, one or more hardware processors, and/or one or more interfaces may be comprised in the system (102). The hardware processor (202), the input/output (I/O) interface (204), the memory (206), and/or the modules may be coupled by a system bus or a similar mechanism.
The hardware processor (202) may be a single processing unit or a number of units, all of which could also include multiple computing units. The hardware processor (202) may be a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor, a plurality of microprocessors,
one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits, Field Programmable Gate Array circuits, any other type of integrated circuits, etc. The processor may perform signal coding data processing, input/output processing, and/or any other functionality that enables the working of the system according to the present disclosure. Among other capabilities, the hardware processor (202) fetches and executes computer-readable instructions and data stored in the memory (206). The hardware processor (202) manages the execution of the dynamic state machine elements representing the functionality of the system (102) for carrying out the context based dynamic dialogue.
The I/O interface (204) may include a variety of software and hardware interfaces, for example, interface for peripheral device(s), such as a keyboard, a mouse, an external memory, and a printer. Further, the I/O interface (204) may enable the system (102) to communicate with other computing devices, such as web servers, mobile devices and external data repositories in the network environment (100). The I/O interface (204) may facilitate multiple communications within a wide variety of protocols and networks, such as a network, including wired networks, e.g., LAN, cable, etc., and wireless networks, e.g., WLAN, cellular, satellite, etc. The I/O interface (204) may include one or more ports for connecting the system (102) to a number of computing devices.
The memory (206), may store instructions, any number of pieces of information, and data, used by a computer system, for example the system (102) to implement the functions of the system (102). The memory (206) may include for example, volatile memory and/or non- volatile memory. Examples of volatile memory may include, but are not limited to volatile random access memory (RAM). The non¬volatile memory may additionally or alternatively comprise an electrically erasable programmable read only memory (EEPROM), flash memory, hard drive, or the like. Some examples of the volatile memory includes, but are not limited to, random access memory, dynamic random access memory, static random access memory,
and the like. Some example of the non-volatile memory includes, but are not limited to, hard disks, magnetic tapes, optical disks, programmable read only memory, erasable programmable read only memory, electrically erasable programmable read only memory, flash memory, and the like. The memory (206) may be configured to store information, data, instructions or the like for enabling the system (102) to carry out various functions in accordance with various example embodiments.
The hardware processor (202) is communicatively coupled to modules (208) which may include hardware units, routines, programs, objects, components, data structures, etc., which perform particular tasks or implement particular abstract data types. In one implementation, the modules (208) include a receiving module (212), a natural language processing module (214), a dialogue management module (216) and other modules (218). The other modules (218) may include means that supplement applications and functions of the system (102) as described in various embodiments. The invention encompasses that the hardware processor (202) may include the modules (208).
The data (210), inter alia serves as a repository for storing data processed, received, and generated by one or more of the module(s) (208). The data (210) may include a system database (220), and other data (not shown in the figure) generated as a result of the tasks performed by one or more modules in the module(s) 208. The system database (220) includes all rules and grammar required for understanding the user utterance and the associated context in a dialogue. The system database (220) may be continuously updated with new rules and grammar generated continuously based on the natural language inputs received from the user.
The invention encompasses handling interactions of varied domains and sharing context within the multiple interactions. In one embodiment, a user may use the user device (104) to access the system (102) via the I/O interface (204). The
operation of the system (102) is further described in detail in connection with Fig. 2-5.
The system (102) is used for implementing a context based dynamic state dialogue manager. To implement the context based dynamic state dialogue manager, the system (102) may receive a first natural language input from a user device. In one embodiment, the receiving module (212) is configured to receive the first natural language input from the user device (104) in real-time. For example, the first natural language input is an utterance provided to a speech enabled application on the user device. The first natural language input may be related to one of a query that the user seeks a response to, a transaction or an instruction to perform an action on the user device (104). The receiving module (212) is also configured to receive one or more values from the system database (220), wherein the system database (220) includes rules and grammar required for understanding the user utterance and the associated context in a dialogue.
Further, the system comprises the natural language processing module (214). The natural language processing unit (214) is configured to analyze the first natural language input to generate a plurality of semantic representations. The natural language processing unit (214) is configured to understand the context of the first natural language input based on one or more rules fetched from the system database (220). The natural language processing module (214) is further configured to transmit the plurality of semantic representations to the dialogue management module (216) for further processing.
The dialogue management module (216) is configured to receive the semantic representations from the natural language processing module (214) and process the semantic representations through a dynamic state machine architecture. The dynamic state machine architecture comprises a plurality of information nodes, wherein each of the plurality of information nodes represents a node in one of a plurality of flow graphs for controlling the dialogue between the user and the
system, and wherein each of the plurality of flow graphs include a plurality of sub-graphs. In an embodiment, the dynamic state machine architecture includes the plurality of nodes as a cluster or collection of graphs cliques, such that each clique contains one or more connected nodes leading to completion of a specified task as per the user input. Each clique is a directed hierarchical flow graph, where sub-tasks are managed within a branch or sub-graph. The plurality of nodes form a network through which data can be passed and each interaction between a user and the system (102) is designed with one or more of the plurality of nodes. In one embodiment, one or more nodes belonging to different cliques are not connected. For instance, a node belonging to a clique related to recharging of a user’s account is not connected to any node in a clique related to booking of a cab.
Each node in the plurality of nodes in the dynamic state machine architecture includes a plurality of variables, at least one status, at least one runner, at least one emitter and at least one trigger. The plurality of variables store information at the node/state level and may include one or more dependent variables. For example, a variable ‘person information’, may include one or more dependent variables including but not limited to, name, age and gender of a person. Further, the present invention encompasses adding variables on-the-fly based on how dialogue between the user and the machine proceeds, for example, the number of variables to be added may depend on the number of people engaged in a dialogue. The at least one runner is the computing unit of the node which performs processing of information gathered from the natural language processing module (214) including the semantic representations. The at least one runner includes the actions that the node performs once the at least one status of the node changes, wherein the at least one status of a node, for instance, checks if gathering of information required for the node to operate is completed or not. Further, the at least one emitter emits responds or queries to the user device in order to receive required information or confirmation. For instance, a node for recommending a travel, upon getting booking preferences like price, days of stay
and place confirmation computes best hotel options for the user. This computing is performed by the at least one runner. The hotel options are emitted as output from the node. The at least one emitter responses is presented to the user as hotel booking options that the user can select from. Furthermore, the at least one trigger refers to an event which may initiate a transition from one state to another. For instance, the at least one trigger included in a node related to collecting information regarding recharge of an account is the user expressing an intent to recharge the account via the user utterances which is interpreted by the natural language processing module (214).
The dialogue management module (216) is configured to identify a first sub-graph capable of generating a response to the first natural language input using a clique identification routine. The clique identification routine is used to identify the first sub-graph and the graph/clique capable of moving forward the conversation based on the plurality of semantic representations. The clique identification routine implemented by the dialogue management module (216) analyzes the one or more nodes within the hierarchy of each clique. (i.e. children or siblings of the clique) in the cluster of cliques to identify nodes most capable of handling the first natural language input. In one embodiment, the identification of the nodes most capable of the generating a response to the first natural language input is based on the context of the first natural language input. Further, the dialogue management module (216) is configured to select a dynamic state machine architecture such that the dynamic state machine architecture includes the first sub-graph capable of generating a response to the first natural language input, wherein first sub-graph is associated with the identified node. The response may be converted to a natural language and transmitted to the user device.
Further, as the dialogue with the user proceeds, the system (102) is configured to receive a second natural language input via the receiving module (212) in real¬time. The receiving module (212) transmits the second natural language input to the natural language module (214) wherein the natural language module (214)
generates semantic representations corresponding to the second natural language input to be analysed by the dialogue management module (216). The natural language processing unit (214) is also configured to understand the context of the second natural language input based on one or more rules fetched from the system database (220).
The dialogue management module (216) is further configured to dynamically expand the dynamic state machine architecture by adding a new node to the dynamic state machine to control the flow of conversation based on the semantic representations of the second natural language input. In one embodiment, the clique identification routine is configured to analyze the one or more nodes in the hierarchy of the first sub-graph and/or the first graph/clique to identify the node capable of generating response to the second natural language input. In the event, none of the nodes in the first flow graph are identified to be capable of handling the second natural language input, the clique identification routine analyzes all the plurality of nodes in the plurality of flow graphs including all sub-graphs and siblings capable of handling the second natural language input. In the event, the at least one node capable of handling the utterance is found in a second sub-graph i.e. a sub-graph different from the first sub-graph, the second sub-graph of the identified node is dynamically attached to the current node in the first sub-graph as a sibling and is included as/ forms a part of the expanded dynamic state machine architecture. The present disclosure further encompasses processing the newly added sibling before moving to it in an interaction. The adding of nodes includes transfer of shared and dependent variables from the second node to the first node in the first sub-graph, wherein the shared and dependent variables include contextual information associated with the second node. In an embodiment, the dialogue management module (216) is configured to compare the context of the second natural language input and the first natural language input and analyze the plurality of flow graphs in the event the context of the first natural language input and the second natural language input are different. In one
embodiment, the second node capable of handling the second natural language user input is present in the first sub-graph. Further, the dialogue management module (216) is configured to share contextual information from the first sub¬graph to the second sub-graph by sharing variables. The plurality of nodes in the dynamic state machine are enabled to store the information of a session or an interaction ongoing between the user and the system (102) through context variables. When the user initiates a new interaction, the current variables are passed on as context and the shared variables receive the values passed on. In an exemplary embodiment, during an interaction relating to booking tickets for an event, when the user initiates an interaction relating to booking a cab, the dialogue management system (102) is configured to receive the values from the context variables passed from the interaction relating to booking of the event. The following shows exemplary responses to user utterances processed by the dialogue manager system (102):
User: I want to book tickets to a standup show in Vapour, Bangalore
DMS: Sure, below are the upcoming shows in Vapour, Bangalore
User: book the show at 8:00 p.m.
DMS: Here are the payment details
User: I want to book a cab to get there
DMS: Here are the cab options at 7:30 p.m. to Vapour, Indira Nagar, Bangalore.
As shown above, during an interaction regarding event booking with the system (102) provides the user inputs information for the ‘Event Name’ variable, ‘Event Time’ variable, ’Location’ variable, ‘City’ variable and ‘Domain’ variable for booking an Event. As the subsequent input received from the user relates to initiating a new interaction regarding booking a Cab service, the system (102) identifies the node capable of handling the subsequent input. The system (102) is configured to connect the current context variables i.e. the ‘Event Time’ variable, the ‘Location’ variable, the ‘City’ variable and the ‘Domain’ variable with the graph nodes in the graph related to booking cab. As the ‘Cab booking graph’ has a node
which gathers information for ‘Destination’, the ‘Location’ variable is shared with the graph related to cab booking and the node is filled with the current context. Further, the hierarchical flow graph after receiving the destination is considered and attached as a part of the current conversation. The system (102) then responds to the user such that the user can then either change the variable or use the same to continue with the booking. The dialogue may proceed with the user providing an input with references to event booking cab, cab booking or any other new reference as the context continues to be attached to the newly generated nodes.
In an embodiment, the dialogue management module (216) is able to identify multiple nodes capable of generating a response to the second natural language input, the dialogue management module (216) is capable of constructing a disambiguation node. The disambiguation node is capable of handling the second natural language input, identifying the best node from the afore-mentioned multiple nodes, and generate a response.
The dialogue management unit (216) is further configured to generate a response based on the dynamically expanded dynamic machine architecture. The response generated in the form of semantic representations is converted into a corresponding natural language output by the natural language processing module (214). The natural language output is then transmitted to the user device 104. In one embodiment, the user device 104 may carry out a transaction, display the output or perform an action based on the output such as initiate booking a movie ticket.
Fig. 3 illustrates a method 300 for implementing a context based dynamic state dialogue manager, according to one embodiment of the present subject matter. The method 300 may be implemented in a variety of computing systems in several different ways. For example, the method 300, described herein, may be implemented using the system (102), as described above.
The order in which the method 300 is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method, or an alternative method. Additionally, individual blocks may be deleted from the method without departing from the spirit and scope of the subject matter described herein. Furthermore, the methods can be implemented in any suitable hardware, software, firmware, or a combination thereof. It will be understood that even though the method 300 is described with reference to the system (102), the description may be extended to other systems as well.
The method 300 initiates at step 302, wherein a user provides a first natural language input through a user device. The first natural language input may include text communicated in a written form or speech communicated by voice or a combination thereof. In a preferred embodiment, the natural language input received from the user is in form of an utterance, i.e. speech input.
At step 304, the first natural language input is analyzed to generate a plurality of semantic representations of the first natural language input and to deduce the context of the first natural language input.
At step 306, a dynamic state machine architecture is selected based on the plurality of semantic representations generated in the previous step. The dynamic state machine architecture is selected by identifying a first state/node capable of handling the first natural language input i.e. generating a response to the first natural language input. The first node most capable of generating a response to the first natural language input is identified using a clique identification routine, wherein the clique identification routine analyzes the one or more nodes within the hierarchy of each clique in the cluster of cliques. Further, the first sub-graph to which the first node is associated is selected to be included in the dynamic state machine architecture.
As the dialogue with the user proceeds, at step 308, a second natural language
input is received from the user device in real-time. At step 310, the second natural language input is analyzed so as to generate plurality of corresponding semantic representations. Further, the context of the second natural language inputs is also understood based on one or more rules retrieved from the database.
At step 312, a second node capable of generating a response to the second natural language input is identified. In the event none of the nodes in the first sub-graph is found to be capable of handling the second natural language input, the second node is identified by analyzing all the plurality of nodes in the plurality of flow graphs including all sub-graphs and siblings. The dynamic state machine architecture is then dynamically expanded by adding a second sub-graph associated with the second node to the first sub-graph in the dynamic state machine architecture. The adding of nodes includes transfer of shared and dependent variables from the second node to the first node in the first sub-graph, wherein the shared and dependent variables include contextual information of the second node. The invention encompasses receiving multiple user utterances from the user between the first and the second natural language user input.
At step 314, a response to the second natural language input is generated using the expanded dynamic state architecture. Generating said response includes converting the semantic representations to the corresponding natural language response. Finally, the natural language response is transmitted to be displayed on the user device.
Figures 4A and 4B illustrate exemplary dynamic state machine models for enabling context based exemplary dialogue in accordance with an embodiment of the present invention. In the present example, figure 4A is in the context of a recharge transaction and figure 4B is in the context of a cab booking in between an interaction related to a recharge transaction.
Referring to Figure 4A, a set of predefined nodes in a state machine defining a hierarchy in an interaction relating to a recharge transaction is represented. Figure
4A illustrates state machine nodes including a ‘Recharge Initiator’ node, a ‘Required Params Selector’ node, a ‘Plans Exploring’ node, a ‘Plan Selection’ node, a ‘Summary’ node and a ‘Payment’ node. The ‘Recharge Initiator’ node includes variables for Greetings and Pleasantries; the ‘Required Params Selector’ node includes variables for storing information about the phone number and operator etc.; a ‘Plans Exploring’ node includes variables for information on plans for the operator such as an internet data plan, price for the plans; the ‘Plan Selection’ Node includes information for selecting a plan; a ‘Summary’ node includes variables for summary of overall information processed so far in the current state machine and the ‘Payment’ node includes variables for information for facilitating payment for the recharge. The data flows through the flow graph as an interaction between the user and the device proceeds and a response is generated for the user based on action taken by any node capable of handling a user input.
Referring to figure 4B, a context-based dialogue using dynamic state machines is illustrated using a set of nodes for a dialogue regarding a recharge transaction and availing a cab service. As shown in figure 4B, where, the user is already deep in the graph related to a recharge transaction, for example, the user is involved in exploring the plans for recharge transaction, if the subsequent input from the user relates to an unrelated domain, such as request for booking a cab service, the sub-graph associates with the nodes relating to the booking of cab will be dynamically added as siblings to the nodes relating to the recharge transaction since no such node in the recharge transaction graph is identified to be capable of performing an action of booking cab in response. This dynamic adding of the cab booking flow allows the user to book a cab and return to doing the recharge when the cab is booked.
As may be apparent from the above description of disclosed dynamic state dialogue manager, this technique provides an efficient way of driving conversation by maintaining a connection of a current context with a context of a previous interaction. It will be appreciated by those skilled in the art that the system and
method described hereinabove results in significant technical advancement since it helps in flexible real-time response delivery to a user despite any abrupt changes in domain/context during an interaction. Although the aforementioned description includes reference to receiving natural language input from a single user, it will apparent to those skilled in the art that the systems and methods described herein are capable of handling simultaneous inputs from multiple users and providing appropriate response to the same.
Although implementations for methods and systems for enabling a context based dynamic dialogue, have been described in language specific to structural features and/or methods, it is appreciated that the appended claims are not limited to the specific features or methods described. Rather, the specific features and methods are disclosed as examples of implementations for enabling a context based dynamic state dialogue.
1. A processor-implemented method for enabling a context based dynamic
dialogue, the method comprising:
- receiving a first natural language input from a user device;
- analysing the first natural language input to generate a plurality of
semantic representations of the first natural language input;
- selecting a dynamic state machine architecture based on the plurality of semantic representations, the dynamic state machine architecture including a first sub-graph capable of generating a response to the first natural language input;
- receiving a second natural language input from the user device;
- analysing the second natural language input to generate a plurality of semantic representations of the second natural language input;
- dynamically expanding the dynamic state machine architecture based on the semantic representation of the second natural language input, wherein the dynamically expanding the dynamic state machine includes adding a second sub-graph to the first sub-graph in the dynamic state machine architecture;
- generating a natural language response to the second natural language input using the expanded dynamic state architecture; and
- transmitting the natural language response to the user device.
2. The processor-implemented method as claimed in claim 1, wherein the
dynamic state machine architecture comprises a plurality of node
elements, wherein each of the plurality of node elements represents a
node in one of a plurality of flow graphs for controlling the dialogue
between the user and the system, and wherein each of the plurality of flow
graphs include a plurality of sub-graphs.
3. The processor-implemented method as claimed in claim 1, wherein
dynamically expanding the dynamic state machine architecture includes
identifying at least one new node associated with the second sub-graph,
wherein the at least one new node is capable of generating a response to
the second natural language input.
4. The processor-implemented method as claimed in claim 1, further comprising constructing a disambiguation node when more than one nodes capable of generating a response to the second natural language inputs are identified.
5. The processor-implemented method as claimed in claim 1, wherein dynamically expanding the dynamic state machine architecture further comprises sharing contextual information from the first sub-graph to the second sub-graph.
6. A system (102) for enabling a context-based dynamic dialogue, the system (102), comprising:
at least one processor (202);
at least one tangible, non-transitory memory (206) coupled with the at
least one processor(202),
- a receiving module (212), coupled to the a least one processor (202), configured to receive a first natural language input from a user device(104);
- a natural language processing module (214) coupled to the at least one processor (202), configured to analyse the first natural language input to generate a plurality of semantic representations of the first natural language input;
- a dialogue management module (216) coupled to the at least one
processor (202), configured to
select a dynamic state machine architecture based on the plurality of semantic representations, the dynamic state machine architecture including a first sub-graph capable of generating a response to the first natural language input, and
expand the dynamic state machine architecture based on a second natural language input, wherein the dynamically expanding the dynamic state machine includes adding a second sub-graph to the first sub-graph in the dynamic state machine architecture and wherein the second natural language input is received by the receiving module,
generate a natural language response to the second natural language input using the expanded dynamic state architecture, and transmit the natural language response to the user device.
7. The system as claimed in claim 6, wherein the dynamic state machine
architecture comprises a plurality of node elements, wherein each of the
plurality of node elements represents a node in one of a plurality of flow
graphs for controlling the dialogue between the user and the system, and
wherein each of the plurality of flow graphs include a plurality of sub¬
graphs.
8. The system as claimed in claim 6, wherein the dialogue management
module (216) is configured to dynamically expand the dynamic state
machine architecture by identifying at least one new node associated with
the second sub-graph, wherein the at least one new node is capable of
responding to the second natural language input.
9. The system as claimed in claim 6, wherein the dialogue management
module (216) is further configured to construct a disambiguation node when more than one nodes capable of responding to the second natural language inputs are identified.
10. The system as claimed in claim 6, wherein dynamically expanding the dynamic state machine architecture by the dialogue management module (216) further comprises sharing information from a first node associated with a first sub-graph to the at least one new node associated with the second sub-graph.
| # | Name | Date |
|---|---|---|
| 1 | Form 3 [19-04-2017(online)].pdf | 2017-04-19 |
| 2 | Drawing [19-04-2017(online)].pdf | 2017-04-19 |
| 3 | Description(Provisional) [19-04-2017(online)].pdf | 2017-04-19 |
| 4 | Form 26 [12-07-2017(online)].pdf | 2017-07-12 |
| 5 | Correspondence By Agent_General Power Of Attorney_13-07-2017.pdf | 2017-07-13 |
| 6 | 201741013912-Proof of Right (MANDATORY) [18-10-2017(online)].pdf | 2017-10-18 |
| 7 | 201741013912-Proof of Right (MANDATORY) [20-10-2017(online)].pdf | 2017-10-20 |
| 8 | Correspondence by Agent_Form-1_27-10-2017.pdf | 2017-10-27 |
| 9 | 201741013912-ENDORSEMENT BY INVENTORS [18-04-2018(online)].pdf | 2018-04-18 |
| 10 | 201741013912-DRAWING [18-04-2018(online)].pdf | 2018-04-18 |
| 11 | 201741013912-CORRESPONDENCE-OTHERS [18-04-2018(online)].pdf | 2018-04-18 |
| 12 | 201741013912-COMPLETE SPECIFICATION [18-04-2018(online)].pdf | 2018-04-18 |
| 13 | 201741013912-FORM-8 [24-05-2018(online)].pdf | 2018-05-24 |
| 14 | Correspondence by Agent_Affidavit_28-05-2018.pdf | 2018-05-28 |
| 15 | 201741013912-Proof of Right (MANDATORY) [22-06-2018(online)].pdf | 2018-06-22 |
| 16 | Correspondence by Agent_Form 1_27-06-2018.pdf | 2018-06-27 |
| 17 | 201741013912-FORM-9 [31-08-2018(online)].pdf | 2018-08-31 |
| 18 | 201741013912-FORM 18 [14-10-2020(online)].pdf | 2020-10-14 |
| 19 | 201741013912-FER.pdf | 2021-12-03 |
| 1 | SearchStrategyMatrixE_22-11-2021.pdf |