A Controller For A Conversational System And Method For Operating The

< Back

A Controller For A Conversational System And Method For Operating The Same

Abstract: A CONTROLLER FOR A CONVERSATIONAL SYSTEM AND METHOD FOR OPERATING THE SAME Abstract The conversational system 100 facilitates contextual conversation with a user 136. The conversational system 100 comprises the controller 110 interfaced with an input means 120 and an output means 118. The controller 110 configured to, receive conversational input through the input means 120, determine, through a context module 116, a context of the conversational input using a first dataset comprising at least one of user data, user profile and conversation history stored in a memory element 106. The context module 116 is any one of a rule based model and a learning based model, and provide conversational output based on the processed input, through the output means 118, characterized in that, the controller 110 configured to determine the context of the input using a second dataset in addition to the first dataset. The second dataset relates to environmental or situational data of the user 136. Figure 1

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

01 September 2023

Publication Number

10/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Bosch Global Software Technologies Private Limited

123, Industrial Layout, Hosur Road, Koramangala, Bangalore – 560095, Karnataka, India

Robert Bosch GmbH

Postfach 30 02 20, 0-70442, Stuttgart, Germany

Inventors

1. Khalpada Purvish

C/O Gordhan Vallabh, 2183, Shree Gokul Niwas, Near Statue of Gandhi, Aazad Chowk, Kapadwanj, Kheda, Gujarat – 387620, India

2. Karthikeyani Shanmuga Sundaram

3/58, AKG Nagar, Ponnalamman durai, Sethumadai(Po), Pollachi(Tk), Coimbatore – 642133, Tamilnadu, India

3. Swetha Shankar Ravisankar

Tower 4, 304 Salarpuria Sattva Cadenza Apartments, Near Nandi Toyota Office, Kudlu gate signal. Hosur Main Road, Benguluru – 560068, Karnataka, India

4. Arvind Devarajan Sankruthi

P-207, Purva, Bluemont,Trichy Road, Singanallur, Coimbatore – 641005, Tamilnadu, India

Specification

Description:Complete Specification:
The following specification describes and ascertains the nature of this invention and the manner in which it is to be performed.
Field of the invention:
[0001] The present invention relates to a controller for a conversational system and method for operating the conversational system.

Background of the invention:
[0002] Often, the existing solutions use the term environment in a loose sense. They refer to meta-data like channel of message, timestamp of the message as environment. A few methods discusses about using such external sensors to answer user’s queries, like if the user asks, “what is the temperature?”, it reads the thermal sensors and responds. Such systems are using the sensors as source of knowledge in a passive manner and not actively.

[0003] Majority of the existing device/systems today deliver a command driven conversations, especially in automobile or homes or the like. For example, consider the following typical conversational scenario of such command driven conversational systems.
User I am feeling very hot.
System Okay! AC temperature reduced to 21 degrees.

[0004] Such systems are more of a voice-controlled system, than a conversational system. A voice-controlled systems translates the physical controls like dials, buttons, etc. to vocal controls. Such translations, being very passive in participation, do not leverage the true potential of a conversational system. Majority of such production systems only consider the current command. A more advance conversational agent considers the previous conversation or conversation history as an additional context. However, limiting the context to current command and the previous conversational history is not sufficient

[0005] One of the biggest hurdles of conversation is establishment of the context. A shared, common context is the foundation of any conversation. In fact, conversation between two entities is only feasible by having a shared context. When conversing with fellow human beings, the context is consciously and/or subconsciously identified. For example, based on various context, “it is a bad day” can mean “I tried my best to present the idea, but because the boss harshly criticized it”, or “my kid is constantly throwing tantrum”, or “it has been raining severely”, or “universally beloved, noble person died”, and so on. When a person says this to a friend, the friend subconsciously derives the context of the statement and responds in an appropriate manner. Although previous conversation with the friend can help approximate the context, but that’s not the only thing that the friend considers. Considering environmental and situational context allows the friend to better understand the context. This is one of the short comings in the existing conversational systems.

[0006] According to a patent literature US2018052925, sensor based context augmentation of search queries is disclosed. A computing device and method is usable to augment search queries with data obtained from sensors. The computing device comprises a processor configured to receive, from a query source, a search query comprising a query concept. The processor is further configured to determine a context of the query concept expressed in the query. The processor is further configured to determine a response to the query. The processor is further configured to validate the context of the query using at least one sensor. The processor is further configured to transmit the response to the query to the query source.

Brief description of the accompanying drawings:
[0007] An embodiment of the disclosure is described with reference to the following accompanying drawings,
[0008] Fig. 1 illustrates a block diagram of a controller for a conversational system, according to an embodiment of the present invention, and
[0009] Fig. 2 illustrates a method of operating the conversational system, according to the present invention.

Detailed description of the embodiments:
[0010] Fig. 1 illustrates a block diagram of a controller for a conversational system, according to an embodiment of the present invention. The conversational system 100 facilitates contextual conversation with a user 136. The conversational system 100 comprises the controller 110 interfaced with an input means 120 and an output means 118. The controller 110 configured to receive conversational input through the input means 120, determine, through a context module 116, a context of the conversational input using a first dataset comprising at least one of user data, user profile and conversation history stored in a memory element 106 of the controller 110. The context module 116 is any one of a rule based model and a learning based model, and provide conversational output based on the processed input, through the output means 118, characterized in that, the controller 110 configured to determine the context of the input using a second dataset in addition to the first dataset. The second dataset relates to environmental or situational data of the user 136. The first dataset can be considered as primary data and the second dataset as secondary data for ease of understanding. Further, it is to be noted that Automatic Speech Recognition or Speech-to-Text conversion is also performed, but the same is not explained for being state of the art.

[0011] The conversational input refers to dialogue between two or more humans, between humans and the conversational system 100 or a query or a question to humans or the conversational system 100. Further, the input means is at least one microphone and a keyboard over a touch screen or conventional keyboard.

[0012] It is important to understand some aspects of Artificial Intelligence (AI) technology and AI based devices, which can be explained as follows. Depending on the architecture of the implements, AI devices may include many components. One such component is an AI model or AI modules. Different modules are described later in this disclosure. The AI model can be defined as reference or an inference set of data, which uses different forms of correlation matrices. Using these AI models and the data from these AI models, correlations can be established between different types of data to arrive at some logical understanding of the data. A person skilled in the art would be aware of the different types of AI models such as linear regression, naïve bayes classifier, support vector machine, neural networks and the like. It must be understood that this disclosure is not specific to the type of model being executed and can be applied to any AI module irrespective of the AI model being executed. A person skilled in the art will also appreciate that the AI model may be implemented as a set of software instructions, combination of software and hardware or any combination of the same.

[0013] Some of the typical tasks performed by AI systems are classification, clustering, regression etc. Majority of classification tasks depend upon labeled datasets; that is, the data sets are labelled manually in order for a neural network to learn the correlation between labels and data. This is known as supervised learning. Some of the typical applications of classifications are, face recognition, object identification, gesture recognition, voice recognition etc. In a regression task, the model is trained based on labeled datasets, where the target labels are numeric values. Some of the typical applications of regressions are, Weather forecasting, Stock price predictions, House price estimation, energy consumption forecasting etc. Clustering or grouping is the detection of similarities in the inputs. The cluster learning techniques do not require labels to detect similarities.

[0014] According to an embodiment of the present invention, the second dataset is selected based on availability and is selected from a group of data source 102 comprising a facial expression/emotions extracted from a camera 122, physiological parameter from a wearable device 124 worn by the user 136, external data comprising weather data, environmental parameters, traffic data from a map based service provider from an internet connected database 126, conversation history 128, a calendar entry, a to-do list obtained from a smart device 130, a location through a satellite based navigation system 132, vehicle data from built-in sensors 134 of the vehicle, and the like. The database 126 comprising user data is updated regularly as and when the user changes and preferences or performs any activity.

[0015] In accordance to an embodiment of the present invention, the controller 110 is provided with necessary signal detection, acquisition, and processing circuits. The controller 110 is the one which comprises input interface 104, output interfaces 108 having pins or ports, the memory element 106 such as Random Access Memory (RAM) and/or Read Only Memory (ROM), Analog-to-Digital Converter (ADC) and a Digital-to-Analog Convertor (DAC), clocks, timers, counters and at least one processor (capable of implementing machine learning) connected with each other and to other components through communication bus channels. The memory element 106 is pre-stored with logics or instructions or programs or applications or modules/models and/or threshold values/ranges, reference values, predefined/predetermined criteria/conditions, which is/are accessed by the at least one processor as per the defined routines. The internal components of the controller 110 are not explained for being state of the art, and the same must not be understood in a limiting manner. The controller 110 may also comprise communication units such as transceivers to communicate through wireless or wired means such as Global System for Mobile Communications (GSM), 3G, 4G, 5G, Wi-Fi, Bluetooth, Ethernet, serial networks, and the like. The controller 110 is implementable in the form of System-in-Package (SiP) or System-on-Chip (SOC) or any other known types. Examples of controller 110 comprises but not limited to, microcontroller, microprocessor, microcomputer, etc.

[0016] Further, the processor may be implemented as any or a combination of one or more microchips or integrated circuits interconnected using a parent board, hardwired logic, software stored in the memory element 106 and executed by a microprocessor, firmware, an application specific integrated circuit (ASIC), and/or a field programmable gate array (FPGA). The processor is configured to exchange and manage the processing of various AI models.

[0017] According to an embodiment of the present invention, the controller 110, comprises an aggregator module 112 to collect the first dataset and the second dataset of the user 136. The aggregator module 112 is connected to the data sources 102 and sensors and continuously collects the data for processing. According to the present invention, a featurizer module 114 is configured to receive the input from the aggregator module 112 and derive/extract feature vectors of the first dataset and the second dataset. The featurizer module 114 vectorizes the aggregated data if the conversational system 100 is an end-to-end learning-based model, or produces the value matrix for a rule-based model. Similar components are available at the time building / training the conversational system 100, where either the rules are deployed that allows inference of the context or data sample from the system, along with the sensory information to train a learning model, is collected.

[0018] According to the present invention, the value matrix is elaborated just for understanding purpose, however the same is known for a person skilled in the art. All the machine learning or computational algorithm works on numbers. So, any kind of input is internally represented in the form of numbers. For an example, an image is represented as matrix of pixels, where every pixel is (R, G, B) value, like (255, 10, 128) representing a color in RGB color space. Depending on the task / domain, there are various ways to convert a traditionally non-numeric data (like image) to (a set / vector / matrix of) numbers. For natural language, one such way is to create a dictionary of all words and assign a unique number to each one of them. For an example, we can build such a dictionary like, (a, 0), (an, 1), …, (elephant, 519), …, (hungry, 2137), …, (is, 4489), … … and then use it to convert the sentence “an elephant is hungry” to the vector as [1, 519, 4489, 2137].

[0019] Instead of considering such raw values to vectors, the output of an machine learning algorithm is used to create higher order vectors. For example, let’s assume, for sake of argument, “an elephant is hungry” is a sad emotion. So, the emotion value is used to represent one dimension of the sentence (output of ML based emotion classifier), instead of raw vector of sentence itself. In other words, the output from other devices / algorithms are used, to create an aggregated vector. Check the figure below.

[0020] According to the present invention, a working of the featurizer module 114 is explained. For example, for face detection, the featurizer module 114 builds vector of facial features such as face location, face count, face recognition label (like who are all in the frame), etcetera, and not the individual components of the face. As another example, featurize emotion of the voice, recognized speaker label, number of voices, etcetera, for auditory sensors. .

[0021] This featurized vector or the value matrix brings environmental context to the conversational system 100. The controller 110 of the conversational system 100 selects (rule-based model) or generate (end-to-end learning model) an appropriate response to the user query with environmental context. The rule-based model and end-to-end learning model are known concepts for a person skilled in the art.

[0022] The output of the featurizer module 114 is stored in the memory element 106. For future queries, the history of these environmental contexts are referred to determine the present/current context. Often, the current environmental context is not sufficient, hence, the history of environmental context is stored and referred as a context feature, similar to history of conversational context.

[0023] According to an embodiment of the present invention, the controller 110 is part of at least one of an infotainment unit of the vehicle, a smartphone, a wearable device 124, a cloud computer. Alternatively, the conversational system 100 is at least one of the infotainment unit of the vehicle, the smartphone, the wearable device 124, the cloud computer, a smart speaker or a smart display and the like. In other words, the controller 110 is part of an internal device of the vehicle or part of external device which is connected to the vehicle through known wired or wireless means as described earlier.

[0024] An example conversation with the conversation system 100 is provided for explanation of the present invention, The same must not be understood in limiting manner.

[0025] The controller 110 enables the conversational system 100 to trigger the personalized conversation based on the environmental conditions. Consider the following scenarios where the conversational system 100 is triggering a conversation with environmental context.
Without environmental context With environmental context
System 100 … < Notices the environment is much chillier than where the user lives, or the user is used to>.
System 100 … Hey! It’s chilly outside. I am feeling cold already. I hope you have your sweatshirts on.
User … Oh yes. It’s unusually chilly here. I have my sweatshirts on.
Some other day…
System 100 …
System 100 … Hey! Shall we increase the AC temperature as you have cold?
User … All right.

[0026] Such personalization makes the conversational system 100 perceivably more intelligent. The controller 110 is not limited to triggering the conversation, but helps understanding the user 136 better. For example, Mr. John Doe is driving in a hot sunny day. Consider the following scenario.
Without environmental context With environmental context
User I am feeling very hot. I am feeling very hot.
System 100
System 100 I have set AC to 18 degrees and increased the fan speed. I have reduced AC temperature to 21 degrees. You know, today, UV index is high, and it is a bright sunny day. Shall I also close the sunroof and close the window blinds?
Some other day…
User I am feeling very hot. I am feeling very hot.
System 100
System 100 I have set AC to 18 degrees and increased the fan speed. I have set AC to 18 degrees and increased the fan speed.

[0027] The conversation, the decision taking is not based only on what the user 136 is asking and a predefined configuration (like reduction of 5 degrees in AC temperature), but also based on environmental factors. Such conversations are closer to human like conversations. The second dataset is not limited to environmental context or weather condition, but also every such sensors that allows the conversational system 100 to sense the environment the user 136 is in, like location, emotion, surrounding people (like with friends, family, co-workers), in-car sensors like speed of the vehicle, other supplementary sensors like heartrate sensor on user’s wearables along with data from other applications like calendar (for example, if the user 136 has an interview scheduled, may be the conversational system 100 can give a pep talk along with reducing the AC temperature.)

[0028] Although all the examples above are related to temperature, the invention is not limiting to the use of thermal sensor. Scenarios in the next section considers location, maps, other services APIs, etc. The invention argues for aggregation of such sensors to better understand the environment the user 136 is in, as well as user’s situation. Such aggregation of environmental context leads to more personalized and sounder conversations. Below is a table providing comparisons of response of different conversational devices with controller 110 of the present invention.

[0029] In the table, Device A, Device B, Device C, Device D and Device E are all different types of existing or conventional digital assistants either standalone or in a vehicle.

User 136 Device A Device B Device C Device D Device E Controller 110
I am tired. Maybe you've been working too hard, you deserve to rest. Listen to me. Put down this device right now and take a nap. I will wait here. I wish you sweet slumber. I understand, taking breaks and getting enough rest is important for maintaining physical and mental health. It is advisable to prioritize sleep, eat well and engage in physical activity to boost energy levels. Hey! I understand a lot of traffic can be tiresome. Good news is it is only till next 500 meters. While you wait, do you want me to connect with your girlfriend?
I am nervous You are not alone in how you feel. I read on the internet that anxiety affects 40 million adults in the US alone. The good news is that it's highly treatable. If you think you need medical help, please reach out to your doctor. I am sorry you're going through this. I've heard that taking your mind off things can help. Take a break and find something that makes you smile. Sorry, I don’t know that one. Being nervous is a natural reaction to stress and can be common in situations that are new or challenging. Here are a few ways to calm nerves:

• Take slow, deep breaths
• Practice positive self-talk
• Engage in physical activity
• Use visualization techniques
• Try to focus on the present moment

Remember, everyone gets nervous at times, and it is a normal part of life. Here is what I found on the web. Hey! It is okay to be nervous before the presentation. However, I know you have worked a lot over it. Do not worry. I know you got this.

With the present invention, the controller 110 delivers perceivably more intelligent and personalized response. Considering environmental and situational context is a natural and real-life extension to only considering previous history as a context.

[0030] Fig. 2 illustrates a method of operating the conversational system, according to the present invention. The method comprises plurality of steps of which, a step 202 comprises receiving conversational input from the input means 120. A step 204 comprises determining, by the context module 116, the context of the conversational input using the first dataset comprising user data, user profile and conversation history stored in the memory element 106. The context module 116 is any one of a rule based model and a learning based model. A step 206 comprises providing conversational output based on the determined context through the output means 118. The method is characterized by the step 204 where determining the context of the conversational input using the second dataset is performed as a step 208, in addition to the first dataset. The second dataset relates to environmental or situational data of the user 136. The determined context is stored in the memory element 106 and accessed as second dataset in subsequent conversational input from the user 136.

[0031] According to the method, the second dataset is selected based on availability and is selected from the group of data source 102 comprising a facial expression/emotions extracted from a camera 122, physiological parameter from a wearable device 124 worn by the user 136, external data comprising weather data, environmental parameters, traffic data from the map based service provider from the internet connected database 126, conversation history 128, the calendar entry, the to-do list obtained from the smart device 130, the location through the satellite based navigation system 132, vehicle data from built-in sensors 134 of the vehicle, and the like.

[0032] According to the present invention, the method comprises the aggregator module 112 for collecting the first dataset and the second dataset of the user 136. Further, the method comprises the featurizer module 114 for receiving the input from the aggregator module 112 and extracting feature vectors of the first dataset and the second dataset.

[0033] According to an embodiment of the present invention, the conversational system 100 is preferably used for a vehicle to provide more convenience to the driver or passengers. The conversational system 100 may also be referred to as digital companion or virtual companion which is more than a digital assistant in a manner that the conversational system 100 is able to extract/deriver and give more information for a detected or asked query.

[0034] According to the present invention, the controller 110 and method encapsulates environmental context and history of the environmental contexts in the conversational system 100. The environmental context is aggregated from various sensors and sources when deployed. When a user 136 gives a query to the conversational system 100, it considers context from these sources and populates or selects an appropriate response. The input of conversational system 100 is changed from only being the user query and history, to having the environmental information along with them. The present invention provides active companion, that is able to predict conflicting / bad situations and derives solutions. The present invention avoids a situation where a tyre pressure light has to blink up, for example. The data is observed, and a prediction is made as to what can go wrong, and the system tries to minimize problem, instead of checking values when it already has gone wrong.

[0035] It should be understood that the embodiments explained in the description above are only illustrative and do not limit the scope of this invention. Many such embodiments and other modifications and changes in the embodiment explained in the description are envisaged. The scope of the invention is only limited by the scope of the claims.
, Claims:We claim:
1. A controller (110) for a conversational system (100), said conversational system (100) facilitates contextual conversation with a user (136), said conversational system (100) comprises said controller (110) interfaced with an input means (120) and an output means (118), said controller (110) configured to,
receive conversational input through said input means (120),
determine, through a context module (116), a context of said conversational input using a first dataset comprising user data, user profile and conversation history stored in a memory element (106), said context module (116) is any one of a rule based model and a learning based model, and
provide conversational output, based on said determined context, through said output means (118), characterized in that, said controller (110) configured to
determine said context of said input using a second dataset in addition to said first dataset, said second dataset relates to environmental or situational data of said user (136).

2. The controller (110) as claimed in claim 1, wherein a determined context is stored in a said database and accessed as second dataset in subsequent conversational input from said user (136).

3. The controller (110) as claimed in claim 1, wherein said second dataset is selected based on availability and is selected from a group of data sources (102) comprising a facial expression/emotions extracted from a camera (122), physiological parameter from a wearable device (124) worn by said user (136), external data comprising weather data, environmental parameters, traffic data from a map based service provider from an internet connected database (126), conversation history (128), a calendar entry, a to-do list obtained from a smart device (130), a location through a satellite based navigation system (132), vehicle data from built-in sensors (134) of a vehicle, and the like.

4. The controller (110) as claimed in claim 1 comprises an aggregator module (112) to collect said first dataset and said second dataset of said user (136).

5. The controller (110) as claimed in claim 4 comprises a featurizer module (114) to receive said input from said aggregator module (112) and derive feature vectors of said first dataset and said second dataset.

6. A method for operating a conversational system (100), said conversational system (100) facilitates contextual conversation with a user (136), said method comprising the steps of:
receiving conversational input from an input means (120),
determining, by a context module (116), a context of said conversational input using a first dataset comprising user data, user profile and conversation history stored in a memory element (106), said context module (116) is any one of a rule based model and a learning based model, and
providing conversational output, based on said processed data, through an output means (118), characterized by,
determining said context of said conversational input using a second dataset in addition to said first dataset, said second dataset relates to environmental or situational data of said user (136).

7. The method as claimed in claim 6, wherein determined context is stored in a said memory element (106) and accessed as second dataset in subsequent conversational input from said user (136).

8. The method as claimed in claim 6, wherein said second dataset is selected based on availability and is selected from a group of data source (102) comprising a facial expression/emotions extracted from a camera (122), physiological parameter from a wearable device (124) worn by said user (136), external data comprising weather data, environmental parameters, traffic data from a map based service provider from an internet connected database (126), conversation history (128), a calendar entry, a to-do list obtained from a smart device (130), a location through a satellite based navigation system (132), vehicle data from built-in sensors (134) of a vehicle, and the like.

9. The method as claimed in claim 6 comprises an aggregator module (112) to collect said first dataset and said second dataset of said user (136).

10. The method as claimed in claim 8 comprises a featurizer module (114) for receiving said input from said aggregator module (112) and deriving/extracting feature vectors of said first dataset and said second dataset.

Documents

Application Documents

#	Name	Date
1	202341058662-POWER OF AUTHORITY [01-09-2023(online)].pdf	2023-09-01
2	202341058662-FORM 1 [01-09-2023(online)].pdf	2023-09-01
3	202341058662-DRAWINGS [01-09-2023(online)].pdf	2023-09-01
4	202341058662-DECLARATION OF INVENTORSHIP (FORM 5) [01-09-2023(online)].pdf	2023-09-01
5	202341058662-COMPLETE SPECIFICATION [01-09-2023(online)].pdf	2023-09-01
6	202341058662-Power of Attorney [29-08-2024(online)].pdf	2024-08-29
7	202341058662-Form 1 (Submitted on date of filing) [29-08-2024(online)].pdf	2024-08-29
8	202341058662-Covering Letter [29-08-2024(online)].pdf	2024-08-29