FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention:
SYSTEM AND METHOD PROVIDING MULTI-MODALITY INTERACTION OVER VOICE CHANNEL BETWEEN COMMUNICATION DEVICES
Applicant
TATA Consultancy Services Limited A company Incorporated in India under The Companies Act, 1956
Having address:
Nirmal Building, 9th Floor,
Nariman Point, Mumbai 400021,
Maharashtra, India
The following specification particularly describes the invention and the manner in which it is to be performed.
FIELD OF THE INVENTION
The present invention relates to a system and method facilitating a multi-modality interaction between communication devices. More particularly, the present invention relates to converting the modality of data transmission between communication devices over a communication path.
BACKGROUND OF THE INVENTION
Interactive Voice Response (IVR) systems are widely deployed by the companies to automate customer interactions through telephone calls in the areas like travel inquiry and booking, phone banking and trading, telecommunication service providers etc. There is increased use of IVR by one or more enterprises to reduce the cost of general sales, service, inquiry and support calls to and from the company. The benefits of IVR systems include handling of large call volumes, reduction in cost and improved customer experience.
These automated telephone IVR systems facilitates answering customer calls by interacting with them. These interactions are enabled by presenting the caller with a voice menu by playing voice prompts. Further the caller is expected to select a choice by pressing a key on the telephone keypad or by speaking a command into the telephone. Generally the information required by the caller is also played back as a voice prompt to the caller.
This kind of interface is referred to as the Voice User Interface (VUI). The VUI inherently faces the problems as unlike a Graphical User Interface (GUI) where the eyes can view the information without explicitly remembering anything by scanning the screen repeatedly. The information played in voice menus and voice prompts has to be listened and necessarily remembered by the user. This increases the cognitive load on the user. For example if a user calls a Telephone Banking IVR system and asks for his last five transactions, the Telephone Banking IVR system will continuously play the transaction details to the caller one after another. The user has to both listen and remember this
lengthy information containing date and amount of transaction. Also with lengthy voice menus and lengthy voice prompts, the amount of time spent by the user to get the information is more. Further the lengthier transaction makes it expensive to both the service provider and the user. These problems with the IVR voice user interface reduce the user experience during an IVR call.
There are various solutions proposed to enhance the IVR system interaction by presenting the user with a visual menu similar to the voice menu on his mobile phone. One of the methods proposes text enhanced voice menu system which provides voice menu to the caller and additionally sending menu information in text form in a voice communication where an enhanced telephone is used. This system separately stores the audio information for producing voice menu and the text information for producing text version of voice menu. According to this method, voice and text are communicated to the user by using corresponding two different paths.
Another method which provides Interactive Voice and Video Response (IVVR) system, adds a video picture to the IVR systems. This technology improves the video picture experience for users and facilitates them to get to the needed information more quickly than using just a recorded voice menu. However the IVVR system requires the user to make a 3G video call from a 3G mobile phone.
One of the disclosed methods include sending text information from one mobile to another, transmitting via radio a series of dual tone multi-frequency (DTMF) tones corresponding to the selected character. This transmission occurs through voice channel. This method includes text transfer from mobile to mobile and does not include text transfer from IVR system to mobile.
Most of the approaches discuss about presenting the user with menu text via data channel only. While in some of the cases there is need of using advance technologies like 3G video call and 3G mobile phone in order to get the visual display of text menu.
Therefore, there is a need of a system and method which will provide the user with visual display of text menu with normal mobile phone call while interacting with IVR system. Also there is need to change the state of lengthier and time consuming repeated interactions with the user. The system and method should reduce the cognitive load on the user, thereby improving the user experience by providing him the necessary information with less effort and comparatively in less time, in both voice and text form.
OBJECTS OF THE INVENTION
It is trie primary object of the invention to provide a system and method for a multi-modality interaction between communication devices.
It is another object of the invention to provide the system and method for selecting the modality of the data to be transmitted by analyzing the parameters associated with the corresponding communication device.
It is another object of the invention to select the language of the data to be transmitted by analyzing the parameters associated with the corresponding communication device.
It is yet another object of the invention to provide an embodiment of selecting the modality of the data to be transmitted by the interactive system by the first communication device.
SUMMARY OF THE INVENTION
The present invention provides a system facilitating enhanced multi-modality interaction between communication devices. The system comprises of an interactive system configured to transmit a data to one or more communication devices. The interactive system further comprises of an intelligent module to determine and later select a modality for the data to be transmitted by analyzing one or more parameters associated with the data and the corresponding communication device and a first converter configured to change format of the data to be transmitted into signal, as per the selected modality. The system further comprises of a first communication device requesting to initiate an
interactive session with the interactive system. The first communication device further comprises of a switching module configured to receive and identify the modality in order to play the data thus transmitted and a second converter configured to reconvert the transmitted signals into the data. The system facilitates the enhanced multi-modality interaction such that the converted data is transmitted over a single preset voice communication path by the interactive system.
The present invention also provides a method facilitating enhanced multi-modality interaction between communication devices. The method comprises of steps of transmitting data from an interactive system to one or more communication devices. The transmitting further comprises of steps of determining and later selecting a modality for the data to be transmitted by analyzing one or more parameters associated with the data and the corresponding communication device and changing the format of data to be transmitted into signal. The method further comprises of initiating an interactive session between a first communication device and the interactive system. The initiating further comprises of steps of receiving and identifying the modality for playing the data thus transmitted and reconverting the signals into the data. The method facilitates the enhanced multi-modality interaction such that the converted data is transmitted over a single preset voice communication path by the interactive system.
BRIEF DESCRIPTION OF DRWAINGS
Figure 1 illustrates the system architecture facilitating an enhanced multi-modality interaction between pluralities of communication devices in accordance with an embodiment of the invention.
Figure 2 illustrates the multi-modality interaction between an IVR system and a user through his cell phone in accordance with an exemplary embodiment of the invention.
DETAILED DESCRIPTION
Some embodiments of this invention, illustrating its features, will now be discussed:
The words "comprising", "having", "containing", and "including", and other forms thereof, are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items.
It must also be noted that as used herein and in the appended claims, the singular forms "a", "an", and "the" include plural references unless the context clearly dictates otherwise. Although any systems, methods, apparatuses, and devices similar or equivalent to those described herein can be used in the practice or testing of embodiments of the present invention, the preferred, systems and parts are now described. In the following description for the purpose of explanation and understanding reference has been made to numerous embodiments for which the intent is not to limit the scope of the invention.
One or more components of the invention are described as module for the understanding of the specification. For example, a module may include self-contained component in a hardware circuit comprising of logical gate, semiconductor device, integrated circuits or any other discrete component. The module may also be a part of any software programme executed by any hardware entity for example processor. The implementation of module as a software programme may include a set of logical instructions to be executed by the processor or any other hardware entity. Further a module may be incorporated with the set of instructions or a programme by means of an interface.
The disclosed embodiments are merely exemplary of the invention, which may be embodied in various forms.
The present invention relates to a system and method facilitating enhanced multi-modality interaction between communication devices or plurality of communication devices. The system and method facilitates the conversion of data from one modality into other by analyzing various parameters associated with the corresponding communication device which is receiving the data. The converted data is transmitted over a preset voice communication path in a network.
In accordance with an embodiment, referring to figure 1, the system (100) comprises of an interactive system (102) configured to transmit a data to one or more communication devices over a preset voice communication path. The interactive system (102) further comprises of an intelligent module (104) configured to determine and select a modality for the data to be transmitted and a first converter (106) for converting the data modality. The system (100) further comprises of a First communication device (108) requesting to initiate an interactive session with the interactive system (102). The first communication device (108) further includes a switching module (110) and a second converter (112) configured to reconvert the transmitted signals into the data.
In a communication network, one or more communication devices (first communication device) requests for an interaction with the interactive system (102). The interactive system (102) may include but is not limited to an IVR system and the corresponding first communication device (108) may include a cell phone. A user through his first communication device (108) request for an interaction with the interactive system (102) and obtains required information in a particular data modality. In the beginning, by way of specific example, the IVR system plays a voice menu which the user listens in order to choose a best/relevant option from the voice menu thus played.
The interactive system (IVR system) transmits the data (in a form of signal) in a continuous manner to the user of the first communication device (108). The interactive system (102) is also capable of transmitting the data by changing its modality before transmission. For the same, the interactive system (102) further comprises of the intelligent module (104) which selects the modality of the data to be transmitted to the first communication device.
The intelligent module (104) determines the modality of data to be transmitted to the corresponding communication device (first communication device) and later selects it. The interactive system (102) transmits the data over a preset voice communication path which includes but is not limited to a voice path. The modality of data transmitted further comprises of data in a voice menu and data in a text menu.
When the intelligent module (104) selects the modality as text menu, the data is converted into a DTMF sequence and transmitted via the voice channel to the user's communication device.
The intelligent module (104) selects the modality of the data to be transmitted by analyzing one or more parameters associated with the first communication device (108).
The parameters further includes but is not limited to length of data (numbers of words played in a particular time, number of words to be transmitted in a text menu), cognitive ability of the user, location of a user, response time of a user, education of a user and usage statistics. By analyzing these parameters, the intelligent module (104) sets a threshold value and changes the modality of the data if a cognitive load (of data transmission over the user) exceeds this threshold value. For example, if the voice menu plays 7 options for around 3 minutes, then the intelligent module (104) will set a threshold value say (maximum time taken by the options should be less than 2 minutes), and when this threshold value is crossed, the intelligent selects the modality of the data as text menu and then the data will be transmitted as text to the user of the first communication device (108).
Further, the intelligent module (104) is also configured to determine the amount of data which is to be converted into different modality and is transmitted to the first communication device (108). The intelligent module (104) either selects the complete data for which the modality has to be changed or a part of data for changing its modality. This selection if further dependent on the parameters analyzed by the intelligent module (104) and are associated with the corresponding communication devices.
The interactive system (102) further comprises of a first converter (106) communicating with the intelligent module (104) to change format of the data to be transmitted as per the selected modality. By way of specific example, the first converter may include but is not limited to an audio signal encoder (DTMF encoder). The DTMF encoder changes the
signals of data to be transmitted into a sequence of DTMF (Dual-Tone-Multi-Frequency) digits in order to change the modality of the data from voice menu to text menu.
The interactive system (102) further comprises of a language selection module (114) configured to transmit the data in a language desired by the user or selected by the intelligent module (104) after analyzing the parameters associated with the first communication device (102). By way of specific example, when the user of the first communication device (108) initiates the interaction with the interactive system (102), the intelligent module (104) starts analyzing the parameters associated with the first communication device (108) in order to select the suitable language for the data to be transmitted. After analyzing the region (location) from where the user has called, the intelligent module (104) selects the respective language. For example, if a user calls from Maharashtra, the intelligent module (104) selects the language as Marathi for transmitting the data and also determines the modality of data which will be transmitted to the first communication device (108).
The data transmitted by the interactive system (102) is now received by the user of the first communication device (108). The first communication device (108) is provided with a switching module (110) in order to support the modality of the data thus transmitted by the interactive system (102).
The switching module (110) receives the data in the form of the signals and identifies its modality (voice menu or text menu). Upon identification, the signals are reconverted into the data by way of the second converter (112) present in the first communication device (108). By way of specific example, the second converter (112) further comprises of a decoder to obtain the text data by means of any predictive text module like T9 predictive text module. The decoder again converts the DTMF digits into the text data which is then displayed on the first communication device (108). This text data may be stored in the user's first communication device (108) for future reference.
The data is transmitted by the interactive system (102) to the first communication device (108) over the voice channel in a radio frequency range.
In accordance with an embodiment, the first communication device (102) further comprises of a selection module (116) which is configured to provide the user an option for selecting the modality of the data to be transmitted by the interactive system (102).
By way of specific example, when the user selects the modality of the data to be transmitted as text menu, his selection is again verified by the intelligent module (104) present in the IVR system. The intelligent module (104) will only accept the modality selection of the user by analyzing the parameters associated with the first communication device (108). The intelligent module (104) will accept the request of transmitting the data in text if the cognitive load exceeds a threshold value (in terms of voice menu length, number of options played etc). If the cognitive load is below the threshold value, the intelligent module (104) will not act on the user's request.
Further, the intelligent module (104) will automatically selects the modality of the data to be changed when again the cognitive load exceeds the threshold value and starts transmitting the data in text menu.
The user after receiving the data in the desired modality (voice or text) selects a relevant option. For example, in case of voice menu, the user selects the option either by speaking or by pressing a button provided in the first communication device (108). In case of text menu, the user may select his option again either by speaking or by pressing any button. Because of the text data, it will be easier for the user to remember all the options transmitted by the IVR system which will reduce the chances of requesting for a call repetition by the user. Moreover, the user may refer to the menu thus transmitted via SMS for his future reference.
BEST MODE/EXAMPLE FOR WORKING OF THE INVENTION
The system and method Illustrated for facilitating an enhanced multi-modality interaction between communication devices may be illustrated by working example stated in the following paragraph; the process is not restricted to the said example only: With respect to an exemplary embodiment, as shown in Figure 2, a typical voice call
between the user and the Telephone IVR with the proposed system will iterate through the
following steps:
1. The IVR system has text information in the Voice Menu. It encodes the
relevant text information into a sequence of DTMF digits.
a. Relevant text information implies that it is not necessary that the entire
text in voice menus is converted to text menu. Instead minimal text
representing the information will be converted to text menu. This is
again identified by the selection module that how much data has to be
converted before transmission. For the said example of 'statement of
account transactions' the prompt can be efficiently represented as the
following text menu:
Withdrawal XX.XX . Deposit XX.XX. Withdrawal XX.XX . Withdrawal XX.XX . Deposit XX.XX .
b. If the voice menus and its text are not in English language then the
language module will transliterate the text menus into English language
by way of the language selection module.
c. The DTMF encoder will encode the text menu into a DTMF sequence
using a T9 predictive text module. For example, the keyword
'DEPOSIT would be converted to the DTMF sequence '3376748'; derived using T9, from the text on the 9 keys of a telephone keypad as shown in figure below.
2. The IVR sends the sequence of DTMF digits to the user through the same voice call channel.
3. The user's mobile phone detects if DTMF digit sequence is sent by the IVR and if DTMF sequence detected instead of prompt, it decodes the DTMF sequence into a digit sequence.
4. The digit sequence is then converted back into text using a T9 predictive texting module. Additionally, if IVR menu is non-English then the text is transliterated back into the respective language.
5. The text, which is the voice menu, is then displayed on the user's mobile phone screen.
6. The user then responds through Speech or key press input, as supported by the Telephone IVR.
WE CLAIM;
1. A system facilitating enhanced multi-modality interaction between
communication devices, the system comprising:
an interactive system configured to transmit a data to one or more communication devices, the interactive system comprising;
an intelligent module to determine and later select a modality for the data to be transmitted by analyzing one or more parameters associated with the data and the corresponding communication device,
a first converter configured to change format of the data to be transmitted into signal, as per the selected modality;
a first communication device requesting to initiate an interactive session with the interactive system, the first communication device comprising:
a switching module configured to receive and identify the modality in order to play the data thus transmitted;
a second converter configured to reconvert the transmitted signals into the data;
such that the converted data is transmitted over a preset communication path by the interactive system.
2. The system as claimed in claim 1, wherein the interactive system may include but is not limited to an 1VR (Interactive Voice Response) system.
3. The system as claimed in claim 1, wherein the modality for transmitting data further comprises a voice menu and a text menu.
4. The system as claimed in claim 1, wherein the parameters associated with the data and the corresponding communication device may include but is not limited to length of data (numbers of words played in a particular time, number of words to be transmitted in a text menu), cognitive ability of the user, location of a user, response time of a user, education of a user and usage statistics.
5. The system as claimed in claim 1, wherein the first communication device may include but is not limited to a mobile phone.
6. The system as claimed in claim 1, wherein the signal further comprises of any audio signal but not limited to DTMF (Dual-Tone-Multi-Frequency) signals.
7. The system as claimed in claim 1, wherein the first converter further comprises of an audio signal encoder but not limited to DTMF encoder.
8. The system as claimed in claim 1, wherein the second converter further comprises of a decoder and optionally any predictive text module.
9. The system as claimed in claim 1, wherein the communication path further
comprises of a voice communication path.
10. The system as claimed in claim 1, wherein the interactive system further comprises of a language selection module configured to transmit the data into a language as desired by the user.
11. The system as claimed in claim 1, wherein the first communication device further comprises of a selection module to allow a user for selecting the modality of the data to be transmitted.
12. A method facilitating enhanced multi-modality interaction between communication devices, the method comprising steps of:
transmitting data from an interactive system to one or more communication devices, the transmitting further comprising steps of:
determining and later selecting a modality for the data to be transmitted by analyzing one or more parameters associated with the data and the corresponding communication device;
changing the format of data to be transmitted into signal;
initiating an interactive session between a first communication device and the interactive system, the initiating further comprising steps of:
receiving and identifying the modality for playing the data thus transmitted;
reconverting the signals into the data;
such that the converted data is transmitted over a preset communication path by the interactive system.
13. The method as claimed in claim 12, wherein the modality for transmitting the data further comprises a voice menu and a text menu.
14. The method as claimed in claim 12, wherein the parameters associated with the data and the corresponding communication device may include but is not limited to length of data (numbers of words played in a particular time, number of words transmitted in a text), location of a user, response time of a user, education of a user and usage statistics.
15. The method as claimed in claim 12, wherein the data is converted into DTMF (Dual-Tone-Multi-Frequency) signals.
16. The method as claimed in claim 12, wherein the method further comprises of transmitting a data into a language as desired by the user.
17. The method as claimed in claim 12, wherein the method further comprises of allowing a user to select the modality of the data to be transmitted.