Sign In to Follow Application
View All Documents & Correspondence

System For Rapid Prototyping Of Speech Recognition Applications In Different Languages

Abstract: A system and method for porting of existing speech recognition solutions in a source language to a target language has been disclosed. The system envisaged by the present invention enables porting of a working speech recognition solution in the source language to a working system in the target language, thus minimising the development process and reusing existing speech recognition solution components to recognise multiple languages.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
19 May 2009
Publication Number
48/2010
Publication Type
INA
Invention Field
ELECTRONICS
Status
Email
Parent Application
Patent Number
Legal Status
Grant Date
2019-03-06
Renewal Date

Applicants

TATA CONSULTANCY SERVICES LIMITED
NIRMAL BUILDING, 9TH FLOOR, NARIMAN POINT, MUMBAI-400021, MAHARASHTRA, INDIA.

Inventors

1. KOPPARAPU SUNIL KUMAR
TCS INNOVATION LABS YANTRA PARK, SDC 5, ODC G, OPP: VOLTAS HRD TRAINING CENTER, SUBHASH NAGAR, POKHRAN ROAD NO. 2, THANE (W)-400601, MAHARASHTRA, INDIA.
2. SHEIKH IMRAN AHMED
TCS INNOVATION LABS YANTRA PARK, SDC 5, ODC G, OPP: VOLTAS HRD TRAINING CENTER, SUBHASH NAGAR, POKHRAN ROAD NO. 2, THANE (W)-400601, MAHARASHTRA, INDIA.
3. PHARANDE AMOL SITARAM
TCS INNOVATION LABS YANTRA PARK, SDC 5, ODC G, OPP: VOLTAS HRD TRAINING CENTER, SUBHASH NAGAR, POKHRAN ROAD NO. 2, THANE (W)-400601, MAHARASHTRA, INDIA.

Specification

FORM-2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2006
PROVISIONAL SPECIFICATION
(See section 10 and rule 13)


SYSTEM FOR RAPID PROTOTYPING OF SPEECH RECOGNITION APPLICATIONS IN DIFFERENT LANGUAGES
TATA CONSULTANCY SERVICES LTD.,
an Indian company,
of Nirmal Building, 9th floor,
Nariman Point, Bombay - 400 021
The following specification particularly describes the nature of the invention.


FIELD OF THE INVENTION
The present invention relates to the field of speech recognition and translation.
BACKGROUND OF THE INVENTION AND PRIOR ART
Today, interactive technologies play a key role for improving customer service. The interactive technologies like IVR (Interactive Voice Response) accept verbal user input and/or request and provide pre-recorded or dynamically generated output in response to the user's request.
Typically, the IVR applications use speech recognition systems to recognize and convert words or a sequence of words spoken by a person to machine readable form. These systems are built for a source language, for instance English, due to wider acceptability of the language and the availability of information and resources in the source language. However, with increasing acceptability of speech based solutions in various countries, where the native language is different from the source language, there is a need to convert an existing speech based application working in the source language to the target language.
Typically, speech recognition based applications require:
(i) a speech recognition (SR) engine with acoustic models for acoustic
recognition; (ii) a pronunciation lexicon of the words which have to be recognized;
and (iii) a language model.
2

These three components work in tandem to convert the speech to text. Converting a speech recognition based solution from a source language to a target language needs these three components to be 'ported' to the target language.
Although, the acoustic models are tuned for a particular language, the source acoustic models can be used to recognize speech in another language with decent accuracy if the other two components, namely, lexicon and the language grammar are addressed adequately in the target language.
Essentially, converting a speech recognition application from one language to another will need creation of a new pronunciation lexicon for the target language, which contains all the words to be recognized by the speech application and its phonemic pronunciation and language grammar of the target language.
These modifications for porting a speech recognition application in source language into a target language requires efforts equivalent to building an entire new application.
Therefore, there is a need for a system which will enable an existing application to be quickly ported and/or modified to work in multiple target languages by reusing the speech recognition engine of the existing application.
3

OBJECTS OF THE INVENTION
It is an object of the present invention to provide a system for enabling an existing application to be quickly ported to work in another target language.
It is another object of the present invention to provide a system for accurate source to target language transliterations and translations of a speech solution.
It is yet another object of the present invention to provide a system which will automatically generate source language phonemic pronunciations of target language words.
BRIEF DESCRIPTION OF ACCOMPANYING DRAWINGS
The invention will now be described in relation to the accompanying
drawings, in which:
Figure 1 illustrates the overview of the proposed system and its interface
with an existing speech recognition application; and
Figure 2 illustrates the process of generating the target language lexicons.
DESCRIPTION OF THE INVENTION
Referring to the drawings, there is disclosed a system for enabling a speech recognition system in a target language by using a speech recognition system of the source language, the accompanying drawings do not limit the scope and ambit of the invention and are provided purely by way of example and illustration.
4

The present invention envisages a speech recognition system in a target language. Particularly, the system envisaged by the present invention enables porting of any existing speech recognition application in a source language to a target language, thus minimizing the development process and reusing existing speech recognition applications to recognize and translate multiple languages.
The preferred embodiment of the present invention describes the conversion of the English language speech to an Indian language (Hindi) speech recognition solution.
Referring to the drawings, Figure 1 shows the overview of an existing SR application and its interface with the system envisaged by the present invention.
Existing speech recognition applications are built of one or more call flow units, represented by block 10 of Figure 1. Each call flow unit 10 comprises of modules for performing the below functions:
(i) prompting the users for speaking their requests, represented by
block 12 of Figure 1; (ii) recognizing the user's speech request, represented by block 14 of
Figure 1; (iii) processing the recognized text to answer the user, represented by
block 16 of Figure 1; and (iv) providing response to the user, represented by block 18 of Figure 1.
5


There are two types of data that needs processing: (a) the speech (acoustic) data and (b) the textual data. The speech data is used at the point of interaction with the user, while the textual data is processed internally for processing information extracted from the speech data.
The invention achieves the objective of converting an existing speech application into a target language by following the steps as given by:
1. keeping the textual data, and its processing by the existing application, unchanged;
2. using the same textual data representation (in source/English language) of the speech data, irrespective of the target language; and
3. modifying the source SR resources i.e. phoneme lexicon and grammar to process the target language.
In accordance with the present invention, the call flow units 10 along with the application data represented by block 22 of Figure 1 remains unchanged in the target language. The modification performed by the present invention is represented by block 20 of Figure 1. The present invention modifies the phoneme lexicon and the grammar for porting any existing SR (Speech Recognition) applications to a target language efficiently.
In accordance with the present invention, the Lexicon and Grammar modifications, performed by the present invention are achieved by the Lexicon Modification Engine (LME), represented by block 24 of Figure 1 and Grammar Modification Engine, represented by block 26 of Figure 1.
6

Figure 2 shows the detailed procedure for generating the source language phoneme pronunciation lexicon for the target language words in the application.
Lexicon Modification Engine;
Lexicon Modification is the main and often the only required modification to the existing SR applications. In accordance with the present invention, the Lexicon Modification Engine (LME) 24 involves automatic creation of the pronunciation lexicon represented by block 28 of Figure 1, of the words to be recognized, which are present in the source language, into the target language. LME 24 consists of a source to target word dictionary, which is a database of words for translating each word in the source lexicon to the target grapheme. LME 24 receives data/words from Application Data 22 and gives the pronunciations to Phoneme lexicon 28.
The LME 24 takes each word in the source lexicon represented by step 102 of Figure 2, and determines its translation from the source language to target language using the word dictionary represented by block 100 of Figure 2, in
the target grapheme, for example gold is converted to *IMI represented by step 108 of Figure 2, then a transliteration is done to convert source language *11*11 to sona represented by step 110 of Figure 2. The pronunciation is determined from sona as "s ow n aa" using a grapheme to phoneme in the source language represented by step 112 of Figure 2 and as seen in Table 1. This process speeds up the process of lexicon creation in the target language though the grapheme to phoneme conversion in the source language for a
7

word in the target language. The 'Phoneme Lexicon' 28 provides the pronunciations to 14 of the call flow unit 10.

English Hindi
Grammar phrase < Gold > < Gold >
Lexicon entry /g/ow/l/d/ /s/ow/n/aa/
Application asks for input (grammar unchanged) (grammar unchanged)
User speaks /gold/ /sonaa/
SR output and Process input 'Gold' 'Gold'
Table 1
If the source language word has a translation in source to target language dictionary 100, then the target language word is fetched, as represented by path referenced by numeral 106 in Figure 2 otherwise path represented by reference numeral 104 in Figure 2 is taken, where the source language word is transliterated to target language (assuming it is spoken in the same way as in the target language e.g. proper nouns).
In accordance with another aspect of the present invention, source to target to source transliteration is done because direct source to phoneme conversion in the source language may not give correct pronunciation phoneme sequence of the target language word.
8

Grammar Modification Engine:
Grammar modification is generally not required for an existing menu driven based speech application because the application expects only a word or a small sequence of words as the input from the user. These words are addressed by the source-target word dictionary to create the pronunciation lexicon 28.
Grammar modification (source-to-target) is required in cases where the speech application system is expected to handle free speech queries. The grammar creation for the target language is achieved by source to target language translation followed by a transliteration to source language.
The Grammar Modification Engine 26 performs these translations and transliterations and gives the output as grammar represented by block 30.
If recorded prompts are used in the existing application, then a similar database of prompts in the target language is created and the application points to this database for prompts and responses to the user.
The technical advancements of the present invention include:
• providing a system for multilingual speech recognition;
• providing a system for enabling an existing application to be quickly ported to work in another language;
• providing a system for accurate source to target language transliterations and translations;
• providing a system which will generate source language phonemic pronunciations of target language words;
9

• providing a system which minimizes the efforts equivalent to designing a new application in the target language when wanting to 'port' the existing application in a source language in target language; and
• providing a system which reuses the original application and business logic.
While considerable emphasis has been placed herein on the particular features of this invention, it will be appreciated that various modifications can be made, and that many changes can be made in the preferred embodiments without departing from the principles of the invention. These and other modifications in the nature of the invention or the preferred embodiments will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter is to be interpreted merely as illustrative of the invention and not as a limitation.

MOHAN DEWAN Of R.K.DEWAN&CO. APPLICANT'S PATENT ATTORNEY
10

Documents

Orders

Section Controller Decision Date

Application Documents

# Name Date
1 1263-MUM-2009-FORM 5(03-05-2010).pdf 2010-05-03
1 1263-MUM-2009-RELEVANT DOCUMENTS [28-09-2023(online)].pdf 2023-09-28
2 1263-MUM-2009-FORM 2(TITLE PAGE)-(03-05-2010).pdf 2010-05-03
2 1263-MUM-2009-RELEVANT DOCUMENTS [26-09-2022(online)].pdf 2022-09-26
3 1263-MUM-2009-RELEVANT DOCUMENTS [30-09-2021(online)].pdf 2021-09-30
3 1263-mum-2009-form 2(03-05-2010).pdf 2010-05-03
4 1263-MUM-2009-RELEVANT DOCUMENTS [29-03-2020(online)].pdf 2020-03-29
4 1263-MUM-2009-DRAWING(03-05-2010).pdf 2010-05-03
5 1263-MUM-2009-IntimationOfGrant06-03-2019.pdf 2019-03-06
5 1263-MUM-2009-DESCRIPTION(COMPLETE)-(03-05-2010).pdf 2010-05-03
6 1263-MUM-2009-PatentCertificate06-03-2019.pdf 2019-03-06
6 1263-MUM-2009-CORRESPONDENCE(03-05-2010).pdf 2010-05-03
7 1263-MUM-2009-CORRESPONDENCE(23-6-2010).pdf 2018-08-10
7 1263-MUM-2009-CLAIMS(03-05-2010).pdf 2010-05-03
8 1263-MUM-2009-CORRESPONDENCE(5-2-2010).pdf 2018-08-10
8 1263-MUM-2009-ABSTRACT(03-05-2010).pdf 2010-05-03
9 1263-MUM-2009-CORRESPONDENCE(IPO)-(FER)-(5-1-2016).pdf 2018-08-10
9 1263-MUM-2009-FORM 18(30-11-2010).pdf 2010-11-30
10 1263-MUM-2009-CORRESPONDENCE(30-11-2010).pdf 2010-11-30
10 1263-MUM-2009-Correspondence-120216.pdf 2018-08-10
11 1263-mum-2009-correspondence.pdf 2018-08-10
11 OTHERS [23-03-2016(online)].pdf 2016-03-23
12 Examination Report Reply Recieved [23-03-2016(online)].pdf 2016-03-23
13 1263-mum-2009-description(provisional).pdf 2018-08-10
13 Description(Complete) [23-03-2016(online)].pdf 2016-03-23
14 1263-mum-2009-drawing.pdf 2018-08-10
14 Correspondence [23-03-2016(online)].pdf 2016-03-23
15 1263-MUM-2009-FORM 1(5-2-2010).pdf 2018-08-10
15 Claims [23-03-2016(online)].pdf 2016-03-23
16 1263-mum-2009-form 1.pdf 2018-08-10
16 Abstract [23-03-2016(online)].pdf 2016-03-23
17 1263-MUM-2009-Written submissions and relevant documents (MANDATORY) [06-11-2017(online)].pdf 2017-11-06
17 1263-mum-2009-form 2(title page).pdf 2018-08-10
18 1263-MUM-2009-PETITION UNDER RULE 137 [06-11-2017(online)]_8.pdf 2017-11-06
19 1263-mum-2009-form 2.pdf 2018-08-10
19 1263-MUM-2009-PETITION UNDER RULE 137 [06-11-2017(online)].pdf 2017-11-06
20 1263-mum-2009-form 26.pdf 2018-08-10
20 abstract1.jpg 2018-08-10
21 1263-MUM-2009-FORM 3(23-6-2010).pdf 2018-08-10
21 1263-MUM-2009_EXAMREPORT.pdf 2018-08-10
22 1263-MUM-2009-Form 3-120216.pdf 2018-08-10
22 1263-MUM-2009-HearingNoticeLetter.pdf 2018-08-10
23 1263-mum-2009-form 3.pdf 2018-08-10
24 1263-MUM-2009-Form 3-120216.pdf 2018-08-10
24 1263-MUM-2009-HearingNoticeLetter.pdf 2018-08-10
25 1263-MUM-2009_EXAMREPORT.pdf 2018-08-10
25 1263-MUM-2009-FORM 3(23-6-2010).pdf 2018-08-10
26 abstract1.jpg 2018-08-10
26 1263-mum-2009-form 26.pdf 2018-08-10
27 1263-mum-2009-form 2.pdf 2018-08-10
27 1263-MUM-2009-PETITION UNDER RULE 137 [06-11-2017(online)].pdf 2017-11-06
28 1263-MUM-2009-PETITION UNDER RULE 137 [06-11-2017(online)]_8.pdf 2017-11-06
29 1263-mum-2009-form 2(title page).pdf 2018-08-10
29 1263-MUM-2009-Written submissions and relevant documents (MANDATORY) [06-11-2017(online)].pdf 2017-11-06
30 1263-mum-2009-form 1.pdf 2018-08-10
30 Abstract [23-03-2016(online)].pdf 2016-03-23
31 1263-MUM-2009-FORM 1(5-2-2010).pdf 2018-08-10
31 Claims [23-03-2016(online)].pdf 2016-03-23
32 1263-mum-2009-drawing.pdf 2018-08-10
32 Correspondence [23-03-2016(online)].pdf 2016-03-23
33 1263-mum-2009-description(provisional).pdf 2018-08-10
33 Description(Complete) [23-03-2016(online)].pdf 2016-03-23
34 Examination Report Reply Recieved [23-03-2016(online)].pdf 2016-03-23
35 1263-mum-2009-correspondence.pdf 2018-08-10
35 OTHERS [23-03-2016(online)].pdf 2016-03-23
36 1263-MUM-2009-CORRESPONDENCE(30-11-2010).pdf 2010-11-30
36 1263-MUM-2009-Correspondence-120216.pdf 2018-08-10
37 1263-MUM-2009-CORRESPONDENCE(IPO)-(FER)-(5-1-2016).pdf 2018-08-10
37 1263-MUM-2009-FORM 18(30-11-2010).pdf 2010-11-30
38 1263-MUM-2009-CORRESPONDENCE(5-2-2010).pdf 2018-08-10
38 1263-MUM-2009-ABSTRACT(03-05-2010).pdf 2010-05-03
39 1263-MUM-2009-CORRESPONDENCE(23-6-2010).pdf 2018-08-10
39 1263-MUM-2009-CLAIMS(03-05-2010).pdf 2010-05-03
40 1263-MUM-2009-PatentCertificate06-03-2019.pdf 2019-03-06
40 1263-MUM-2009-CORRESPONDENCE(03-05-2010).pdf 2010-05-03
41 1263-MUM-2009-IntimationOfGrant06-03-2019.pdf 2019-03-06
41 1263-MUM-2009-DESCRIPTION(COMPLETE)-(03-05-2010).pdf 2010-05-03
42 1263-MUM-2009-RELEVANT DOCUMENTS [29-03-2020(online)].pdf 2020-03-29
42 1263-MUM-2009-DRAWING(03-05-2010).pdf 2010-05-03
43 1263-MUM-2009-RELEVANT DOCUMENTS [30-09-2021(online)].pdf 2021-09-30
43 1263-mum-2009-form 2(03-05-2010).pdf 2010-05-03
44 1263-MUM-2009-FORM 2(TITLE PAGE)-(03-05-2010).pdf 2010-05-03
44 1263-MUM-2009-RELEVANT DOCUMENTS [26-09-2022(online)].pdf 2022-09-26
45 1263-MUM-2009-FORM 5(03-05-2010).pdf 2010-05-03
45 1263-MUM-2009-RELEVANT DOCUMENTS [28-09-2023(online)].pdf 2023-09-28

ERegister / Renewals

3rd: 15 May 2019

From 19/05/2011 - To 19/05/2012

4th: 15 May 2019

From 19/05/2012 - To 19/05/2013

5th: 15 May 2019

From 19/05/2013 - To 19/05/2014

6th: 15 May 2019

From 19/05/2014 - To 19/05/2015

7th: 15 May 2019

From 19/05/2015 - To 19/05/2016

8th: 15 May 2019

From 19/05/2016 - To 19/05/2017

9th: 15 May 2019

From 19/05/2017 - To 19/05/2018

10th: 15 May 2019

From 19/05/2018 - To 19/05/2019

11th: 15 May 2019

From 19/05/2019 - To 19/05/2020

12th: 02 May 2020

From 19/05/2020 - To 19/05/2021

13th: 30 Apr 2021

From 19/05/2021 - To 19/05/2022

14th: 02 May 2022

From 19/05/2022 - To 19/05/2023

15th: 19 Apr 2023

From 19/05/2023 - To 19/05/2024

16th: 01 May 2024

From 19/05/2024 - To 19/05/2025

17th: 13 May 2025

From 19/05/2025 - To 19/05/2026