Abstract: A tool for automatic generation of a pronunciation lexicon, said tool comprising basis storing means, having a set of words, such that no word in basis can be constructed by joining one or more words in basis; a database of proper nouns such that any entry in said database can be constructed by joining a plurality of words from the basis; optimization means adapted to keep the number of basis words used for constructing a name in said database to a minimum; and means for joining plurality of basis words to form a phonetic transcription.
FORM -2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003
PROVISIONAL
Specification
(See Section 10 and rule 13)
SPEECH RECOGNITION AND TEXT TO SPEECH CONVERSION
TATA CONSULTANCY SERVICES LTD.,
an Indian Company of Bombay House, 24, Sir Homi Mody Street, Mumbai 400 001
Maharashtra, India
Field of Innovation
This invention relates to the field of speech recognition and text to speech conversion.
This invention envisages a tool for automatic creation of a pronunciation lexicon independent of language for proper nouns. These proper nouns are typically names of persons, places.
Background
Text to Speech (TTS) synthesis is an automated encoding process which converts a sequence of symbols (text) conveying linguistic information, into an acoustic waveform (speech). A TTS transforms text into speech after generating a phonetic transcription from text as an intermediate step. Automatic G2P transcription is possible for phonetic transcription of non-proper nouns based on certain rules, but generating phonetic transcriptions of proper nouns, involves human endeavor.
In general, a one to one correspondence between the orthographic representation of a word and its pronunciation is absent more so for proper nouns. In this situation, the default 'character to phone' mapping which maps a sequence of characters into a sequence of phones has to be modified according to some predefined rules. The pronunciation lexicon is created by identifying a set of rules that modify the 'default mapping' of the characters based on the 'context'" in which a particular phoneme occurs and is language dependent.
2
A general purpose rule set can not handle proper names. As a result a proper noun lexicon can not be generated using this set of rules. This requires manual effort to build a pronunciation lexicon. Thus there is a need for a tool for creating a lexicon for proper names automatically.
Objects of the invention
It is an object of this invention to provide a tool for creating a pronunciation lexicon.
Another object of the invention is to provide a pronunciation lexicon for proper names.
Yet another object of this invention is to provide a tool for creating a pronunciation lexicon which is automatic.
Further another object of this invention is to provide a tool for creating a pronunciation lexicon which is independent of language.
Yet one more object of this invention is to provide a tool which reduces the mundane task of transcribing proper names manually.
Summary of the Invention:
In accordance with this invention there is provided a tool for automatic generation of pronunciation lexicon, said tool comprising,
a basis having a set of words such that no word in basis can be constructed by joining one or more words in basis;
3
a database of proper nouns such that any entry in the said database can be constructed by joining a plurality of words from the basis;
optimization means to keep the number of basis words used for constructing a name in said database is minimum; and
means for joining plurality of basis words to form a phonetic transcription.
Brief Description of the Accompanying Drawings:
Fig 1 illustrates a high level view of the automatic pronunciation lexicon generator in accordance with this invention;
Fig 2 illustrates a graphical relation between size of basis |B| and the total number of joins |J[ in accordance with this invention; and
Fig 3 illustrates the flow chart for the method in accordance with this invention.
Detailed Description of the Accompanying Drawings:
In accordance with this invention there is provided a tool for automatic generation of pronunciation lexicon for proper names.
Figure 1 shows a high level view of the present invention. The automatic pronunciation lexicon generator takes the proper names database as input and creates the pronunciation lexicon. The construction of the 'Automatic Pronunciation lexicon creator is described as follows.
4
In vector algebra, among other things a basis is defined as a set of linearly independent vectors that can span the entire vector space. This definition of basis is used in the present invention. Here, the complete set of entries in the proper nouns database is the vector space while the basis is a set of words such that, one can construct an entry in the database by joining one or more words from the basis. Further, no word in the basis can be constructed by joining one or more words in the basis.
The invention optimizes the number of words in the basis such that the number of basis words used to construct a name in the database is the least. The optimizing scheme is described in further detail below.
5
sequence, are appended to the basis (Bm in the Figure 3) if the new word to be added satisfies the basis condition (accomplished by two functions namely isBasis() and makeBasis() in Figure 3). This process is to be repeated for all the proper nouns in the database.
This process of constructing is repeated until no significant growth (e) in the basis is observed in two successive iterations.
Advantages:
• An efficient, fast and automatic procedure for creating pronunciation lexicon for proper nouns
• Negates the manual effort in transcribing the proper names.
• The process assists in enabling text to speech synthesis system to spell proper nouns reliably and correctly.
• The transcription problem reduces to one of the minimizing the constructed cost function.
• The invention is not restricted to any particular language proper nouns.
• The proposed procedure is applicable for any database of proper nouns
While considerable emphasis has been placed herein on the specific structure of the preferred embodiment, it will be appreciated that many alterations can be made and that many modifications can be made in the preferred embodiment without departing from the principles of the invention. These and other changes in the preferred embodiment as well as other embodiments of the invention will be apparent to those skilled in the
art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter is to be interpreted merely as illustrative of the invention and not as a limitation.
Dated this 21st day of August 2008.
11
| Section | Controller | Decision Date |
|---|---|---|
| # | Name | Date |
|---|---|---|
| 1 | 1766-MUM-2008-FORM 1(18-09-2008).pdf | 2008-09-18 |
| 1 | 1766-MUM-2008-RELEVANT DOCUMENTS [28-09-2023(online)].pdf | 2023-09-28 |
| 2 | 1766-MUM-2008-RELEVANT DOCUMENTS [26-09-2022(online)].pdf | 2022-09-26 |
| 2 | 1766-MUM-2008-CORRESPONDENCE(18-09-2008).pdf | 2008-09-18 |
| 3 | OTHERS [18-05-2016(online)].pdf | 2016-05-18 |
| 3 | 1766-MUM-2008-RELEVANT DOCUMENTS [30-09-2021(online)].pdf | 2021-09-30 |
| 4 | Examination Report Reply Recieved [18-05-2016(online)].pdf | 2016-05-18 |
| 4 | 1766-MUM-2008-IntimationOfGrant27-02-2020.pdf | 2020-02-27 |
| 5 | Description(Complete) [18-05-2016(online)].pdf | 2016-05-18 |
| 5 | 1766-MUM-2008-PatentCertificate27-02-2020.pdf | 2020-02-27 |
| 6 | Correspondence [18-05-2016(online)].pdf | 2016-05-18 |
| 6 | 1766-MUM-2008-Written submissions and relevant documents [06-02-2020(online)].pdf | 2020-02-06 |
| 7 | Claims [18-05-2016(online)].pdf | 2016-05-18 |
| 7 | 1766-MUM-2008-ORIGINAL UR 6(1A) FORM 26-290120.pdf | 2020-01-30 |
| 8 | Abstract [18-05-2016(online)].pdf | 2016-05-18 |
| 8 | 1766-MUM-2008-FORM-26 [24-01-2020(online)].pdf | 2020-01-24 |
| 9 | abstract1.jpg | 2018-08-09 |
| 9 | 1766-MUM-2008-HearingNoticeLetter-(DateOfHearing-27-01-2020).pdf | 2019-12-31 |
| 10 | 1766-mum-2008-abstract(18-8-2009).pdf | 2018-08-09 |
| 10 | 1766-MUM-2008_EXAMREPORT.pdf | 2018-08-09 |
| 11 | 1766-mum-2008-power of attorney.pdf | 2018-08-09 |
| 11 | 1766-mum-2008-claims(18-8-2009).pdf | 2018-08-09 |
| 12 | 1766-mum-2008-correspondence(18-8-2009).pdf | 2018-08-09 |
| 12 | 1766-MUM-2008-GENERAL POWER OF ATTORNEY-160516.pdf | 2018-08-09 |
| 13 | 1766-MUM-2008-CORRESPONDENCE(4-11-2010).pdf | 2018-08-09 |
| 13 | 1766-mum-2008-form 5(18-8-2009).pdf | 2018-08-09 |
| 14 | 1766-MUM-2008-CORRESPONDENCE-160516.pdf | 2018-08-09 |
| 14 | 1766-mum-2008-form 3.pdf | 2018-08-09 |
| 15 | 1766-mum-2008-correspondence.pdf | 2018-08-09 |
| 15 | 1766-mum-2008-form 2.pdf | 2018-08-09 |
| 16 | 1766-mum-2008-description(complete)-(18-8-2009).pdf | 2018-08-09 |
| 17 | 1766-mum-2008-form 2(title page).pdf | 2018-08-09 |
| 18 | 1766-mum-2008-description(provisional).pdf | 2018-08-09 |
| 18 | 1766-mum-2008-form 2(title page)-(18-8-2009).pdf | 2018-08-09 |
| 19 | 1766-mum-2008-drawing(18-8-2009).pdf | 2018-08-09 |
| 19 | 1766-mum-2008-form 2(18-8-2009).pdf | 2018-08-09 |
| 20 | 1766-mum-2008-drawing.pdf | 2018-08-09 |
| 20 | 1766-MUM-2008-FORM 18(4-11-2010).pdf | 2018-08-09 |
| 21 | 1766-mum-2008-form 1.pdf | 2018-08-09 |
| 21 | 1766-mum-2008-form 13(18-8-2009).pdf | 2018-08-09 |
| 22 | 1766-mum-2008-form 1.pdf | 2018-08-09 |
| 22 | 1766-mum-2008-form 13(18-8-2009).pdf | 2018-08-09 |
| 23 | 1766-mum-2008-drawing.pdf | 2018-08-09 |
| 23 | 1766-MUM-2008-FORM 18(4-11-2010).pdf | 2018-08-09 |
| 24 | 1766-mum-2008-drawing(18-8-2009).pdf | 2018-08-09 |
| 24 | 1766-mum-2008-form 2(18-8-2009).pdf | 2018-08-09 |
| 25 | 1766-mum-2008-description(provisional).pdf | 2018-08-09 |
| 25 | 1766-mum-2008-form 2(title page)-(18-8-2009).pdf | 2018-08-09 |
| 26 | 1766-mum-2008-form 2(title page).pdf | 2018-08-09 |
| 27 | 1766-mum-2008-description(complete)-(18-8-2009).pdf | 2018-08-09 |
| 28 | 1766-mum-2008-correspondence.pdf | 2018-08-09 |
| 28 | 1766-mum-2008-form 2.pdf | 2018-08-09 |
| 29 | 1766-MUM-2008-CORRESPONDENCE-160516.pdf | 2018-08-09 |
| 29 | 1766-mum-2008-form 3.pdf | 2018-08-09 |
| 30 | 1766-MUM-2008-CORRESPONDENCE(4-11-2010).pdf | 2018-08-09 |
| 30 | 1766-mum-2008-form 5(18-8-2009).pdf | 2018-08-09 |
| 31 | 1766-mum-2008-correspondence(18-8-2009).pdf | 2018-08-09 |
| 31 | 1766-MUM-2008-GENERAL POWER OF ATTORNEY-160516.pdf | 2018-08-09 |
| 32 | 1766-mum-2008-claims(18-8-2009).pdf | 2018-08-09 |
| 32 | 1766-mum-2008-power of attorney.pdf | 2018-08-09 |
| 33 | 1766-mum-2008-abstract(18-8-2009).pdf | 2018-08-09 |
| 33 | 1766-MUM-2008_EXAMREPORT.pdf | 2018-08-09 |
| 34 | 1766-MUM-2008-HearingNoticeLetter-(DateOfHearing-27-01-2020).pdf | 2019-12-31 |
| 34 | abstract1.jpg | 2018-08-09 |
| 35 | 1766-MUM-2008-FORM-26 [24-01-2020(online)].pdf | 2020-01-24 |
| 35 | Abstract [18-05-2016(online)].pdf | 2016-05-18 |
| 36 | 1766-MUM-2008-ORIGINAL UR 6(1A) FORM 26-290120.pdf | 2020-01-30 |
| 36 | Claims [18-05-2016(online)].pdf | 2016-05-18 |
| 37 | 1766-MUM-2008-Written submissions and relevant documents [06-02-2020(online)].pdf | 2020-02-06 |
| 37 | Correspondence [18-05-2016(online)].pdf | 2016-05-18 |
| 38 | 1766-MUM-2008-PatentCertificate27-02-2020.pdf | 2020-02-27 |
| 38 | Description(Complete) [18-05-2016(online)].pdf | 2016-05-18 |
| 39 | 1766-MUM-2008-IntimationOfGrant27-02-2020.pdf | 2020-02-27 |
| 39 | Examination Report Reply Recieved [18-05-2016(online)].pdf | 2016-05-18 |
| 40 | OTHERS [18-05-2016(online)].pdf | 2016-05-18 |
| 40 | 1766-MUM-2008-RELEVANT DOCUMENTS [30-09-2021(online)].pdf | 2021-09-30 |
| 41 | 1766-MUM-2008-RELEVANT DOCUMENTS [26-09-2022(online)].pdf | 2022-09-26 |
| 41 | 1766-MUM-2008-CORRESPONDENCE(18-09-2008).pdf | 2008-09-18 |
| 42 | 1766-MUM-2008-RELEVANT DOCUMENTS [28-09-2023(online)].pdf | 2023-09-28 |
| 42 | 1766-MUM-2008-FORM 1(18-09-2008).pdf | 2008-09-18 |