A System And Method For Rendering Textual Messages Using Customized

A System And Method For Rendering Textual Messages Using Customized Natural Voice

Abstract: This disclosure relates generally to the text-to-speech synthesis and more particularly to a system and method for rendering textual messages using customized natural voice. In one embodiment, a system for rendering textual messages using customized natural voice, is disclosed, comprising a processor and a memory communicatively coupled to the processor. The memory stores processor instructions, which, on execution, causes the processor to receive present textual messages and at least one of previous textual messages, response to the previous textual messages or receiver’s context. The processor further predicts final emotional state of sender’s customized natural voice based on an intermediate emotional state and the receiver’s context. The processor further synthesizes the sender’s customized natural voice based on the predicted final emotional state of the sender’s customized natural voice, voice samples and voice parameters associated with the at least one sender. FIG.2

Patent Information

Application #

Filing Date

31 March 2017

Publication Number

40/2018

Publication Type

INA

Invention Field

PHYSICS

Status

Email

ipo@knspartners.com

Parent Application

Patent Number

Legal Status

Grant Date

2023-06-06

Renewal Date

Applicants

WIPRO LIMITED

Doddakannelli, Sarjapur Road, Bangalore 560035, Karnataka, India.

Inventors

1. ADRITA BARARI

Block B-16, Flat 249, Kendriya Vihar, Koikhali, VIP Road, Kolkata – 700052, West Bengal, India.

2. GHULAM MOHIUDDIN KHAN

F-901, Concorde Manhattans, Electronic City Phase -I, Bangalore – 560100, Karnataka, India.

3. MANJUNATH RAMACHANDRA IYER

80, Sadhana, 2nd Main, BSK 3rd Stage, Katriguppe East, Bangalore-560085, Karnataka, India.

Specification

Claims:WE CLAIM
1. A method of rendering one or more present textual messages using customized natural voice of at least one sender, the method comprising:
receiving, by a customized voice synthesizer, the one or more present
textual messages from the at least one sender and at least one of one or more previous textual messages from the at least one sender, response to the one or more previous textual messages or receiver’s context;
predicting, by the customized voice synthesizer, final emotional state of sender’s customized natural voice based on an intermediate emotional state and the receiver’s context, wherein the intermediate emotional state is based on emotions associated with the at least one of the one or more present textual messages, the one or more previous textual messages, the response to the one or more previous textual messages or the receiver of the one or more present textual messages; and
synthesizing, by the customized voice synthesizer, the sender’s customized natural voice based on the predicted final emotional state of the sender’s customized natural voice, voice samples and voice parameters associated with the at least one sender.
2. The method as claimed in claim 1, further comprising generating a voice dataset, wherein the voice dataset comprises at least one of the voice parameters and the voice samples associated with the at least one sender.
3. The method as claimed in claim 2, wherein the voice samples is determined from at least one of voice calls, videos, audios, social sites, public domains or previously built databases.
4. The method as claimed in claim 1, wherein the voice parameters comprise at least one of pitch, rate, quality of voice, amplitude, style of speaking, tone, user’s pronunciation, prosody or pause taken in between each sentences.
5. The method as claimed in claim 1, wherein the receiver’s context comprises at least one of receiver’s location, receiver’s state, receiver’s health condition or receiver’s preferences.
6. The method as claimed in claim 1, further comprising summarizing the one or more present textual messages based on at least one of content of the one or more present textual messages and the receiver’s preferences.
7. The method as claimed in claim 1, wherein the final emotional state of sender’s customized natural voice is predicted using deep learning techniques.
8. The method as claimed in claim 1, wherein predicting the final emotional state of sender’s customized natural voice comprises:
determining, by a deep neural network, an intermediate emotional vector
that is associated with the intermediate emotional state;
assigning, by the deep neural network, weightages to the intermediate
emotional vector and the receiver’s context based on at least one of time lapse between receiving the one or more previous textual messages and receiving the one or more present textual messages, time lapse between receiving the response to the one or more previous textual messages and receiving the one or more present textual messages, overall emotion associated with the one or more present textual messages and the receiver associated with the one or more present textual messages; and
predicting, by the deep neural network, a final emotional vector based on the intermediate emotional vector and the weightages, wherein the final emotional vector is associated with the final emotional state of sender’s customized natural voice.
9. The method as claimed in claim 1, wherein synthesizing the customized natural voice is done using deep learning techniques.
10. A system for rendering one or more present textual messages using customized natural voice of at least one sender, the system comprising:
a processor;
a memory communicatively coupled to the processor, wherein the memory stores the processor-executable instructions, which, on execution, causes the processor to:
receive the one or more present textual messages from the at least one
sender and at least one of one or more previous textual messages from the at least one sender, response to the one or more previous textual messages or receiver’s context;
predict final emotional state of sender’s customized natural voice based on an intermediate emotional state and the receiver’s context, wherein the intermediate emotional state is based on emotions associated with the at least one of the one or more present textual messages, the one or more previous textual messages, the response to the one or more previous textual messages or the receiver of the one or more present textual messages; and
synthesize the sender’s customized natural voice based on the predicted final emotional state of sender’s customized natural voice, voice samples and voice parameters associated with the at least one sender.
11. The system as claimed in claim 10, wherein the processor is further configured to generate a voice dataset, wherein the voice dataset comprises at least one of the voice parameters and the voice samples associated with the at least one sender.
12. The system as claimed in claim 11, wherein the voice samples is determined from at least one of voice calls, videos, audios, social sites, public domains or previously built databases.
13. The system as claimed in claim 10, wherein the voice parameters comprise at least one of pitch, rate, quality of voice, amplitude, style of speaking, tone, user’s pronunciation, prosody or pause taken in between each sentences.
14. The system as claimed in claim 10, wherein the receiver’s context comprises at least one of receiver’s location, receiver’s state, receiver’s health condition or receiver’s preferences.
15. The system as claimed in claim 10, wherein the processor is further configured to summarize the one or more present textual messages based on at least one of content of the one or more present textual messages or the receiver’s preferences.
16. The system as claimed in claim 10, wherein the final emotional state of sender’s customized natural voice is predicted using deep learning techniques.
17. The system as claimed in claim 10, wherein the processor is configured to predict the final emotional state of the sender’s customized natural voice by:
determining an intermediate emotional vector that is associated with the
intermediate emotional state;
assigning weightages to the intermediate emotional vector and the
receiver’s context based on at least one of time lapse between receiving the one or more previous textual messages and receiving the one or more present textual messages, time lapse between receiving the response to the one or more previous textual messages and receiving the one or more present textual messages, overall emotion associated with the one or more present textual messages and the receiver associated with the one or more present textual messages; and
predicting the final emotional vector based on the intermediate emotional vector and the weightages, wherein the final emotional vector is associated with the final emotional state of sender’s customized natural voice.
18. The system as claimed in claim 10, wherein synthesizing the customized natural voice is done using deep learning techniques.

Dated this 31st day of March, 2017

Swetha SN
Of K&S Partners
Agent for the Applicant
, Description:TECHNICAL FIELD
This disclosure relates generally to the text-to-speech synthesis and more particularly to a system and method for rendering textual messages using customized natural voice.

Documents

Application Documents

#	Name	Date
1	Power of Attorney [31-03-2017(online)].pdf	2017-03-31
2	Form 5 [31-03-2017(online)].pdf	2017-03-31
3	Form 3 [31-03-2017(online)].pdf	2017-03-31
4	Form 18 [31-03-2017(online)].pdf_67.pdf	2017-03-31
5	Form 18 [31-03-2017(online)].pdf	2017-03-31
6	Form 1 [31-03-2017(online)].pdf	2017-03-31
7	Drawing [31-03-2017(online)].pdf	2017-03-31
8	Description(Complete) [31-03-2017(online)].pdf_68.pdf	2017-03-31
9	Description(Complete) [31-03-2017(online)].pdf	2017-03-31
10	PROOF OF RIGHT [21-06-2017(online)].pdf	2017-06-21
11	Correspondence by Agent_Form 1_23-06-2017.pdf	2017-06-23
12	Abstract_201741011632.jpg	2017-06-30
13	201741011632-REQUEST FOR CERTIFIED COPY [07-06-2018(online)].pdf	2018-06-07
14	201741011632-Response to office action (Mandatory) [11-06-2018(online)].pdf	2018-06-11
15	201741011632-PETITION UNDER RULE 137 [06-04-2021(online)].pdf	2021-04-06
16	201741011632-Information under section 8(2) [06-04-2021(online)].pdf	2021-04-06
17	201741011632-FORM 3 [06-04-2021(online)].pdf	2021-04-06
18	201741011632-FER_SER_REPLY [06-04-2021(online)].pdf	2021-04-06
19	201741011632-FER.pdf	2021-10-17
20	201741011632-PatentCertificate06-06-2023.pdf	2023-06-06
21	201741011632-IntimationOfGrant06-06-2023.pdf	2023-06-06
22	201741011632-PROOF OF ALTERATION [11-09-2023(online)].pdf	2023-09-11

Search Strategy

1	searchstrategyE_05-10-2020.pdf