“Methods And Systems To Transform A User Recorded Video Into A Video

< Back

“Methods And Systems To Transform A User Recorded Video Into A Video Sticker”

Abstract: The present invention relates to a technique for generating an animated video message. The technique includes receiving, a recorded video message from a user, via a user device and segregating, audio and video component from the recorded message. The technique includes processing the audio component to extract phonemes associated with the recorded message and determining user facial motion data based on the extracted phonemes. Thereafter, the technique includes processing the video component to extract at least one of facial information and emotion information of the user and generating graphical representation of the user based on the extracted user facial information and the emotion information. Moreover, the technique includes integrating the user facial motion data with the graphical representation of the user to generate an animated representation of the user and integrating the audio component with the animated representation of the user to generate the animated video message.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

24 February 2020

Publication Number

36/2021

Publication Type

INA

Invention Field

COMMUNICATION

Status

ipo@knspartners.com

Parent Application

Applicants

Hike Private Limited

4th Floor, Indira Gandhi International Airport, Worldmark 1, Northern Access Road, Aerocity, New Delhi, Delhi 110037, India

Inventors

1. Ankur Narang

Hike Private Limited, 4th Floor, Indira Gandhi International Airport, Worldmark 1, Northern Access Road, Aerocity, New Delhi, Delhi 110037, India

2. Kavin Bharti Mittal

Hike Private Limited, 4th Floor, Indira Gandhi International Airport, Worldmark 1, Northern Access Road, Aerocity, New Delhi, Delhi 110037, India

3. Dipankar Sarkar

Hike Private Limited, 4th Floor, Indira Gandhi International Airport, Worldmark 1, Northern Access Road, Aerocity, New Delhi, Delhi 110037, India

Specification

[0001] The present disclosure generally relates to message communications. More
particularly but not exclusively, the present disclosure relates to generating animated
video message based on user recorded video.
BACKGROUND
[0002] Messaging services/applications allow users to communicate without being
physically present at the same location. The messaging services allow users to
communicate via a number of communication mechanisms, such as telephony, email,
multimedia messaging, and instant messaging. One or more of these communication
mechanisms allow a user to record and send video messages to another user.
[0003] However, such conventional messaging platforms/applications fail to provide
modification and/or customization to said video recorded message and send the
message as originally recorded. Therefore, such platforms/applications fail to provide
interactive user experience to the user. Accordingly, there exists a need in the art to
provide a solution which overcomes the above-mentioned problems.
SUMMARY
[0004] Exemplary aspects are directed to a system and method for generating an
animated video message. The system may provide animated audio and video effect to
a recorded video message and thereby enhances a user experience and interest while
communicating via video messages.
[0005] In an exemplary aspect, the present disclosure describes a method for
generating an animated video message. The method includes receiving, a recorded
video message from a user, via a user device. Further, the method includes
segregating, audio and video component from the recorded video message. The
method also includes processing the audio component to extract at least one or more
phonemes associated with the recorded video message. Moreover, the method
includes determining user facial motion data based on the extracted one or more
phonemes and processing the video component to extract at least one of facial
information and emotion information of the user. Further, the method includes
3
generating graphical representation of the user based on the extracted user facial
information and the emotion information and integrating the user facial motion data
with the graphical representation of the user to generate an animated representation
of the user. The method then includes integrating the audio component with the
animated representation of the user to generate the animated video message.
[0006] According to an aspect, the present disclosure relates to a method of generating
animated video message wherein user facial motion data comprises at least one of
mouth movement of the user.
[0007] According to another aspect, the present disclosure relates to a method of
generating animated video message wherein user facial information comprises
positional information and motion information associated with one or more facial
components, wherein the one or more facial components comprises eyes, nose, ear,
lips, eyebrows, and head.
[0008] According to yet another aspect, the present disclosure relates to a method of
generating animated video message wherein integrating the audio component with
the animated representation of the user comprises synchronizing the audio component
of the user recorded message with the animated graphical representation of the user
based on time stamps stored over a blockchain network.
[0009] In an exemplary aspect, the present disclosure describes a device for
generating an animated video message. The device includes a memory and a
processor coupled to the memory. The processor configured to receive a recorded
video message of a user, via user device. The processor is also configured to segregate
audio and video component from the recorded video message. The processor is
further configured to process the audio component to extract at least one or more
phonemes associated with the recorded video message. Further, the processor is
configured to determine user facial motion data based on the extracted one or more
phonemes. The processor is also configured to process the video component to extract
at least one of facial information and emotion information of the user and generate
graphical representation of the user based on the extracted user facial information and
the emotion information. Moreover, the processor is configured to integrate the user
facial motion data with the graphical representation of the user to generate an
4
animated representation of the user and integrate the audio component with the
animated representation of the user to generate the animated video message.
[0010] According to an aspect, the present disclosure relates to a device for
generating animated video message wherein user facial motion data comprises at least
one of mouth movement of the user.
[0011] According to another aspect, the present disclosure relates to a device for
generating animated video message wherein user facial information comprises
positional information and motion information associated with one or more facial
components, wherein the one or more facial components comprises eyes, nose, ear,
lips, eyebrows, and head.
[0012] According to yet another aspect, the present disclosure relates to a device for
generating animated video message wherein integrating the audio component with
the animated representation of the user comprises synchronizing the audio component
of the user recorded message with the animated graphical representation of the user
based on time stamps stored over a blockchain network.
[0013] The foregoing summary is illustrative only and is not intended to be in any
way limiting. In addition to the illustrative aspects, embodiments, and features
described above, further aspects, embodiments, and features will become apparent by
reference to the drawings and the following detailed description
OBJECTIVES OF THE INVENTION
[0014] An object of present invention is to provide system and method generating an
animated video message based on a recorded video message.
[0015] Another object of present invention is to provide system and method to
enhance user experience while communicating via the video messaging.
[0016] Yet another object of present invention is to provide system and method for
interactive video messaging.
5
BRIEF DESCRIPTION OF THE DRAWINGS
[0017] The accompanying drawings, which are incorporated in and constitute a part
of this disclosure, illustrate exemplary embodiments and, together with the
description, serve to explain the disclosed embodiments. In the figures, the left-most
digit(s) of a reference number identifies the figure in which the reference number first
appears. The same numbers are used throughout the figures to reference like features
and components. Some embodiments of system and/or methods in accordance with
embodiments of the present subject matter are now described, by way of example
only, and with reference to the accompanying figures, in which:
[0018] Fig. 1 illustrates an exemplary diagram of a system for implementing message
communication using animated video message, in accordance with an embodiment
of present disclosure.
[0019] Fig. 2 illustrates a block diagram of a user device, in accordance with an
embodiment of present disclosure.
[0020] FIG. 3 is a flow diagram illustrating the process of generating an animated
video message, in accordance with an embodiment of present disclosure.
[0021] It should be appreciated by those skilled in the art that any block diagrams
herein represent conceptual views of illustrative systems embodying the principles of
the present subject matter. Similarly, it will be appreciated that any flow charts, flow
diagrams, state transition diagrams, pseudo code, and the like represent various
processes which may be substantially represented in computer readable medium and
executed by a computer or processor, whether or not such computer or processor is
explicitly shown.
DETAILED DESCRIPTION
[0022] In the present document, the word “exemplary” is used herein to mean
“serving as an example, instance, or illustration.” Any embodiment or
6
implementation of the present subject-matter described herein as “exemplary” is not
necessarily to be construed as preferred or advantageous over other embodiments.
[0023] While the disclosure is susceptible to various modifications and alternative
forms, specific embodiment thereof has been shown by way of example in the
drawings and will be described in detail below. It should be understood, however that
it is not intended to limit the disclosure to the particular forms disclosed, but on the
contrary, the disclosure is to cover all modifications, equivalents, and alternatives
falling within the scope of the disclosure.
[0024] The terms “comprises”, “comprising”, “include(s)”, or any other variations
thereof, are intended to cover a non-exclusive inclusion, such that a setup, system or
method that comprises a list of components or steps does not include only those
components or steps but may include other components or steps not expressly listed
or inherent to such setup or system or method. In other words, one or more elements
in a system or apparatus proceeded by “comprises… a” does not, without more
constraints, preclude the existence of other elements or additional elements in the
system or apparatus.
[0025] The phrase “recorded video message”, “video recorded message” and/or “user
recorded message” may be used interchangeably through-out the description.
[0026] In the following detailed description of the embodiments of the disclosure,
reference is made to the accompanying drawings that form a part hereof, and in which
are shown by way of illustration specific embodiments in which the disclosure may
be practiced. These embodiments are described in sufficient detail to enable those
skilled in the art to practice the disclosure, and it is to be understood that other
embodiments may be utilized and that changes may be made without departing from
the scope of the present disclosure. The following description is, therefore, not to be
taken in a limiting sense.
[0027] The present invention will be described herein below with reference to the
accompanying drawings. In the following description, well known functions or
7
constructions are not described in detail since they would obscure the description with
unnecessary detail.
[0028] The present invention relates to a technique for generating an animated visual
message based on a recorded video message. The technique involves a sever that is
accessible by user devices over a network. The server may be configured to provide
a messaging platform to enable transmission of the generated video message between
users. The technique includes receiving, a recorded video message from a user, via a
user device and segregating, audio and video component from the recorded message.
The technique includes processing the audio component to extract phonemes
associated with the recorded message and determining user facial motion data based
on the extracted phonemes. Thereafter, the technique includes processing the video
component to extract at least one of facial information and emotion information of
the user and generating graphical representation of the user based on the extracted
user facial information and the emotion information. Moreover, the technique
includes integrating the user facial motion data with the graphical representation of
the user to generate an animated representation of the user and integrating the audio
component with the animated representation of the user to generate the animated
video message. Accordingly, the technique provides an interactive and effective video
messaging communication.
[0029] Figure 1 illustrates an exemplary environment/system 100 for implementing
message communication via animated video message. The system 100 includes a
plurality of user devices 102a-102n (interchangeably referred to as “the user device
102”), a server 106, and a network 104 connecting the user device 102 and the server
106.
[0030] The user devices 102a-102n may be communicably coupled to each other via
the network 104. The user device 102 may enable a user to communicate with other
user via any suitable communication means such as, but not limited to, calling,
messaging and so forth. According to an exemplary embodiment, the communication
may be in the form exchange of one or more of the following, but not limited to, text,
audio, video, emoji, stickers, animations, and images, audio-visual media etc.
Examples of the user device 102 may include any suitable communication device
8
such as, but not limited to, smartphone, mobile phone, laptop, tablet, portable
communication device and so forth. In an exemplary embodiment, the user device
102 may include a memory and a processor, communicably coupled to each other and
configured to perform the desired functionality of the user device 102. In alternative
embodiments, the user device 102 may include any additional component required to
perform the desired functionality of the user device 102, in accordance with the
embodiments of present disclosure.
[0031] The server 106 may be configured to enable a messaging platform resident on
the user device 102 to allow communication among the user devices 102. The
messaging platform may implement one or more user interfaces to generate an
animated video message to be transmitted by a user to another user. The server 106
may include a memory unit (not shown) and a processing unit (not shown) configured
to implement the desired functionality of the server 106. Additionally, the server 106
may include any suitable component required to perform the desired functionality of
the server 106, in accordance with the embodiments of present disclosure. The sever
106 may remain operatively connected to the one or more user devices 102a-102n to
receive and forward the communication from and to the user devices 102. In some
embodiments, the server 106 may be configured to process a message received from
a source user device, before transmitting to a destined user device.
[0032] The network 104 may include a data network such as, but not restricted to, the
Internet, Local Area Network (LAN), Wide Area Network (WAN), Metropolitan Area
Network (MAN), etc. In certain embodiments, the network 104 may include a
wireless network, such as, but not restricted to, a cellular network and may employ
various technologies including Enhanced Data rates for Global Evolution (EDGE),
General Packet Radio Service (GPRS), Global System for Mobile Communications
(GSM), Internet protocol Multimedia Subsystem (IMS), Universal Mobile
Telecommunications System (UMTS) etc. In other embodiments, the network 104
may include or otherwise cover networks or subnetworks, each of which may include,
for example, a wired or wireless data pathway.
[0033] Fig. 2 illustrates a block diagram of the user device 102 (hereinafter referred
as “the device 102”), in accordance with an embodiment of present disclosure. The
device 102 includes a transceiver 202, a I/O interface 204, a camera 206, a
9
microphone 208, a memory 210, a processor 212 and one or more units. Each of said
components of the device 102 may be communicably coupled to each other.
[0034] The transceiver 202 may be configured to enable communication between the
device 102 and the server 106 (shown in Fig. 1). Further, the transceiver 202 may also
be configured to enable communication between the device 102 and the other user
devices 102. The transceiver 202 may be configured to enable transmission or
reception of data from and at the device 102. In some embodiments, the transceiver
202 may include communication devices such as, but not limited to, antennas,
modulators, demodulators and so forth. In an exemplary embodiment, the transceiver
202 may configured to receive data from the server 106 to implement the messaging
platform at the device 102.
[0035] The I/O interface 204 may enable a user to interact with the device 102. The
I/O interface 204 may include, but not limited to, a mouse, a pointer, a keyboard, a
touch screen, a display, a graphical user interface and/or any other combination of
input and output devices of a computing system. In some embodiments, the user may
provide user input via the I/O interface 204. The I/O interface 204 may be configured
to present one or more user interfaces required to implement the functionality of the
messaging platform provided by the server 106.
[0036] The camera 206 may enable a user to record a video message which a user
intends to send to another user. The camera 206 may be positioned at any suitable
portion of the user device 102. For instance, the camera 206 may be positioned either
on a front side or a back side of the device 102. In some embodiments, the device 102
may include multiple cameras 206, placed suitably as per the requirement. The
camera 206 may be any suitable mobile camera such as, but not limited to,
standard/main camera, ultra-wide camera, telephoto or periscope zoom camera,
macro camera, monochrome camera, and depth sensor or 3D ToF sensor. The user of
the device 102 may access the camera 206 to record a video message.
[0037] The device 102 may also include the microphone 208 configured to enable the
user to record an audio. In an exemplary embodiment, the microphone 208 may be
operatively coupled to the camera 206 and be activated simultaneously with camera
10
206. Therefore, the microphone 208 may capture the audio, while the camera 206 is
capturing a video. In an exemplary embodiment, the camera 206 and microphone 208
may act as an integrated unit which is configured to produce a video recorded message
including audio and video content.
[0038] The device 102 may include the memory 210. In an embodiment, the memory
210 may be configured to store data relating to the messaging platform. In an
exemplary embodiment, the memory 210 may store recorded video message. The
memory 210 may also include one or more graphical components required to generate
a graphical image. The graphical images may correspond to at least one of, but not
limited to, a sticker, an emoji, an emoticon and so forth. The graphical components
may include at least one of a background image, a graphical representation of a user
and a textual data. In addition, the memory 210 may include, but not restricted to, a
Random Access Memory (RAM) unit and/or a non-volatile memory unit such as a
Read Only Memory (ROM), optical disc drive, magnetic disc drive, flash memory,
Electrically Erasable Read Only Memory (EEPROM), and so forth.
[0039] The device 102 may include the processor 212 which may be communicably
coupled to the memory 210 to perform one or more desired functionality. The
processor 212 may be configured to receive the recorded video message. The
processor 212 may be configured to segregate audio and video component from the
recorded video message. The processor 212 may then processes the audio component
to extract one or more audio information included in the video recorded message.
Example of the audio information includes, but restricted to, pitch, amplitude,
frequency, phonemes information, and/or other language related information. In an
exemplary embodiment, the processor 212 may process the audio component to
extract at least one or more phonemes associated with the recorded video message.
Thereafter, the processor 212 may determine user facial motion data based on the
extracted one or more phonemes. The user facial motion data includes at least one of
mouth movement of the user. In some embodiments, the user facial motion data may
correspond to positional information of various muscles and facial parts such as, jaw,
lips, tongue during a speech.
11
[0040] The processor 212 may also be configured to process the video component of
the video recorded message to extract at least one of facial information and emotion
information of the user. The user facial information comprises positional information
and motion information associated with one or more facial components, wherein the
one or more facial components comprises eyes, nose, ear, lips, eyebrows, and head.
The processor 212 then generates a graphical representation of the user based on the
extracted user facial information and emotion information. The graphical
representation of the user comprises at least one of stickers, emojis, and so forth. In
some embodiments, the processor 212 may utilize the one or more graphical
components stored in the memory 210 to generate the graphical representation of the
user. Embodiments either cover or intend to cover, any suitable means required by
the processor 212 to generate the graphical representation of the user. The processor
212 may integrate the user facial motion data with the graphical representation of the
user to generate an animated representation of the user. The animated representation
of the user may be 2-Dimensional or 3-Dimensional in nature. The animated
representation may be a sequence of graphical representation of the user with different
facial characteristics to represent the user facial motion, as determined from the video
recorded message. The processor 212 may then integrate the audio component with
the animated representation of the user to generate the animated video message. In
some embodiments, the processor 212 may identify time stap information associated
with audio and video component of the video recorded message. The time stamp
information may be stored over a blockchain network. The processor 212 then
synchronizes the audio component of the user recorded message with the animated
graphical representation of the user based on time stamps stored over the blockchain
network.
[0041] In some embodiments, the processor 212 may use one or more neural networks
(not shown) and/or any suitable technique such as machine learning, artificial
intelligence and so forth to perform one or more desired functionalities.
[0042] The device 102 may also include one or more units configured to perform one
or more operations of the processor 212. In an embodiment, the processor 212 may
be operatively coupled to the units. The operations and/or functions of the processor
212 and the units may be performed interchangeably and/or in combination with each
12
other. In an exemplary embodiment, the device 102 may include a segregation unit
214, an audio component processing unit 216, a facial motion determination unit 218,
a video component processing unit 220, a graphical representation generation unit
222 and an integration unit 224. The segregation unit 214 may be configured to
segregate audio and video components from the recorded video message. The audio
component processing unit 216 may be configured to process the audio component
to extract at least one or more phonemes associated with the recorded video message.
The facial motion determination unit 218 may be configured to determine user facial
motion data based on the extracted one or more phonemes. The video component
processing unit 220 may be configured to process process the video component to
extract at least one of facial information and emotion information of the user. The
graphical representation generation unit 222 may be configured to generate graphical
representation of the user based on the extracted user facial information and the
emotion information. The integration unit 224 may be configured to integrate the user
facial motion data with the graphical representation of the user to generate an
animated representation of the user. Further, the integration unit 224 may be
configured to integrating the audio component with the animated representation of
the user to generate the animated video message.
[0043] Each of the segregation unit 214, the audio component processing unit 216,
the facial motion determination unit 218, the video component processing unit 220,
the graphical representation generation unit 222 and the integration unit 224 may be
implemented by any suitable combination of various hardware and software
components.
[0044] In an alternative embodiment, one or more of the functionalities of the
processor 212 and/or the device 102 may be performed by the server 106 (shown in
Fig. 1). The server 106 may include any suitable components required to performed
above-mentioned functionalities. In some embodiments, the generated animated
video message may be directly transmitted from a source user device 102 to a destined
user device 102. In alternative embodiment, the animated video message may be
transmitted from a source user device 102 to a destined user device 102 via the server
106, wherein the server 106 may be configured to either directly transmit the
13
generated animated video message or process the video recorded message to generate
the animated video message to be transmitted to the destined user device 102.
[0045] Figure 3 discloses a flowchart of exemplary method 300 for generating an
animated visual representation representing a user based on the received audio
message. This flowchart is provided for illustration purposes, and embodiments are
intended to include or otherwise cover any methods or procedures for implementing
the project workflow using blockchain. Fig. 3 is described in reference to Figs. 1-2.
[0046] At step 302, the method comprises receiving, a recorded video message from
a user. The recorded video message may include audio and video component
representing an information which a user of a source user device wants to
communicate to another user of a destinated user device.
[0047] At step 304, the method includes segregating, audio and video component
from the recorded video message. Further, at step 306, the method comprises
processing the audio component to extract at least one or more phonemes associated
with the recorded video message. Thereafter, at step 308, the method comprises
determining user facial motion data based on the extracted one or more phonemes.
The user facial motion data comprises at least one of mouth movement of the user.
[0048] At step 310, the method comprises processing the video component to extract
at least one of facial information and emotion information of the user. Further, at step
312, the method comprises generating graphical representation of the user based on
the extracted user facial information and the emotion information. The user facial
information comprises positional information and motion information associated with
one or more facial components, wherein the one or more facial components comprises
eyes, nose, ear, lips, eyebrows, and head.
[0049] At step 314, the method comprises integrating the user facial motion data with
the graphical representation of the user to generate an animated representation of the
user. Thereafter at step 316, the method comprises integrating the audio component
with the animated representation of the user to generate the animated video message.
In some embodiments, the method comprises synchronizing the audio component of
14
the recorded video message with the animated graphical representation of the user
based on time stamps stored over a blockchain network.
[0050] The illustrated steps are set out to explain the exemplary embodiments shown,
and it should be anticipated that ongoing technological development will change the
manner in which particular functions are performed. These examples are presented
herein for purposes of illustration, and not limitation. Further, the boundaries of the
functional building blocks have been arbitrarily defined herein for the convenience
of the description. Alternative boundaries can be defined so long as the specified
functions and relationships thereof are appropriately performed.
[0051] The above explained embodiments may be better understood from following
example:
A user wishes to say “I LOVE YOU” to someone. Accordingly, the user
records a video message saying the phrase “I LOVE YOU” with the desired
expression. However, said conventional recorded video message may not be able to
make user communication interactive. Therefore, the processor 212 may take the
recorded video message as an input and process in a manner as disclosed above to
generate an animated video message. Specifically, the processor 212 may identify the
phonemes associated with phrase “I LOVE YOU” and determine the corresponding
facial motion data. For example, the processor 212 may identify “opening of a mouth
for pronouncing ‘I’”, “lowering the chin while opening the mouth for pronouncing
‘LOVE’”, and “a tightening of lips for pronouncing ‘U’”. The processor 212 may also
extract facial information and emotional information of the user from the video
component of the video recorded message. The processor 212 may then generate a
graphical representation of the user based on the extracted facial and emotional
information. For example, the processor 212 may generate an emoji of the user with
an expression of “happiness”, which may be determined based on the video content.
Now the processor 212 may integrate the user facial motion data with the graphical
representation of the user to animate the generate an animated graphical
representation, e.g., an animated emoji. Further, the processor 212 may integrate the
audio component with the animated representation of the user to generate the
animated video message i.e., an animated emoji of the user saying, “I LIKE YOU”.
In some embodiments, the embodiments as discussed above, may also reduce the size
15
of overall video message. Further, the method and system as disclosed enhance the
user experience while sharing video messages and make the communication
interesting and interactive.
[0052] In this manner, the system and method describe in the present disclosure may
generate an animated video message based on recorded video message. The system
and method enable efficiently and accurately display of the emotions and expression
of a user in an interactive and interesting way.
[0053] Alternatives (including equivalents, extensions, variations, deviations, etc., of
those described herein) will be apparent to persons skilled in the relevant art(s) based
on the teachings contained herein. Such alternatives fall within the scope and spirit of
the disclosed embodiments.
[0054] Furthermore, one or more computer-readable storage media may be utilized in
implementing embodiments consistent with the present disclosure. A computerreadable storage medium refers to any type of physical memory on which information
or data readable by a processor may be stored. Thus, a computer-readable storage
medium may store instructions for execution by one or more processors, including
instructions for causing the processor(s) to perform steps or stages consistent with the
embodiments described herein. The term “computer- readable medium” should be
understood to include tangible items and exclude carrier waves and transient signals,
i.e., are non-transitory. Examples include random access memory (RAM), read-only
memory (ROM), volatile memory, non-volatile memory, hard drives, CD ROMs,
DVDs, flash drives, disks, and any other known physical storage media.
[0055] Suitable processors include, by way of example, a general-purpose processor,
a special purpose processor, a conventional processor, a digital signal processor
(DSP), a plurality of microprocessors, one or more microprocessors in association
with a DSP core, a controller, a microcontroller, Application Specific Integrated
Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) circuits, any other type
of integrated circuit (IC), and/or a state machine.
16
[0056] Although the present invention has been described in considerable detail with
reference to figures and certain preferred embodiments thereof, other versions are
possible. Therefore, the scope of the present invention should not be limited to the
description of the preferred versions contained herein.
17
Referral Numerals:
Reference Number Description
100 ENVIRONMENT
102a-102n USER DEVICES
104 NETWORK
202 TRANSCEIVER
204 I/O INTERFACE
206 CAMERA
208 MICROPHONE
210 MEMORY
212 PROCESSOR
214 SEGREGATION UNIT
216 AUDIO COMPONENT PROCESSING UNIT
218 FACIAL MOTION DETERMINATION UNIT
220 VIDEO COMPONENT PROCESSING UNIT
222 GRAPHICAL REPRESENTATION
GENERATION UNIT
224 INTEGRATION UNIT
300 METHOD
302-316 METHOD STEPS

We Claim:

A method for generating an animated video message, comprising:
receiving, a recorded video message from a user, via a user device;
segregating, audio and video component from the recorded video message;
processing the audio component to extract at least one or more phonemes associated
with the recorded video message;
determining user facial motion data based on the extracted one or more
phonemes;
processing the video component to extract at least one of facial information
and emotion information of the user;
generating graphical representation of the user based on the extracted user
facial information and the emotion information;
integrating the user facial motion data with the graphical representation of the
user to generate an animated representation of the user;
integrating the audio component with the animated representation of the user
to generate the animated video message.
2. The method as claimed in claim 1, wherein user facial motion data comprises
at least one of mouth movement of the user.
3. The method as claimed in claim 1, wherein user facial information comprises
positional information and motion information associated with one or more facial
components, wherein the one or more facial components comprises eyes, nose, ear,
lips, eyebrows, and head.
4. The method as claimed in claim 1, wherein integrating the audio component
with the animated representation of the user comprises:
synchronizing the audio component of the user recorded message with the
animated graphical representation of the user based on time stamps stored over a
blockchain network.
5. A device for generating an animated video message, comprising:
a memory; and
a processor coupled to the memory, the processor configured to:
19
receive a recorded video message of a user, via a user device;
segregate audio and video component from the recorded video
message;
process the audio component to extract at least one or more phonemes
associated with the recorded video message;
determine user facial motion data based on the extracted one or more
phonemes;
process the video component to extract at least one of facial
information and emotion information of the user;
generate graphical representation of the user based on the extracted
user facial information and the emotion information;
integrate the user facial motion data with the graphical representation
of the user to generate an animated representation of the user;
integrate the audio component with the animated representation of
the user to generate the animated video message.
6. The device as claimed in claim 5, wherein user facial motion data comprises
at least one of mouth movement of the user.
7. The device as claimed in claim 5, wherein user facial information comprises
positional information and motion information associated with one or more facial
components, wherein the one or more facial components comprises eyes, nose, ear,
lips, eyebrows, and head.
8. The device as claimed in claim 5, wherein integrating the audio component
with the animated representation of the user comprises:
synchronizing the audio component of the user recorded message with the
animated graphical representation of the user based on time stamps stored over a
blockchain network.

Documents

Application Documents

#	Name	Date
1	202011007727-FORM 18 [14-11-2023(online)].pdf	2023-11-14
1	202011007727-STATEMENT OF UNDERTAKING (FORM 3) [24-02-2020(online)].pdf	2020-02-24
2	202011007727-COMPLETE SPECIFICATION [24-02-2021(online)].pdf	2021-02-24
2	202011007727-PROVISIONAL SPECIFICATION [24-02-2020(online)].pdf	2020-02-24
3	202011007727-POWER OF AUTHORITY [24-02-2020(online)].pdf	2020-02-24
3	202011007727-CORRESPONDENCE-OTHERS [24-02-2021(online)].pdf	2021-02-24
4	202011007727-FORM 1 [24-02-2020(online)].pdf	2020-02-24
4	202011007727-DRAWING [24-02-2021(online)].pdf	2021-02-24
5	202011007727-DRAWINGS [24-02-2020(online)].pdf	2020-02-24
5	202011007727-Proof of Right [09-08-2020(online)].pdf	2020-08-09
6	202011007727-DECLARATION OF INVENTORSHIP (FORM 5) [24-02-2020(online)].pdf	2020-02-24
7	202011007727-DRAWINGS [24-02-2020(online)].pdf	2020-02-24
7	202011007727-Proof of Right [09-08-2020(online)].pdf	2020-08-09
8	202011007727-DRAWING [24-02-2021(online)].pdf	2021-02-24
8	202011007727-FORM 1 [24-02-2020(online)].pdf	2020-02-24
9	202011007727-CORRESPONDENCE-OTHERS [24-02-2021(online)].pdf	2021-02-24
9	202011007727-POWER OF AUTHORITY [24-02-2020(online)].pdf	2020-02-24
10	202011007727-PROVISIONAL SPECIFICATION [24-02-2020(online)].pdf	2020-02-24
10	202011007727-COMPLETE SPECIFICATION [24-02-2021(online)].pdf	2021-02-24
11	202011007727-STATEMENT OF UNDERTAKING (FORM 3) [24-02-2020(online)].pdf	2020-02-24
11	202011007727-FORM 18 [14-11-2023(online)].pdf	2023-11-14