A Method For Indexing Recording And Playback Of Pre Recorded Multiplex Data And The System Thereof


Updated about 2 years ago

Abstract

The invention provides a method for streaming of prerecorded data across a network. The method includes receiving at a server, at least one data component from at least one participant over a network for preprocessing; retrieving independently temporal information of the data received; multiplexing the preprocessed data by synchronizing with the temporal information to obtain an indexed multiplex data; and streaming the indexed multiplex data to a plurality of users. A system for streaming of prerecorded data across a network is also provided.

Information

Application ID 2321/MUM/2009
Invention Field ELECTRONICS
Date of Application 2009-10-06
Publication Number 05/2012

Applicants

Name Address Country Nationality
GREAT SOFTWARE LABORATORY PVT LTD. VISHWAKALYAN, S.NO. 149/3, OFF. ITI ROAD, PUNE - 411 007. MAHARASHTRA - INDIA. India India

Inventors

Name Address Country Nationality
CHETAN J. VAITY VISHWAKALYAN, S.NO.149/3, OFF. ITI ROAD, PUNE - 411 007. MAHARASHTRA -INDIA. India India
PRATIK PRADHAN VISHWAKALYAN, S. NO. 149/3, OFF. ITI ROAD, PUNE - 411 007, MAHARASHTRA - INDIA. India India
PUSHYAMITRA NAVARE VISHWAKALYAN, S.NO. 149/3, OFF. ITI ROAD, PUNE - 411 007, MAHARASHTRA - INDIA. India India

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
& THE PATENTS RULES, 2003
COMPLETE SPECIFICATION
[See section 10]
A METHOD FOR INDEXING RECORDING AND PLAYBACK OF PRE-RECORDED MULTIPLEX DATA AND A SYSTEM THEREOF;
GREAT SOFTWARE PRIVATE LIMITED, A COMPANY INCORPORATED UNDER THE COMPANIES ACT 1956, WHOSE ADDRESS IS VISHWAKALYAN, S.No. 149/3, Off.ITI ROAD, PUNE-411 007, MAHARASHTRA, INDIA.

THE FOLLOWING SPECIFICATION PARTICULARLY DESCRIBES THE NATURE OF THIS INVENTION AND THE MANNER IN WHICH IT IS TO BE PERFORMED.


FIELD OF THE INVENTION
The present invention generally relates to virtual conferencing systems in their various forms like virtual online meetings, virtual classrooms and virtual presentations. More particularly, embodiments of the invention relates to a system and a method for indexing recording and playback of pre-recorded multiplex data.
PRIOR ART
Classroom education is effective when the target audience have a common source of assembly. Distance education or Distance learning is an alternate mode of education wherein the target audience have distinct sources of origin. Normally, distance education teaching occurs through dispatch of study materials through any preferred mode of postal services such as book-posts, registered posts and courier services. The dispatch of the study materials is often augmented by conduction of Contact classes wherein at a predetermined schedule; the target audiences are assembled at a predesignated location to deliver lectures.
However, attendance during such interactions is always low due to logistics. Hence, virtual conferences are a preferred mode of learning for distance learning or virtual meetings. Virtual conferencing is a fast emerging mode for interaction and collaboration of participants in real-time, irrespective of their geographical location. Participants can connect to a virtual conferencing server through individual computers or use a common computer to collaborate in real-time with the other participants in the conference. Participants use various media like text messages, audio, video, slides and shared documents to discuss and illustrate their points in the conference.
The recordings of such virtual conferences are a useful resource and they can be used for educational purposes so as to reach a larger audience. For instance, a participant who could not attend the virtual conference when it took place could access the recording of the conference and view it to catch up with the rest of the class. The content owners could also make the recording available over the internet as a learning resource on its own right. The recording of a virtual conference can also be useful as it serves as an excellent record of what happened during the session. In other words, it can serve a "minutes of meeting" purpose.
Traditionally, the approach to record a virtual conference session is to capture the proceedings on the screen as a video. This very simple and generic method can

be employed regardless of the kind of media used in the conference. It has the advantage that the video file thus produced is extremely portable and can be played by generic video players.
However, the approach has the following shortcomings:
a. The resulting recording is of a larger size than required, as it is
essentially a motion capture of the screen for the duration of the virtual
conference.
For example, referring to FIG. 8, Area A in the figure is the screen area where presenter's video is shown during the virtual conference. Area B is the screen area where text messages for chat are displayed. Area C represents the screen area where presentation slides appear. In a screen capture recording, Area B and C will also be captured as part of the recording video, although they do not change as often as the presenter's video in A. This results in large size of the recording, and hence a greater bandwidth expense for streaming conference recording to its viewers.
b. The resulting recording is a "flat" video with no machine
understandable information about the media components involved.
The screen capture video of a virtual conference does not have any
structure. For example, a video of the conference in Fig 8, has no
information about the fact that Area B is an area to display text
messages. Indeed, it does not have any knowledge about the various
media present in the virtual conference. The screen video just captures
all the activity happening on the screen without any intelligence of the
media being manipulated.
c. The resulting recording has no machine understandable information
about the contents of the conference.
It is very obvious to a human eye that the slide in Fig 8 is about "Temples of India". But, it will be very difficult to derive this from the screen capture video recording of the conference. A lot of such content information present in various media components of the conference, like presentation slides, chat, questions etc., is lost in the "opaque" video recording.


Without any machine understandable structure in the recording, the recording has to be annotated manually with tags to provide indexing and search. For example, say the presenter talked about the "Temples of India" using the slide shown in Fig. 8, at 20th minute of the conference. To allow the viewer to seek directly this topic, a tag with the text "Temples of India" needs to be added at a time stamp of 20 minutes, to the video. Such manual tagging is a time-consuming and tedious process, and in general, it is not scalable.
Hence, there is a need for a method that optimizes the recording of the events without compromising on the quality of the media captured. Further, the recorded files when played back should be capable of being searched through to retrieve the desired content. In addition, the transmission of the pre-recorded data should be real-time and independent of the accessible bandwidth at the receiving end. SUMMARY OF THE INVENTION
One aspect of the invention includes a method for streaming of prerecorded data across a network. The method of streaming includes receiving at a server, a data from a participant over a network for preprocessing. The method also includes the step of retrieving independently temporal information of the data received. The preprocessed data is synchronized with the temporal information to obtain an indexed multiplex data. Further, the indexed multiplex data is streamed to a plurality of users .
Another aspect of the invention includes a system for streaming of prerecorded data across a network. The system includes a media server configured for receiving a data from a participant, retrievably storing the data and transmitting the same. A recording creation server is configured to process the received data and create a time synchronized indexed multiplex data. Further, a streaming server is provided for seeking and playing the indexed multiplexed data to a plurality of participants.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates a virtual conferencing session according to an embodiment of the invention.
FIG.2 illustrates a schematic workflow for multiplex recording generation according to an embodiment of the invention.


FIG.3 illustrates the streaming of the multiplexed recording files stored in the storage server according to an embodiment of the invention.
FIG.4 shows a representative multiplexed media in a container format according to an embodiment of the invention.
FIG.5 shows the structure of the index according to an example of the invention.
FIG.6 shows an exploded view of the index file and the location of the corresponding entries in a representative multiplexed recording file according to an embodiment of the invention.
FIG.7 shows a representative tagged media container file format according to an embodiment of the invention.
FIG.8 shows motion capture of a screen according to a method known in the prior art.
DETAILED DESCRIPTION OF INVENTION
Various embodiments of the invention provide a method for indexing recording and playback of pre-recorded multiplex data. Further, the invention also provides a system for indexing recording and playback of pre-recorded multiplex data. In any interactive virtual conference, various modes of interactions are normally employed. For the purpose of description of this embodiment, the various modes of interaction shall be collectively referred to hereinafter as a media component. Examples of the media components commonly used in a virtual conference include audio, video, text messaging, also referred to as chat, presentation slides, software whiteboard, desktop sharing or application sharing and media file sharing.
Some of the media components are "mixed" while others are individual streams from each participant. For instance, the audio media component consists of the combined voices of all participants in the virtual conference. On the other hand, the video media component cannot be "mixed" and the individual video media streams retain their identity. Further, there are also media components that have "history" associated with them. For instance, the instantaneous chat messages will not make sense unless the previous messages are also displayed.


Fig. 1 illustrates a virtual conferencing session according to an embodiment of the invention. The virtual conferencing system 100 comprises a virtual conference client application (not shown) which runs on the computer of a plurality of participants 109. A controlling server 103 is configured to handle call setup and manage business logic from a plurality of inputs 106. At least one media server 101 is configured to receive at least one media component from at least one participant and transmit the same. A storage server 105 stores user content and a plurality of media component recordings. The control server 103 and media server 101 provide a integrated output that includes temporal information of media recordings 107.
A recording creation server is configured to process the media component recordings and create a multiplexed recording. A recording streaming server streams the multiplexed recordings to the participants. A recording player client application is provided to run on the computer of each participant.
FIG.2 illustrates a schematic workflow for multiplex recording generation according to an embodiment of the invention. The storage server 105 transmits the media component recordings to a multiplex recording creation server 201. The recording creation server 201 comprises of a metadata extraction unit 203 and a preprocessing unit 205. The metadata extraction unit 203 is responsible for analysis of the media component recordings and gathering of textual information. The preprocessing unit 205 is responsible for conversion of the media component recordings into formats suitable for incorporation into the multiplexed recording. The textual data is extracted from the metadata extraction unit 203 and the multiplexing unit 207 achieves the integration of the audio data and the video data. Examples of this textual information are text on the presentation slides, names and descriptions of media files used during the virtual conference.
The multiplex recording generation unit 201 comprises of a metadata extraction unit 203 and a preprocessing unit 205. A multiplexing unit 207 is configured to receive information simultaneously from the preprocessing unit 205, metadata extraction unit 203 and from the temporal information of media recordings 107. The multiplexing unit provides an output 209, which includes multiplexed recording of various media components with an index attached. The


multiplexed recording files 209 is then stored in a storage server for streaming at a later instance.
FIG.3 illustrates the streaming of the multiplexed recording files stored in the storage server according to an embodiment of the invention. In an example of the invention, multiplexed recording files from a plurality of distinct conferences are stored in the storage server 301. The multiplexed recording files are then streamed to a plurality of participants 303 through a recording streaming server 302 configured to transmit the recording files.
FIG.4 shows a representative multiplexed media in a container format 400, according to an embodiment of the invention. The container format 400 stores the multiplexed recording files in an interleaved manner and each container format 400 is characterized by an unique index 401. The index is stored in the initial portion of the multiplexed recording file. The container format 400 also includes a plurality of media components along with time tags subsequent to the position of the index 401. Each time tag has a start value 402 and an end value 404. In an example of the invention, a media component 403 is stored within a time frame range of tAs and tAe, a media component 405 is stored within a time frame range of tBs and tBe and a media component 407 is stored within a time frame range of tAs and tAe.
FIG.5 shows the structure of the index 500 according to an example of the invention. The index comprises of a table of events, which include information about the type of content 503 transmitted and the corresponding time stamps 501 at which the transmission of a specific type of media component occurred. The table entries also have metadata details 505 about the event.
FIG.6 shows an exploded view of the index file 500 and the location of the corresponding entries 6032, 605 and 607 in a representative multiplexed recording file 600. The index 500, stored at a location 601, includes pluralities of time stamped entries. Each entry in the table of the index file has a distinct location in the multiplexed recording file. The multiplexed recording file 600 also has unique tags included to distinguish between the various media components. FIG. 7 shows a representative tagged media container file format according to an embodiment of the invention. Each tag comprises of a tag header 701 followed by the tag data 703. Further, the tag header includes a plurality of attributes 705. In an example of

the invention, the attributes in a tag header would include information regarding time stamp, media type and media codec.
Each of the figures shall now be referred to in detail to explain the method of indexing, recording and playback of the pre-recorded media.
In an example of the invention, User content comprises of attributes of the user like name, description, photograph, etc. This user content is added to the system and stored in the Storage server when the users are provisioned in the system. We mention this aspect here because the process of recording generation may use some attributes of the users participating in the virtual conferenceThe presenter has a presentation which has 10 slides. In the virtual conference whenever he shows Slide 1, the time (TSI) is noted by the Controlling Server 103 in 107. When the presenter shows Slide 2, the time (TS2) is noted, and so on. All the slides are also stored on the Storage Server 105. As shown in FIG. 2, the text mentioned on the slides is extracted 203 and correlated with the times from 107. An index entry is made with timestamp TSI, content type as "Slide Transition" and data as the text extracted from Slide 1. In this manner, index entries for all slide transitions are created and added to the index tag 401 of the multiplexed recording file as shown in FIG. 4.
All the media component recordings are converted in the preprocessing unit 205 such that the data contained in them is time stamped. For some media like video this may already be the case and no preprocessing may be necessary. For media like Slides shown daring the conference, the temporal information 107 is used for this purpose. Next, the Multiplexing unit 207 initializes a time-counter to zero and checks all media component recordings one at a time if they have any data at zero time. If a media component recording has some data at that timestamp, it is read and added as a tag to the multiplexed recording file as shown in FIG. 4. The timestamp is written in the tag header as shown in FIG. 7. It is possible that two or more media component recordings have some data at the same timestamp. In this case multiple tags with the same timestamp are created in the multiplexed recording file. Once this process is done for all media component recordings, the time-counter is incremented and the same process is repeated. At the end of this process, the multiplexed file contains all the media components in an interleaved fashion as shown in FIG. 4

The method of seeking and playing back a particular instance of the multiplexed recording files includes the steps of:
d. clicking on a particular index item that needs to be seeked by a
participant;
e. retrieval of the corresponding instance by the recording player client
through lookup search of the timestamp tx associated with the item in
the index table;
f. streaming of the particular instance from the associated timestamp tx,
through instructions received by the recording streaming server from
the recording player client; and
g. playback of the media components from time tx.
In an embodiment of the invention, the textual metadata which can be extracted from some of the media component recordings include
h. data from presentation slides such as the filename of the presentation
slides file , the name of the participant who loaded this file in the
virtual conference, and the text on the slides along with slide transition
times;
i. text messages sent by the participant along with the time stamp;
j. video: the name of the participant who starts broadcasting a video along
with the time of broadcasting; and k. Audio: the speech to text transcript of the audio Industrial Application:
The Client application runs on the computer of every participant of the virtual conference. This application has the ability to capture various media components and also deliver them in order to offer a rich multimedia interactive experience to the participant. The Controlling Server communicates with the participant's Client application to establish the call and setup media component connections of the participant's Client with the Media Server. See Fig 1. In addition, the Controlling Server maintains temporal information about the various media interactions and events, which occur during the virtual conference. Examples of such temporal information are:
Participant A starts his video at time tl, Participant B shows slide SI at time t2, Participant B sends text message C1 at time t3.

The Media Server manages the flow of media network traffic from and to the various participant Client applications. It saves each media component session as a file or a set of files on the Storage Server. The Storage Server stores the media component recording files from the Media Server. It also stores the multiplexed recordings generated by the Recording Creation Server. The Recording Creation Server performs three tasks in the process of creating the multiplexed recording as shown in FIG.2.
The Recording Creation Server first extracts textual content from the media component recording files. For instance, the textual content includes the text on the presentation slides, the text in the chat messages, the markers stored by the participants, the audio transcript obtained from speech-to-text processing of the audio media component recording. The Recording Server then pre-processes the media component recording files and transforms them to a format suitable for combining into a multiplexed file. Some of the transformations that are performed include re-sampling of audio file, changing the resolution of the video file.
After preprocessing, the media component recording files are multiplexed into a single file in a time synchronous fashion as shown in Fig 2. A media container file format stores the time synchronized multiplexed recording file as shown in FIG.4. To achieve correct time alignment of the media components, the temporal information stored by the controlling server is also utilized. The multiplexed recording file thus generated, is stored on the Storage Server. The Recording Streaming Server is responsible for accepting connections from the participants and streaming the multiplexed recording file from the storage server. Note that every participant accessing the server may be accessing the recording of a different virtual conference. See Fig 3.
The Recording Player Client application is capable of connecting to the Recording Streaming Server. The content that is received by the Recording Player Client is the multiplexed file. The Recording Player de-multiplexes the individual media components and displays them in a time synchronous manner to the user. The index data is also received in the beginning and the Recording Player Client displays it in a manner suitable for the user to search and seek to a particular point in the virtual conference recording.


A container file format is used to identify and interleave different media component data. This file format defines a unit called a tag, which contains the data for a particular time period for a particular media component. The tag also has information identifying the media component and the timestamp corresponding to the start of the media stored in the tag. This is stored in the initial portion of each tag referred to herein as the tag header as shown in FIG.7.
As the index is stored in the initial portion of the multiplexed recording file, the Recording Player Client receives this index in the beginning as it starts receiving the content from the Recording Streaming Server. The Client stores this index and maintains it till the user is playing the conference recording. The metadata content of the index entries are displayed by the Recording Player Client in a sequential manner so that the user sees a number of these entries at once while watching the recording. When the user clicks on any entry, the Recording Player Client seeks to the timestamp associated with that index entry. In this manner, the index display offers a quick way to seek to contextual ly interesting portions of the virtual conference recording.
The textual metadata extracted from the media component recordings is used to create index entries. In addition to this, the participants of the virtual conference have the ability to enter markers at particular times during the conference. For instance, when the presenter speaks about "the Amravati Stupa", one of the attendees may enter the text "Amravati Stupa" as a marker for reference. Such text markers can also be considered as temporal metadata of the virtual conference. The individual media component recordings are dealt with independently. During the recording generation, there is access to the media component recording files. This enables a thorough extraction of metadata from these media component recordings. Such extracted textual metadata is incorporated in the index which helps in giving contextual and meaningful structure and seeking ability in the recording of the virtual conference.
Further, the recording player client offers a user-friendly interface, which enables the participant to view and experience pre-recorded multiplexed media content in real time without loss of information. All the media components and related events are displayed to the participant. A Seek bar provided assists the participant to seek to a particular time in the recording. An index of the recording,

which occurred during the virtual conference, is shown as a list of texts, each depicting a distinct event.
Certain media components like text messages, which have history associated with them, are shown in context. Hence, the entire content of such media is stored in the initial portion of the multiplexed file so that the Recording Player Client gets all the content whenever it starts playing the recording. Later when the Recording Player Client seeks to a time tx, it displays all the historical content for that media also. For example, in the case of text messages, all the text messages which have been made until time tx are displayed.
The invention as described in detail herein and as illustrated by the drawing provides a provides a method for streaming of prerecorded data across a network. The method enables real time streaming of multiplexed data independent of the bandwidth capability at the end user without compromising on the quality of the data streamed.
The foregoing description of the invention has been set for merely to illustrate the invention and is not intended to be limiting. Since modifications of the disclosed embodiments incorporating the spirit and substance of the invention may occur to person skilled in the art, the invention should be construed to include everything within the scope of the appended claims and equivalents thereof.


CLAIM
1. A method for streaming of prerecorded data across a network, the method
comprising:
receiving at a server, at least one data component from at least one
participant over a network for preprocessing;
retrieving independently temporal information of the data received;
multiplexing the preprocessed data by synchronizing with the temporal
information to obtain an indexed multiplex data; and
streaming the indexed multiplex data to a plurality of users.
2. The method according to claim 1, wherein the data received includes a audio data, video data and textual data.
3. The method according to claim 1, wherein the preprocessing of the data includes separation of textual data from the audio and/or visual data to enable indexing of the data.
4. The method according to claim 1, wherein the indexing of the data comprises of generation of at least one time stamped entry corresponding to at least one type of data.
5. The method according to claim 1, wherein the step of streaming of the indexed multiplex data includes a method of seeking and playing back of the indexed multiplexed data.
6. The method of seeking and playing back a particular instance of the multiplexed data across a network, the method including the steps of:
a. clicking on a particular index item from an index table that needs to be
seeked by at least one participant;
b. retrieving of the corresponding instance by a recording player client
through lookup search of the timestamp associated with the item in the
index table;
c. streaming of the particular instance from the associated timestamp,
through instructions received by the recording streaming server from
the recording player client; and
d. playback of the media components from preselected time.
7. A system for streaming of prerecorded data across a network, the system
comprising:


i. at least one media server configured for receiving at least one data from at least one participant, retrievably storing the data and transmitting the same; ii. a recording creation server configured to process the received data from the media server and create a time synchronized indexed multiplex data; and iii. a streaming server coupled to the recording creation server for seeking and playing the indexed multiplexed data to a plurality of participants.
8. The system according to claim 7, wherein the data received includes a audio data, video data and textual data.
9. The system according to claim 7, wherein the recording creation server comprises of
i. a preprocessing unit configured for analyzing the audio data
and/or video data and creating a format capable of streaming; ii. a metadata extraction unit for extracting the textual data to create a time stamped tag associated with the audio data and/or video data; and iii. a multiplexing unit for integrating the time stamped textual data along with the audio data and/or video data to created a indexed multiplex data.
10. The system according to claim 7, wherein the streaming server includes a
recording player client for seeking and playing back the indexed multiplex data
to a plurality of participants across a network.

Documents

Name Date
2321-MUM-2009-POWER OF ATTORNEY(16-10-2009).pdf 2009-10-16
2321-MUM-2009-FORM 5(16-10-2009).pdf 2009-10-16
2321-MUM-2009-CORRESPONDENCE(16-10-2009).pdf 2009-10-16
2321-MUM-2009-FORM 1(16-10-2009).pdf 2009-10-16
abstract1.jpg 2018-08-10
2321-mum-2009-form 5.pdf 2018-08-10
Other Patent Document [05-10-2016(online)].pdf 2016-10-05
2321-mum-2009-form 2.doc 2018-08-10
2321-mum-2009-form 2(title page).pdf 2018-08-10
2321-MUM-2009-FORM 18(17-1-2011).pdf 2018-08-10
2321-mum-2009-form 2.pdf 2018-08-10
2321-mum-2009-form 13(12-1-2011).pdf 2018-08-10
2321-mum-2009-form 1.pdf 2018-08-10
2321-mum-2009-drawing.pdf 2018-08-10
2321-mum-2009-description(complete).doc 2018-08-10
2321-mum-2009-correspondence.pdf 2018-08-10
2321-mum-2009-description(complete).pdf 2018-08-10
2321-MUM-2009-CORRESPONDENCE(12-4-2012).pdf 2018-08-10
2321-MUM-2009-CORRESPONDENCE(17-1-2011).pdf 2018-08-10
2321-MUM-2009-CORRESPONDENCE(12-1-2011).pdf 2018-08-10
2321-mum-2009-claims.doc 2018-08-10
2321-mum-2009-claim.pdf 2018-08-10
2321-mum-2009-abstract.pdf 2018-08-10
2321-mum-2009-abstract.doc 2018-08-10
2321-MUM-2009-FER.pdf 2018-08-21
2321-MUM-2009-OTHERS [20-02-2019(online)].pdf 2019-02-20
2321-MUM-2009-FORM-26 [20-02-2019(online)].pdf 2019-02-20
2321-MUM-2009-FORM 13 [20-02-2019(online)].pdf 2019-02-20
2321-MUM-2009-FER_SER_REPLY [20-02-2019(online)].pdf 2019-02-20
2321-MUM-2009-FORM 13 [20-02-2019(online)]-1.pdf 2019-02-20
2321-MUM-2009-COMPLETE SPECIFICATION [20-02-2019(online)].pdf 2019-02-20
2321-MUM-2009-DRAWING [20-02-2019(online)].pdf 2019-02-20
2321-MUM-2009-CLAIMS [20-02-2019(online)].pdf 2019-02-20
2321-MUM-2009-ABSTRACT [20-02-2019(online)].pdf 2019-02-20
2321-MUM-2009-ORIGINAL UR 6(1A) FORM 26-250219.pdf 2019-06-25
2321-MUM-2009-US(14)-HearingNotice-(HearingDate-01-04-2020).pdf 2020-03-04
2321-MUM-2009-US(14)-HearingNotice-(HearingDate-14-07-2020).pdf 2020-06-19
2321-MUM-2009-Correspondence to notify the Controller [30-03-2020(online)].pdf 2020-03-30
2321-MUM-2009-Correspondence to notify the Controller [13-07-2020(online)].pdf 2020-07-13
2321-MUM-2009-US(14)-ExtendedHearingNotice-(HearingDate-15-07-2020).pdf 2020-07-14
2321-MUM-2009-Response to office action [14-07-2020(online)].pdf 2020-07-14
2321-MUM-2009-PatentCertificate18-05-2021.pdf 2021-05-18
2321-MUM-2009-Written submissions and relevant documents [29-07-2020(online)].pdf 2020-07-29
2321-MUM-2009-IntimationOfGrant18-05-2021.pdf 2021-05-18

Orders

Applicant Section Controller Decision Date URL