Methods And Systems For Interactive Streaming Of 3 D Videos

< Back

Methods And Systems For Interactive Streaming Of 3 D Videos

Abstract: ABSTRACT Disclosed is method for interactive streaming of 3D video, the method comprising: obtaining 3D model of environment represented in 3D video; 5 determining starting point(s) of 3D video from which interactive streaming is to be enabled; generating video segment(s) corresponding to starting point(s) using 3D model, wherein video segment(s) comprise(s) images representing views of environment from location(s) in the 3D model that corresponds to starting point(s); receiving input indicative of starting point from which 3D video is to be 10 streamed to client device; and sending video segment corresponding to starting point to client device and initializing streaming of 3D video from starting point, wherein video segment is to be played at client device until 3D video begins to play from starting point at client device. FIG. 1

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

27 February 2023

Publication Number

35/2024

Publication Type

INA

Invention Field

COMMUNICATION

Status

Parent Application

Applicants

ADLOID TECHNOLOGIES PRIVATE LIMITED

Plot 18-20, Level 10, Hindustan times house K.G. Marg, Connaught Place, New Delhi - 110001

Inventors

1. Kanav Singla

H no 68, street 5, officers enclave phase 2 Patiala Punjab 147001

2. Kartik Kanaujia

#804, Tower 6, Uniworld Gardens, Sohna Road, Gurgaon 122018

3. Rohit Ranjan

35-T, Srikrishnapuram, Near Karim Nagar, Chargawan, Gorakhpur, Uttar pradesh. Pin code: 273013

Specification

Description:TECHNICAL FIELD

The present disclosure relates to methods for interactive streaming of three- dimensional (3D) videos. The present disclosure also relates to systems for interactive streaming of 3D videos.

5 BACKGROUND

With technological advancements, extended-reality technology has become increasingly popular in various fields such as entertainment, real estate, training, simulators, navigation, and the like. Nowadays, streaming of three-dimensional (3D) videos is in demand for users to experience 3D virtual environments. Such
10 3D virtual environments can be experienced by using XR devices (such as virtual- reality (VR) devices) associated with the users. The streaming of the 3D videos is highly useful in applications, for example, such as XR tourism, XR gaming, XR video-conferencing, and XR online education, and the like.

However, the existing technologies for streaming of three-dimensional (3D)
15 videos are associated with certain limitations and complications. The existing technologies for streaming of 3D videos require considerable processing resources at a streaming source which need to remain activated at all times, in order to provide a seamless streaming experience and 3D viewing experience. Thus, the streaming of the 3D videos is extremely power-intensive and cost-intensive.
20 Moreover, when the processing resources are activated based on user demand or user requirement, the streaming of the 3D videos is performed with considerable latency/delay, and the 3D viewing experience of the user is adversely compromised. Further latency in the streaming is introduced by a communication network between the streaming source and a streaming destination, which is
25 undesirable.

Therefore, in the light of foregoing discussion, there exists a need to overcome the aforementioned drawbacks associated with the existing technologies for streaming of the 3D videos.

SUMMARY

The present disclosure seeks to provide a method for interactive streaming of three-dimensional (3D) videos. The present disclosure also seeks to provide a system for interactive streaming of 3D videos. An aim of the present disclosure is
5 to provide a solution that overcomes at least partially the problems encountered in prior art.

In a first aspect, an embodiment of the present disclosure provides a method for interactive streaming of a three-dimensional (3D) video, the method comprising:

- obtaining a 3D model of an environment represented in the 3D video;
10 - determining at least one starting point of the 3D video from which the interactive streaming of the 3D video is to be enabled;
- generating at least one video segment corresponding to the at least one starting point using the 3D model, wherein the at least one video segment comprises a plurality of images representing a plurality of views of the
15 environment from at least one location in the 3D model that corresponds to the at least one starting point;
- receiving an input indicative of a given starting point from which the 3D video is to be streamed to a client device; and
- sending a given video segment corresponding to the given starting point to
20 the client device and initializing streaming of the 3D video from the given starting point, wherein the given video segment is to be played at the client device until the 3D video begins to play from the given starting point at the client device.

Optionally, the method further comprises:
- generating reprojection information indicative of at least one of: a rotation,
25 a translation, to be applied to one or more of the plurality of images in the given video segment; and
- sending the reprojection information to the client device, wherein the reprojection information is to be utilized at the client device for generating

additional images that are to be added to the given video segment prior to playing the given video segment at the client device.

Optionally, the method further comprises sending the at least one video segment corresponding to the at least one starting point of the 3D video to a cache memory
5 of a content delivery network, wherein the client device is communicably coupled to the content delivery network,
and wherein the step of receiving the input indicative of the given starting point from which the 3D video is to be streamed to the client device and the step of sending the given video segment corresponding to the given starting point to the
10 client device and initializing streaming of the 3D video from the given starting point are implemented by a server of the content delivery network.

Optionally, in the method, the 3D video is a virtual-reality (VR) video.

Optionally, in the method the step of obtaining the 3D model of the environment represented in the 3D video comprises receiving the 3D model of the environment
15 from a device at which the 3D video is created or from a device at which the 3D model is stored.

Optionally, in the method, the 3D model is in form of one of: a 3D polygonal mesh, a 3D point cloud, a voxel-based model, a mathematical 3D surface model, a 3D grid.

20 Optionally, in the method, the step of determining the at least one starting point of the 3D video comprises:
- creating a 3D grid indicating spatial structuring of the environment represented in the 3D video, wherein the 3D grid comprises a plurality of cells;
- creating a 3D textual matrix of the environment, wherein information
25 pertaining to each cell of the plurality of cells is saved in the 3D textual matrix;
- selecting an actual start point of the 3D video as a default starting point;
- enabling streaming of the 3D video from the default starting point and collecting user interaction data corresponding to a time period of the streaming;

- analysing the user interaction data for predicting probabilities of a user interacting with the plurality of cells in future; and
- identifying the at least one starting point, based on the probabilities that are predicted, wherein the at least one location in the 3D model that corresponds
5 to the at least one starting point maps to at least one cell amongst the plurality of cells.

Optionally, in the method, the at least one starting point is at least one of: an actual start point of the 3D video, a mid-point of the 3D video, an intermediate point along a length of the 3D video.

10 Optionally, in the method, the step of generating the at least one video segment corresponding to the at least one starting point using the 3D model comprises:
- identifying, in the 3D model, the at least one location corresponding to the at least one starting point;
- controlling at least one virtual camera for capturing the plurality of images
15 representing the plurality of views of the environment from the at least one location in the 3D model, by adjusting a height and/or a viewing direction of the at least one virtual camera at the at least one location; and
- utilising the plurality of images captured from the at least one location for producing the at least one video segment.

20 Optionally, in the method, the step of utilising the plurality of images captured from the at least one location for producing the at least one video segment comprises re-ordering the plurality of images, based on the plurality of views represented therein, to facilitate a smooth viewing experience of the at least one video segment.

25 Optionally, in the method, the plurality of images representing a plurality of views of the environment from the at least one location collectively cover a 360-degree view of the environment from the at least one location.

In a second aspect, an embodiment of the present disclosure provides a system for interactive streaming of a three-dimensional (3D) video, the system comprising at least one server configured to:

- obtain a 3D model of an environment represented in the 3D video;
5 - determine at least one starting point of the 3D video from which the interactive streaming of the 3D video is to be enabled;
- generate at least one video segment corresponding to the at least one starting point using the 3D model, wherein the at least one video segment comprises a plurality of images representing a plurality of views of the
10 environment from at least one location in the 3D model that corresponds to the at least one starting point;
- receive an input indicative of a given starting point from which the 3D video is to be streamed to a client device; and
- send a given video segment corresponding to the given starting point to the
15 client device and initialize streaming of the 3D video from the given starting point, wherein the given video segment is to be played at the client device until the 3D video begins to play from the given starting point at the client device.

Optionally, the at least one server is configured to:
- generate reprojection information indicative of at least one of: a rotation, a
20 translation, to be applied to one or more of the plurality of images in the given video segment; and
- send the reprojection information to the client device, wherein the reprojection information is to be utilized at the client device to generate additional images that are to be added to the given video segment prior to playing the given
25 video segment at the client device.

Optionally, the at least one server is configured to send the at least one video segment corresponding to the at least one starting point of the 3D video to a cache memory of a content delivery network, wherein the client device is communicably coupled to the content delivery network,

and wherein a server of the content delivery network is configured to receive the input indicative of the given starting point from which the 3D video is to be streamed to the client device, and send the given video segment corresponding to the given starting point to the client device and initialize streaming of the 3D
5 video from the given starting point.

Optionally, in the system, the 3D video is a virtual-reality (VR) video.

Optionally, when obtaining the 3D model of the environment represented in the 3D video, the at least one server is configured to receive the 3D model of the environment from a device at which the 3D video is created or from a device at
10 which the 3D model is stored.

Optionally, in the system, the 3D model is in form of one of: a 3D polygonal mesh, a 3D point cloud, a voxel-based model, a mathematical 3D surface model, a 3D grid.

Optionally, when determining the at least one starting point of the 3D video, the at
15 least one server is configured to:
- create a 3D grid indicating spatial structuring of the environment represented in the 3D video, wherein the 3D grid comprises a plurality of cells;
- create a 3D textual matrix of the environment, wherein information pertaining to each cell of the plurality of cells is saved in the 3D textual matrix;
20 - select an actual start point of the 3D video as a default starting point;
- enable streaming of the 3D video from the default starting point and collecting user interaction data corresponding to a time period of the streaming;
- analyse the user interaction data for predicting probabilities of a user interacting with the plurality of cells in future; and
25 - identify the at least one starting point, based on the probabilities that are predicted, wherein the at least one location in the 3D model that corresponds to the at least one starting point maps to at least one cell amongst the plurality of cells.

Optionally, in the system, the at least one starting point is at least one of: an actual start point of the 3D video, a mid-point of the 3D video, an intermediate point along a length of the 3D video.

Optionally, when generating the at least one video segment corresponding to the
5 at least one starting point using the 3D model, the at least one server is configured to:
- identify, in the 3D model, the at least one location corresponding to the at least one starting point;
- control at least one virtual camera for capturing the plurality of images
10 representing the plurality of views of the environment from the at least one location in the 3D model, by adjusting a height and/or a viewing direction of the at least one virtual camera at the at least one location; and
- utilise the plurality of images captured from the at least one location for producing the at least one video segment.

15 Optionally, when utilising the plurality of images captured from the at least one location for producing the at least one video segment, the at least one server is configured to re-order the plurality of images, based on the plurality of views represented therein, to facilitate a smooth viewing experience of the at least one video segment.

20 Optionally, in the system, the plurality of images representing a plurality of views of the environment from the at least one location collectively cover a 360-degree view of the environment from the at least one location.

Embodiments of the present disclosure substantially eliminate or at least partially address the aforementioned problems in the prior art, and enable real time or near-
25 real time interactive streaming of 3D videos without any latency/delay, in power- efficient and cost-efficient manner.

Additional aspects, advantages, features and objects of the present disclosure would be made apparent from the drawings and the detailed description of the

illustrative embodiments construed in conjunction with the appended claims that follow.

It will be appreciated that features of the present disclosure are susceptible to being combined in various combinations without departing from the scope of the
5 present disclosure as defined by the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

The summary above, as well as the following detailed description of illustrative embodiments, is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the present disclosure, exemplary
10 constructions of the disclosure are shown in the drawings. However, the present disclosure is not limited to specific methods and instrumentalities disclosed herein. Moreover, those skilled in the art will understand that the drawings are not to scale. Wherever possible, like elements have been indicated by identical numbers.

15 Embodiments of the present disclosure will now be described, by way of example only, with reference to the following diagrams wherein:

FIG. 1 illustrates steps of a method for interactive streaming of a three- dimensional (3D) video, in accordance with an embodiment of the present disclosure;
20 FIGs. 2A and 2B illustrate exemplary environments in which a system for interactive streaming of a three-dimensional (3D) video is used, in accordance with different embodiments of the present disclosure; and
FIG. 3A illustrates an exemplary layout of an environment represented in a three-dimensional (3D) video, while FIGs. 3B, 3C and 3D illustrate exemplary
25 images representing different views of the environment from different perspectives with respect to a starting point of the 3D video, in accordance with an embodiment of the present disclosure.

In the accompanying drawings, an underlined number is employed to represent an item over which the underlined number is positioned or an item to which the underlined number is adjacent. A non-underlined number relates to an item identified by a line linking the non-underlined number to the item. When a
5 number is non-underlined and accompanied by an associated arrow, the non- underlined number is used to identify a general item at which the arrow is pointing.

DETAILED DESCRIPTION

The following detailed description illustrates embodiments of the present
10 disclosure and ways in which they can be implemented. Although some modes of carrying out the present disclosure have been disclosed, those skilled in the art would recognize that other embodiments for carrying out or practising the present disclosure are also possible.

The present disclosure provides a method and a system for interactive streaming
15 of a three-dimensional (3D) video. The method and the system enable real time or near-real time interactive streaming of the 3D video without any latency/delay, in a power-efficient and a cost-efficient manner. Herein, a given video segment corresponding to a given starting point of the 3D video is generated by at least one server, and is subsequently played at a client device (and thus, is shown to a user
20 of the client device) until the 3D video is streamed and begins to play from the given starting point at the client device. In such a case, processing resources of the at least one server need not remain activated at all times to provide a 3D viewing experience of the user. Such a manner of interactive streaming of the 3D video is relatively simple and effectively masks a latency involved with streaming in a
25 manner that is imperceptible to the user, by playing the given video segment at the client device until the 3D video starts playing from the given starting point at the client device. Thus, viewing experience of said user is not compromised in the time that processing resources of the at least one server are being activated for streaming the 3D video and the 3D video is being sent to the client device.

Moreover, the given video segment is immersive and interactive in nature, so the user's viewing experience is rather enhanced in said time. The method and the system are simple, support in providing real-time interactive streaming experience, and can be implemented with ease.

5 Referring to FIG. 1, illustrated are steps of a method for interactive streaming of a three-dimensional (3D) video, in accordance with an embodiment of the present disclosure. At step 102, a 3D model of an environment represented in the 3D video is obtained. At step 104, there is determined at least one starting point of the 3D video from which the interactive streaming of the 3D video is to be enabled.
10 At step 106, at least one video segment corresponding to the at least one starting point is generated using the 3D model, wherein the at least one video segment comprises a plurality of images representing a plurality of views of the environment from at least one location in the 3D model that corresponds to the at least one starting point. At step 108, an input is received, the input being
15 indicative of a given starting point from which the 3D video is to be streamed to a client device. At step 110, a given video segment corresponding to the given starting point is sent to the client device and streaming of the 3D video from the given starting point is initialized, wherein the given video segment is to be played at the client device until the 3D video begins to play from the given starting point
20 at the client device.

The aforementioned steps are only illustrative and other alternatives can also be provided where one or more steps are added, one or more steps are removed, or one or more steps are provided in a different sequence without departing from the scope of the claims herein. Each of these steps is described later in more detail.

25 Referring to FIGs. 2A and 2B, illustrated are exemplary environments in which a system 200 for interactive streaming of a three-dimensional (3D) video is used, in accordance with different embodiments of the present disclosure. The system 200 comprises at least one server (depicted as a server 202).

With reference to FIG. 2A, the at least one server 202 is communicably coupled to a client device 204. The at least one server 202 is configured to:
- obtain a 3D model of an environment represented in the 3D video;
- determine at least one starting point of the 3D video from which the
5 interactive streaming of the 3D video is to be enabled;
- generate at least one video segment corresponding to the at least one starting point using the 3D model, wherein the at least one video segment comprises a plurality of images representing a plurality of views of the environment from at least one location in the 3D model that corresponds to the at
10 least one starting point;
- receive an input indicative of a given starting point from which the 3D video is to be streamed to the client device 204; and
- send a given video segment corresponding to the given starting point to the client device 204 and initialize streaming of the 3D video from the given starting
15 point, wherein the given video segment is to be played at the client device 204 until the 3D video begins to play from the given starting point at the client device 204.

With reference to FIG. 2B, optionally, the at least one server 202 is communicably coupled to at least one content delivery network (depicted as a
20 content delivery network 206), wherein the content delivery network 206 is communicably coupled to the client device 204. In other words, the at least one server 202 is communicably coupled to the client device 204, optionally via the content delivery network 206. Optionally, the content delivery network 206 comprises a plurality of servers (depicted as servers 208 and 210) and a plurality
25 of cache memories (depicted as cache memories 212 and 214) associated with the plurality of servers 208 and 210. The plurality of cache memories 212 and 214 may be a part of data repositories (not shown) associated with the plurality of servers 208 and 210, respectively.

It may be understood by a person skilled in the art that the FIGs. 2A and 2B
30 include simplified exemplary environments of using the system 200 for sake of

clarity, which should not unduly limit the scope of the claims herein. It is to be understood that the specific implementations of the system 200 are provided as examples, and are not to be construed as limiting them to specific numbers or types of servers, client devices, content delivery networks, and/or cache
5 memories. The person skilled in the art will recognize many variations, alternatives, and modifications of embodiments of the present disclosure. For sake of simplicity, the content delivery network 206 is shown to be communicably coupled to only a single client device 204 in the FIG. 2B, but in reality, the content delivery network 206 is communicably coupled to a plurality of client
10 devices.

Throughout the present disclosure, the term "three-dimensional video" refers to a sequence of images representing 3D visual content. Such visual content may, for example, be represented using colour information, depth information, transparency information, lighting information, and the like, of an environment of
15 which the 3D video is made. Notably, the 3D video conveys the depth information, in addition to the colour information, as compared to a two- dimensional (2D) video. Thus, when the 3D video is shown to a user, the user can beneficially visually perceive depth within the environment represented in the 3D video (i.e., depths of objects or their parts present in said environment).

20 The 3D video is shown to a user of the client device 204. Optionally, the client device 204 is a virtual reality (VR) device. The VR device could, for example, be a VR head-mounted display (HMD) device, a pair of VR glasses, a VR console, and the like. The VR device could also be an extended-reality (XR) device that is capable of showing VR videos. Optionally, the client device 204 is a 3D video-
25 compatible television, a 3D video-compatible computer, a 3D video-compatible smartphone, or similar. The client devices capable of showing 3D videos are well- known in the art.

Optionally, the 3D video is a virtual-reality (VR) video. The term "virtual-reality video" refers to a computer-generated video that represents a virtual environment.

Optionally, the VR video can represent a fully computer-generated virtual environment that is not created using any real-world content i.e., only computer- generated objects (namely, digital objects) are present in said virtual environment. In an example, the VR video may be a VR adventure park video in which a user
5 can virtually experience a roller coaster ride. Furthermore, optionally, the VR video can represent a virtual environment that is created based on real-world content, i.e., real-world content is captured using camera(s) at a real-world location for being shown at another real-world location. Objects represented in the real-world location can be understood to be virtual objects, from a perspective of
10 the another real-world location. In an example, the VR video may represent a virtual tour of the real-world location, for example, such as a museum, a waterfall, a shopping mall, and the like. This virtual tour is to be shown at other real-world locations, for example, such as at a user's home, a user's workplace, and the like. Moreover, optionally, the VR video can also represent a virtual environment that
15 is created based on real-world content and the computer-generated objects. It will be appreciated that the user can view the VR video from different perspectives, for example, by changing an orientation of his/her head, by changing his/her gaze within the virtual environment, or similar. The VR video can be useful in various fields such as entertainment, education, marketing, training, simulators,
20 navigation, and the like. VR videos are well-known in the art.

Throughout the present disclosure, the term "interactive streaming" of the 3D video refers to streaming of the 3D video from the at least one server 202 to the client device 204 in a manner that a latency of such streaming is masked by provision of an interactive visual experience until the 3D video begins to play at
25 the client device 204. Such interactive streaming can be useful in terms of improving user engagement for streaming-based applications such as virtual- reality tourism, virtual-reality gaming, video-conferencing, and online education, and the like.

Throughout the present disclosure, the term "three-dimensional model" of the
30 environment refers to a data structure comprising detailed information pertaining

to a 3D space of the environment. Such detailed information is indicative of at least one of: features of objects or their parts present in the environment, shapes and sizes of the objects or their parts, position and orientation of the objects or their parts, colour and depth information of the objects or their parts, lighting
5 within the environment. It will be appreciated that the 3D model of the environment is already present at a time of generating the 3D video.

Optionally, in the method, the step of obtaining the 3D model of the environment represented in the 3D video comprises receiving the 3D model of the environment from a device at which the 3D video is created or from a device at which the 3D
10 model is stored. In this regard, the device at which the 3D video is created may comprise only a processor, or may comprise both a processor and at least one camera. The processor of such a device can be communicably coupled to the at least one server 202. In some implementations, when the 3D video is a fully computer-generated 3D video, the aforesaid device comprises only the processor.
15 In other implementations, when the 3D video represents a virtual environment that is created based on real-world content, or based on the real-world content and the computer-generated content, the aforesaid device comprises both the processor and the at least one camera. Examples of the at least one camera include, but are not limited to, a Red-Green-Blue (RGB) camera, a Red-Green-Blue-Alpha (RGB-
20 A) camera, a Red-Green-Blue-Depth (RGB-D) camera, a stereo camera, an infrared (IR) camera. The device at which the 3D video is created could be a computing device. Examples of the computing device include, but are not limited to, a laptop, a desktop, a tablet, a workstation, and a console. Furthermore, upon its generation, the 3D model of the environment could stored at a device.
25 Optionally, in this regard, the 3D model is stored at a data repository communicably coupled to the at least one server 202. This data repository is also communicably coupled to the device at which the 3D video is created. The data repository could, for example, be implemented as a memory of the at least one server 202, a memory of the computing device, a removable memory, a cloud-
30 based database, or similar.

Optionally, the 3D model is in form of one of: a 3D polygonal mesh, a 3D point cloud, a voxel-based model, a mathematical 3D surface model, a 3D grid. It will be appreciated that the aforesaid forms of the 3D model are simple, reliable, and accurate in terms of representing the environment that is represented in the 3D
5 video. Such forms of the 3D model effectively incorporate 3D visual details of the environment represented in the 3D video. Advantageously, this facilitates in generating the at least one video segment with high accuracy as the plurality of images captured from the at least one location in the 3D model, using the 3D model, have accurate and comprehensive 3D visual representation throughout
10 their fields of view. Furthermore, the aforesaid forms of the 3D model are well- known in the art, and can be implemented with ease.

Notably, the interactive streaming of the 3D video is enabled (namely, started) from the at least one starting point of the 3D video. In this regard, there could be different starting points of the 3D video which correspond to different junctures
15 (namely, different timestamps) in the 3D video. These different junctures can be used by a user to jump or navigate to important and relevant portions in the 3D video. It will be appreciated that since visual content of the 3D video is already well-defined after generating the 3D video, the different junctures in the 3D video can then be ascertained easily for determining the at least one starting point of the
20 3D video. In an example, for a 20-minute 3D video of a virtual tour of a museum, there could be four starting points corresponding to entries of four different rooms of the museum, each starting point lying at a time interval of 5 minutes from each other. In this example, the four starting points are: 0 minutes, 5 minutes, 10 minutes, and 15 minutes.

25 Optionally, in the method, the step of determining the at least one starting point of the 3D video comprises:
- creating a 3D grid indicating spatial structuring of the environment represented in the 3D video, wherein the 3D grid comprises a plurality of cells;
- creating a 3D textual matrix of the environment, wherein information
30 pertaining to each cell of the plurality of cells is saved in the 3D textual matrix;

- selecting an actual start point of the 3D video as a default starting point;
- enabling streaming of the 3D video from the default starting point and collecting user interaction data corresponding to a time period of the streaming;
- analysing the user interaction data for predicting probabilities of a user
5 interacting with the plurality of cells in future; and
- identifying the at least one starting point, based on the probabilities that are predicted, wherein the at least one location in the 3D model that corresponds to the at least one starting point maps to at least one cell amongst the plurality of cells.

10 In this regard, it will be appreciated that the aforesaid steps are performed prior to actually implementing interactive streaming of the 3D video. Optionally, when creating the 3D grid, the at least one server 202 is configured to digitally create a 3D data structure representing a 3D space of the environment and to digitally divide the 3D data structure into a plurality of 3D sub-units of the 3D data
15 structure, wherein the 3D data structure is the 3D grid and the plurality of 3D sub- units are the plurality of cells of the 3D grid, and wherein each cell represents a portion of the 3D space of the environment. The plurality of cells constitute a 3D grid structure of the environment. Each cell is specified with its corresponding location in the 3D model and a unique identifier of the cell. Such a corresponding
20 location could be represented using a coordinate system, for example, such as a Cartesian coordinate system. Furthermore, the term "textual matrix" refers to a data structure that is capable of storing information pertaining to each cell in text format. Such information may comprise a portion of the detailed information in the 3D model that corresponds to each cell. The 3D textual matrix of the
25 environment may also store analytics information pertaining to each cell.

Since videos are typically streamed from their beginning, the actual start point of the 3D video is selected as the default starting point. Once the 3D video is streamed from the actual start point, visual content of the 3D video is displayed at client devices of multiple users, and the multiple users may start interacting with
30 the environment represented in the 3D video. These multiple users receive the 3D

video in a conventionally-streamed manner, and their interaction data is utilised for eventually enabling interactive streaming for the user of the client device 204. The term "user interaction data" refers to information pertaining to how the multiple users of the client devices interacts with and behave in the environment
5 represented in the 3D video during the time period of the streaming of the 3D video. Optionally, the user interaction data comprises at least one of: a location visited in the environment, time spent at a location visited in the environment, a theme that is selected, an environmental effect that is selected, a type and/or a characteristic of an object that is selected, a placement of objects, of/by a given
10 user.

Furthermore, by optionally analysing the user interaction data, the at least one server 202 could correctly predict the probabilities of the user interacting with the plurality of cells in the future. Optionally, when analysing the user interaction data for predicting the probabilities, the at least one server 202 is configured to employ
15 at least data processing algorithm.

It will be appreciated that the at least one (identified) starting point may or may not include the default starting point (i.e., the actual start point of the 3D video). Optionally, the at least one starting point comprises N starting points whose locations in the 3D model map to cells having N highest probabilities from
20 amongst the plurality of cells. Optionally, in this regard, N is greater than or equal to 1. Alternatively, optionally, the at least one starting point comprises all starting points whose locations in the 3D model map to cells having a probability that is greater than a predefined threshold. Optionally, in this regard, the predefined threshold lies in a range of 0.51 to 0.9. More optionally, the predefined threshold
25 lies in a range of 0.7 to 0.9. It will be appreciated that the predefined threshold defines a minimum required probability of user interaction for a cell, so as to be able to identify a starting point corresponding to the cell. In an example, the 3D grid may comprise four cells C1, C2, C3, and C4, and probabilities of the user interacting with the four cells C1-C4 in future may be 0.6, 0.33, 0.75, and 0.48,

respectively. In such a case, when the predefined threshold is 0.55, two starting points corresponding to the cells C1 and C3, respectively, are identified.

Optionally, the at least one starting point is at least one of: an actual start point of the 3D video, a mid-point of the 3D video, an intermediate point along a length of
5 the 3D video. In this regard, the actual start point of the 3D video could be the default starting points from which the interactive streaming of the 3D video is to be enabled. It will be appreciated that different intermediate points along the length of the 3D video may be different important junctures (i.e., timestamps) in the 3D video from which the interactive streaming of the 3D video could be
10 started. Prior to streaming the 3D video at the client device 204, the user of the client device 204 may choose a given timestamp from amongst multiple timestamps as the given starting point. Optionally, the 3D video is split into smaller-sized chunks and the smaller-sized chunks are encoded and then stored at the at least one server 202. In this regard, the intermediate point is optionally an
15 actual start point of a given smaller-sized chunk of the 3D video. In an example, a 10-minute 3D video may be stored at the at least one server 202 in form of ten 1- minute chunks. In such a case, there could be 10 starting points which are actual start points of the ten 1-minute chunks. The user could begin to stream from a start point of any of these ten 1-minute chunks.

20 Upon determining the at least one starting point, the at least one server 202 generates the at least one video segment using the 3D model. Throughout the present disclosure, the term "video segment" refers to a video that represents views of the environment from the at least one location in the 3D model that corresponds to the at least one starting point. It will be appreciated that different
25 video segments are generated for different starting points in the 3D video, since different starting points may correspond to different locations in the 3D model.

Optionally, in the method, the step of generating the at least one video segment corresponding to the at least one starting point using the 3D model comprises:

- identifying, in the 3D model, the at least one location corresponding to the at least one starting point;
- controlling at least one virtual camera for capturing the plurality of images representing the plurality of views of the environment from the at least one
5 location in the 3D model, by adjusting a height and/or a viewing direction of the at least one virtual camera at the at least one location; and
- utilising the plurality of images captured from the at least one location for producing the at least one video segment.

Optionally, when identifying the at least one location corresponding to the at least
10 one starting point, the at least one server 202 is configured to: identify an image in the 3D video that corresponds to the at least one starting point; and map, to the 3D model, a view of the environment that is represented in the image, for determining a viewpoint of the view as the at least one location corresponding to the at least one starting point. Thus, the at least one server 202 can easily and accurately
15 identify the at least one location in the view of the environment.

Furthermore, the "virtual camera" is not a physical camera, but a virtual entity that is controllable to capture a given image of the environment from a certain viewpoint and viewing direction (which are as indicated by a camera pose). In other words, the at least one virtual camera is a software entity that can be
20 controlled (by the at least one server 202) to be arranged in a particular location, and to capture the plurality of images by adjusting camera poses of the at least one virtual camera at said location, in a manner similar to how any physical camera can be used. The term "pose" refers to position and/or orientation. It will be appreciated that adjusting the at least one virtual camera at different heights
25 and/or different viewing directions enables in capturing the plurality of views of the environment from different perspectives at the at least one location. A two- dimensional location of the at least one virtual camera stays the same while capturing the plurality of images from any location in the 3D model, but the height of the at least one virtual camera can optionally vary. Beneficially, a
30 comprehensive and realistic multi-perspective view of the environment could be

provided in the at least one video segment (that is produced by utilising the plurality of images). This would potentially provide an immersive and seemingly- realistic and interactive viewing experience to the user whilst the 3D video is being streamed from the given starting point.

5 Optionally, when utilising the plurality of images for producing the at least one video segment, the at least one server 202 is configured to create a sequence of the plurality of images. Optionally, in this regard, the at least one server 202 is configured to employ at least one image processing algorithm. Image processing algorithms (for example, such as image sharpening, image brightening, colour
10 correction, and the like) are well-known in the art.

Optionally, in the method, the step of utilising the plurality of images captured from the at least one location for producing the at least one video segment comprises re-ordering the plurality of images, based on the plurality of views represented therein, to facilitate a smooth viewing experience of the at least one
15 video segment. In this regard, it may be that the plurality of images are captured by randomly adjusting the at least one virtual camera at different heights and/or towards different viewing directions, thus a captured sequence of the plurality of images may collectively represent a staggered and discontinuous view of the environment. If this captured sequence of the plurality of images is shown to the
20 user, a viewing experience of the user could be unrealistic and non-immersive. Therefore, when utilising the plurality of images for producing the at least one video segment, the plurality of images are optionally beneficially re-arranged (namely, re-ordered) in a specific sequence that would represent a logical and continuous view of the environment. This facilitates in providing a cohesive and
25 smooth viewing experience of the at least one video segment to the user. Such re- ordering can be done by re-arranging the plurality of images in the specific sequence in a logical way, for example by placing images with similar views of the environment next to each other in the specific sequence, or similar.

In an example, a sequence of the plurality of images captured by the at least one virtual camera may represent, for example, a top-right view, a straight view, a bottom-right view, a top-left view, and a bottom-left view of the environment. In such an example, the aforesaid images can be re-ordered to provide, for example,
5 a clockwise spiral visual trajectory. Upon re-ordering, the sequence of the plurality of images may be: the straight view, the top-right view, the bottom-right view, the bottom-left view, the top-left view.

Optionally, the plurality of images representing a plurality of views of the environment from the at least one location collectively cover a 360-degree view of
10 the environment from the at least one location. Typically, when users view 3D environments from a particular location, they view different perspectives of the 3D environments, for example, by changing an orientation of his/her head. Therefore, the plurality of images are captured in a manner that they collectively cover the 360-degree view of the environment from the at least one location.
15 Advantageously, this facilitates in providing a complete view of the environment to the user from every direction from the at least one location. Thus, a viewing experience of the user would be realistic and immersive, when the at least one video segment is shown the user. In an example, the user can see what is in front of him/her, what is behind him/her, what is in left and right of his/her, and what is
20 above and below his/her, in the 360-degree view of the environment from the at least one location.

Notably, once the at least one video segment is generated, the at least one server
202 receives the input indicative of the given starting point from the client device
204. Such an input may be provided by a user, by using a button, a touch screen, a
25 microphone, and the like, of the client device 204 to provide the input. In some implementations, when multiple starting points are determined in the 3D video, an input indicative of only one starting point at a time is received by the at least one server 202 from the client device 204. Subsequently, the given video segment corresponding to the given starting point is sent by the at least one server 202 to
30 the client device 204, and is to be played at the client device 204 until the 3D

video begins to play from the given starting point at the client device 204. Advantageously, in this way, the user of the client device 204 experiences interactive streaming of the 3D video and does not perceive any latency/delay of the streaming. This is because until the 3D video begins to play from the given
5 starting point at the client device 204, the given video segment corresponding to the given starting point would be shown to the user of the client device 204. Thus, viewing experience of said user is not compromised in the time that processing resources of the at least one server 202 are being activated for streaming the 3D video and the 3D video is being sent to the client device 204. Moreover, the given
10 video segment is immersive and interactive in nature, so the user's viewing experience is rather enhanced in said time.

It will be appreciated that streaming of the 3D video from the given starting point is performed by the at least one server 202. This means that the at least one server
202 is a streaming source for the client device 204 (which is a streaming
15 destination). Notably, "activating" processing resources of the at least one server 202 could mean one or more of: increasing a number of servers that are engaged in streaming, increasing a number of processing threads handled by each core of a server engaged in streaming, and the like.

It will also be appreciated that initializing streaming of the 3D video from the
20 given starting point could mean any of:
- activating a processor of the at least one server 202 from its de-activated state for performing the streaming of the 3D video from the given starting point,
- controlling said processor to pause at least one currently ongoing task being performed by the processor for performing the streaming of the 3D video
25 from the given starting point,
- controlling said processor for performing the streaming of the 3D video from the given starting point, in addition to at least one currently ongoing task being performed by said processor.

Optionally, the method further comprises encoding the given video segment prior to sending the given video segment to the client device 204. The technical benefit of such encoding is that minimal storage transmission resources, transmission time, and bandwidth are required by the at least one server 202 to send the given
5 (encoded) video segment to the client device 204. Optionally, upon receiving the given (encoded) video segment, the processor of the client device 204 is configured to decode the given video segment, prior to playing the given video segment at the client device 204.

It will be appreciated that when the user of the client device 204 optionally wants
10 to stream the 3D video from a new starting point that was not predefined earlier, a new video segment corresponding to the new starting point could be generated at the client device 204. The generation of the new video segment is performed by the processor of the client device 204. In this regard, when the user was earlier present at the at least one cell corresponding to the at least one location in the 3D
15 model and looked around, images that were rendered at that time can be utilised to generate the new video segment. Such images were rendered during user's movement in the environment for the at least one location previously, and generating the new video segment using such previously-generated images enables in providing a dynamic immersive streaming experience without
20 computationally overburdening the processor of the client device 204.

Optionally, the method further comprises:
- generating reprojection information indicative of at least one of: a rotation, a translation, to be applied to one or more of the plurality of images in the given video segment; and
25 - sending the reprojection information to the client device, wherein the reprojection information is to be utilized at the client device for generating additional images that are to be added to the given video segment prior to playing the given video segment at the client device.

In this regard, since camera poses from which the plurality of images are captured are already known accurately, a change in viewpoint and view direction for reprojection can be determined accurately, based on said camera poses. Thus, the reprojection information is optionally generated based on the change in viewpoint
5 and view direction, and on a number of additional images that are required to generated at the client device 204. The change in viewpoint corresponds to the translation to be applied and the change in view direction corresponds to the rotation to be applied. In the reprojection information, the translation may be specified in terms of units of length, new coordinates, or similar, whereas the
10 rotation may be specified in terms of angles, radians, and similar. The number of additional images can be known, for example, based on a difference between an actual frame rate of the given video segment and a required frame rate of the given video segment. For example, when the actual frame rate of the given video segment having a 5 second duration is 40 frames per second (FPS) and the
15 required frame rate is 60 FPS, 20 additional frames would be generated (using the reprojection information) per second of the given video segment. In this example, a total of 100 additional images would be generated at the client device 204. It will be appreciated that the additional images (representing additional views of the environment) can be generated at the client device 204, by the processor of the
20 client device 204, using the reprojection information, for example, by employing an interpolation technique or an extrapolation technique.

Optionally, some of the additional images are generated and added to an initial portion of the given video segment prior to playing the given video segment at the client device 204, and remaining images of the additional images are generated
25 and added to a later portion of the given video segment while the initial portion is being played at the client device 204.

It will be appreciated that sending the reprojection information to the client device 204 facilitates in saving space at a data repository associated with the at least one server 202, and in saving transmission and processing resources of the at least one
30 server 202. This is so because the at least one server 202 is able to include fewer

images in the at least one video segment, since additional images can be generated using the reprojection information. Moreover, the given video segment that is generating by adding the additional images to the given video segment would be highly realistic and information-rich. This would potentially improve user's
5 viewing experience when the given video segment is shown to the user at the client device 204.

Optionally, the method further comprises sending the at least one video segment corresponding to the at least one starting point of the 3D video to a cache memory of a content delivery network, wherein the client device is communicably coupled
10 to the content delivery network,
and wherein the step of receiving the input indicative of the given starting point from which the 3D video is to be streamed to the client device and the step of sending the given video segment corresponding to the given starting point to the client device and initializing streaming of the 3D video from the given starting
15 point are implemented by a server of the content delivery network.

The term "content delivery network" refers to a group of geographically distributed and interconnected servers that are employed to deliver content to client devices based on their geographical locations. Such content could, for example, be videos, webpages, audio files, text files, and the like. Depending on a
20 geographical location of a given client device (for example, the client device 204), a server (for example, the server 208) of the content delivery network 206 that is nearest to said geographical location of the given client device may be selected to deliver the content to the given client device.

It will be appreciated that there could be a scenario in which the at least one server
25 202 and the client device 204 are located considerably far from each other, and thus receiving the given video segment by the client device 204 from the at least one server 202 is associated with considerably network latency and bandwidth issues. Therefore, in such a case, the content delivery network 206 is optionally employed between the at least one server 202 and the client device 204. Thus,

instead of receiving the given video segment directly from the at least one server 202, the client device 204 can easily receive the given video segment from the content delivery network 206, as and when required in real time or near-real time (i.e., without any latency/delay) and in a bandwidth-efficient manner.

5 Optionally, the at least one video segment is sent from the at least one server 202 to the cache memory of the content delivery network 206, at a time when a number of streaming requests are low. The cache memory of the content delivery network 206 comprises the cache memories 212 and 214. Upon receiving the at least one video segment from the at least one server 202, the content delivery
10 network 206 stores the at least one video segment in its cache memory, and upon receiving the input indicative of the given starting point from the client device 204, the given video segment corresponding to the given starting point can be sent to the client device 204. It will be appreciated that the at least one video segment could be sent to and saved at one or more of the cache memories 212 and 214.
15 This potentially reduces transmission resources and time of the at least one server 202 when the given video segment is streamed to the client device 204. Moreover, employing the content delivery network 206 is particularly beneficial when the at least one server 202 has to serve a plurality of client devices simultaneously. It will be appreciated that the content delivery network 206 is communicably
20 coupled to the plurality of client devices. The term "cache memory" refers to a memory that can be accessed quickly and efficiently, as compared to a main memory (namely, a random access memory (RAM)). It will be appreciated that the aforementioned steps of receiving the input and sending the given video segment and initializing streaming of the 3D video are implemented by the server
25 (comprising the servers 208 and 210) of the content delivery network 206 in a similar manner as performed by the at least one server 202 of the system 200 that is described earlier.

Referring to FIGs. 3A, 3B, 3C and 3D, FIG. 3A illustrates an exemplary layout of an environment represented in a three-dimensional (3D) video, while FIGs. 3B-
30 3D illustrate exemplary images representing different views of the environment

from different perspectives with respect to a starting point of the 3D video, in accordance with an embodiment of the present disclosure. With reference to FIG. 3A, the environment is shown to be a museum having three rooms 302a-c of different sizes. The room 302a is a first room of the museum, and has an entrance
5 gate 304 of the museum. The room 302a also has two paintings 306a-b arranged on a wall of the room 302a, and one painting 306c arranged on another wall of the room 302a. The room 302a is shown to be connected to the room 302b via a passage 308a. The room 302b is a second room of the museum. The room 302b has two sculptures 310a-b arranged in two different corners of the room 302b,
10 one painting 306d arranged on a wall of the room 302b, and a window 312 on another wall of the room 302b. The room 302b is shown to be connected to the room 302c via a passage 308b. The room 302c is a third room of the museum, and has an exit gate 314 of the museum. The room 302c has one sculpture 310c arranged in a central region of the room 302c, three paintings 306e-g arranged on
15 a wall of the room 302c, and two paintings 306h-i arranged on another wall of the room 302c.

It will be appreciated that a trajectory (shown using a dashed line with arrows) from the entrance gate 304 to the exit gate 314 represents a complete path along which a virtual tour of the museum (represented in the 3D video) is provided, as
20 the 3D video, to the user of the client device 204. As an example, a location 'X' in the room 302b may correspond to a starting point of the 3D video from which interactive streaming of the 3D video is to be enabled. At least one virtual camera may be employed for capturing a plurality of images (for example, such as depicted in FIGs. 3B-3D) from the location 'X', the plurality of images
25 representing different views of the museum. The plurality of images are utilized for producing a video segment that is to be played at the client device 204 until the 3D video begins to play at the client device 204, from the starting point.

With reference to FIG. 3B, there is shown a first image representing a view of the museum from a first perspective at the location 'X'. The first image is shown to
30 represent objects, for example, such as a wall 316a of the room 302b, the passage

308a, portions of the two paintings 306a-b arranged on a wall of the room 302a, and some floor space 314 of the rooms 302a-b.

With reference to FIG. 3C, there is shown a second image representing a view of the museum from a second perspective at the location 'X', the second perspective
5 being different from the first perspective. The second image is shown to represent objects, for example, such as walls 316a and 316b of the room 302b, the sculpture 310a arranged in a corner of the room 302b, the painting 306d, a portion of the passage 308a, a portion of the painting 306a in the room 302a, and some floor space of the rooms 302a-b.

10 With reference to FIG. 3D, there is shown a third image representing a view of the museum from a third perspective at the location 'X', the third perspective being different from both the first and second perspectives. The third image is shown to represent objects, for example, such as a wall 316c of the room 302b, the sculpture 310b arranged in a corner of the room 302b, the window 312, and some
15 floor space of the room 302b.

FIGs. 3A-3D are merely examples, which should not unduly limit the scope of the claims herein. The person skilled in the art will recognize many variations, alternatives, and modifications of embodiments of the present disclosure.

The present disclosure also relates to the system 200 as described above. Various
20 embodiments and variants disclosed above, with respect to the aforementioned method, apply mutatis mutandis to the system 200.

Optionally, the system 200 further comprises a data repository communicably coupled to the at least one server 202, wherein the at least one server 202 is configured to store, at the data repository, at least one of: the 3D model of the
25 environment, the 3D video, the at least one video segment, a list of identified starting points from which the interactive streaming of the 3D video is to be enabled. Optionally, the at least one server 202 is further configured to store, at

the data repository, at least one of: the reprojection information, the 3D textual matrix of the environment.

Optionally, the at least one server 202 is configured to:
- generate reprojection information indicative of at least one of: a rotation, a
5 translation, to be applied to one or more of the plurality of images in the given video segment; and
- send the reprojection information to the client device 204, wherein the reprojection information is to be utilized at the client device 204 to generate additional images that are to be added to the given video segment prior to playing
10 the given video segment at the client device 204.

Optionally, the at least one server 202 is configured to send the at least one video segment corresponding to the at least one starting point of the 3D video to a cache memory of a content delivery network 206, wherein the client device 204 is communicably coupled to the content delivery network 206,
15 and wherein a server of the content delivery network 206 is configured to receive the input indicative of the given starting point from which the 3D video is to be streamed to the client device 204, and send the given video segment corresponding to the given starting point to the client device 204 and initialize streaming of the 3D video from the given starting point.

20 Optionally, in the system 200, the 3D video is a virtual-reality (VR) video.

Optionally, when obtaining the 3D model of the environment represented in the 3D video, the at least one server 202 is configured to receive the 3D model of the environment from a device at which the 3D video is created or from a device at which the 3D model is stored.

25 Optionally, in the system 200, the 3D model is in form of one of: a 3D polygonal mesh, a 3D point cloud, a voxel-based model, a mathematical 3D surface model, a 3D grid.

Optionally, when determining the at least one starting point of the 3D video, the at least one server 202 is configured to:
- create a 3D grid indicating spatial structuring of the environment represented in the 3D video, wherein the 3D grid comprises a plurality of cells;
5 - create a 3D textual matrix of the environment, wherein information pertaining to each cell of the plurality of cells is saved in the 3D textual matrix;
- select an actual start point of the 3D video as a default starting point;
- enable streaming of the 3D video from the default starting point and collecting user interaction data corresponding to a time period of the streaming;
10 - analyse the user interaction data for predicting probabilities of a user interacting with the plurality of cells in future; and
- identify the at least one starting point, based on the probabilities that are predicted, wherein the at least one location in the 3D model that corresponds to the at least one starting point maps to at least one cell amongst the plurality of
15 cells.

Optionally, in the system 200, the at least one starting point is at least one of: an actual start point of the 3D video, a mid-point of the 3D video, an intermediate point along a length of the 3D video.

Optionally, when generating the at least one video segment corresponding to the
20 at least one starting point using the 3D model, the at least one server 202 is configured to:
- identify, in the 3D model, the at least one location corresponding to the at least one starting point;
- control at least one virtual camera for capturing the plurality of images
25 representing the plurality of views of the environment from the at least one location in the 3D model, by adjusting a height and/or a viewing direction of the at least one virtual camera at the at least one location; and
- utilise the plurality of images captured from the at least one location for producing the at least one video segment.

Optionally, when utilising the plurality of images captured from the at least one location for producing the at least one video segment, the at least one server 202 is configured to re-order the plurality of images, based on the plurality of views represented therein, to facilitate a smooth viewing experience of the at least one
5 video segment.

Optionally, in the system 200, the plurality of images representing a plurality of views of the environment from the at least one location collectively cover a 360- degree view of the environment from the at least one location.

Modifications to embodiments of the present disclosure described in the foregoing
10 are possible without departing from the scope of the present disclosure as defined by the accompanying claims. Expressions such as "including", "comprising", "incorporating", "have", "is" used to describe and claim the present disclosure are intended to be construed in a non-exclusive manner, namely allowing for items, components or elements not explicitly described also to be present. Reference to
15 the singular is also to be construed to relate to the plural.
, Claims:CLAIMS

What is claimed is:

1. A method for interactive streaming of a three-dimensional (3D) video, the method comprising:

5 - obtaining a 3D model of an environment represented in the 3D video;
- determining at least one starting point of the 3D video from which the interactive streaming of the 3D video is to be enabled;
- generating at least one video segment corresponding to the at least one starting point using the 3D model, wherein the at least one video segment
10 comprises a plurality of images representing a plurality of views of the environment from at least one location in the 3D model that corresponds to the at least one starting point;
- receiving an input indicative of a given starting point from which the 3D video is to be streamed to a client device; and
15 - sending a given video segment corresponding to the given starting point to the client device and initializing streaming of the 3D video from the given starting point, wherein the given video segment is to be played at the client device until the 3D video begins to play from the given starting point at the client device.

2. A method as claimed in claim 1, further comprising:

20 - generating reprojection information indicative of at least one of: a rotation, a translation, to be applied to one or more of the plurality of images in the given video segment; and
- sending the reprojection information to the client device, wherein the reprojection information is to be utilized at the client device for generating
25 additional images that are to be added to the given video segment prior to playing the given video segment at the client device.

3. A method as claimed in claim 1 or 2, further comprising sending the at least one video segment corresponding to the at least one starting point of the 3D video to a cache memory of a content delivery network, wherein the client device is communicably coupled to the content delivery network,

5 and wherein the step of receiving the input indicative of the given starting point from which the 3D video is to be streamed to the client device and the step of sending the given video segment corresponding to the given starting point to the client device and initializing streaming of the 3D video from the given starting point are implemented by a server of the content delivery network.

10 4. A method as claimed in any of claims 1-3, wherein the 3D video is a virtual-reality (VR) video.

5. A method as claimed in any of claims 1-4, wherein the step of obtaining the 3D model of the environment represented in the 3D video comprises receiving the 3D model of the environment from a device at which the 3D video is created
15 or from a device at which the 3D model is stored.

6. A method as claimed in any of claims 1-5, wherein the step of determining the at least one starting point of the 3D video comprises:

- creating a 3D grid indicating spatial structuring of the environment represented in the 3D video, wherein the 3D grid comprises a plurality of cells;
20 - creating a 3D textual matrix of the environment, wherein information pertaining to each cell of the plurality of cells is saved in the 3D textual matrix;
- selecting an actual start point of the 3D video as a default starting point;
- enabling streaming of the 3D video from the default starting point and collecting user interaction data corresponding to a time period of the streaming;
25 - analysing the user interaction data for predicting probabilities of a user interacting with the plurality of cells in future; and
- identifying the at least one starting point, based on the probabilities that are predicted, wherein the at least one location in the 3D model that corresponds

to the at least one starting point maps to at least one cell amongst the plurality of cells.

7. A method as claimed in any of claims 1-6, wherein the step of generating the at least one video segment corresponding to the at least one starting point
5 using the 3D model comprises:

- identifying, in the 3D model, the at least one location corresponding to the at least one starting point;
- controlling at least one virtual camera for capturing the plurality of images representing the plurality of views of the environment from the at least one
10 location in the 3D model, by adjusting a height and/or a viewing direction of the at least one virtual camera at the at least one location; and
- utilising the plurality of images captured from the at least one location for producing the at least one video segment.

8. A method as claimed in claim 7, wherein the step of utilising the plurality
15 of images captured from the at least one location for producing the at least one video segment comprises re-ordering the plurality of images, based on the plurality of views represented therein, to facilitate a smooth viewing experience of the at least one video segment.

9. A method as claimed in any of claims 1-8, wherein the plurality of images
20 representing a plurality of views of the environment from the at least one location collectively cover a 360-degree view of the environment from the at least one location.

10. A system for interactive streaming of a three-dimensional (3D) video, the system comprising at least one server configured to:

25 - obtain a 3D model of an environment represented in the 3D video;
- determine at least one starting point of the 3D video from which the interactive streaming of the 3D video is to be enabled;

- generate at least one video segment corresponding to the at least one starting point using the 3D model, wherein the at least one video segment comprises a plurality of images representing a plurality of views of the environment from at least one location in the 3D model that corresponds to the at
5 least one starting point;
- receive an input indicative of a given starting point from which the 3D video is to be streamed to a client device; and
- send a given video segment corresponding to the given starting point to the client device and initialize streaming of the 3D video from the given starting
10 point, wherein the given video segment is to be played at the client device until the 3D video begins to play from the given starting point at the client device.

Documents

Application Documents

#	Name	Date
1	202311013231-STATEMENT OF UNDERTAKING (FORM 3) [27-02-2023(online)].pdf	2023-02-27
2	202311013231-POWER OF AUTHORITY [27-02-2023(online)].pdf	2023-02-27
3	202311013231-FORM FOR STARTUP [27-02-2023(online)].pdf	2023-02-27
4	202311013231-FORM FOR SMALL ENTITY(FORM-28) [27-02-2023(online)].pdf	2023-02-27
5	202311013231-FORM 1 [27-02-2023(online)].pdf	2023-02-27
6	202311013231-FIGURE OF ABSTRACT [27-02-2023(online)].pdf	2023-02-27
7	202311013231-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [27-02-2023(online)].pdf	2023-02-27
8	202311013231-DRAWINGS [27-02-2023(online)].pdf	2023-02-27
9	202311013231-DECLARATION OF INVENTORSHIP (FORM 5) [27-02-2023(online)].pdf	2023-02-27
10	202311013231-COMPLETE SPECIFICATION [27-02-2023(online)].pdf	2023-02-27
11	202311013231-Proof of Right [16-03-2023(online)].pdf	2023-03-16
12	202311013231-FORM-26 [16-03-2023(online)].pdf	2023-03-16
13	202311013231-Others-150523.pdf	2023-06-21
14	202311013231-Others-150523-1.pdf	2023-06-21
15	202311013231-GPA-150523.pdf	2023-06-21
16	202311013231-Correspondence-150523.pdf	2023-06-21
17	202311013231-Power of Attorney [29-11-2023(online)].pdf	2023-11-29
18	202311013231-FORM28 [29-11-2023(online)].pdf	2023-11-29
19	202311013231-Form 1 (Submitted on date of filing) [29-11-2023(online)].pdf	2023-11-29
20	202311013231-Covering Letter [29-11-2023(online)].pdf	2023-11-29