Streaming An Event In A Virtual Environment

< Back

Streaming An Event In A Virtual Environment

Abstract: Disclosed is a system for streaming an event in a virtual environment. The system may receive event data comprising venue data and broadcast data for an event. The broadcast data may comprise a live stream of the event. The system may generate a 3D digital twin of a venue where the event is taking place. Further, the system may identify one or more event objects in the venue. The system may also track movement of the one or more event objects across the live stream of the event. The system may then render a 3D digital twin of the venue based on a viewer position received from a user. The system may recreate the movement of the one or more event objects in the rendered 3D digital twin. [To be published with Figure 1]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

15 March 2024

Publication Number

15/2024

Publication Type

INA

Invention Field

COMMUNICATION

Status

Parent Application

Applicants

Quidich Innovation Labs Pvt. Ltd.

No 6, Keytuo Kondivita Rd, M.I.D.C, Andheri East, Mumbai Maharashtra India

Inventors

1. KULSHRESHTHA, Rahat

404, Rosa Alba, Next to Nahar International School, Chandivali, Andheri East 400072

2. MEHTA, Gaurav

601, Chester Supreme Pallacio Near Pancard Clubs, Baner

3. CHAUDHARY, Manuyash

1197 Sector -3 Rohtak Haryana India 124001

4. MOHAMMED TM, Thaha

Thaha Mohammed TM Thottivalappil Mangadavath, Nannamukku South 679575

Specification

Description:FORM 2

THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003

COMPLETE SPECIFICATION
(See Section 10 and Rule 13)

Title of invention:
STREAMING AN EVENT IN A VIRTUAL ENVIRONMENT

APPLICANT:
Quidich Innovation Labs Pvt. Ltd.
Having address as:
No 6, Keytuo, Kondivita Rd, M.I.D.C, Andheri East, Mumbai, 400059

The following specification describes the invention and the manner in which it is to be performed.
PRIORITY INFORMATION
[001] This patent application does not take priority from any application.
TECHNICAL FIELD
[002] The present subject matter described herein, in general, pertains to multimedia technology and digital content creation and more specifically, pertains to streaming a live event in a virtual environment.
BACKGROUND
[003] How events are witnessed and consumed changes dramatically when new technologies emerge. With the advent of virtual environments and the increasing need for immersive experiences, there exists a burgeoning interest in mixing live broadcast data with virtual spaces to create dynamic, interactive, and engaging events. This convergence has led to the creation of numerous platforms seeking to bridge the gap between physical presence and digital immersion.
[004] Existing systems for broadcasting events in virtual environments have difficulties in offering real-time, high-quality experiences while synchronising live broadcast data with virtual aspects flawlessly. The existing systems have problems with latency, lack of synchronisation, and restricted interactivity, which negatively affects the user experience as a whole.
SUMMARY
[005] Before the present systems and methods, are described, it is to be understood that this application is not limited to the particular systems, and methodologies described, as there can be multiple possible embodiments which are not expressly illustrated in the present disclosures. It is also to be understood that the terminology used in the description is for the purpose of describing the particular versions or embodiments only, and is not intended to limit the scope of the present application. This summary is provided to introduce concepts related to systems and methods for streaming an event in a virtual environment and the concepts are further described below in the detailed description. This summary is not intended to identify essential features of the claimed subject matter nor is it intended for use in determining or limiting the scope of the claimed subject matter.
[006] In one implementation, a system for streaming an event in a virtual environment is disclosed. The system may comprise a processor and a memory coupled to the processor. The processor may be configured to execute program instructions stored in the memory. The system may receive event data comprising venue data and broadcast data. The broadcast data may include a live stream of an event and the venue data may include details about a venue where the event is taking place. Further, the system may generate a three dimensional digital twin of the venue based on the venue data. The three dimensional digital twin may be generated using at least one of computer vision, photogrammetry, and computer graphics techniques based on the event data. The three dimensional digital twin of the venue may be generated in a virtual environment. Furthermore, the system may detect one or more event objects present in the venue in the broadcast data. Subsequently, the system may track movement of the one or more event objects using a machine learning algorithm. The movement may correspond to a change in position of the one or more event objects and a change in pose of the one or more event objects. Further, the system may receive a viewer position to render the 3D digital twin of the venue based on the viewer’s position. Subsequently, the system may recreate the movement of the one or more event objects in the rendered 3D digital twin of the venue using a transformation model. The rendered 3D digital twin may be displayed to a user on multimedia devices including at least one of a mobile, a tablet, a Virtual Reality (VR) Headset, a television (TV), and an Augmented Reality (AR) Headset.
[007] In another implementation, a method for streaming an event in a virtual environment is disclosed. In order to stream an event in a virtual environment, initially, event data comprising broadcast data and venue data is received. The broadcast data may include a live stream of an event and the venue data may include details about a venue where the event is taking place. Further, a 3D digital twin of the venue may be generated based on the venue data comprising at least one of a name of the venue, measurements of the venue, one or more images of the venue, a location of the venue, and weather data corresponding to the location of the venue. Subsequently, one or more event objects present in the venue may be detected in the event data. Further, movement of the one or more event objects may be tracked in the event data. The movement may correspond to at least one of a change in position coordinates of the one or more event objects and a change in pose of the one or more event objects. Furthermore, a viewer position may be received to render the 3D digital twin of the venue based on the viewer position. Further, the movement of the one or more event objects may be recreated in the rendered 3D digital twin using a transformation model.
[008] In yet another implementation, non-transitory computer readable medium embodying a program executable in a computing device for streaming an event in a virtual environment is disclosed. The program may comprise a program code for receiving event data comprising broadcast data and venue data. The broadcast data may include a live stream of an event and the venue data may include details about a venue where the event is taking place. The program may comprise a program code for generating a 3D digital twin of the venue. The program may comprise a program code detecting one or more event objects in the venue. The program may comprise a program code for tracking movement of the one or more event objects. The program may comprise a program code for receiving a viewer’s position to render the 3D digital model based on the viewer’s position. The program may comprise a program code for recreating the movement of the one or more event objects in the rendered 3D digital twin of the venue using a transformation model.
BRIEF DESCRIPTION OF THE DRAWINGS
[009] The foregoing detailed description of embodiments is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the disclosure, examples of the disclosure are shown in the present document; however, the disclosure is not limited to the specific methods and apparatus disclosed in the document and the drawings.
[0010] The detailed description is given with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the drawings to refer to like features and components.
[0011] Figure 1 illustrates a network implementation of a system for streaming an event in a virtual environment, in accordance with an embodiment of the present subject matter.
[0012] Figure 2 illustrates a flowchart of a method for streaming an event in a virtual environment, in accordance with an embodiment of the present subject matter.
[0013] Figures 3 illustrates an example of a stream in a virtual environment, in accordance with an embodiment of the present subject matter.
[0014] Figure 4 shows an example of a frame of a live stream, in accordance with an embodiment of the present subject matter.
[0015] Figure 5 illustrates exchange of data among one or more components of the system, in accordance with an embodiment of the present subject matter.
[0016] The figures depict an embodiment of the present disclosure for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the disclosure described herein.
DETAILED DESCRIPTION
[0017] Some embodiments of this disclosure, illustrating all its features, will now be discussed in detail. The words "comprising," "obtaining," "recreating," "rendering," "generating," "detecting," "tracking," "having," "containing," and "including," and other forms thereof, are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural references unless the context clearly dictates otherwise. Although any systems and methods similar or equivalent to those described herein can be used in the practice or testing of embodiments of the present disclosure, the exemplary, systems and methods are now described. The disclosed embodiments are merely exemplary of the disclosure, which may be embodied in various forms.
[0018] Various modifications to the embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. However, one of ordinary skill in the art will readily recognize that the present disclosure is not intended to be limited to the embodiments illustrated, but is to be accorded the widest scope consistent with the principles and features described herein.
[0019] Certain technical challenges exist in streaming an event in a virtual environment. One technical challenge faced in streaming an event in a virtual environment is that the virtual environment needs to accurately depict various aspects of the event. The solution presented in the embodiments herein is to use 3D modelling techniques to generate a 3D digital twin of a venue of the event. The 3D modelling techniques may include one or more machine learning algorithms that may be trained using images of the event received from one or more cameras. The detailed functioning is described below with the help of figures.
[0020] While aspects of the described system and method for streaming an event in a virtual environment may be implemented in any number of different computing systems, environments, and/or configurations, the embodiments are described in the context of the following exemplary system/s.
[0021] Referring to Figure 1, a network implementation 100 of a system 102 for streaming an event in a virtual environment is disclosed. one or more users may access the system 102 through one or more user devices 104-1, 104-2…104-N. The user devices 104-1, 104-2…104-N may be collectively referred to as the user devices 104.
[0022] Although the present disclosure is explained considering that the system 102 is implemented on a server, it may be understood that the system 102 may be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a virtual environment, a mainframe computer, a server, a network server, a cloud-based computing environment. The system 102 may be accessed by multiple users through one or more user devices 104-1, 104-2…104-N. In one implementation, the system 102 may comprise the cloud-based computing environment in which the user may operate individual computing systems configured to execute remotely located applications. Examples of the user devices 104 may include but are not limited to, a portable computer, a personal digital assistant, a handheld device, and a workstation. The user devices 104 are communicatively coupled to the system 102 through a network 106. In another implementation, the system 102 may be implemented on a user device 104 as a stand-alone system.
[0023] In one implementation, the network 106 may be a wireless network, a wired network, or a combination thereof. The network 106 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and the like. The network 106 may either be a dedicated network or a shared network. The shared network represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like, to communicate with one another. Further, the network 106 may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, and the like.
[0024] In one embodiment, the system 102 may include at least one processor 108, an input/output (I/O) interface 110, a memory 112, and a database 114. The at least one processor 108 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, Central Processing Units (CPUs), state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the at least one processor 108 is configured to fetch and execute computer-readable instructions stored in the memory 112.
[0025] The I/O interface 110 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface 110 may allow the system 102 to interact with the user directly or through the client devices 104. Further, the I/O interface 110 may enable the system 102 to communicate with other computing devices, such as web servers and external data servers (not shown). The I/O interface 110 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. The I/O interface 110 may include one or more ports for connecting a number of devices to one another or to another server.
[0026] The memory 112 may include any computer-readable medium or computer program product known in the art including, for example, volatile memory, such as static random-access memory (SRAM) and dynamic random-access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, Solid State Disks (SSD), optical disks, and magnetic tapes. The memory 112 may include routines, programs, objects, components, data structures, etc., which perform particular tasks or implement particular abstract data types. The memory 112 may include programs or coded instructions that supplement applications and functions of the system 102. In one embodiment, the memory 112, amongst other things, serves as a repository for storing data processed, received, and generated by one or more of the programs or the coded instructions.
[0027] In an embodiment, for streaming an event in a virtual environment, a user may use the user device 104 to access the system 102 via the I/O interface 110. The user may register the user devices 104 using the I/O interface 110 in order to use the system 102. In one aspect, the user may access the I/O interface 110 of the system 102 to provide input to the system if required.
[0028] In order to stream an event in a virtual environment, initially, the system 102 may receive event data. The event data may comprise venue data and broadcast data. The venue data may include at least one of a name of the venue, measurements of a venue, one or more images of the venue, location of the venue, and weather data corresponding to the location of the venue. The one or more images may be at least one of two dimensional (2D) images and depth images captured using one or more image sensors. The broadcast data includes at least one of a live stream and time stamp data. The live stream may be received using video capture devices. The live stream may be captured using one or more cameras strategically placed in the venue. The live stream may be transmitted from the one or more cameras to a broadcasting station. In an embodiment, the live stream may be transmitted to a cloud server. Further, the system may fetch the live stream from the cloud server for analysis and virtual content generation. The transmission may be done using at least one of wired connections, fibre optics, and wireless technologies. In an embodiment, the live stream may be compressed for transmission. The time stamp data may be used to analyse the live stream by splitting the video into one or more image frames arranged in a sequence based on the time stamp data.
[0029] In an embodiment, the venue data may be received from a database based on the broadcast data using a venue detection model. The database may comprise venue data corresponding to a plurality of venues. The system 102 may analyse the broadcast data using the venue detection model to identify the venue from the database in order to receive corresponding venue data. The venue detection model may be trained using a training dataset comprising a plurality of image frames from a plurality of events annotated with corresponding venues. The plurality of events may have taken place at a plurality of venues. The venue detection model may analyse image frames of the live stream from the broadcast data. The venue detection model may identify the venue of the event based on the analysis. In an embodiment, the venue data may identify the venue by comparing one or more features from the image frames of the live stream from the broadcast data with the plurality of image frames in the training dataset. The venue detection model may determine similarity between the one or more features in the image frames of the live stream and the plurality of images from the training dataset. Further, the venue detection model may identify an image from the plurality of images having the highest similarity. The venue detection model may identify the venue corresponding to the live stream based on the annotated venue in the image having the highest similarity.
[0030] The venue data from the event data may be used to generate a Three Dimensional (3D) digital twin of the venue. The 3D digital twin may be a replica of the venue in a virtual environment, such as metaverse. The 3D digital twin may be generated using at least one of computer vision, photogrammetry, and computer graphics techniques. In an embodiment, the 3D digital In an embodiment, the venue data and the live stream may be used to generate the 3D digital twin may be created manually from images of venue captured using an image sensor. The image sensor may be attached to a drone to capture images of the venue. In another embodiment, one or more images of the venue may be extracted from the live stream to capture a plurality of features and objects that may be missing from the venue data. One or more machine learning algorithms including Neural Radiance Fields (NeRF), Multi-View Stream (MVS), and the like may be used to convert the venue data into the 3D digital twin of the venue.
[0031] Further, the system may detect event objects. The event objects may be objects present in the venue during the event. The event objects may include at least one of one or more players, one or more audience members, one or more equipment items, and the like. For example, a cricket match is the event being broadcasted as the live stream. The event objects may include the cricket team players, the crowd, the stands, the cricket field, the boundary, the cricket ball, the wickets, the inner circle, and the umpires. In an embodiment, the one or more event objects may be detected using object detection techniques including Convolutional Neural Networks (CNN). The CNN may be trained using a training data comprising one or more live streams corresponding to one or more events, annotated event objects for the one or more events in image frames of the live streams corresponding to the one or more events. In an embodiment, the system may be trained to detect events based on event objects detected. For example, during drinks break in a cricket match, number of event objects increase because many extra players come on the field.
[0032] In another embodiment, the system may receive a list of event objects, further, the system may use one or more internet sources to obtain a set of images for each event object in the list of event objects. The system may then use a plurality of image processing techniques to detect the event objects from the list of event objects in the live stream based on the set of images obtained. The image processing techniques may include Histogram of Oriented Gradients, Haar Cascades, and Feature extraction and Feature matching using Convolution Neural Networks.
[0033] Further to detecting the one or more event objects, the system may track movement of the one or more event objects present in the venue. The system may continuously analyse the event data, specifically, the live stream of the event from the broadcast data. The system may process each image frame of the live stream based on time stamps associated with the image frames. The system may identify the first image frame of the live stream, further, the system may determine position coordinates of the one or more event objects in the first image frame. The system may continuously determine position coordinates of the one or more event objects in the subsequent image frames of the live stream. Further, the system may continuously match features of the event objects in an image frame with the features of the event objects in the subsequent image frames to detect appearance of a new event object or disappearance of an event object.
[0034] In an embodiment, the system may predict motion of the one or more event objects based on a change in the position coordinates of the event objects between two or more successive image frames. The system may use the predicted motion to estimate position coordinates of the one or more event objects in a new image frame of the live stream. The system may then detect the one or more event objects in the new image frame and compare the position coordinates of the detected one or more event objects with the predicted position coordinates to accurately determine the change in the position coordinates of the one or more event objects. The system may use one or more algorithms such as Kalman filters, and Particle filters to predict motion of the one or more event objects and continuously improve accuracy of the one or more algorithms by using a feedback loop. The feedback loop may be formed by providing difference between the predicted position coordinates of the one or more event objects and the position coordinates of the detected one or more event objects in the new image frame as an input to the one or more algorithms.
[0035] In an embodiment, the system may use data association techniques including Nearest Neighbor and Multiple Assignment Points to consistently track the one or more event objects across all image frames of the live stream.
[0036] Further, the system may track movement of the one or more event objects based on the tracking of the event objects. The movement may correspond to at least one of a change in position coordinates of the one or more image objects and a change in pose of the one or more event objects. The change in position coordinates of the one or more event objects may determined by continuously tracking the one or more event objects as explained above. The position coordinates of the one or more event objects may correspond to at least one of a location in the venue and pixel coordinates in an image frame of the live stream.
[0037] In an embodiment, the system may use advanced image processing to detect and track change in pose of the one or more event objects. The pose of the one or more event objects may correspond to the orientation of the one or more event objects. The system may determine a set of event objects from the one or more event objects to track pose of the set of event objects based on one or more features of the one or more event objects. The features of the one or more event objects may include at least one of keypoints/local descriptors, edges and corners, textures, shapes, and the like. The features of the one or more event objects may be determined and extracted from the image frames of the live stream using image processing techniques including at least one of Scale-Invariant Feature Transform (SIFT), Speeded-Up Robust Features (SURF), Oriented FAST and Rotated BRIEF (ORB), Gabor filters, Local Binary Patterns, Binary Robust Invariant Scalable Keypoints, Histograms of Oriented Gradients (HOG) and the like.
[0038] In an embodiment, the system may use human pose estimation algorithms like Metrabs that involve use of Metric-Scale Truncation-Robust Heatmaps to achieve accurate pose estimation without requiring prior knowledge of distance or relying on specific test-time information. In an embodiment, the system may train deep learning algorithms to identify poses specific to the event being broadcasted.
[0039] In an embodiment, to track the change in pose, the system may use the detected features in image frames of the live stream to estimate motion of the features between successive image frames of the live stream. The estimation of the motion of the features indicate a change in the object’s pose. Further, the system may estimate a new pose of the one or more event objects in a new image frame using algorithms including Perspective-n-Point, Random Sample Consensus, and the like. Further, the system may utilize methods including Kalman Filters and Sequential estimation for continuously estimating and tracking the pose of the one or more event objects in a live stream.
[0040] The system may render the 3D digital twin of the venue based on a viewer position. In an embodiment, the system may receive the viewer position from a user. In an embodiment, the system may have one or more predetermined viewer positions for a user to choose from. In another embodiment, the system may determine an ideal viewer position based on at least one of a type of event being live streamed, a device used by the user to view the live stream, historic data of the viewer positions selected by the user, and the like. To render the 3D digital twin based on the viewer position, the system may use one or more 3D modelling techniques including computer vision, photogrammetry, Neural Radiance Fields, and the like. The system may gather information points of the venue from at least one of the venue data and the broadcast data comprising the live stream. The information points may include one or more features of the one or more event objects and corresponding 3D coordinates. The system may generate a point cloud of the venue based on the information points using 3D construction techniques such as Structure from Motion. Further, the system may generate the 3D digital twin based on the viewer position, the information points, and the point cloud by creating a 3D mesh of the venue. The system may use surface reconstruction algorithms. The system may map textures on the 3D mesh based on the information points gathered by processing the live stream. In an embodiment the system may use refinement and optimization techniques for smoothening surfaces and minimizing modelling errors.
[0041] In an embodiment, the system may use a transformation model to render the 3D digital twin of the venue based on the viewer position. The transformation model may be trained using a training database comprising a plurality of videos of an event, a plurality of 3D digital twins of venues in the plurality of videos, a plurality of viewer positions in the venues, and corresponding modified 3D digital twins of the venues based on the viewer positions. The transformation model may use a mathematical equation to modify the 3D digital twins of the venues while maintaining accurate scale of the event objects in the 3D digital twins and accurate position and orientation of the event objects.
[0042] In an embodiment, the viewer position may be dynamic. For example, let us consider that a cricket match is the event being streamed. The viewer position may be the cricket ball. In the above example, the viewer position may continuously change, and the system may continuously render the 3D digital twin of the venue for every new viewer position using the transformation model.
[0043] Further to rendering the 3D digital twin of the venue based on the viewer position, the system may recreate the movement of the one or more event objects tracked by the system. The system may use the transformation model to recreate the movement of the one or more event objects in the rendered 3D digital twin of the venue. The transformation model may be trained to determine a scale coefficient, a rotation coefficient by comparing the generated 3D digital twin of the venue and the 3D digital twin of the venue rendered based on the viewer position. Further, the transformation model may be trained to calibrate the change in position and the change in pose of the one or more event objects, in the rendered 3D digital twin of the venue, based on the scale coefficient and the rotation coefficient. The scale coefficient may correspond to a change in size of the one or more event objects and the rotation coefficient may correspond to a change in perspective of viewing the event objects resulting in a change in features such as shape and size of the one or more event objects. The system may then recreate the movement of the one or more event objects based on the change in position of the one or more event objects, the change in pose of the one or more event objects, the scale coefficient, and the rotation coefficient.
[0044] The system may recreate the movement of the one or more event objects by plotting the one or more event objects in the rendered 3D digital twin of the venue. The one or more event objects may be plotted based on at least one of the one or more features of the one or more event objects, the scale coefficients, the rotation coefficients, the change in position of the one or more event objects, and the change in pose of the one or more event objects. The system may use the transformation model to plot the one or more event objects.
[0045] In an embodiment, the system may filter the event objects based on a predefined set of event objects. The filtered event objects may be recreated with randomized motion and features. For example, to recreate the audience in a cricket match, the system me create random human like figures in the area where the audience is detected and add a randomized motion to the created human like figures.
[0046] Referring now to Figure 2, a method 200 for streaming an event in a virtual environment is shown, in accordance with an embodiment of the present subject matter. The method 200 may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, etc., that perform particular functions or implement particular abstract data types.
[0047] The order in which the method 200 is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method 200 or alternate methods for streaming an event in a virtual environment. Furthermore, the method 200 for streaming an event in a virtual environment can be implemented in any suitable hardware, software, firmware, or combination thereof. However, for ease of explanation, in the embodiments described below, the method 200 may be considered to be implemented in the above-described system 102.
[0048] At 202, event data is received from at least one of a database, one or more internet sources, a broadcasting center, a video camera, and a cloud server. The event data may be received by the system using the processor 108. Further, the event data may be temporarily be stored in the memory 112. The event data may comprise details about the live event being broadcast. The details may include venue data, and broadcast data. The venue data includes at least one of measurements of a venue, one or more images of the venue, and location of the venue. The broadcast data includes at least one of a live stream and time stamp data. Let us consider an example of a cricket match being the live event. The venue may be a cricket stadium. Let us consider that the measurements of the cricket stadium include dimensions of the stadium building, dimensions of the playing field, size of the cricket pitch, shape of the boundary, radius of the boundary, and the like. One or more images of the stadium may be captured using a drone and an imaging device. The images may be captured from multiple angles to cover the entire stadium. In an embodiment, only one image of the stadium may be captured from the top angle. The location of the stadium may be provided as at least one of the name of the stadium and the city and geographical coordinates of the stadium. The location of the stadium may be used to find the stadium on one or more internet sources to find images of the stadium. The live stream may be received from one or more cameras recording the cricket match. The time stamp data may be provided along with the live stream for synchronizing the image frames of the live stream in order to track the movement of the event objects.
[0049] At 204, a three dimensional (3D) digital twin is generated based on the venue data by the processor. The method may comprise employing at least one of computer vision, photogrammetry, and computer graphics techniques to generate the 3D digital object. The 3D digital twin may be a digital twin of the venue. Considering the cricket match being the live event, the 3D digital twin may be a 3D model of the cricket stadium where the cricket match is being played. The 3D model may be a scaled replica of the cricket stadium. The 3D model may include all the objects as seen on the cricket stadium including the audience in the stands, the flood lights, the wickets, the players, and the like.
[0050] At 206, one or more event objects may be detected in the image frames of the live stream by the processor. The one or more event objects may be detected using object detection algorithms. In an embodiment, the one or more event objects may be determined based on a list of event objects.
[0051] At 208, movement of the one or more event objects may be tracked in the live stream, by the processor, using image processing techniques. The movement of the one or more event objects may correspond to a change in position of the one or more event objects and a change in pose of the one or more event objects. The movement of the one or more event objects may be tracked by determining position coordinates of the one or more event objects in each image frame of the live stream and comparing the position coordinates of the one or more event objects in successive image frames. The successive image frame for a first image frame may be determined based on the time stamp data and a frame rate of the live stream. The frame rate of the live stream corresponds to the number of image frames transmitted in a second.
[0052] At 210, a viewer position may be received by the system using the I/O interface 110. The viewer position may correspond to viewer’s choice of a location in the venue to consume the live stream in the virtual environment. In an embodiment, an ideal viewer position may be determined for a viewer using one or more algorithms. The ideal viewer position may be determined based on the type of event being streamed. The type of event may be one of a sport event, a cinematic event, an artistic event, and the like. In another embodiment, the ideal viewer position may be determined based on historical data comprising viewer positions selected by a user in the past.
[0053] At 212, the 3D digital twin of the venue may be rendered, by the processor, based on the viewer position. The 3D digital twin of the venue may be modified by changing one or more features of the one or more event objects using a transformation model. The transformation model may be trained to render a 3D digital twin of the venue based on a plurality of 3D digital twins of the venue rendered corresponding to a plurality of locations in the venue.
[0054] At 214, the movement of the one or more event objects is recreated, by the processor, in the 3D digital twin of the venue rendered based on the viewer position. The movement of the one or more event objects may be calibrated based on a scale coefficient and a rotation coefficient. The scale coefficient may correspond to a change in size of the one or more event objects and the rotation coefficient corresponds to a change of perspective between the generated 3D digital twin of the venue and the rendered 3D digital twin of the venue. A plurality of algorithms and image processing techniques may be used to recreate the movement of the one or more event objects.
[0055] Referring now to Figure 3, a snapshot 300 of the stream of the event in the virtual environment from a viewer position is shown. A cricket match is being streamed in the virtual environment. The viewer position is selected as a location in the venue a few feet above the ground near a point on the boundary. A few event objects including a stadium 302 , a cricket pitch 306, and a boundary 304 are visible.
[0056] Referring now to Figure 4, a snapshot 400 of the live stream is shown. The Figure 3 is a view of the moment captured in Figure 4 from the viewer position selected by the user. The cricket pitch 306 can be seen in the Figure 4 as transmitted in the live stream.
[0057] Referring now to Figure 5, data flow diagram 500 explaining data interactions among various components used for implementation of the system to stream an event in a virtual environment is illustrated, in accordance with an embodiment of the subject matter. Initially, a live stream is received from one or more cameras 502 at a broadcasting center 504. The live stream from multiple cameras may be manipulated and merged to generate a single stream for a user. The single stream may be uploaded to a cloud server 506. In an embodiment, the live stream may be directly uploaded on the cloud server 506. The live stream may be fetched from the cloud server using a video capture device 508. Further, the live stream may then be processed by a 3D digital twin generator 510.
[0058] In an embodiment, the 3D digital twin generator may fetch one or more images by converting the live stream into image frames. The one or more images may be used to determine venue data and the venue data may further be used to generate the 3D digital object. The 3D digital twin generator may use a combination of machine learning algorithms for object tracking and 3D reconstruction algorithms.
[0059] In another embodiment, the system 102 may receive the venue data including images of the venue, dimensions of the venue, and the like. The 3D digital twin generator may generate the 3D digital twin using the venue data based on one or more machine learning techniques including 3D reconstruction, computer vision, photogrammetry, and the like.
[0060] The live stream and the 3D digital twin may be received by a Video manipulator 512. The video manipulator may convert project the live stream in a virtual environment comprising a 3D digital twin of the venue using a transformation model. The video manipulator 512 may comprise a database of training data for the one or more machine learning algorithms used by the transformation model and the execution steps, a processor for executing the one or more machine learning algorithms to project the live stream to in a virtual environment. In an embodiment, the user may provide input comprising a viewer position and a playback speed using a user device 518. The user input may be transmitted to the video manipulator from the cloud server, or the user input may be received by the system 102 directly. The video manipulator may use the transformation model to render the 3D digital twin based on the user input.
[0061] The output of the video manipulator may be at least one of movements of one or more event objects identified by a machine learning model recreated as a 3D projection in the 3D digital twin using the transformation model on the live stream. The output of the video manipulator may be uploaded to the cloud server to be transmitted to the user devices for viewing the live stream in the virtual environment. In an embodiment, the output of the video manipulator may be directly transmitted or conveyed to the user devices.
[0062] The embodiments of the system and the method described above are explained considering an example of a sporting event. The systems and the methods may be used for any live streamed event including media events, political events, technological events, community events, and the like.
[0063] Exemplary embodiments discussed above may provide certain advantages. Though not required to practice aspects of the disclosure, these advantages may include the following.
[0064] Some embodiments of the system and the method may provide a better viewing experience of a live event to the audience.
[0065] Some embodiments of the system and the method may provide abilities to interact with a live stream of an event.
[0066] Some embodiments of the system and the method may provide better viewing experience and better understanding by allowing a user to select viewpoints while watching the live event recreated in the 3D digital twin and by allowing the broadcasters to add infographics to a live stream.
[0067] Some embodiments of the system and the method may provide detailed data of every moment in a live event. The data may be recorded as 3D reconstruction of each moment of the live event. Intricate data such as distance measurements between objects at a particular moment of the live event may be extracted using the 3D reconstruction and mathematical algorithms.
[0068] Although implementations for methods and system for streaming an event in a virtual environment have been described in language specific to structural features and/or methods, it is to be understood that the appended claims are not necessarily limited to the specific features or methods described. Rather, the specific features and methods are disclosed as examples of implementations for streaming an event in a virtual environment. , Claims:I/We Claim:
1. A method for streaming an event in a virtual environment, the method comprising:
receiving event data;
generating a three dimensional (3D) digital twin of a venue based on the event data;
detecting one or more event objects present in the venue in the event data;
tracking movement of the one or more event objects in the event data, wherein the movement of the one or more event objects corresponds to at least one of a change in position coordinates of the one or more event objects and a change in pose of the one or more event objects;
receiving a viewer position corresponding to viewer’s choice of a location in the venue;
rendering the 3D digital twin of the venue based on the viewer position; and
recreating the movement of the one or more event objects in the rendered 3D digital twin of the venue using a transformation model.
2. The method as claimed in claim 1, wherein the event data comprises venue data and broadcast data.
3. The method as claimed in claim 2, wherein the venue data comprises at least one of name of the venue, measurements of the venue, one or more images of the venue, location of the venue, and weather data corresponding to the location of the venue.
4. The method as claimed in claim 2, wherein the venue data is received, based on the broadcast data using a venue detection model, from a database comprising the venue data corresponding to a plurality of venues.
5. The method as claimed in claim 2, wherein the broadcast data includes at least one of a live stream and time stamp data, and wherein the live stream is captured using one or more image sensors placed in the venue.
6. The method as claimed in claim 1, wherein the transformation model plots the one or more event objects, and wherein the transformation model is trained using a training dataset comprising a plurality of live streams, a plurality of event objects in the plurality of live streams, a plurality of 3D digital objects of venues corresponding to the plurality of live streams, and the plurality of event objects plotted in the corresponding plurality of 3D digital objects of the venues.
7. The method as claimed in claim 1, wherein the 3D digital twin is generated using at least one of computer vision, photogrammetry, and computer graphics techniques based on the event data.
8. The method as claimed in claim 1, wherein the 3D digital twin is rendered on multimedia devices including at least one of a mobile, a tablet, a Virtual Reality (VR) Headset, a television (TV), and an Augmented Reality (AR) Headset.
9. The method as claimed in claim 1, wherein the movement of the one or more event objects is tracked using a machine learning algorithm.
10. The method as claimed in claim 9, wherein the machine learning algorithm is trained to detect at least one of change in pose of the one or more event objects and change in position coordinates of the one or more event objects, and wherein the position coordinates of the one or more event objects are determined using an image processing algorithm.
11. The method as claimed in claim 1, wherein the 3D digital twin of the venue is generated in the virtual environment.
12. A system for streaming an event in a virtual environment, the system comprises:
a memory; and
a processor coupled to the memory, wherein the processor is configured to execute program instructions stored in the memory for:
receiving event data;
generating a three dimensional (3D) digital twin of a venue based on the event data;
detecting one or more event objects in the broadcast data, wherein the one or more event objects include objects present in the venue;
tracking movement of the one or more event objects in the event data, wherein the movement of the one or more event objects corresponds to at least one of a change in position coordinates of the one or more event objects and a change in pose of the one or more event objects;
receiving a viewer position corresponding to viewer’s choice of a location in the venue;
rendering the 3D digital twin of the venue based on the viewer position; and
recreating the movement of the one or more event objects in the rendered 3D digital twin of the venue using a transformation model.
13. A non-transitory computer program product having embodied thereon a computer program for streaming an event in a virtual environment, the non-transitory computer program product storing instructions for:
receiving event data;
generating a three dimensional (3D) digital twin of a venue based on the event data;
detecting one or more event objects in the broadcast data, wherein the one or more event objects include objects present in the venue;
tracking movement of the one or more event objects in the event data, wherein the movement of the one or more event objects corresponds to at least one of a change in position coordinates of the one or more event objects and a change in pose of the one or more event objects;
receiving a viewer position corresponding to viewer’s choice of a location in the venue;
rendering the 3D digital twin of the venue based on the viewer position; and
recreating the movement of the one or more event objects in the rendered 3D digital twin of the venue using a transformation model.

Documents

Application Documents

#	Name	Date
1	202421018941-STATEMENT OF UNDERTAKING (FORM 3) [15-03-2024(online)].pdf	2024-03-15
2	202421018941-REQUEST FOR EARLY PUBLICATION(FORM-9) [15-03-2024(online)].pdf	2024-03-15
3	202421018941-POWER OF AUTHORITY [15-03-2024(online)].pdf	2024-03-15
4	202421018941-FORM-9 [15-03-2024(online)].pdf	2024-03-15
5	202421018941-FORM FOR STARTUP [15-03-2024(online)].pdf	2024-03-15
6	202421018941-FORM FOR SMALL ENTITY(FORM-28) [15-03-2024(online)].pdf	2024-03-15
7	202421018941-FORM 1 [15-03-2024(online)].pdf	2024-03-15
8	202421018941-FIGURE OF ABSTRACT [15-03-2024(online)].pdf	2024-03-15
9	202421018941-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [15-03-2024(online)].pdf	2024-03-15
10	202421018941-DRAWINGS [15-03-2024(online)].pdf	2024-03-15
11	202421018941-DECLARATION OF INVENTORSHIP (FORM 5) [15-03-2024(online)].pdf	2024-03-15
12	202421018941-COMPLETE SPECIFICATION [15-03-2024(online)].pdf	2024-03-15
13	Abstract.jpg	2024-04-06