Methods And Systems For Generating Interactive Mixed Reality Content

< Back

Methods And Systems For Generating Interactive Mixed Reality Content For Broadcasting Applications

Abstract: Embodiments provide methods and systems for generating interactive Mixed Reality (MR) content for broadcasting applications. The method includes accessing first media content including computer graphics (CG) content related to an event displayed to a first user and information related to interactions of the first user with object(s) in the CG content, second media content including recorded content of the first user, and tracking information related to variation in a position of the at least one camera and the first user and multiple local markers. The method includes generating third media content based on the first media content, the second media content, and the tracking information. Generating the third media content includes: determining a relative position of the first user with respect to the first media content based on the tracking information, and switching between the foreground content and the midground content based on the relative position of the first user.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

20 May 2022

Publication Number

47/2023

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Star India Private Limited

Star House, Urmi Estate, 95 Ganpatrao Kadam Marg, Lower Parel (W) Mumbai 400 013, Maharashtra, India

Inventors

1. Saurabh Ranjan

Star House, Urmi Estate, 95 Ganpatrao Kadam Marg, Lower Parel (W), Mumbai 400 013, Maharashtra, India

2. Caroline Stedman

Star House, Urmi Estate, 95 Ganpatrao Kadam Marg, Lower Parel (W), Mumbai 400 013, Maharashtra, India

3. Prashant Khanna

Star House, Urmi Estate, 95 Ganpatrao Kadam Marg, Lower Parel (W), Mumbai 400 013, Maharashtra, India

Specification

DESC:TECHNICAL FIELD
[0001] The present disclosure generally relates to the delivery of digital content such as streaming content to content viewers, and more particularly, to methods and systems for generating interactive mixed reality content for broadcasting applications such as sports analysis-related programs or shows.
BACKGROUND
[0002] In broadcasting applications such as fact-based television shows i.e., sports, weather, news, science, etc., Augmented Reality (AR) graphics are now increasingly being used. Such graphics are used by presenters to explain scenarios and to demonstrate data points. For instance, presenters, while describing any sports event or specific details about players, play arena, etc., can use AR content or Virtual Reality (VR) content for their description of the events. AR graphics are used in hard-set environments as well as in combination with virtual studio green screen environments. In the existing setup, the usage of AR graphics has been subject to technological restrictions whereby the presenters are unable to see the graphics and have to rely on comfort monitors that show their proximity to the virtual objects. In other words, the presenters have to rely on their instinct and the comfort monitors to determine their relative position within the VR/AR content stream that is being streamed to the content viewers. Upon determining their position, the presenters may adjust their position to better present themselves within the VR/AR content being streamed. As may be understood, this process is not accurate and often leads to poor viewer experience for the content viewers.
[0003] Presenters have also been unable to interact directly with the graphics, so any dynamic changes to the graphics have to be implemented by an additional graphics operator or a hardware device controlled by the presenter, such as a clicker or gesture band worn on the hand or wrist. In addition, the graphics generally appear as an alpha layer rendered in front of the presenter or a background layer in a green screen set-up, where the presenter is keyed in front of the graphics. If the presenter moves around the studio space, there is a risk of unwanted occlusion of the virtual object appearing to be a three-dimensional (3D) physical object. The limitations on the visibility of graphics to the presenter, interaction, and free movement around a virtual object considerably restrict storytelling abilities, natural-looking interactions, and other visual aesthetics.
[0004] Thus, there exists a technological need for methods and techniques that can allow the presenter to interact with graphical content generated by an external system and to stream and make spectators experience such interactions through the use of hand-held devices and computers.
SUMMARY
[0005] Various embodiments of the present disclosure provide methods and systems for generating interactive Mixed Reality (MR) content for broadcasting applications.
[0006] In an embodiment, a computer-implemented method for generating interactive MR content for broadcasting applications is disclosed. The computer-implemented method is implemented by a system and includes accessing first media content from a database associated with the system. The first media content includes computer graphics (CG) content related to an event displayed to a first user and information related to interactions of the first user with a plurality of objects in the CG content. The computer-implemented method further includes accessing second media content from the database. The second media content includes recorded content including one or more actions and one or more positions of the first user. The recorded content is captured by at least one camera. Further, the computer-implemented method includes accessing tracking information from the database. The tracking information includes information related to variation in a position of the at least one camera and the first user and a plurality of local markers corresponding to a position of the plurality of objects. Furthermore, the computer-implemented method includes generating third media content to be streamed to a plurality of second users based, at least in part, on the first media content, the second media content, and the tracking information. The third media content includes background content and at least one of foreground content and midground content. Further, generating the third media content includes determining a relative position of the first user with respect to the first media content based, at least in part, on the tracking information, and switching between the foreground content and the midground content on top of the background content based, at least in part, on the relative position of the first user with respect to the first media content.
[0007] In another embodiment, a system is disclosed. The system includes a memory module configured to store instructions. The system also includes a processor in communication with the memory module. The processor is configured to execute the instructions stored in the memory module and thereby cause the system to perform, at least in part to access first media content from a database associated with the system. The first media content includes computer graphics (CG) content related to an event displayed to a first user and information related to interactions of the first user with a plurality of objects in the CG content. The system is further caused to access second media content from the database. The second media content includes recorded content including one or more actions and one or more positions of the first user. The recorded content is captured by at least one camera. Further, the system is caused to access tracking information from the database. The tracking information includes information related to variation in a position of the at least one camera and the first user and a plurality of local markers corresponding to a position of the plurality of objects. The system is caused to generate third media content to be streamed to a plurality of second users based, at least in part, on the first media content, the second media content, and the tracking information. The third media content includes background content and at least one of foreground content and midground content. Further, for generating the third media content, the system is caused at least in part to determine a relative position of the first user with respect to the first media content based, at least in part, on the tracking information. The system is further caused to switch between the foreground content and the midground content on top of the background content based, at least in part, on the relative position of the first user with respect to the first media content.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] The advantages and features of the invention will become better understood with reference to the detailed description taken in conjunction with the accompanying drawings, wherein like elements are identified with like symbols, and in which:
[0009] FIG. 1 illustrates an exemplary representation of an environment related to at least some example embodiments of the present disclosure;
[0010] FIG. 2 illustrates a simplified block diagram of a system, in accordance with an embodiment of the present disclosure;
[0011] FIGS. 3A-3F illustrates a first set of video frames of streaming content displayed to a content viewer in a first scenario, in accordance with an embodiment of the present disclosure;
[0012] FIGS. 4A-4F illustrates a second set of video frames of the streaming content displayed to the content viewer in a second scenario, in accordance with an embodiment of the present disclosure;
[0013] FIGS. 5A-5D illustrates a third set of video frames of the streaming content displayed to the content viewer in a third scenario, in accordance with an embodiment of the present disclosure;
[0014] FIG. 6 illustrates a process flow diagram depicting a method for dynamically generating interactive mixed reality (MR) content, in accordance with an embodiment of the present disclosure; and
[0015] FIG. 7 illustrates a process flow diagram depicting a method for generating interactive MR content for broadcasting applications, in accordance with an embodiment of the present disclosure.
[0016] The drawings referred to in this description are not to be understood as being drawn to scale except if specifically noted, and such drawings are only exemplary in nature.
DETAILED DESCRIPTION
[0017] The best and other modes for carrying out the present invention are presented in terms of the embodiments, herein depicted in FIGS. 1 to 7. The embodiments are described herein for illustrative purposes and are subject to many variations. It is understood that various omissions and substitutions of equivalents are contemplated as circumstances may suggest or render expedient but are intended to cover the application or implementation without departing from the scope of the invention. Further, it is to be understood that the phraseology and terminology employed herein are for the purpose of the description and should not be regarded as limiting. Any heading utilized within this description is for convenience only and has no legal or limiting effect.
[0018] Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. The appearance of the phrase “in an embodiment” in various places in the specification do not necessarily all refer to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not for other embodiments.
[0019] Moreover, although the following description contains many specifics for the purposes of illustration, anyone skilled in the art will appreciate that many variations and/or alterations to said details are within the scope of the present disclosure. Similarly, although many of the features of the present disclosure are described in terms of each other, or in conjunction with each other, one skilled in the art will appreciate that many of these features can be provided independently of other features. Accordingly, this description of the present disclosure is set forth without any loss of generality to, and without imposing limitations upon, the present disclosure.
[0020] Embodiments of the present disclosure may be embodied as an apparatus, system, method, or computer program product. Accordingly, embodiments of the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.), or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” "engine" “module” or “system”. Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer-readable storage media having computer-readable program code embodied thereon.
[0021] The terms “a” and “an” herein do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced items.
[0022] The terms “augmented reality” and “augmented reality graphics” may have been used interchangeably throughout the description and they generally refer to an enhanced, interactive version of a real-world environment achieved through digital visual elements, sounds, and other sensory stimuli presented in part via holographic technology (in some scenarios). In various non-limiting examples, Augmented Reality (AR) is designed to add digital elements over real-world views with limited interaction. It incorporates three features: a combination of digital and physical worlds, interactions made in real-time, and accurate three-dimensional (3D) identification of virtual and real objects.
[0023] The terms “virtual reality” and “virtual reality graphics” may have been used interchangeably throughout the description and they generally refer to a simulated 3D environment that enables users or viewers to explore and interact with a virtual surrounding in a way that approximates reality, as it is perceived through the users’ senses. In various non-limiting examples, Virtual Reality (VR) provides such an experience usually via a headset device and headphones designed for such activities.
[0024] The term “mixed reality” and “mixed reality graphics” may have been used interchangeably throughout the description and they generally refer to a blend of physical and digital worlds, unlocking natural and intuitive 3D interactions with humans, computers, and the environment. In various non-limiting examples, Mixed Reality (MR) combines an AR and a VR element so that digital objects can interact with the real world.
[0025] The term “broadcasting applications” generally refers to applications that are used for streaming or broadcasting media content such as live recordings or streams of media content such as videos, sounds, and the like recorded at a particular location to others (at the content viewer’s end) at a different location having access to the stream. For example, content viewers may access live content on their personal devices via the broadcasting applications.
[0026] The terms “streaming content”, “digital content”, “media content”, “broadcast content”, “subscribed content”, and “content” are used throughout the description interchangeably, and they generally refer to multimedia content that is delivered and consumed in a continuous manner from a source, with little or no intermediate storage in network elements. The streaming of the streaming content may be controlled by streaming content providers.
[0027] The term “streaming content provider” generally refers to an entity that holds digital rights associated with digital content, i.e., media content, present within digital video content libraries, offers the content on a subscription basis by using a digital platform and Over-The-Top (OTT) media services, i.e., content is streamed over the Internet to the electronic devices of the subscribers, i.e., content viewers.
[0028] The terms “user”, “subscriber”, “content viewer”, “viewer”, “spectator”, and “content receiver” may have been used interchangeably throughout the description and they generally refer to a viewer of subscribed content, which is offered by the streaming content provider or the content provider.
OVERVIEW
[0029] In an embodiment, a system that may be a digital platform server associated with a content provider is configured to access first media content, second media content, and tracking information from a database associated with the system. Herein, the first media content may include Computer Graphics (CG) content related to an event displayed to a first user and information related to interactions of the first user with a plurality of objects in the CG content. The second media content may include recorded content including one or more actions and one or more positions of the first user. Herein, the recorded content is captured by at least one camera. The tracking information may include information related to variation in a position of the at least one camera and the first user and a plurality of local markers corresponding to a position of the plurality of objects.
[0030] In one implementation, the event is displayed to the first user as Virtual Reality (VR) content via a Head-Mounted Display (HMD) device worn by the first user. In an embodiment, the HMD device may be associated with a first computing device. Further, for accessing the first media content including the CG content, the first computing device may be configured to receive information related to the interactions of the first user with the VR content corresponding to the event. The information related to a selection of one or more options, by the first user, is displayed in the VR content. The first computing device may be further configured to receive historical information related to a plurality of events from a media database associated with the first computing device. The first computing device may also be configured to generate three-dimensional (3D) content using MR graphics, in response to the interactions of the first user with the VR content based, at least in part, on analysis of the information related to the interactions and the historical information. In some implementations, the 3D content may include the plurality of objects related to the corresponding event, that can be interacted with by the first user. Herein, the 3D content is displayed to the first user via the HMD device.
[0031] In some embodiments, the first computing device may be configured to receive information related to interactions of the first user with the 3D content displayed to the first user via the HMD device. Herein, the information may include variation in one or more physical parameters associated with the plurality of objects. The first computing device may further be configured to receive historical data related to the plurality of objects related to the corresponding event from the media database. Further, the first computing device may be configured to generate new 3D content including an anticipated output associated with the one or more physical parameters associated with the plurality of objects, in response to the interactions of the first user with the 3D content. The first computing device may generate the new 3D content based, at least in part, on analysis of the information related to the interactions and the historical data related to the plurality of objects.
[0032] In some other embodiments, the first computing device may transmit information related to the new 3D content to a second computing device associated with the at least one camera. In an implementation, the second computing device may be configured to receive the information related to the new 3D content, the second media content, and the tracking information. The second computing device may further be configured to position a virtual camera in the MR graphics related to the new 3D content. Herein, the virtual camera may be adapted to capture the MR graphics with a second user’s perspective and movements as the virtual camera moves in the MR graphics. The second computing device may further be configured to generate the CG content based, at least in part, on the information related to the new 3D content, the second media content, the tracking information, and the MR graphics captured by the virtual camera. In a non-limiting example, the tracking information may be captured via a camera tracker associated with each of the at least one camera.
[0033] The system may be further caused to generate third media content to be streamed to a plurality of second users based, at least in part, on the first media content, the second media content, and the tracking information. Herein, the third media content may include background content and at least one of foreground content and midground content. In a non-limiting example, the third media content may include the interactive MR content. In some embodiments, for generating the third media content, the system may be further caused to determine a relative position of the first user with respect to the first media content based, at least in part, on the tracking information, and switch between the foreground content and the midground content on top of the background content based, at least in part, on the relative position of the first user with respect to the first media content.
[0034] In alternative implementations, for generating the third media content, the system may further be caused to determine a background of the first user in the second media content to be one of a physical set and a green screen set by analyzing a plurality of features associated with the second media content. The system may be further caused to generate the third media content such that the third media content may include the background content corresponding to the second media content and the foreground content corresponding to the first media content when the background of the first user in the second media content may include the physical set. Alternatively, the system may be caused to generate the third media content such that the third media content may include the background content corresponding to a virtual background and the foreground content corresponding to at least one of the first media content and the second media content when the background of the first user in the second media content may include the green screen set. Herein, the first user standing in the green screen set is keyed out from the green screen set and positioned on top of the virtual background. In some embodiments, the system may be caused to generate the third media content by stitching the first media content with the second media content in real-time by performing media acquisition, calibration, registration, and blending of the first media content and the second media content.
[0035] In an embodiment, for the system to switch between the foreground content and the midground content, the system may be further caused to access relevant historical data including historically recorded information related to at least one of the first media content, the second media content, and the tracking information. The system may be further caused to determine, via a Machine Learning (ML) model, switching criteria for the foreground content and the midground content based, at least in part, on the relevant historical data and the relative position of the first user with respect to the first media content. In a non-limiting example, the switching criteria may include a first criterion for switching to the foreground content from the midground content when the first user is determined to be located in the background content and is determined to be interacting with the plurality of objects in the foreground content. In another example, the switching criteria may include a second criterion for switching to the midground content from the foreground content in response to the position and the interactions of the first user with the plurality of objects for displaying information that reflects an outcome of the interactions.
[0036] Various embodiments of the present disclosure offer multiple advantages and technical effects. For instance, the system and/or the setup proposed in the present disclosure, improves the analytical analysis of an ongoing event by helping the presenter to demonstrate the ongoing event in the MR graphics. The system further facilitates the presenter to interact with various parameters or objects in the MR graphics, thereby helping to dive deeper into the various scenarios of the event by not only showcasing what happened in the event but also showcasing what could have happened by changing the input parameters. The benefits to the presenters in events broadcasting such as sports broadcasting are enhanced, particularly the dynamic storytelling abilities and analysis of various events, facts, and scenarios, as the presenters are able to see the graphics in front of them via the holographic display, as opposed to guessing where they are via visualization on a comfort monitor.
[0037] Various embodiments of the present disclosure offer the presenters with direct interactivity and manipulation of the data that the graphics represent which includes Hawk-eye™ data e.g., in the case of cricket sports analysis, ball trajectories, beehives, wagon wheels, pitch maps, etc., and biomechanics on a virtual mannequin as well as fielding positions of players, multiple virtual video screens with telestration functionality. In conventional arrangements, some of these functionalities were only possible via preprogrammed graphics templates controlled/fired by a third-party graphics operator. The benefits to the content viewers of the event broadcasts include improving analytical analysis of the event via the enhanced storytelling detailed above, resulting in a more visually appealing, data-rich, and authentic experience.
[0038] In addition to facilitating the content viewers to be able to visualize events and interactions of the presenter with various parameters of the ongoing event in MR graphics, the process of collecting and streaming the interactive MR content to the content viewers requires less processing resources and storage compared to conventional methods. This is because the system does not have a dependency on comfort monitors and third-party graphic operators, instead computing devices associated with the HMD device and the camera tracker communicably coupled to the system proposed in the present disclosure are sufficient to capture the interactive MR graphics related to the demonstration of the event and streaming the same to the content viewers. This makes the system proposed in the present disclosure more efficient and more reliable in terms of processing resources and processing time.
[0039] Various embodiments of the present disclosure are described hereinafter with reference to FIGS. 1 to 7.
[0040] FIG. 1 illustrates an exemplary representation of an environment 100 related to at least some embodiments of the present disclosure. Although the environment 100 is presented in one arrangement, other embodiments may include the parts of the environment 100 (or other parts) arranged otherwise depending on, for example, accessing information from a plurality of sources and generating interactive Mixed Reality (MR) content from the accessed information that needs to be delivered to content viewers through broadcasting applications.
[0041] The environment 100 generally includes a presenter 102 and a Head-Mounted Display (HMD) device 104 to be worn by the presenter 102. The HMD device 104 is a display device for displaying Virtual Reality (VR) content (hereinafter also termed as ‘graphical content’) to the presenter 102 wearing the HMD device 104. The VR content includes content represented in a graphical form (such as icons or three-dimensional (3D) objects displayed in a virtual space). The HMD device 104 includes a processing unit, lenses, cameras, and one or more sensors. In one example, the HMD device 104 can have more components than the lenses, the cameras, and the one or more sensors. In a non-limiting example, the HMD device 104 corresponds to HoloLens® that uses multiple sensors, advanced optics, and holographic processing which meld the content seamlessly with its environment. The holograms generated by the HoloLens® can be used to display information, blend with the real world, or even simulate a virtual world. The HMD device 104 scans a physical space (i.e., surroundings) in the viewing direction of the presenter 102 and generates a mesh of the physical space in an example. For example, the HMD device 104 can scan the surroundings of the presenter 102 in a set where a show (e.g., a post-match analysis program of a sports match) is recorded. The HMD device 104 utilizes the mesh for calibration and detects the position of the presenter 102 as well as can place local markers in the physical space.
[0042] As used herein, the term “mesh” refers to an element that is utilized for mapping the surroundings of an individual for generating VR content to be displayed in an HMD device worn by the individual, with which the individual can interact using hand gestures, gaze, and voice commands. Therefore, based upon the mapping of the surroundings which is carried out using the mesh for appropriately and accurately generating the VR content, the VR content is calibrated. Further, the HMD device 104 may detect the position of the presenter 102 using sensors such as, but not limited to, a proximity sensor, a gyroscope, an accelerometer, a magnetometer, and the like located within the HMD device 104. The local markers (hereinafter also referred to as ‘physical markers’) are the positions where real objects are placed around the presenter 102. Therefore, in a non-limiting example, the local markers may refer to a physical entity such as a printed pattern on a paper. Herein, the paper with the printed pattern may be physically placed in the physical space. When the presenter 102 is using the HMD 104 for getting a VR experience, the HMD device 104 may be configured to identify the physical marker. The HMD device 104 may further be configured to place or overlay a pre-loaded image on top of the local marker in the graphical content visible to the presenter 102 via the HMD device 104. The pre-loaded image may correspond to an item that the presenter 102 may be willing to physically place at the location of the physical marker. Therefore, the local markers may be any physical entity (relatively static in its plane) that helps in mapping and/or replacing real-world objects with virtual objects.
[0043] Further, the HMD device 104 allows tracking the presenter’s hand including finger joints after the completion of the calibration process. This is enabled via various sensors and input devices associated with the HMD device 104 that capture images and videos of the presenter’s hands. The HMD device 104 allows the presenter 102 to use hand-tracking gestures (such as palm detection and pinching) to interact with the graphical content displayed to the presenter 102. The gestures are used to execute different actions like opening a menu and interacting with virtual graphical objects displayed in the graphical content. In one example, the interaction of the presenter 102 with the graphical content can be through the use of a remote device connected to the HMD device 104 or through the gaze of the presenter 102 or the use of voice commands, or the use of eye blinks, or any other suitable means for interaction, etc.
[0044] As shown in FIG. 1, the environment 100 further includes a first computing device 106 associated with a first database 108 (hereinafter also referred to as media database 108). The first computing device 106 may also be associated with the HMD device 104. The first computing device 106 can be any computer or a device similar to a computer that has a processor, memory, communication, and input/output modules. The first computing device 106 receives information related to the interactions of a first user such as the presenter 102 with the graphical content corresponding to the event. The information may include a selection of one or more options, by the first user (i.e., the presenter 102), displayed in the VR content. This information may be captured by the HMD device 104 with the help of the sensors and the input devices associated with the HMD device 104.
[0045] In some embodiments, the media database 108 includes historical information related to a plurality of events and their multimedia content (e.g., sports). The media database 108 may be a single device or a network of multiple devices that store and share information related to a plurality of events that can be analyzed using the HMD device 104. Thus, the first computing device 106 may receive historical information from the media database 108. For example, the media database 108 may include information related to a player in a game of cricket and video content related to a player. In one case, the media database 108 includes the average strike rate of the player in the One-Day International (ODI) format of cricket and video content related to the player’s performances in ODIs. The first computing device 106 associated with the HMD device 104 may access the media database 108 upon receiving a request from the HMD device 104. The information stored in the media database 108 is classified into multiple groups that are represented as options in a menu. The presenter 102 may be displayed with the menu upon activating the HMD device 104 and using a hand gesture defined for displaying the menu. In one example, the presenter 102 may select an icon in the menu, upon which a request is generated by the HMD device 104 to access the information related to the selected icon from the media database 108. Then, the first computing device 106 accesses the information from the media database 108 upon the request from the HMD device 104 and then provides the information to the HMD device 104 for a display to the presenter 102.
[0046] In some other embodiments, the presenter 102 may be displayed a submenu after the selection of the icon from the menu. For example, upon selection of a player’s stats icon from the menu, a submenu including a number of player names in that sport can be displayed to the presenter 102. The presenter 102 may select an option related to the player who needs to be analyzed. Upon selection of the player, the information related to the player will be displayed to the presenter 102 through the use of MR graphics in the physical space.
[0047] The environment 100 further depicts one or more cameras 110a, 110b, and 110c (collectively, referred to as cameras 110) to record a show, which the presenter 102 is hosting. Three cameras 110a-110c are shown for example purposes only, and there may be any number of cameras as part of the set-up. The cameras 110-110c are associated with a second computing device 112. The second computing device 112 may be similar to the first computing device 106. The second computing device 112 receives an outcome of the interaction of the presenter 102 with the graphical content from the first computing device 106.
[0048] In addition, the environment 100 depicts a camera tracker 114 attached to the cameras 110. The environment 100 further depicts a system 116 associated with a second database 118 (hereinafter also referred to as database 118). Further, the environment 100 depicts a content viewer device 120 associated with a content viewer 122. The content viewer device 120 and the system 116 are coupled to, and in communication with (and/or with access to) a network 124.
[0049] The system 116 may be a single server or a network of servers to receive, process, create and provide streaming content based at least on data accessed by the system 116. In the illustrated example, the data accessed by the system 116 includes the first media content, the second media content, and the tracking information. Although, it is noted that the data may also include other data related to the event as well without limiting the scope of the invention. In an embodiment, the system 116 may be a digital platform server associated with a content provider. The content offered by the content provider may be embodied as the streaming content or streaming video content such as live streaming content or on-demand video streaming content. It is noted that though the content offered by the content provider is explained with reference to video content, the term ‘content’ as used hereinafter may not be limited to only video content. Instead, the term ‘content’ may refer to any media content including but not limited to ‘video content’, ‘audio content’, ‘gaming content’, ‘textual content’, and any combination of such content offered in an interactive or non-interactive form. Accordingly, the term ‘content’ is also interchangeably referred to hereinafter as ‘media content’ for the purposes of description. Individuals such as the content viewer 122 wishing to view/access the streaming media content may subscribe to at least one type of subscription offered by the content provider.
[0050] It may be noted that the content viewer 122 depicted in the environment 100 is controlling the content viewer device 120 for viewing/accessing the streaming content from the content provider. In an example, the content viewer device 120 may include one or more electronic devices, such as a smartphone, a laptop, a desktop, a personal computer, a wearable device, or any spatial computing device to view the content provided by the content provider. The content viewer 122 may have downloaded a software application 126 (hereinafter referred to as an ‘application 126’ or an ‘app 126’) corresponding to at least one content provider on the content viewer device 120. It is noted that the application 126 may correspond to the broadcasting application described earlier.
[0051] In one illustrative example, the content viewer 122 may access a Web interface associated with the application 126 provided by the content provider on the content viewer device 120. It is understood that the content viewer device 120 may be in operative communication with the network 124, such as the Internet, enabled by a network provider, also known as an Internet Service Provider (ISP). The content viewer device 120 may connect to the network 124 using a wired network, a wireless network, or a combination of wired and wireless networks. Some non-limiting examples of wired networks may include the Ethernet, the Local Area Network (LAN), a fiber-optic network, and the like. Some non-limiting examples of wireless networks may include Wireless LAN (WLAN), cellular networks, Bluetooth or ZigBee networks, and the like.
[0052] The content viewer device 120 may fetch the Web interface associated with the application 126 over the network 124 and cause a display of the Web interface on a display screen (not shown) of the content viewer device 120. In an illustrative example, the Web interface may display a plurality of content titles corresponding to the media content offered by the content provider to its consumers, i.e., the content viewers. For example, the application 126 may facilitate the broadcasting of fact-based television shows i.e., sports, weather, news, science, etc.
[0053] In an illustrative example, the content viewer 122 may select a content title from among the plurality of content titles displayed on the display screen of the content viewer device 120. The selection of the content title may trigger a request for a playback Uniform Resource Locator (URL). The request for the playback URL is sent from the content viewer device 120 via the network 124 to the system 116 associated with the content provider.
[0054] In at least some embodiments, the system 116 may include at least one of a Content Management System (CMS) and a User Management System (UMS) for facilitating the streaming of third media content from the database 118 associated with the system 116 of the content provider to the plurality of second users, such as the content viewer 122. The system 116 is configured to authenticate the content viewer 122 and determine if the content viewer 122 is entitled to view the requested content. To this effect, the system 116 may be in operative communication with one or more remote servers, such as an authentication server and an entitlement server. The authentication server and the entitlement server are not shown in FIG. 1. The authentication server may facilitate the authentication of viewer account credentials using standard authentication mechanisms, which are not explained herein. The entitlement server may facilitate the determination of the viewer’s subscription type (i.e., whether the second user has subscribed to regular or premium content) and status (i.e., whether the subscription is still active or is it expired), which in turn may enable the determination of whether the content viewer 122 is entitled to view/access the requested content or not.
[0055] Further, in some scenarios, an instance of a three-dimensional (3D) content creation engine 128 is running on the first computing device 106 for generating a User Interface (UI) and the graphical content displayed to the presenter 102. Similarly, the second computing device 112 also includes an instance of the 3D content creation engine 128 for generating and simulating real-time 3D content based at least on the data received by the second computing device 112. Therefore, the first computing device 106 may generate the 3D content using Mixed Reality (MR) graphics, in response to the interactions of the first user with the VR content. The first computing device 106 may generate the 3D content based, at least in part, on an analysis of the information related to the interactions and the historical information. Herein, the 3D content may include a plurality of objects related to the corresponding event, that can be interacted with by the first user (i.e., the presenter 102). Herein, the 3D content is displayed to the first user via the HMD device 104. The 3D content creation engine 128 is configured to generate and simulate real-time 3D content, and hence facilitate the first computing device 106 for generating the 3D content. For instance, the 3D content creation engine 128 may correspond to an Unreal Engine® that is used for 3D modeling. Alternative examples for the 3D content creation engine 128 may include Blender®, Unity®, CryEngine®, etc.
[0056] The 3D content creation engine 128 may include pre-defined modules and functions to process and visualize the information. However, additional modules and functions can be created in the 3D content creation engine 128 for usage related to the processing and visualization of desired information. Moreover, in some embodiments, the modules may be event specific, e.g., mixed reality cricket analysis toolkit for analysis of cricket, mixed reality football analysis toolkit for analysis of football, etc.
[0057] In some other embodiments, the modules may include one or more of a physical manager module, an input manager module, a Graphical User Interface (GUI) system, a scene manager module, a network component, an Artificial Intelligence (AI) system, a resource manager module, an entity manager module, a speech recognizer module, and the like. In a non-limiting example, such modules may be communicating with each other for generating and managing 3D content.
[0058] The modules may include multiple functions to process various data related to the players and sub-events that occurred in the event such as in the sport of Cricket. For instance, the multiple functions may include one or more modules for performing one or more of modeling, mapping, skinning, exporting or importing of assets, editing, and the like using corresponding techniques such as various modeling techniques, mapping techniques, skinning techniques, exporting and importing techniques, editing techniques, and the like respectively. The modules may generate the graphical content that is displayed to the presenter 102 for interaction by the presenter 102. For example, the graphical content displayed to the presenter 102 may include content from a hawk-eye™ system (not shown in FIG. 1) connected to the media database 108. To display the content from the hawk-eye™ system, a cricket pitch is digitally rendered in the physical space in the viewing direction of the presenter 102. In addition, content-related parameters that may be taken into consideration for generating the 3D content may include, in the case of an example of a cricket sport, a scorecard (having information related to runs, wickets, strike rate, bowling speed, fielding placement, wagon wheel, etc.), content metadata, user demographic details, and the like.
[0059] In a non-limiting example, a digitally rendered pitch may be displayed to the content viewer 122 on their electronic device 120. The digitally rendered pitch may include a ball at its point of contact with the pitch and the ball’s trajectory. The presenter 102 can interact with the ball and change its point of contact with the pitch for analyzing the ball’s trajectory at a new point of contact. The new ball trajectory is generated by the modules based on past data related to the type of pitch that is digitally rendered. The modules use predefined physical functions to predict and display the new trajectory of the ball based on its new point of contact with the pitch, the type of pitch that is being used, and data related to the delivery (such as speed and type of delivery of the ball). For instance, the predefined physical functions may correspond to a set of equations showing a relationship between a plurality of inputs and a plurality of corresponding outputs related to various sub-events corresponding to the event. In the example of cricket sport, if throwing a ball toward a batsman with a certain trajectory is one of the sub-events in the gameplay of the corresponding sport by the players, then different trajectories of the ball may be the input and output may be the type of shot the batsman is able to hit for that particular ball trajectory. Therefore, upon analyzing such different outputs for various inputs related to a particular event, the 3D content creation engine is configured to generate a set of equations. The set of equations may be describing the relation between the inputs and the outputs related to the event. Further, the set of equations may be implemented in the form of instructions. The instructions may be stored in a memory associated with the 3D content creation engine 128. The set of instructions may have to be executed by a processor associated with the 3D content creation engine 128 based, at least, on the past data related to the type of pitch that is digitally rendered, new point of contact with the pitch, the type of pitch that is being used, data related to the delivery, and the like. In a non-limiting example, the type of delivery of the ball can be off-break, wrist spin, in-swing, out-swing, etc.
[0060] In some embodiments, the presenter 102 may interact with the ball through the use of hand gestures. The information related to the interaction is provided to the first computing device 106 associated with the HMD device 104. Therefore, the first computing device 106 may receive the information related to interactions of the first user with the 3D content displayed to the first user via the HMD device 104. The information may include variations in one or more physical parameters associated with the plurality of objects. For example, in case of the example of the object being the ball, the physical parameter is the ball trajectory. The first computing device 106 may further receive historical data related to the plurality of objects related to the corresponding event from the media database 108. Further, the first computing device 106 may generate new 3D content including an anticipated output associated with the one or more physical parameters associated with the plurality of objects, in response to the interactions of the first user with the 3D content based, at least in part, on the analysis of the information related to the interactions and the historical data related to the plurality of objects.
[0061] The first computing device 106 is further configured to transmit the information related to the outcome of the interaction (i.e., the new 3D content) with the second computing device 112 associated with the cameras 110.
[0062] In an embodiment, the cameras 110 are spectator cameras used to capture the presenter 102 and the physical space around the presenter 102 which may be referred to as second media content. In one example, the physical space can be a set where the show of the presenter 102 is being recorded. If the presenter 102 is on the set in a studio, the cameras 110 capture the presenter 102 and the objects present on the set around the presenter 102. The cameras 110 provide the captured information to the second computing device 112. Further, the camera tracker 114 attached to the cameras 110 may capture tracking information related to the cameras 110. For instance, the camera tracker 114 can be a Mo-sys® camera tracking system. The tracking information related to the cameras 110 may include positional, pan, tilt, and zoom data of the cameras 110 as well as all the intrinsic properties of the cameras 110 in real-time. Further, the camera tracker 114 provides the tracking information of the cameras 110 to the second computing device 112. The instance of the 3D content creation engine 128 running on the second computing device 112 may receive the information from the cameras 110 and the tracking information from the camera tracker 114. The 3D content creation engine 128 uses the tracking information to place a virtual camera in the MR graphics created in the 3D content creation engine 128. The virtual camera is placed at the exact position in the MR graphics with all the properties of the real camera. Hence, the virtual camera captures the MR graphics with the right perspective such as a second user’s perspective and movements as the virtual camera moves in the MR graphics. Herein, the movement of the virtual camera may be controlled via a controller module (not shown). For instance, the controller may tune metadata for the virtual camera similar to how a real-time camera metadata is tuned. For example, the virtual camera may be controlled to adjust the panning, gimbal, orientation, and the like. In one example, the mixed reality graphics may include a digitally rendered pitch, a digitally rendered cricket ground, a virtual mannequin of a player, a ball trajectory, and similar features. It is understood that these features may be event-specific in nature.
[0063] In some embodiments, the 3D content creation engine 128 may include a composure system module (not shown in FIG. 1). Herein, the composure system module is configured to process the information captured by the virtual camera for composing the actions of the presenter 102 in the MR graphical content. Therefore, along with the information from the first computing device 106, the second computing device 112 may receive information captured by the cameras 110 and the tracking information from the camera tracker 114. The instance of the 3D content creation engine 128 of the second computing device 112 may generate MR environment based, at least in part, on the information received from the first computing device 106, the cameras 110, and the tracking information from the camera tracker 114.
[0064] Further, during the interaction of the presenter 102 with the graphical content displayed to the presenter 102 in the HMD device 104, the corresponding graphical content and the interactions may have to be displayed to the content viewer 122, for the content viewer 122 to be able to see the presenter’s interaction with the graphical content. Therefore, the generated MR environment at the second computing device 112 is provided to the system 116 as Computer Graphics (CG) content. The database 118 associated with the system 116 may receive information from various devices such as the second computing device 112, the cameras 110, and the camera tracker 114 and stores the same as the first media content, the second media content, and the tracking information, respectively.
[0065] Herein, the first media content may include the CG content related to an event (e.g., a cricket sport) displayed to the first user and the information related to interactions of the first user with the plurality of objects in the CG content. As used herein, the term “first user” may correspond to the presenter 102. Thus, hereinafter, the terms “first user 102” and the “presenter 102” may be used interchangeably for the purpose of description. The plurality of objects in the CG content may include icons corresponding to a menu, a submenu, a ball, and the like. The plurality of objects may be virtual objects in the CG content.
[0066] It may be noted that the first computing device 106 provides the outcome of the interaction of the presenter 102 with the CG content to the second computing device 112. The system 116 further receives this information from the second computing device 112 as a part of the first media content. In some embodiments, the system 116 is caused to generate the new 3D content including an anticipated output associated with the physical parameters associated with the plurality of objects, in response to the interactions of the first user 102 with the 3D content.
[0067] For instance, the presenter 102 may change a position of the ball in the MR graphical content. Upon changing the position of the ball, the resulting ball trajectory may change. The change in the ball trajectory is detected by the system 116 upon analyzing the first media content using image interpolation techniques. Further, the other physical parameters related to the ball being fed to the system 116 via the first media content. For instance, the physical parameters related to the ball may include ball trajectory, ball physical orientation, axis, cycle, location, and the like.
[0068] In another embodiment, the system 116 may further map the physical parameters associated with the ball with gestures or inputs captured by the HMD device 104. Further, the system 116 may generate an anticipated output for any inputs related to the ball using a Machine Learning (ML) model based at least in part on the historical data related to physical parameters associated with the ball. For instance, the historical data may include past outcomes corresponding to a plurality of variations in the physical parameters associated with the ball corresponding to a particular player of a particular event. Therefore, the ML model learns from the historical data and facilitates the system 116 to provide predictions for any variation in the physical parameters associated with the ball. In some scenarios, the ML model may be a supervised learning model or a deep learning model. The system 116 may further transmit the anticipated outcome to the HMD device 104 and the same may be displayed to the presenter 102 in the form of the new 3D content.
[0069] Further, the second media content may include recorded content including one or more actions and one or more positions of the first user 102. The recorded content may be captured by at least one camera (e.g., the cameras 110). For instance, the one or more actions of the first user 102 may correspond to the selection of one or more options displayed as menus and submenus in the graphical content displayed to the first user 102 via the HMD device 104. The one or more actions may further include movement of the presenter 102 in the physical space surrounding the presenter 102. Basically, the recorded content may correspond to broadcast live content which is generally captured by the cameras 110 and broadcasted to the content viewers 122 via the network 124 on the content viewer device 120. The tracking information may include information related to variation in a position of the at least one camera and the first user 102 and a plurality of local markers corresponding to a position of the plurality of objects.
[0070] Further, the system 116 is caused to access the first media content, the second media content, and the tracking information from the database 118. The system 116 is further caused to generate third media content to be streamed to a plurality of second users based, at least in part, on the first media content, the second media content, and the tracking information. As used herein, the term “second user” refers to the content viewer 122 who has access to the streaming content provided by a content provider. Thus, hereinafter, the terms “second user 122” and the “content viewer 122” may have been used interchangeably for the purpose of description.
[0071] In an illustrative example, the third media content may correspond to the streaming content. Further, the third media content may include background content and at least one of foreground content, and midground content. Therefore, the system 116 may be caused to determine a background of the first user 102 in the second media content to be one of a physical set or a green screen set by analyzing a plurality of features associated with the second media content. In some embodiments, this determination may be performed based on one or more image processing techniques such as, but not limited to, image analysis, image compression, image segmentation, edge detection, objection detection, and the like.
[0072] In an embodiment, the third media content may include background content and at least one of foreground content and midground content. As used herein, the term ‘foreground content’ refers to an area in a media content that is closest to the content viewer’s view and the term ‘midground content’ refers to a layer just below the foreground content, and the term ‘background content’ refers to the area behind the two layers (foreground and midground).
[0073] Further, in an embodiment, the system 116 may generate the third media content. Herein, the third media content may include the background content corresponding to the second media content and the foreground content corresponding to the first media content when the background of the first user 102 in the second media content includes the physical set.
[0074] Alternatively, the third media content may include the background content corresponding to a virtual background and the foreground content corresponding to at least one of the first media content, and the second media content when the background of the first user 102 in the second media content includes the green screen set. In such a scenario, the first user 102 standing in the green screen set is keyed out from the green screen set and positioned on top of the virtual background. The virtual background may be generated by the system 116. Further, the midground content is the portion of the third media content that reflects variation in the plurality of objects, in response to the interactions and the position of the first user 102 with respect to the MR graphical content.
[0075] For instance, in one scenario, the presenter 102 may be required in the third media content, e.g., a presenter 102 demonstrating a cricket in a video wearing the HMD device 104, and the output from the video is shown as a holographic visual with which the presenter 102 is interacting. In such a scenario, the presenter 102 is standing in the physical set demonstrating the event. In another scenario, the display view of the HMD device 104 is shown e.g., as the presenter 102 sees the VR content without the presenter 102 being in the frame. In yet another scenario, the CG content with the presenter 102 standing on the ground, interacting with a player may be required to be shown. In such a scenario, the background content may be the green screen view, and hence, the virtual background may be generated corresponding to the ground.
[0076] In some embodiments, for generating the third media content, the system 116 may be further caused to stitch the first media content with the second media content in real-time by performing a plurality of steps such as media acquisition, calibration, registration, and blending.
[0077] As may be understood, herein, the term ‘media acquisition’ refers to an action of retrieving media from multiple sources. The term ‘calibration’ refers to the process of minimizing optical defects such as optical distortions and perspective distortions from media content. The term ‘Registration’ refers to the process of aligning multiple images using appropriate transformation. Further, the term ‘Image blending’ refers to the process of removing visible seams across the boundary area between the input images.
[0078] Further, in an embodiment, switching between the foreground content and the midground content on top of the background content may further cause the system 116 to access relevant historical data including historically recorded information related to at least one of the first media content, the second media content, and the tracking information. Further, the system 116 may be caused to determine, via a ML model, switching criteria for the foreground content and the midground content based, at least in part, on the relevant historical data and the relative position of the first user 102 with respect to the first media content. In a non-limiting example, the switching criteria may include a first criterion for switching to the foreground content from the midground when the first user 102 is determined to be interacting with the plurality of objects in the foreground content while being located in the background content. In another example, the switching criteria may include a second criterion for switching to the midground content from the foreground content in response to the position and the interactions of the first user 102 with the plurality of objects for displaying information that reflects an outcome of the interactions.
[0079] Furthermore, in some implementations, the system 116 may be caused to switch between the foreground content and the midground content based at least on a switching technique. In some embodiments, the switching technique may be implemented as mixing, wiping, and keying may be occurring on the first media content and the second media content. Further, the switching technique may be one of K-frame switcher, distributed production switcher, variable production load switcher, and the like. Further, in an embodiment, the system 116 is caused to extract an Autonomous System Number (ASN) and an IP address from the playback URL request, and identify at least one Content Delivery Network (CDN) Point of Presence (PoP), which is in the proximity of the location of the content viewer 122. Then, the system 116 may stream the third media content to the CDN PoP that is located nearest to the location of the content viewer 122. Later, the CDN PoP may further transmit the same to the content viewer device 120, thereby enabling the content viewer 122 to view the third media content without any buffering.
[0080] In another embodiment, the system 116 may transmit the first media content and the second media content separately to the content viewer device 120 via the network. Then, at the content viewer device 120, the third media content may be generated by stitching the first media content with the second media content. At the content viewer device 120, the third media content may be downloaded and played.
[0081] In yet another embodiment, at a Production Room (PR) the system 116 may stitch the first media content with the second media content and generate the third media content prior to broadcasting. Then, the generated third media content may be streamed to the content viewer device 120 via the network 124.
[0082] The number and arrangement of systems, devices, and/or networks shown in FIG. 1 are provided as an example. There may be additional systems, devices, and/or networks; fewer systems, devices, and/or networks; different systems, devices, and/or networks; and/or differently arranged systems, devices, and/or networks than those shown in FIG. 1. Furthermore, two or more systems or devices shown in FIG. 1 may be implemented within a single system or device, or a single system or device is shown in FIG. 1 may be implemented as multiple, distributed systems or devices. In addition, the system 116 should be understood to be embodied in at least one computing device in communication with the network 124, which may be specifically configured, via executable instructions, to perform steps as described herein, and/or embodied in at least one non-transitory computer-readable media.
[0083] It is understood that although the various embodiments of the present disclosure have been explained with the help of an example of a sporting event, i.e., cricket, this should not be construed as limiting in any manner. To that end, the embodiments of the present disclosure can be applied to different sporting events such as football as well. Furthermore, the embodiments of the present disclosure can be applied altogether to separate applications such as educational applications, medical applications, and research applications, etc., among other suitable applications as well and such applications would also fall within the scope of the present disclosure. In a non-limiting example, a presenter 102 may be demonstrating a football game for spectators (e.g., the content viewers 122) watching a Television (TV) show specialized for demonstrating after-game analysis of various sports on a TV (e.g., the content viewer device 120). To make the viewing experience more realistic, the presenter 102 may use the HMD device 104 for viewing a pre-recording of the football game in the MR graphics and analyzing the game plans for educating the spectators about the game. The presenter 102 may be standing in a studio setup where the recording of the corresponding TV show is carried out via the cameras 110 provided with the camera tracker 114. Initially, upon turning on the HMD device 104, the VR graphics having a menu may appear in front of the eyes of the presenter 102. The presenter 102 may make certain gestures using his hands to interact with the 3D content in the MR graphics corresponding to an option selected from the menu displayed on the HMD device 104 via the first computing device 106. Suppose the presenter 102 selects a football player A and content corresponding to one of his past games, then the 3D content corresponding to the selected football game may appear to be overlaid on top of the physical space surrounding the presenter 102 i.e., the studio setup for the presenter 102 upon wearing the HMD device 104.
[0084] Suppose the angle at which the football player A kicked a ball to make a goal is changed by the presenter 102 by interacting with the 3D content. As a result, the first computing device 106 generates new 3D content in the MR graphics reflecting an outcome due to the variation of the corresponding angle. Suppose earlier the football player A failed to achieve a goal and by changing the angle, the new 3D content shows that the football player A made the goal easily. Therefore, with the help of the HMD device 104 and the computing device 106, the presenter 102 can visualize a prediction about the game by interacting with the 3D content and the VR graphics.
[0085] Now if this information can be shown to the spectators, then even the spectators may be able to receive a realistic interpretation of the game. Therefore, CG content may be generated from the spectator’s perspective via the second computing device 112 upon receiving content from the first computing device 106, the cameras 110, and tracking information from the camera tracker 114. Later, the CG content along with information related to interactions of the presenter 102 with the CG content may be provided to the system 116 as the first media content. The system 116 also receives a live recording of the show from the cameras 110 as the second media content along with the tracking information. The system 116 further generates third media content which is interactive MR content including the background content, the midground content, and the foreground content based on the information received from various devices. The third media content is basically a composite of the first media content and the second media content.
[0086] The system 116 is communicably coupled to the TV via which the spectators are watching the show, thus, the third media content is streamed to the TV as streaming content. Upon streaming such content, the spectators are also able to watch the interactive MR content demonstrating the football game. The third media content may include media of the studio setup where the presenter 102 is standing and demonstrating the game overlaid with the CG content showing MR graphics of football player A on the football ground shooting the ball with a certain angle targeting to make a goal. The spectators are also able to watch the interactions of the presenter 102 with the CG content in terms of changing the angle at which the football player A has to shoot the ball to make the goal. Therefore, as the presenter 102 is interacting with the CG content, the system 116 facilitates the switching between the foreground content and the midground content in the third media content based on the switching criteria.
[0087] FIG. 2 illustrates a simplified block diagram of a system 200, in accordance with an embodiment of the present disclosure. The system 200 is identical to the system 116 of FIG. 1. The system 200 is caused to generate interactive Mixed Reality (MR) content for broadcasting applications. In some embodiments, the system 200 may be deployed within the digital platform server, or may be placed external to, and in operative communication with, the digital platform server. In other embodiments, the system 200 may be implemented as a part of a CDN or a digital platform server.
[0088] The system 200 is depicted to include multiple components such as a processing module 202, a memory module 204, an Input/Output (I/O) module 206 and a communication module 208, and a storage module 210. The various components of the system 200, such as the processing module 202, the memory module 204, the I/O module 206, the communication module 208, and the storage module 210 are configured to communicate with each other via or through a centralized circuit system 212. The centralized circuit system 212 may be various devices configured to, among other things, provide or enable communication between the components of the system 200. In certain embodiments, the centralized circuit system 212 may be a central Printed Circuit Board (PCB) such as a motherboard, a main board, a system board, or a logic board. The centralized circuit system 212 may also, or alternatively, include other Printed Circuit Assemblies (PCAs) or communication channel media.
[0089] In some embodiments, the system 200 is associated with a database 214. The database 214 may be integrated within the storage module 210 as well. For example, the storage module 210 may include one or more hard disk drives as the database 214. The database 214 is similar to the database 118 of FIG. 1. The database 214 is configured to store the first media content 216, the second media content 218, and the tracking information 220. Herein, the first media content 216 may include Computer Graphics (CG) content related to an event (e.g., a cricket sport) displayed to a first user 102 and information related to interactions of the first user 120 with a plurality of objects in the CG content. The second media content 218 may include recorded content including one or more actions and one or more positions of the first user 102. The tracking information may include information related to variation in a position of at least one camera and the first user 102 and a plurality of local markers corresponding to a position of the plurality of objects.
[0090] The processing module 202 is configured to include a content compositing module 222 and a content customization module 224. It is noted that although the system 200 is depicted to include the processing module 202, the memory module 204, the Input/Output (I/O) module 206, the communication module 208, and the storage module 210, in some embodiments, the system 200 may include more or fewer components than those depicted herein. The various components of the system 200 may be implemented using hardware, software, firmware, or any combination thereof.
[0091] In one embodiment, the processing module 202 may be embodied as a multi-core processor, a single-core processor, or a combination of one or more multi-core processors and one or more single-core processors. For example, the processing module 202 may be embodied as one or more of various processing devices, such as a coprocessor, a microprocessor, a controller, a Digital Signal Processor (DSP), a processing circuitry with or without an accompanying DSP, or various other processing devices including integrated circuits such as, for example, an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Microcontroller Unit (MCU), a hardware accelerator, a special-purpose computer chip, or the like. In one embodiment, the memory module 204 is capable of storing machine-executable instructions, referred to herein as platform instructions 205. Further, the processing module 202 is capable of executing the platform instructions 205. In an embodiment, the processing module 202 may be configured to execute hard-coded functionality. In an embodiment, the processing module 202 is embodied as an executor of software instructions, wherein the instructions may specifically configure the processing module 202 to perform the algorithms and/or operations described herein when the instructions are executed. For example, in at least some embodiments, each component of the processing module 202 may be configured to execute instructions stored in the memory module 204 for realizing respective functionalities, as will be explained in further detail later.
[0092] The memory module 204 may be embodied as one or more non-volatile memory devices, one or more volatile memory devices, and/or a combination of one or more volatile memory devices and non-volatile memory devices. For example, the memory module 204 may be embodied as semiconductor memories, such as flash memory, mask Read Only Memory (ROM), programmable ROM (PROM), Erasable PROM (EPROM), Random Access Memory (RAM), etc., and the like. In at least some embodiments, the memory module 204 may store a machine learning model (not shown in FIG. 2).
[0093] In at least some embodiments, the memory module 204 stores logic and/or instructions, which may be used by modules of the processing module 202, such as the content compositing module 222 and the content customization module 224. For example, the memory module 204 includes instructions for (a) receiving information from various devices such as the second computing device 112, the cameras 110 and the camera tracker 114 and store the same as the first media content 216, the second media content 218, and the tracking information 220 in the database 214, (b) accessing the first media content 216, the second media content 218, and the tracking information 220 from the database 214, (c) generating third media content 226 to be streamed to a plurality of second users based, at least in part, on the first media content 216, the second media content 218, and the tracking information 220, (d) generating the third media content 226 by determining a background of the first user 102 in the second media content to be one of a physical set and a green screen set by analyzing a plurality of features associated with the second media content, (e) stitching the first media content 216 with the second media content 218 in real time by performing media acquisition, calibration, registration, and blending of the first media content 216 and the second media content 218, (f) generating the third media content 226 including the background content corresponding to the second media content 218 and the foreground content corresponding to the first media content 216 when the background of the first user 102 in the second media content includes the physical set, (g) generating the third media content 226 including the background content corresponding to a virtual background and the foreground content corresponding to at least one of the first media content 216 and the second media content 218 when the background of the first user 102 in the second media content includes the green screen set, wherein the first user 102 standing in the green screen set is keyed out from the green screen set and positioned on top of the virtual background, (h) determining a relative position of the first user 102 with respect to the first media content 216 based, at least on, the tracking information, (i) switching between the foreground content and the midground content on top of the background content based at least on the relative position of the first user 102 with respect to the first media content 216, switching including accessing the relevant historical data and determining, via the ML model, switching criteria.
[0094] In an embodiment, the I/O module 206 may include mechanisms configured to receive inputs from and provide outputs to an operator of the system 200. The term ‘operator of the system 200’ as used herein may refer to one or more individuals, whether directly or indirectly associated with managing the digital OTT platform on behalf of the content provider. To enable the reception of inputs and provide outputs to the system 200, the I/O module 206 may include at least one input interface and/or at least one output interface. Examples of the input interface may include, but are not limited to, a keyboard, a mouse, a joystick, a keypad, a touch screen, soft keys, a microphone, and the like. Examples of the output interface may include, but are not limited to, a display such as a light-emitting diode display, a Thin-Film Transistor (TFT) display, a Liquid Crystal Display (LCD), an Active-Matrix Organic Light-Emitting Diode (AMOLED) display, a microphone, a speaker, a ringer, and the like. In an example embodiment, at least one module of the system 200 may include an I/O circuitry (not shown in FIG. 1) configured to control at least some functions of one or more elements of the I/O module 206, such as, for example, a speaker, a microphone, a display, and/or the like. The processing module 202 of the system 200 and/or the I/O circuitry may be configured to control one or more functions of the elements of the I/O module 206 through computer program instructions, for example, software and/or firmware, stored on a memory, for example, the memory module 204, and/or the like, accessible to the processing module 202 of the system 200.
[0095] In at least some embodiments, the operator of the system 200 may use the I/O module 206 to provide inputs to train a ML model stored in the memory module 204. The inputs provided to the ML model may include relevant historical data. The relevant historical data may include historically recorded information related to at least one of the first media content 216, the second media content 218, and the tracking information. Upon receiving all these inputs, the ML model learns from the inputs and generates outputs via the I/O module 206. The outputs may include determining the switching criteria for the foreground content and the midground content. In a non-limiting example, the switching criteria may include a first criterion for switching to the foreground content from the midground content when the first user 102 is determined to be located in the background content and is determined to be interacting with the plurality of objects in the foreground content. In another example, the switching criteria may include a second criterion for switching to the midground content from the foreground content in response to the position and the interaction of the first user 102 with the plurality of objects for displaying information that reflects an outcome of the interactions.
[0096] The communication module 208 is configured to facilitate communication between the system 200 and other components of the digital platform server (not shown in FIG. 2). In some embodiments, the communication module 208 may be configured to facilitate communication between the system 200 and one or more remote entities over the network 124 (shown in FIG. 1). For example, the communication module 208 is capable of facilitating communication with content viewer devices of content viewers, with Internet Service Providers (ISPs), with edge servers associated with content delivery networks (CDNs), with content ingestion servers, and the like.
[0097] The storage module 210 is any computer-operated hardware suitable for storing and/or retrieving data. In one embodiment, the storage module 210 is configured to store the historic data, sensor data, a plurality of images, a plurality of videos, graphical content, marker-related information, information related to various events and their multimedia content (e.g., sports), live streaming content, an outcome of an interaction of the presenter 102 with CG content, an instance of a three-dimensional (3D) content creation engine, and the like. The storage module 210 may include multiple storage units such as hard drives and/or solid-state drives in a Redundant Array of Inexpensive Disks (RAID) configuration. In some embodiments, the storage module 210 may include a Storage Area Network (SAN) and/or a Network Attached Storage (NAS) system. In one embodiment, the storage module 210 may correspond to a distributed storage system, wherein individual databases are configured to store custom information, such as user playback event logs.
[0098] In some embodiments, the processing module 202 and/or other components of the content compositing module 222may access the storage module 210 using a storage interface (not shown in FIG. 2). The storage interface may include, for example, an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a RAID controller, a SAN adapter, a network adapter, and/or any component providing the processing module 202 and/or the content compositing module 222 with access to the storage module 210.
[0099] In at least one example embodiment, the communication module 208 is configured to receive content such as video content from a remote source, such as the second computing device 112, the cameras 110, and the camera tracker 114. That may be configured to ingest the video content to the system 200. The term ‘content’ is explained hereinafter with reference to media content, but it is understood that the content may include any type of multimedia content, such as gaming content, audio content, audio-visual content, etc. Herein, the content may refer to the first media content 216, the second media content 218, and the tracking information 220 received via the communication module 208 and stored in the database 214.
[00100] The communication module 208 may be configured to forward such inputs (i.e., the first media content 216, the second media content 218, and the tracking information 220) to the processing module 202, or more specifically, to the content compositing module 222. The content compositing module 222 in conjunction with the instructions stored in the memory module 204 is configured to process such inputs (i.e., the first media content 216, the second media content 218, and the tracking information 220) to generate the third media content 226 which is then stored in the database 214 as shown in FIG. 2.
[00101] In an embodiment, the third media content 226 may correspond to the interactive MR content. Further, the third media content 226 may include background content and at least one of foreground content, and midground content. Further, the content compositing module 222 is configured to determine a background of the first user 102 in the second media content 218 to be one of a physical set and a green screen set by analyzing a plurality of features associated with the second media content 218.
[00102] In various non-limiting examples, the content compositing module 222 is further configured to stitch the first media content 216 with the second media content 218 in real time by performing media acquisition, calibration, registration, and blending of the first media content 216 and the second media content 218. Further, the content compositing module 222 is configured to generate the third media content 226, such that the third media content 226 includes background content corresponding to the second media content 218 and foreground content corresponding to the first media content 216 when the background of the first user 102 in the second media content includes the physical set.
[00103] Alternatively, the content compositing module 222 is configured to generate the third media content 226, such that the third media content 226 includes background content corresponding to a virtual background and the foreground content corresponding to at least one of the first media content 216 and the second media content 218 when the background of the first user 102 in the second media content includes a green screen set. The first user 102 standing in the green screen set is keyed out from the green screen set and positioned on top of the virtual background.
[00104] The content customization module 224 of the processing module 202 is configured to access relevant historical data including historically recorded information related to at least one of the first media content 216, the second media content 218, and the tracking information 220. The content customization module 224 may further be configured to access relevant historical data including historically recorded information related to at least one of the first media content 216, the second media content 218, and the tracking information 229. The content customization module 224 may further be configured to determine, via the ML model, switching criteria for the foreground content and the midground content. In an embodiment, the switching criteria may be determined based, at least in part, on the relevant historical data and the relative position of the first user 102 with respect to the first media content 216. Further, in a non-limiting example, the switching criteria may include a first criterion for switching to the foreground content from the midground content when the first user 102 is determined to be located in the background content and is determined to be interacting with the plurality of objects in the foreground content. In another example, the switching criteria may include a second criterion for switching to the midground content from the foreground content in response to the position and the interactions of the first user 102 with the plurality of objects for displaying information that reflects an outcome of the interactions.
[00105] FIG. 3A depicts a user interface 300 displayed to the presenter 102 through the head-mounted display (HMD) device 104. Upon switching ON the HMD device 104, the HMD device 104 scans and meshes the physical space in the viewing direction of the presenter 102. The user interface 300 includes a main menu 302 that is displayed when the presenter 102 performs a hand gesture. The information regarding the main menu 302 is sent from HMD device 104 to a computing device (such as the second computing device 112) associated with cameras (such as cameras 110) in a studio via a computing device (such as the first computing device 106) associated with the HMD device 104. The second computing device 112 shares the information to a system (such as the system 116) for streaming content to a content viewer. Hence, the user interface 300 is displayed to the content viewer (such as the content viewer 122) through the content viewer device 120. The user interface 300 includes a pattern of icons related to different information pertaining to the sport (in this case ‘Cricket’). In the illustrated example in FIG. 3A, the icons displayed are for fielding 302a, player stats 302b, video wall 302c, telestration 302d, hawk-eye 302e, and weather analysis 302f. These icons are shown only for example purposes, and these can be customized in any other form.
[00106] FIG. 3B depicts the third media content including content (i.e., video frame) 310 displayed to a content viewer 122 along with the MR graphical content displayed to the presenter 102 upon selection of the hawk-eye icon 302e. The content 310 displayed to the content viewer 122 is part of a show recorded in a studio 312 along with the graphical content displayed to the presenter 102. A cricket pitch 314 digitally rendered by the HMD device 104 in the studio 312 is streamed to the content viewers 124. It may be observed that the parameters that are visible on the cricket pitch 314 that is digitally rendered by the HMD device 104 and may include various length markings such as 2 meters (m), 4m, 6m, 8m, halfway, and the like. These are referred to as 2M, 4M, 6M, 8M, HALFWAY, etc., in FIG. 3B. Similarly, strike rate markings may also be visible such as 140, 89, 180, and the like percentages. The HMD device 104 receives information related to the rendering of a pitch 314 from the first computing device 106 associated with HMD device 104. The first computing device 106 is in communication with the media database 108 from which the information is received. Upon selection of the hawk-eye icon 302e, a request is sent to the first computing device 106 to provide content related to the selected hawk-eye icon 302e. After receiving the request from the HMD device 104, the 3D content creation engine 128 running on the first computing device 106 requests information related to the selected hawk-eye icon 302e from the media database 108. The media database 108 is an external storage device that stores data related to the hawk-eye™ system used in sports. The digitally rendered pitch 314 is categorized into different regions based on the length of the pitch. The different regions can be halfway, the slot, Yorker, and full toss (or halfway, short-pitch, overpitch, and Yorker). A virtual mannequin 316 of a player is also depicted on the pitch 314 nearer to the stumps. For example purposes, the average strike rate of the player in the different regions of the pitch 314 is shown beside the pitch 314 (e.g., a strike rate of 180 is shown when the ball is pitched on the halfway region, and a strike rate of 89 when the ball is pitched on the short pitch region). Further, a ball 318 and its trajectory 320 before and after the contact with the pitch is shown to the presenter 102. The ball 318 and its trajectory 320 are digital representations of delivery from a bowler faced by the player in a cricket match (such as ODI, test, twenty-20 international (T20I), or any domestic leagues). The digitally rendered pitch 314 and the ball 318 are part of mixed reality graphics displayed by the HMD device 104 that the presenter 102 can interact with.
[00107] FIG. 3C depicts the third media content including content (i.e., video frame) 330 displayed to the content viewer 122 along with the interaction of the presenter 102 with the MR graphical content. The presenter 102 interacts with the ball 318 using a hand gesture. Upon selection of the ball 318 by the presenter 102, the ball 318 is highlighted by placing a circle 332 below the ball 318. For the purposes of explanation, a magnified view 338 of the hand gesture 336 made by the presenter 102 to select the ball 318 is shown in FIG. 3C. In addition to placing the circle 332 below the ball 318, a line (see, 334) from the hand connecting the ball 318 is solidified upon the selection of the ball 18. The presenter 102 can move the point of contact of the ball 318 with the pitch 314 to any of the regions of the pitch 314.
[00108] FIG. 3D depicts the third media content including content (i.e., video frame) 340 displayed to the content viewer 122 along with the movement of the ball 318 on the pitch 314 by the presenter 102. Upon selection of the ball 318, the presenter 102 can move the ball 318 on the pitch 314. The information related to the selection of the ball 318 is sent to the first computing device 106 by the HMD device 104. The first computing device 106 shares the outcome, of the selection of the ball 318, with the second computing device 112. The second computing device 112 generates the computer graphics (CG) content related to the selection of the ball 318 and sends the CG content to the system 116 for streaming the content related to the selection of the ball 318. The presenter 102 moves the point of contact of the ball 318 through the movement of the hand used for the hand gesture 336 while the hand gesture 336 is in use.
[00109] FIG. 3E depicts the third media content including content (i.e., video frame) 350 displayed to the content viewer 122 along with a new point of contact 352 of the ball 318 with the pitch 314. The HMD device 104 sends the information related to the new point of contact 352 to the first computing device 106. The first computing device 106 generates new trajectory 358 of the ball 318 using the new point of contact 352 and history data received from the media database 108. The first computing device 106 accesses the media database 108 to obtain the history data and obtain a new point of contact of the ball from the HMD device 104 for the generation of the new trajectory of the ball using predefined functions available in the 3D content creation engine 128. In particular, the 3D content creation engine 128 generates the new trajectory 358 of the ball 318. To release the ball 318 from the selection, the presenter 102 uses another hand gesture 354 shown in a magnified view 356, at the new point of contact 352 of the ball 318 with the pitch 314. The new trajectory 358 of the ball 318 is displayed to the presenter 102 and the content viewer 122 upon unselecting the ball 318.
[00110] FIG. 3F depicts the third media content including content (i.e., video frame) 360 displayed to the content viewer 122 along with the new point of contact 352 of the ball 318 with the pitch 314 in which the ball is not in selection anymore. The circle 332 is not shown after the ball is not selected anymore.
[00111] FIGS. 4A-4F, collectively, depict content related to another scenario in which the presenter 102 interacts with the field settings of team A in a cricket match, as per various embodiments of the present disclosure.
[00112] FIG. 4A depicts a user interface 400 displayed to the presenter 102 through the head-mounted display (HMD) device 104 which is similar to the one displayed in FIG. 2A, in accordance with an embodiment of the present disclosure. The user interface 400 includes a main menu 402 and icons 402a, 402b, 402c, 402d, 402e, and 402f similar to the menu and icons shown in FIG. 4A.
[00113] FIG. 4B depicts third media content including the content (i.e., video frame) 310 displayed to the content viewer 122 along with graphical content displayed to the presenter 102 upon selection of the fielding icon 402a. The content 410 displayed to the content viewer 122 is part of a show recorded in the studio 412 along with the graphical content displayed to the presenter 102. A cricket stadium 414 digitally rendered by the HMD device 104 in the studio 412 is streamed to the content viewer 122. The HMD device 104 receives information related to the rendering of the cricket stadium 414 from the first computing device 106 associated with HMD device 104. The first computing device 106 is in communication with the media database 108 from which the information is received. The media database 108 is an external storage device that stores data related to the measurement and design of stadiums available in cricket. Upon selection of the fielding icon 402a, a request is sent to the first computing device 106 to provide content related to the selected fielding icon 402a. After receiving the request from the HMD device 104, the instance of the 3D content creation engine 128 running on the first computing device 106 requests information related to the selected fielding icon 402a from the media database 108. The media database 108 is an external storage device that stores data related to the field settings used in cricket. The digitally rendered cricket stadium 414 is part of mixed reality graphics displayed by the HMD device 104 that the presenter 102 can interact with.
[00114] FIG. 4C depicts third media content including content (i.e., video frame) 420 displayed to the content viewer 122 along with a field view 422 of the cricket stadium 414 in which players of a match are displayed to the presenter 102. Upon selection of the fielding icon 402a, a request is sent from the HMD device 104 to the first computing device 106. The first computing device 106 requests information related to field settings from the media database 108. The field setting can be a defensive field setting used in Test Cricket. The field view 422 shows a digitally rendered cricket ground with digitally rendered players (for example, see, 424 and 426) on the ground. The players include eleven players 424 on the bowling side and two players 426 on the batting side. The eleven players 424 are placed on the ground using predefined field settings received from the first computing device 106 associated with the HMD device 104. The two players 426 of the batting side are placed on the opposite ends of the pitch closer to the wickets. The presenter 102 can interact with the players 424 on the bowling side on the ground.
[00115] FIG. 4D depicts third media content including content (i.e., video frame) 430 displayed to the content viewer 122 along with a selection of a player by the presenter 102 through the use of a hand gesture. The hand gesture used for the selection of the player is similar to the hand gesture used for the selection of the ball in FIG. 3C. A line from the hand of the presenter 102 towards the player 424 is shown to the presenter 102 as well as the content viewer 122. Upon selection of the player 424 by the presenter 102, the player 424 is highlighted by placing a circle 432 on the surface below the player 424. In addition to placing the circle 432 below the player 424, the line (see, 434) from the hand connecting the player 424 is solidified upon the selection of the player 424. The information related to the selection of the player 424 is sent to the first computing device 106 from the HMD device 104. The first computing device 106 shares information related to the outcome of the selection of the player 424 to the second computing device 112. The second computing device 112 generates graphical content related to the selection of the player 424 and relays the graphical content to the system 116. Thereby, the graphical content related to the selection of the player 424 is streamed to the content viewer 122 by the system 116. The presenter 102 can select a new position for the selected player 424 using the line.
[00116] FIG. 4E depicts third media content (i.e., video frame) 440 displayed to the content viewer 122 along with a selection of a new position 442 for the player 424 by the presenter 102. Further, the movement of the selected player 424 is shown to the presenter 102 as well as the content viewer 122. Upon selection of the new position 442, the first computing device 106 generates information related to the movement of the player 424. In particular, the 3D content creation engine 128 generates data related to the movement of the player 424 and sends it to the HMD device 104. The HMD device 104 displays the movement of the player 424 based on the information received from the first computing device 106.
[00117] FIG. 4F depicts content (i.e., video frame) 450 displayed to the content viewer 122 along with the player 424 moved to the new position 442. The presenter 102 can select another player for changing the position of the other player on the ground.
[00118] FIGS. 5A-5D, collectively, depict content related to another scenario in which the presenter 102 interacts with the weather settings on a match day.
[00119] FIG. 5A depicts a user interface 500 displayed to the presenter 102 through the head-mounted display (HMD) device 104 which is similar to the one displayed in FIG. 3A. The user interface 500 includes a main menu 502 and icons 502a, 502b, 502c, 502d, 502e, and 502f similar to the menu and icons shown in FIG. 3A.
[00120] FIG. 5B depicts a user interface 510 displayed to the presenter 102 through the HMD device 104 upon selection of the weather analysis icon 502f. Upon selection of the weather analysis icon 502f, a sub-menu (i.e., weather presets 512) is displayed that includes more icons such as overcast 512a, rainy day 512b, foggy day 512c, sunny day 512d, and thunderstorm 512e. The HMD device 104 receives information related to the weather presets 512 from the first computing device 106 associated with HMD device 104. The first computing device 106 is in communication with a media database 108 from which the information is received. The media database 108 is an external storage device that stores history data related to various weather conditions. Upon selection of the weather analysis icon 502f, a request is sent to the first computing device 106 to provide content related to the selected weather analysis icon 502f. After receiving the request from the HMD device 104, the instance of the 3D content creation engine 128 running on the first computing device 106 requests information related to the selected weather analysis icon 502f from the media database 108.
[00121] FIG. 5C depicts third media content including content (i.e., video frame) 520 displayed to the content viewer 122 along with a stadium and one or more sliders 522a, 522b, 522c, 522d, 522e, 522f, and 522g (herein referred to as a slider 522) displayed to the presenter 102. Upon selection of the weather analysis icon 02f, a request is sent from the HMD device 104 to the first computing device 106. The first computing device 106 requests information related to weather conditions from the media database 108. The weather conditions represented in the form of the one or more sliders 522a-522g are displayed to the presenter 102. The one or more sliders 522a-522g include values related to different intensities of the weather conditions. The presenter 102 can interact with the one or more sliders 522a-522g to simulate various weather conditions at the stadium. For example, the presenter 102 can move the wind speed slider 522d to display wind speed graphics to the presenter 102. Upon moving the wind speed slider 522d, the first computing device 106 through the 3D content creation engine 128 generates the graphics required to represent the wind speed value based on the history data from the media database 108. Further, the presenter 102 can analyze the delivery of a ball (such as the trajectory of the ball) from a bowler based on the different weather conditions. The presenter 102 can interact with the time of day slider 522a to change sunrise to sunset. The presenter 102 may review different events of a cricket match based on interacting multiple sliders. The presenter 102 can interact with the one or more sliders 522a-522g through the use of hand gestures.
[00122] FIG. 5D depicts third media content including content (i.e., video frame) 530 displayed to the content viewer 122 along with a simulation of wind speed and direction. The content to be displayed to the content viewer 122 is part of a show recorded at the studio 532. Upon interaction of the presenter 102 with the wind speed slider 522d and the directions slider 522b, wind flow graphics are simulated at the stadium 534 that is displayed to the presenter 102. To generate and display the wind flow graphics to the presenter 102, the HMD device 104 sends a request for information when the presenter 102 interacts with one or more sliders 522a-522g. The first computing device 106 requests information related to the weather conditions from the media database 108. The first computing device 106 through the use of 3D content creation engine 128 creates graphics related to wind direction upon selection of wind speed slider 522d and the directions slider 522b. The first computing device 106 shares information related to the outcome of the movement of the one or more sliders 522a-522g to the second computing device 112. The second computing device 112 generates graphical content related to the movement of the slider and relays the graphical content to the system 116. Thereby, the graphical content related to the movement of the slider, graphics related to the movement of the slider, and simulation of wind flow graphics are streamed to the content viewer 122 by the system 116.
[00123] In one scenario, the presenter 102 can interact with virtual mannequins of players (such as players 316, 424, and 426) in the mixed reality graphical content displayed to the presenter 102. For example, the presenter 102 can change the posture or stance of a batsman (see 316) and analyze a shot played by the batsman. The presenter 102 can change the shot played by the batsman based on the change in posture or stance or foot movement of the batsman. The first computing device 106 requests shot information from the media database 108 to generate graphics related to changes in shot. The media database 108 includes information related to various shots played by various batsmen in cricket. Further, the presenter 102 can change the angle at elbow joints and knee joints to analyze the biomechanics (such as bowling, batting, running, and fielding) of a player. For example, the presenter 102 can see the angle at the elbow joints and knee joints of a player and can change the posture of a player by changing the angle at elbow joints and knee joints.
[00124] In addition to the above scenarios, more scenarios are possible including the usage of a telestration, player stats, and a video wall. When the presenter 102 selects telestration option, a region is displayed behind the digital mannequin (shown in FIG. 3B) of the player. The region may include one or more sub-regions, each indicating an average strike of the player when the trajectory of the ball is in that sub-region. The average strike rate displayed is generated based on the historical data of the player obtained from the media database 108.
[00125] When the presenter 102 selects the video wall option, a user interface with a list of players in the sport is displayed to the presenter 102. The presenter 102 may choose a player, and then a list of video clips of the players is displayed to the presenter. The list of video clips includes a number of the video clips of the respective player that is available for display. The video clip may include content related to his performance in a match. Upon selection of a video clip from the list of video clips, the video clip is played in a digital frame to the presenter 102. The presenter 102 can play, pause or change the playback of the video clips through the use of hand gestures as well as by selecting digital options placed near the video playback. The video clips are stored in the media database 108 which is accessed by the first computing device 106 upon a request from the HMD device 104. In one case, the live broadcast of a match can be displayed to the presenter 102 which is received from a real-world stadium.
[00126] Further, the properties/conditions of the digitally rendered pitch in FIG. 3B can be changed using the historic data from the database. For example, the pitch conditions can be dry, grass, wet, etc. The ball trajectories are generated using the point of contact of the ball, pitch conditions, and weather conditions at the stadium. For example, the bounce of the ball after contact with the pitch will be low when the pitch is dry, therefore the trajectory of the ball is generated by taking into consideration of the pitch conditions. The data related to pitch conditions and data related to the bounce of the ball on those pitches can be available in the database.
[00127] The ball trajectories can be changed by changing any of the points of the contact of the ball, pitch conditions, and weather conditions at the stadium. Furthermore, the point of release of a ball by a bowler can be changed by the presenter using hand gestures and movements of the hand of the presenter. The point of release of the ball can also be selecting another bowler who has a different point of release. The ball trajectory will also change when the point of release of the ball is changed by the presenter.
[00128] FIG. 6 illustrates a process flow diagram depicting a method 600 for dynamically generating interactive mixed reality content, in accordance with an embodiment of the present disclosure. The various steps and/or operations of the flow diagram, and combinations of steps/operations in the flow diagram, may be implemented by, for example, hardware, firmware, a processor, circuitry, and/or a system (such as the system 116) explained with reference to FIG. 1 and/or by a different device associated with the execution of software that includes one or more computer program instructions. The method 600 starts at operation 602.
[00129] At operation 602 of the method 600, the system 116 receives computer graphics (CG) content from a second computing device 112 associated with at least one camera 110. The CG content includes content related to mixed reality (MR) graphics and information related to interactions of a presenter 102 with a plurality of objects in the CG content. In a non-limiting example, the CG content and the information related to the interactions of the presenter 102 may be collectively referred to as first media content 216.
[00130] At operation 604 of the method 600, the system 116 receives video content from the at least one camera 110. The video content is recorded content of the presenter 102 in an environment such as including but not limited to a studio setup. In a non-limiting example, the video content received from the at least one camera 110 may be interchangeably referred to as second media content 218.
[00131] At operation 606 of the method 600, the system 116 receives tracking information 220 of the at least one camera 110 from a camera tracker 114. The tracking information 220 includes information related to a position and tilt of the at least one camera 110 in the studio setup.
[00132] At operation 608 of the method 600, the system 116 generates content to be streamed based at least on the CG content, the video content, and the tracking information of the at least one camera. The streaming content includes foreground content, midground content, and background content. In a non-limiting example, the streaming content may be interchangeably referred to as third media content.
[00133] At operation 610 of the method 600, the system 116 switches between the foreground and the midground content based on the position of the presenter 102 with respect to the plurality of objects in the CG content.
[00134] FIG. 7 illustrates a process flow diagram depicting a method 700 for generating interactive mixed reality content for broadcasting applications, in accordance with an embodiment of the present disclosure. The various steps and/or operations of the flow diagram, and combinations of steps/operations in the flow diagram, may be implemented by, for example, hardware, firmware, a processor, circuitry, and/or a system (such as the system 116) explained with reference to FIG. 1 and/or by a different device associated with the execution of software that includes one or more computer program instructions. The method 700 starts at operation 702.
[00135] At operation 702 of the method 700, the system 116 accesses first media content 216 from a media database 108 associated with the system 116. The first media content 216 may include computer graphics (CG) content related to an event displayed to a first user 102 and information related to interactions of the first user 102 with a plurality of objects in the CG content.
[00136] At operation 704 of the method 700, the system 116 accesses second media content 218 from the media database 108. The second media content 218 may include recorded content including one or more actions and one or more positions of the first user 102. The recorded content is captured by at least one camera.
[00137] At operation 706 of the method 700, the system 116 accesses tracking information 220 from the media database 108. The tracking information 220 may include information related to variation in a position of the at least one camera and the first user 102 and a plurality of local markers corresponding to a position of the plurality of objects.
[00138] At operation 708 of the method 700, the system 116 generates third media content to be streamed to a plurality of second users based, at least in part, on the first media content 216, the second media content 218, and the tracking information 220. The third media content includes background content and at least one of foreground content and midground content. The generation of the third media content may include performing steps 708a and 708b.
[00139] At operation 708a of the method 700, the system 116 determines a relative position of the first user 102 with respect to the first media content 216 based, at least on, the tracking information 220.
[00140] At operation 708b of the method 700, the system 116 switches between the foreground content and the midground content on top of the background content based at least on the relative position of the first user 102 with respect to the first media content 216.
[00141] The disclosed method with reference to FIGS. 6 and 7, or one or more operations of the system 200 may be implemented using software including computer-executable instructions stored on one or more computer-readable media (e.g., non-transitory computer-readable media, such as one or more optical media discs, volatile memory components (e.g., DRAM or SRAM), or nonvolatile memory or storage components (e.g., hard drives or solid-state nonvolatile memory components, such as Flash memory components) and executed on a computer (e.g., any suitable computer, such as a laptop computer, netbook, Web book, tablet computing device, smartphone, or other mobile computing devices). Such software may be executed, for example, on a single local computer or in a network environment (e.g., via the Internet, a wide-area network, a local-area network, a remote web-based server, a client-server network (such as a cloud computing network), or other such networks) using one or more network computers. Additionally, any of the intermediate or final data created and used during the implementation of the disclosed methods or systems may also be stored on one or more computer-readable media (e.g., non-transitory computer-readable media) and are considered to be within the scope of the disclosed technology. Furthermore, any of the software-based embodiments may be uploaded, downloaded, or remotely accessed through a suitable communication means. Such a suitable communication means include, for example, the Internet, the World Wide Web, an intranet, software applications, cable (including fiber optic cable), magnetic communications, electromagnetic communications (including RF, microwave, and infrared communications), electronic communications, or other such communication means.
[00142] Although the invention has been described with reference to specific exemplary embodiments, it is noted that various modifications and changes may be made to these embodiments without departing from the broad spirit and scope of the invention. For example, the various operations, blocks, etc., described herein may be enabled and operated using hardware circuitry (for example, complementary metal oxide semiconductor (CMOS) based logic circuitry), firmware, software, and/or any combination of hardware, firmware, and/or software (for example, embodied in a machine-readable medium). For example, the apparatuses and methods may be embodied using transistors, logic gates, and electrical circuits (for example, application specific integrated circuit (ASIC) circuitry and/or in Digital Signal Processor (DSP) circuitry).
[00143] Particularly, the system 200 and its various components may be enabled using software and/or using transistors, logic gates, and electrical circuits (for example, integrated circuit circuitry such as ASIC circuitry). Various embodiments of the invention may include one or more computer programs stored or otherwise embodied on a computer-readable medium, wherein the computer programs are configured to cause a processor or the computer to perform one or more operations. A computer-readable medium storing, embodying, or encoded with a computer program, or similar language, may be embodied as a tangible data storage device storing one or more software programs that are configured to cause a processor or computer to perform one or more operations. Such operations may be, for example, any of the steps or operations described herein. In some embodiments, the computer programs may be stored and provided to a computer using any type of non-transitory computer-readable media. Non-transitory computer-readable media includes any type of tangible storage media. Examples of non-transitory computer-readable media include magnetic storage media (such as floppy disks, magnetic tapes, hard disk drives, etc.), optical magnetic storage media (e.g. magneto-optical disks), CD-ROM (compact disc read-only memory), CD-R (compact disc recordable), CD-R/W (compact disc rewritable), DVD (Digital Versatile Disc), BD (BLU-RAY® Disc), and semiconductor memories (such as mask ROM, PROM (programmable ROM), EPROM (erasable PROM), flash memory, RAM (random access memory), etc.). Additionally, a tangible data storage device may be embodied as one or more volatile memory devices, one or more non-volatile memory devices, and/or a combination of one or more volatile memory devices and non-volatile memory devices. In some embodiments, the computer programs may be provided to a computer using any type of transitory computer-readable media. Examples of transitory computer-readable media include electric signals, optical signals, and electromagnetic waves. Transitory computer-readable media can provide the program to a computer via a wired communication line (e.g., electric wires, and optical fibers) or a wireless communication line.
[00144] Various embodiments of the invention, as discussed above, may be practiced with steps and/or operations in a different order, and/or with hardware elements in configurations, which are different from those which, are disclosed. Therefore, although the invention has been described based on these exemplary embodiments, it is noted that certain modifications, variations, and alternative constructions may be apparent and well within the spirit and scope of the invention.
[00145] Although various exemplary embodiments of the invention are described herein in a language specific to structural features and/or methodological acts, the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as exemplary forms of implementing the claims.
,CLAIMS:WE CLAIM:
1. A computer-implemented method, comprising:
accessing, by a system, first media content from a database associated with the system, the first media content comprising Computer Graphics (CG) content related to an event displayed to a first user, and information related to interactions of the first user with a plurality of objects in the CG content;
accessing, by the system, second media content from the database, the second media content comprising recorded content comprising one or more actions and one or more positions of the first user, wherein the recorded content is captured by at least one camera;
accessing, by the system, tracking information from the database, the tracking information comprising information related to variation in a position of the at least one camera and the first user and a plurality of local markers corresponding to a position of the plurality of objects; and
generating, by the system, third media content to be streamed to a plurality of second users based, at least in part, on the first media content, the second media content, and the tracking information, the third media content comprising background content and at least one of foreground content and midground content, wherein generating the third media content comprises:
determining, by the system, a relative position of the first user with respect to the first media content based, at least in part, on the tracking information; and
switching, by the system, between the foreground content and the midground content on top of the background content based, at least in part, on the relative position of the first user with respect to the first media content.
2. The computer-implemented method as claimed in claim 1, wherein the event is displayed to the first user as Virtual Reality (VR) content via a Head-Mounted Display (HMD) device worn by the first user.
3. The computer-implemented method as claimed in claim 2, wherein accessing the first media content comprising the CG content comprises:
receiving, by a first computing device associated with the HMD device, information related to interactions of the first user with the VR content corresponding to the event, the information related to a selection of one or more options, by the first user, is displayed in the VR content;
receiving, by the first computing device, historical information related to a plurality of events from a media database associated with the first computing device; and
generating, by the first computing device, three-dimensional (3D) content using Mixed Reality (MR) graphics, in response to the interactions of the first user with the VR content based, at least in part, on analysis of the information related to the interactions and the historical information, the 3D content comprising the plurality of objects related to the corresponding event, that can be interacted with by the first user, wherein the 3D content is displayed to the first user via the HMD device.
4. The computer-implemented method as claimed in claim 3, further comprising:
receiving, by the first computing device, information related to interactions of the first user with the 3D content displayed to the first user via the HMD device, the information comprising variation in one or more physical parameters associated with the plurality of objects;
receiving, by the first computing device, historical data related to the plurality of objects related to the corresponding event from the media database; and
generating, by the first computing device, new 3D content comprising an anticipated output associated with the one or more physical parameters associated with the plurality of objects, in response to the interactions of the first user with the 3D content based, at least in part, on analysis of the information related to the interactions and the historical data related to the plurality of objects.
5. The computer-implemented method as claimed in claim 4, further comprising:
transmitting, by the first computing device, information related to the new 3D content to a second computing device associated with the at least one camera.
6. The computer-implemented method as claimed in claim 5, further comprising:
receiving, by the second computing device, the information related to the new 3D content, the second media content, and the tracking information;
positioning, by the second computing device, a virtual camera in the MR graphics related to the new 3D content, the virtual camera adapted to capture the MR graphics with a second user’s perspective and movements as the virtual camera moves in the MR graphics;
generating, by the second computing device, the CG content based, at least in part, on the information related to the new 3D content, the second media content, the tracking information, and the MR graphics captured by the virtual camera.
7. The computer-implemented method as claimed in claim 1, wherein the tracking information is captured via a camera tracker associated with each of the at least one camera.
8. The computer-implemented method as claimed in claim 1, wherein generating the third media content comprises:
determining, by the system, a background of the first user in the second media content to be one of a physical set and a green screen set by analyzing a plurality of features associated with the second media content; and
generating, by the system, the third media content comprising one of:
the background content corresponding to the second media content and the foreground content corresponding to the first media content when the background of the first user in the second media content comprises the physical set, and
the background content corresponding to a virtual background and the foreground content corresponding to at least one of the first media content and the second media content when the background of the first user in the second media content comprises the green screen set, wherein the first user standing in the green screen set is keyed out from the green screen set and positioned on top of the virtual background.
9. The computer-implemented method as claimed in claim 1, wherein the switching between the foreground content and the midground content on top of the background content, further comprises:
accessing, by the system, relevant historical data comprising historically recorded information related to at least one of the first media content, the second media content, and the tracking information; and
determining, by the system via a Machine Learning (ML) model, switching criteria for the foreground content and the midground content based, at least in part, on the relevant historical data and the relative position of the first user with respect to the first media content, the switching criteria comprising one of:
a first criterion for switching to the foreground content from the midground content when the first user is determined to be located in the background content and is determined to be interacting with the plurality of objects in the foreground content, and
a second criterion for switching to the midground content from the foreground content in response to the position and the interactions of the first user with the plurality of objects for displaying information that reflects an outcome of the interactions.
10. The computer-implemented method as claimed in claim 1, wherein the third media content comprises interactive MR content.
11. The computer-implemented method as claimed in claim 1, wherein the system is a digital platform server associated with a content provider.
12. A system, comprising:
a memory module configured to store instructions; and
a processor in communication with the memory module, the processor configured to execute the instructions stored in the memory module and thereby cause the system to perform at least in part to:
access first media content from a database associated with the system, the first media content comprising Computer Graphics (CG) content related to an event displayed to a first user and information related to interactions of the first user with a plurality of objects in the CG content;
access second media content from the database, the second media content comprising recorded content comprising one or more actions and one or more positions of the first user, wherein the recorded content is captured by at least one camera;
access tracking information from the database, the tracking information comprising information related to variation in a position of the at least one camera and the first user and a plurality of local markers corresponding to a position of the plurality of objects; and
generate third media content to be streamed to a plurality of second users based, at least in part, on the first media content, the second media content, and the tracking information, the third media content comprising background content and at least one of foreground content and midground content, wherein for generating the third media content, the system is caused at least in part to:
determine a relative position of the first user with respect to the first media content based, at least in part on, the tracking information; and
switch between the foreground content and the midground content on top of the background content based, at least in part, on the relative position of the first user with respect to the first media content.
13. The system as claimed in claim 12, wherein the event is displayed to the first user as Virtual Reality (VR) content via a Head-Mounted Display (HMD) device worn by the first user.
14. The system as claimed in claim 13, wherein the HMD device is associated with a first computing device, wherein for accessing the first media content comprising the CG content, the first computing device is configured to:
receive information related to interactions of the first user with the VR content corresponding to the event, the information related to a selection of one or more options, by the first user, is displayed in the VR content;
receive historical information related to a plurality of events from a media database associated with the first computing device; and
generate three-dimensional (3D) content using Mixed Reality (MR) graphics, in response to the interactions of the first user with the VR content based, at least in part, on analysis of the information related to the interactions and the historical information, the 3D content comprising the plurality of objects related to the corresponding event, that can be interacted with by the first user, wherein the 3D content is displayed to the first user via the HMD device.
15. The system as claimed in claim 14, wherein the first computing device is further configured to:
receive information related to interactions of the first user with the 3D content displayed to the first user via the HMD device, the information comprising variation in one or more physical parameters associated with the plurality of objects;
receive historical data related to the plurality of objects related to the corresponding event from the media database; and
generate new 3D content comprising an anticipated output associated with the one or more physical parameters associated with the plurality of objects, in response to the interactions of the first user with the 3D content based, at least in part, on analysis of the information related to the interactions and the historical data related to the plurality of objects.
16. The system as claimed in claim 15, wherein the first computing device is further configured to:
transmit information related to the new 3D content to a second computing device associated with the at least one camera.
17. The system as claimed in claim 16, wherein the second computing device is configured to:
receive the information related to the new 3D content, the second media content, and the tracking information;
position a virtual camera in the MR graphics related to the new 3D content, the virtual camera adapted to capture the MR graphics with a second user’s perspective and movements as the virtual camera moves in the MR graphics; and
generate the CG content based, at least in part, on the information related to the new 3D content, the second media content, the tracking information, and the MR graphics captured by the virtual camera.
18. The system as claimed in claim 12, wherein the tracking information is captured via a camera tracker associated with each of the at least one camera.
19. The system as claimed in claim 12, wherein for generating the third media content, the system is further caused to:
determine a background of the first user in the second media content to be one of a physical set and a green screen set by analyzing a plurality of features associated with the second media content; and
generate the third media content comprising one of:
the background content corresponding to the second media content and the foreground content corresponding to the first media content when the background of the first user in the second media content comprises the physical set, and
the background content corresponding to a virtual background and the foreground content corresponding to at least one of the first media content and the second media content when the background of the first user in the second media content comprises the green screen set, wherein the first user standing in the green screen set is keyed out from the green screen set and positioned on top of the virtual background.
20. The system as claimed in claim 12, wherein to switch between the foreground content and the midground content on top of the background content, the system is further caused to:
access relevant historical data comprising historically recorded information related to at least one of the first media content, the second media content, and the tracking information; and
determine, via a Machine Learning (ML) model, switching criteria for the foreground content and the midground content based, at least in part, on the relevant historical data and the relative position of the first user with respect to the first media content, the switching criteria comprising one of:
a first criterion for switching to the foreground content from the midground content when the first user is determined to be located in the background content and is determined to be interacting with the plurality of objects in the foreground content, and
a second criterion for switching to the midground content from the foreground content in response to the position and the interactions of the first user with the plurality of objects for displaying information that reflects an outcome of the interactions.

Documents

Application Documents

#	Name	Date
1	202221029259-STATEMENT OF UNDERTAKING (FORM 3) [20-05-2022(online)].pdf	2022-05-20
2	202221029259-PROVISIONAL SPECIFICATION [20-05-2022(online)].pdf	2022-05-20
3	202221029259-POWER OF AUTHORITY [20-05-2022(online)].pdf	2022-05-20
4	202221029259-FORM 1 [20-05-2022(online)].pdf	2022-05-20
5	202221029259-DRAWINGS [20-05-2022(online)].pdf	2022-05-20
6	202221029259-DECLARATION OF INVENTORSHIP (FORM 5) [20-05-2022(online)].pdf	2022-05-20
7	202221029259-Proof of Right [08-09-2022(online)].pdf	2022-09-08
8	202221029259-ORIGINAL UR 6(1A) FORM 1-140922.pdf	2022-09-15
9	202221029259-Request Letter-Correspondence [06-04-2023(online)].pdf	2023-04-06
10	202221029259-Power of Attorney [06-04-2023(online)].pdf	2023-04-06
11	202221029259-Form 1 (Submitted on date of filing) [06-04-2023(online)].pdf	2023-04-06
12	202221029259-Covering Letter [06-04-2023(online)].pdf	2023-04-06
13	202221029259-CORRESPONDENCE(IPO)-(WIPO DAS)-18-04-2023.pdf	2023-04-18
14	202221029259-FORM 18 [19-05-2023(online)].pdf	2023-05-19
15	202221029259-DRAWING [19-05-2023(online)].pdf	2023-05-19
16	202221029259-CORRESPONDENCE-OTHERS [19-05-2023(online)].pdf	2023-05-19
17	202221029259-COMPLETE SPECIFICATION [19-05-2023(online)].pdf	2023-05-19
18	Abstract1.jpg	2023-10-20
19	202221029259-FER.pdf	2025-04-07
20	202221029259-RELEVANT DOCUMENTS [29-07-2025(online)].pdf	2025-07-29
21	202221029259-POA [29-07-2025(online)].pdf	2025-07-29
22	202221029259-MARKED COPIES OF AMENDEMENTS [29-07-2025(online)].pdf	2025-07-29
23	202221029259-FORM 13 [29-07-2025(online)].pdf	2025-07-29
24	202221029259-AMENDED DOCUMENTS [29-07-2025(online)].pdf	2025-07-29
25	202221029259-FER_SER_REPLY [06-10-2025(online)].pdf	2025-10-06

Search Strategy

1	SearchHistoryE_28-02-2024.pdf