Sign In to Follow Application
View All Documents & Correspondence

Method And System For Optimizing Content Insertion During Streaming Of A Media Content

Abstract: A method and a system for optimizing a second media content insertion while streaming first streaming media content are disclosed. The method includes receiving a modified manifest including a second content position marker corresponding to the first streaming media content requested by a content viewer. The method further includes computing, via a distribution model, a break duration prediction based on the first streaming media content and the second content position marker. Furthermore, the method includes determining a streaming order for each of the plurality of second media content based on the break duration prediction. Furthermore, the method includes generating an updated manifest based on the modified manifest and the streaming order. The method also includes facilitating a transmission of the updated manifest to an electronic device associated with the content viewer.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
10 October 2023
Publication Number
16/2025
Publication Type
INA
Invention Field
COMMUNICATION
Status
Email
Parent Application

Applicants

Star India Private Limited
Star House, Urmi Estate, 95, Ganpatrao Kadam Marg, Lower Parel (W), Mumbai 400013, Maharashtra, India

Inventors

1. Bingyang Huang
Unit N711, Floor 7, North Building, Raycom Infotech Park Tower C, No.2 Kexueyuan South Road, Haidian District, Beijing 100190, China
2. Haifang Qin
Unit N711, Floor 7, North Building, Raycom Infotech Park Tower C, No.2 Kexueyuan South Road, Haidian District, Beijing 100190, China
3. Tao Xiong
Unit N711, Floor 7, North Building, Raycom Infotech Park Tower C, No.2 Kexueyuan South Road, Haidian District, Beijing 100190, China

Specification

DESC: The present technology generally relates to interlacing one or more media content while content viewers are streaming content and, more particularly, to a method and system for optimizing a second media content insertion during a live media content stream.

BACKGROUND
Recently, on-demand media streaming as well as live-streaming of content has gained popularity. Content viewers (or subscribers) can access an increasing variety of media content using a variety of electronic devices (or user devices) for streaming content. The content viewer generally accesses the desired media content via an application or website associated with a content provider platform or an Over-The-Top (OTT) media service (i.e., over the Internet). The content provider platform refers to an entity that holds the digital license or rights to a plethora of media content and provides such content-to-content viewers on demand. The content provider platforms typically use a Content Delivery Network (CDN) to deliver the streaming content to the user device associated with the content. It is noted that content provider platforms typically offer several subscription levels such as a regular, a premium subscription, a free subscription supported by the insertion of other content (such as promotions, advertisements, third-party content, etc.), and/or a combination thereof to their subscribers, i.e., content viewers for accessing content from their digital content library. In addition to the subscription fees, additional content received from additional third-party services may also be streamed to the content viewer to enhance their user experience.
In one such scenario, a content viewer may opt to see a live-streaming event such as sporting events, music concerts, award ceremonies, and the like. Upon receiving the content request from the content viewer, the content provider platform may initiate the streaming of the live event. The delivery of content can be supported with second media content such as Advertisements, hereinafter referred to as ‘Ads’, animations, etc. It is noted that Ads serve as a medium to market an enterprise offering, such as a product or a service, to the content viewers of the content being streamed by the content provider platform. The Ads may be inserted in-between content segments of the live-streaming event prior to the encoding of the content, or subsequent to the encoding of the content. In some scenarios, the Ads may be personalized for different content viewers. To that end, if the Ads are inserted prior to encoding live-streaming content, then it may be difficult to personalize the Ads as very little information is known about the end-user at the content encoding stage.
Accordingly, a Server-Side Ad Insertion (SSAI) server typically inserts Ads post-encoding of the media content such as the live stream, i.e., the Ads are included within encoded content segments of the media content in real-time. For example, Ads are streamed in between the shots of live-stream content. It is noted that Ads are inserted randomly within live-stream content. They are instead placed at different positions within the live stream based on a variety of factors such as a period of inactivity within the live event, a certain activity within the live stream, and the like. For example, during a sporting event such as cricket, Ads may be placed during a time window (known as ‘Ad opportunity window’) where players are discussing strategy with each other on the field. However, the duration and number of Ads that can be delivered during this window are highly ecstatic and therefore can lead to Ad crashes if the game resumes before the streaming of Ads is finished. The term “Ad crash” refers to an act of forcibly crashing an ongoing advertisement by the content provider platform to stream the live moments of the live-streaming event to the content viewer. For example, when an Ad opportunity window is identified, two Ads of length 10 sec and 5 sec may be streamed to the content viewer. However, if the sporting event resumes only after 12 seconds, then the Ad crash takes place and the live stream event is streamed to the content viewer, thus crashing the Ads after 12 seconds. Now in this time duration, the first Ad may be delivered to the content viewer in its entirety while the second Ad gets crashed which causes a deficient or poor watch experience for the content viewer. In some scenarios, the abrupt nature of such ad crashes will cause confusion and frustration for the content viewer who might have been interested in the Ad content. Further, since the Ad was not delivered to the content viewer in its entirety, the same cannot be charged by the content provider platform to the Ad provider thereby, causing wastage of Ad inventory and leading to financial damages which in turn may cause poor service or no service to content viewers with free or regular subscriptions.
To that end, it is understood that the insertion of second media content such as Ads between live-streaming content while it is being streamed by a content viewer is a very tedious and difficult task as the duration and/or length of second media content is highly inconsistent and this process is non-uniform and dependent on a level of activity or inactivity during the live event among other factors.
Accordingly, there is a need to overcome the aforementioned drawbacks caused by the real-time insertion of second media content in between the live-streaming of content to the content viewers.

SUMMARY
Various embodiments of the present disclosure provide a method and a system for optimizing a second media content insertion during a live media content stream.
In an embodiment of the present disclosure, the computer-implemented
method performed by a system includes receiving a modified manifest corresponding to first streaming media content requested by a content viewer. The modified manifest includes a second content position marker for inserting a plurality of second media content within the first streaming media content. The computer-implemented method further includes computing, via a distribution model, a break duration prediction based, at least in part, on the first streaming media content and the second content position marker. The computer-implemented method further includes determining a streaming order for each of the plurality of second media content based, at least in part, on the break duration prediction. Herein, the streaming order enables a maximum amount of the plurality of second media content to be streamed in between the first streaming media content. The computer-implemented method further includes generating an updated manifest based, at least in part, on the modified manifest and the streaming order. The computer-implemented method further includes facilitating a transmission of the updated manifest to an electronic device associated with the content viewer.
In another embodiment of the present disclosure a system for optimizing content insertion while streaming content is disclosed. The system includes a memory for storing instructions and a processor configured to execute the instructions and thereby cause the system, at least in part, to receive a modified manifest corresponding to first streaming media content requested by a content viewer. The modified manifest includes a second content position marker for inserting a plurality of second media content within the first streaming media content. The system is further caused to compute, via a distribution model, a break duration prediction based, at least in part, on the first streaming media content and the second content position marker. The system is further caused to determine a streaming order for each of the plurality of second media content based, at least in part, on the break duration prediction. Herein, the streaming order enables a maximum amount of the plurality of second media content to be streamed in between the first streaming media content. The system is further caused to generate an updated manifest based, at least in part, on the modified manifest and the streaming order. The system is further caused to facilitate a transmission of the updated manifest to an electronic device associated with the content viewer.
In another embodiment of the present disclosure, a non-transitory computer-readable storage medium includes computer-executable instructions that, when executed by at least a processor of a system, cause a system to perform a method. The method includes receiving a modified manifest corresponding to first streaming media content requested by a content viewer. The modified manifest includes a second content position marker for inserting a plurality of second media content within the first streaming media content. The method further includes computing, via a distribution model, a break duration prediction based, at least in part, on the first streaming media content and the second content position marker. The method further includes determining a streaming order for each of the plurality of second media content based, at least in part, on the break duration prediction. Herein, the streaming order enables a maximum amount of the plurality of second media content to be streamed in between the first streaming media content. The method further includes generating an updated manifest based, at least in part, on the modified manifest and the streaming order. The method further includes facilitating a transmission of the updated manifest to an electronic device associated with the content viewer.

BRIEF DESCRIPTION OF THE FIGURES
The advantages and features of the invention will become better understood with reference to the detailed description taken in conjunction with the accompanying drawings, wherein like elements are identified with like symbols, and in which:
FIG. 1 depicts a representation for illustrating the provisioning of media content offered by a content provider platform to a content viewer, in accordance with various embodiments of the present disclosure;
FIG. 2 depicts a block diagram of a system configured for facilitating the stream of first streaming media content with a second media content to content viewers, in accordance with an embodiment of the present disclosure;
FIG. 3 depicts a process flow for determining a plurality of activities based, at least in part, on analyzing the first streaming media content, in accordance with an embodiment of the present disclosure;
FIG. 4 depicts a graph illustrating the variance in the break duration prediction computed using the distribution model, in accordance with an embodiment of the present disclosure;
FIG. 5 depicts a representation of the maximum break duration ‘M’, in accordance with an embodiment of the present disclosure;
FIGS. 6A, 6B, and 6C, collectively, depict a comparison between crash losses calculated using conventional methods and proposed methods, in accordance with an embodiment of the present disclosure;
FIG. 7 depicts a flow diagram of a method for optimizing a second media content insertion during a live media content stream, in accordance with an embodiment of the present disclosure;
FIG. 8 depicts a simplified block diagram of a Content Delivery Network (CDN), in accordance with various embodiments of the present disclosure; and
FIG. 9 depicts a flow diagram of a method for optimizing a second media content insertion during a live media content stream, in accordance with an embodiment of the present disclosure.
The drawings referred to in this description are not to be understood as being drawn to scale except if specifically noted, and such drawings are only exemplary in nature.

DETAILED DESCRIPTION
The best and other modes for carrying out the present invention are presented in terms of the embodiments, herein depicted in FIGS. 1 to 8. The embodiments are described herein for illustrative purposes and are subject to many variations. It is understood that various omissions and substitutions of equivalents are contemplated as circumstances may suggest or render expedient but are intended to cover the application or implementation without departing from the scope of the invention. Further, it is to be understood that the phraseology and terminology employed herein are for the purpose of the description and should not be regarded as limiting. Any heading utilized within this description is for convenience only and has no legal or limiting effect.
Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. The appearance of the phrase “in an embodiment” in various places in the specification is not necessarily all refer to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not for other embodiments.
The terms “a” and “an” herein do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced items.

OVERVIEW
Various embodiments of the present disclosure provide methods, systems electronic devices, and computer program products for optimizing a second media content insertion during a live media content stream for a content viewer while streaming content from a content repository server. It is noted that the content repository server is a Content Delivery Network (CDN).
Conventional authentication approaches have various drawbacks and limitations as described earlier. One such drawback includes an Ad crash that takes place while the live-stream event is being streamed to the content viewer. The abrupt nature of such Ad crashes will cause confusion and frustration for the content viewer who might have been interested in the Ad content. If the full Ad was not delivered to the content viewer, the content provider platform cannot charge the Ad provider. This creates a waste of Ad inventory and leads to financial damages to the content provider, which in turn may cause poor service or no service to content viewers with free or regular subscriptions.
To overcome such problems or limitations, the present disclosure describes a system that is configured to perform the various operations described herein. The system is configured to receive a modified manifest corresponding to first streaming media content requested by a content viewer. The modified manifest includes a second content position marker for inserting a plurality of second media content within the first streaming media content. In particular, the system receives an original manifest corresponding to the first streaming media content requested by the content viewer. Then, the system determines the plurality of second media content to be inserted within the first streaming media content at a position marked by the second content position marker. Then, the system determines a plurality of second media content playback Uniform Resource Locators (URLs) associated with each of the plurality of second media content. Further, the system is configured to generate the modified manifest based, at least in part, on the plurality of second media content playback URLs.
In an implementation, determining the plurality of second media content includes accessing a profile associated with the content viewer from a database associated with the system. Then the system classifies the content viewer into at least one cohort based, at least in part, on the profile. Then, the system accesses the plurality of second media content relevant for the content viewer based, at least in part, on the at least one cohort.
In an alternative implementation, determining the plurality of second media content includes accessing a profile associated with the content viewer from a database associated with the system. Then, the system determines a preference of the content viewer based, at least in part, on the profile. Thereafter, the system accesses the plurality of second media content relevant for the content viewer based, at least in part, on the preference.
In another embodiment, the system is configured to compute, via a distribution model, a break duration prediction based, at least in part, on the first streaming media content and the second content position marker. In an implementation, the distribution model is trained to predict the break duration prediction based, at least in part, on historical second media content marker data. In particular, the system extracts a set of image frames from the first streaming media content based, at least in part, on the second content position marker. Then, the system identifies a set of summary key frames of the first streaming media content, based, at least in part, on the set of image frames. Then, the system determines via a classification model, an activity being performed within the set of summary key frames, based, at least in part, on the set of summary key frames. Further, the system determines via the distribution model, the break duration prediction for inserting the plurality of second media content based, at least in part, on the determined activity. In a non-limiting implementation, the break duration prediction indicates a combination of delay time D and actual break duration before receiving a cue-out marker.
In another embodiment, the system is configured to determine a streaming order for each of the plurality of second media content based, at least in part, on the break duration prediction. Herein, the streaming order enables a maximum amount of the plurality of second media content to be streamed in between the first streaming media content. In particular, determining the streaming order for each of the plurality of second media content includes sorting each of the plurality of second media content in a plurality of streaming orders based, at least in part, on the break duration prediction. Herein, an individual streaming order indicates an individual order for streaming each of the plurality of second media content. The system then computes a crash loss for each of the plurality of streaming orders. Herein, the crash loss indicates an average length of crashed second media content corresponding to the individual streaming order. Then, the streaming order is selected for each of the plurality of second media content based, at least in part, on the streaming order being associated with the least crash loss.
In another embodiment, the system is configured to generate an updated manifest based, at least in part, on the modified manifest and the streaming order. Then, the system is configured to facilitate a transmission of the updated manifest to an electronic device associated with the content viewer. In an implementation, the the system can be any one of a content repository server, Server Guided Ad Insertion (SGAI) server, or a Server-Side Ad Insertion (SSAI) server.
The various embodiments of the present disclosure provide multiple advantages and technical effects while addressing technical problems such as reducing the Ad crash, reducing the wastage of Ad inventory, and reducing the financial damages to the content provider.
Various embodiments of the present disclosure have been described with reference to FIG. 1 to FIG. 9.
FIG. 1 depicts a representation 100 for illustrating the provisioning of media content offered by a content provider platform 126 to a content viewer 102, in accordance with various embodiments of the present disclosure.
In an embodiment, the content provider platform 126 is an entity that holds digital rights associated with digital media content present within digital video content libraries. In some scenarios, the content provider platform 126 offers the media content (referred to hereinafter interchangeably as ‘content’) on a subscription basis by using a digital platform and/or over-the-top (OTT) media services, i.e., content is streamed over the Internet to an electronic device 104 of the content viewers by the content provider platform 126. The content provider platform 126 is hereinafter referred to as a ‘content provider’ for ease of explanation. The term ‘content viewer 102’ as used herein implies a user, who has subscribed, i.e., registered to a subscription plan (whether a free subscription plan or a paid subscription plan) from among a plurality of subscription plans for accessing content offered by the content provider platform 126. It is noted that the term ‘content viewer’ is also interchangeably referred to hereinafter as a ‘subscriber’ or a ‘user’. In at least some embodiments, the term ‘content viewer 102’ may also include one or more users in addition to the individual content viewer 102, such as for example family members of the content viewer 102.
The representation 100 depicts an example content, such as first streaming media content 110, offered by the content provider platform 126. The first streaming media content 110 may be embodied as streaming video content such as live-streaming content or on-demand video streaming content. As an example, the representation 100 shows that the first streaming media content 110 may be embodied as live-streamed content corresponding to a sports match being streamed from an event venue such as a stadium 112. In another illustrative example, the first streaming media content 110 may correspond to Video On Demand (VOD) content, i.e., content streamed from a content library such as a content library 114, on the content viewer’s demand.
It is noted that although the first streaming media content 110 offered by the content provider is mentioned to be embodied as media content such as video content however, it is noted that the term ‘media content’ or ‘content’ as used herein may include ‘video content’, ‘audio content’, ‘gaming content’, ‘textual content’, and any combination of such content offered in an interactive or non-interactive form. Accordingly, the term ‘content’ is also interchangeably referred to hereinafter as ‘media content’ for the purposes of description.
In the illustrated example, the content viewer 102 is depicted to be controlling the electronic device 104, which is capable of displaying content streamed from CDNs, such as the first streaming media content 110 streamed from a CDN 122 in the form of a stream of encoded content segments. The electronic device 104 is depicted to be a television (TV) for illustration purposes. It is noted that the content viewer 102 may use one or more electronic devices (such as the electronic device 104), a smartphone, a laptop, a desktop, or a personal computer to view the first streaming media content 110 provided by the content provider via the CDN 122.
In an illustrative example, to subscribe to the streaming content services offered by the content provider platform 126, content viewers such as the content viewer 102 may register with the content provider platform 126 by creating an online account on the content provider’s portal. As part of the account creation process, the content viewer 102 may provide personal information, such as age, gender, language preference, content preference, and any other personal preferences to the content provider platform 126. Such information may be stored in a content viewer profile or subscriber profile along with other account information such as a type of subscription, a validity date of the subscription, etc., in a database (not shown in FIG. 1) associated with the content provider platform 126.
Once the content viewer 102 has created the account, the content viewer 102 may access a user interface (UI) of a mobile application or a Web application associated with the content provider platform 126 to view/access content. It is understood that the electronic device 104 may be in operative communication with a communication network, such as the Internet, enabled by a network provider, also known as the Internet Service Provider (ISP). The electronic device 104 may connect to the ISP network using a wired network, a wireless network, or a combination of wired and wireless networks. Some non-limiting examples of wired networks may include the Ethernet, the Local Area Network (LAN), a fiber-optic network, and the like. Some non-limiting examples of wireless networks may include Wireless LAN (WLAN), cellular networks, Bluetooth or ZigBee networks, and the like.
The electronic device 104 may fetch a UI associated with the content provider over the ISP network and cause the display of the UI on a display screen of the electronic device 104. In an illustrative example, the UI may include a plurality of content titles corresponding to a variety of content items offered by the content provider to its consumers. The content viewer 102 may select a content title from among the plurality of content titles shown on the UI, which is displayed on the display screen of the electronic device 104. For example, the content viewer 102 may select a content title related to the live cricket match streamed from the event venue, such as the stadium 112. The selection of the content title may trigger a request for a playback uniform resource locator (URL) to be sent from the electronic device 104 to the content provider platform 126. The transmission of the request for the playback URL from the electronic device 104 to the content provider platform 126 is shown using communication link 106.
In addition to requesting the playback URL of the chosen content title, the request for the playback URL also includes viewer metadata. For example, the viewer metadata includes information related to the type of electronic devices (for example, mobile phone, TV, or tablet device) used by the content viewer 102 for requesting the content, the type of login method (for example, Email or Web login) used by the content viewer 102, the type of network access (for example, cellular or Wi-Fi) associated with playback URL request, network provider ID, device identifier, IP address, geolocation information, browser information (e.g., cookie data), time of the day, and the like. The content provider platform 126 is configured to forward the request for the playback URL to a content handling server 124. The transmission of the request for the playback URL from the content provider platform 126 to the content handling server 124 is exemplarily depicted using communication link 108. It is noted in some example scenarios, the content handling server 124 may be incorporated within the content provider platform 126.
The content handling server 124 determines the type of content requested by the content viewer 102 based on the request for the playback URL. If the type of content requested by the content viewer 102 is a movie trailer or a non-content request (such as a request for the synopsis of the content), then the content handling server 124 may determine that no Ad insertion is required for the requested content. In some scenarios, the type of content requested by the content viewer 102 may be non-monetary in nature. For example, a president or prime minister’s public address to the nation during a pandemic, a natural disaster event, or any such announcement conveying news of national importance. In such scenarios, where the type of content requested by the content viewer 102 is non-monetary in nature, the content handling server 124 may determine that no Ad insertion is required for the requested content. Alternatively, if the type of request corresponds to a content title, then Ad insertion may be possible during the streaming of the content. In one illustrative example, if the content viewer 102 has requested playback of a trailer of an action movie, the content handling server 124 determines that Ad content integration is not required for the request (i.e., trailer of the action movie).
In another illustrative example, if the content viewer 102 has requested streaming of the live cricket match from the stadium 112, then the content handling server 124 may determine that Ad content can be integrated on the fly. In some scenarios, if the Ad content integration is possible, then the content handling server 124 is configured to analyze the viewer metadata included in the request for playback URL along with any other content viewer 102 related information, such as whether the content viewer 102 watched a particular brand advertisement completely in the past or not, or whether the content viewer 102 clicked a product hyperlink in an advertisement in the past and visited the product webpage, and the like. Based on the analysis, the content handling server 124 may be configured to predict a type of personalized Ads that may be suitable for the interests or preferences of the content viewer 102. In a non-limiting implementation, the content handling server 124 utilizes a prediction model, to predict the Ad content to be accommodated within the first streaming media content 110 based, at least in part, on the preference of the content viewer 102. It should be noted that within the first streaming media content 110, one or more Ad content may be accommodated.
It is noted that the prediction model is trained before its operation during deployment based on the historical content viewer data. In an instance, the prediction model is trained on the historical content viewer data to learn patterns, relationships, and trends in the input data during deployment. In various examples, the predicted one or more Ad content may be at least one of the Ad content that the content viewer 102 is interested while streaming the first streaming media content. In one specific implementation, the AI or ML model may be used as the prediction model. In a non-liming implementation, the prediction model may be a Light Gradient Boosting Machine (LightGBM), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), Bootstrap Aggregating (Bagging), Gradient Boosting Machine (GBM), Voting Classifier, Stacked Generalization (Stacking), Multiple Additive Regression Trees (MART), Gradient Boosted Regression Trees (GBRT), and so on.
The representation 100 further depicts a content handling team 116, which may include one or more individuals tasked with analyzing the first streaming media content 110 and identifying slots for inserting the second media content such as one or more Ads in between the first streaming media content 110. It should be understood that the present disclosure has been described with regard to advertisements as the second media content. However, other forms of digital content may also constitute the second media content and the same is covered by the various embodiments of the present disclosure as well. For example, any form of ‘video content’, ‘audio content’, ‘gaming content’, ‘textual content’, ‘pictorial content’, ‘image content’, ‘webpage’, and any combination of such content may be used as the second media content as well.
It is noted that these slots may be referred to as windows of opportunity during a live-streaming event where the second media content such as one or more Ads can be inserted such that the content viewer 102 does not miss any essential portion of the live event. These slots are referred to hereinafter interchangeably as ‘second media content slots’ or ‘Ad slots’ throughout the present disclosure. In one illustrative example, while streaming a live sporting event such as cricket, the players of a cricket team may celebrate the fall of a wicket. Due to this, there may be a brief opportunity within the live-streaming content to include an Ad before the players resume the actual gameplay. Such, an opportunity for an Ad slot can be identified by the content handling team 116. In another illustrative example, Video-On-Demand (VOD) content such as a movie may be associated with a storyline that may include changes in screenplay, songs, or twists in the plotline. An Ad may be slotted immediately after or before the occurrence of such changes to make the content interesting for the content viewer 102. The content handling team 116 is tasked with identifying such slots in which the Ad may be displayed during the streaming of VOD content. The content handling team 116 may insert second media content position marker 118 to identify locations within the first streaming media content 110 at which the plurality of second media content may be inserted. The term ‘second media content position marker 118’ is hereinafter interchangeably referred to as ‘second content position marker 118’. In one embodiment, the second content position marker 118 may be embodied as SCTE markers, such as SCTE-35 markers. It is noted that SCTE-35 is a joint ANSI/Society of Cable and Telecommunications Engineers standard that describes the inline insertion of cue tones in content streams. In addition to identifying the location of the Ad insertion, the second content position marker 118 may be associated with metadata such as the duration of the slot, a sequence number, and details of the slot (e.g., pre-roll, mid-roll, change in plot, fall of wicket, player injured, innings break, change in sides, etc.).
The first streaming media content 110 and the second content position marker 118 are provided to a video encoder 120. The video encoder 120 is configured to convert the first streaming media content 110 (i.e., video content) into a format capable of being streamed to content viewers such as the content viewer 102. More specifically, the first streaming media content 110 may be split into segments and each content segment may be encoded to generate a stream of encoded content segments, which may be combined to form the streaming content that is to be provided to the electronic device 104 of the content viewers. For example, when a live tennis match is delivered to a content viewer 102, encoded content segments are delivered sequentially to the electronic device 104 of the content viewer 102 providing a seamless experience of watching the live tennis match. In one illustrative example, the content segments may be encoded using predefined video encoding standards such as H.264 baseline profile or H.264 high profile.
In general, the content segments may be encoded using different combinations of resolutions and bitrates to ensure seamless delivery of the streaming content to content viewers. Some examples of the resolutions used during encoding include, but are not limited to, 2160p, 1440p, 1080p, 720p, and 480p. Some examples of the bitrates used during encoding include, but are not limited to, 128 KBPS, 2.5 MBPS, 3.5 MBPS, 5 MBPS, and the like.
In at least one embodiment, each encoded content segment is a fragment of fixed length and all encoded segments of the first streaming media content 110 have a uniform or identical length, e.g., 4 seconds of time duration. For example, a live tennis match (i.e., the first streaming media content 110) of 90 minutes duration may be segmented and encoded to generate 1350 encoded segments of 4 seconds each. Further, the video encoder 120 is also configured to generate a log, referred to herein as a ‘manifest’. The manifest includes information related to the encoded content segments, such as a number of segments related to the content, a size of each segment, the overall size of the content, an order of streaming of content, available resolutions for each content segment, and available bitrates for each content segment. The manifest also includes information related to the second content position marker 118, i.e., a number of second content position markers 118, a location (or timestamp) of each second content position marker 118, and the like. In particular, the manifest may be further classified as a master manifest including one or more child manifest. The master manifest includes different encoding ladders where each encoding ladder is placed within a separate child manifest within the master manifest.
The encoding ladder indicates a layer of content, i.e., different renditions of the same content in different resolution qualities. In one example, a playback URL for a 480p content stream with a specific bitrate is placed in one child manifest and a playback URL for a 720p content stream with a specific bitrate is placed in another child manifest. In another example, a playback URL for a 720p content stream with a bitrate of 1.5 MBPS may be placed in a child manifest while a playback URL for a 720p content stream with a bitrate of 3.5 MBPS is placed in another child manifest. The manifest and the encoded content segments are forwarded to one or more Content Delivery Networks (CDNs), such as the CDN 122. The transmission of the encoded content segments and the manifest from the video encoder 120 to the CDN 122 is shown using communication link 132. The CDN 122 is configured to add URL information to the encoded content segments to generate an ‘original manifest’. It is noted that in some scenarios, the URL information may also be added by the video encoder 120 and provided to the CDN 122 along with the encoded content segments. The original manifest (i.e., the master manifest) is provided by the CDN 122 to a content handling server 124. The transmission of the original manifest from the CDN 122 to the content handling server 124 is shown using communication link 134. The CDN 122 is configured to stream the encoded content segments to a plurality of content viewers on receiving a request for the first streaming media content 110 from the respective content viewer’s device.
The content handling server 124 is in operative communication with a second content server 128. It is noted that the second content server 128 can be replaced with the second media content server (not shown) which is configured to store the plurality of second media content such as the plurality of Ads. The plurality of second media content is hereinafter interchangeably referred to as ‘second media content’. The second content server 128 is configured to store a plurality of Ad creative received from a plurality of advertising entities, such as an example second content distributor 130 shown in the representation 100.
The content handling server 124 is configured to analyze the original manifest received from the CDN 122 and determine a number of the second content position marker 118 or Ad slots that can accommodate Ads within the streaming content. The content handling server 124 is configured to request Ad related information corresponding to the Ads stored in the second content server 128 based on the analysis of the original manifest and the viewer metadata. The content handling server 124 is configured to perform an analysis of the Ad related information received from the second content server 128 and determine the Ads that are to be inserted in the place of one or more second content position marker 118 identified in the manifest. The determination of the Ads relevant to the content viewer 102 is explained in further detail below.
In at least some embodiments, the content handling server 124 is configured to extract or access a profile associated with the content viewer 102 including information related to the content viewer 102 stored in the database associated with the content provider platform 126. In a non-limiting example, the profile includes at least information related to age, gender, location, network provider, access patterns, content genre preferences, language preferences, and the like of the content viewer 102. Further, a viewing history of the content viewer such as the content viewer 102 may also be extracted to determine the content genres, language preference, and other information related to the content viewer 102. In some cases, the content viewer 102 may have shown interest in some Ads or Ad related content in the past, and such other information may also be extracted by the content handling server 124. In some scenarios, the content handling server 124 is configured to classify content viewers into different cohorts. The term ‘cohort’ as used herein refers to a group of viewers who share at least two or more commonalities from among an age group, gender, location, network provider, access patterns, content genre preferences, language preferences, and the like. For example, each cohort may prefer or appreciate certain second media content that is determined based on the characteristics of the respective cohort. For example, a cohort corresponding to a particular age group (above 40 years) and having a preference for a particular content genre (e.g., Drama, Action/Adventure, etc.) may likely appreciate second media content related to health and fitness. Similarly, a cohort corresponding to a female gender having a preference for a particular content genre (e.g., fantasy, romance, etc.) may likely appreciate Ads related to beauty products. Similarly, a cohort corresponding to a discount seeker persona may likely appreciate second media content that offers lucrative deals or promotional offers.
Accordingly, the content handling server 124 is configured to determine the relevant second media content for the various slots as identified by the second content position marker 118 based on the aforementioned analysis and the second media content-related information provided by the second content server 128. In some embodiments, the content handling server 124 is also referred to as the server-side ad insertion (SSAI) server as it facilitates real-time post-encoding insertion of Ad content, i.e., an example of the second media content in the streaming content. In a non-limiting example, the second media content includes advertisements that may likely interest the viewers or entice the viewers by promoting some product and/or service, or making an announcement such as a new game release that the content viewer 102 may like. In other words, a preference of the content viewer 102 for the second media content is predicted based, at least in part, on the profile. Further, a second media content playback URL from the content provider platform 126 or content handling server 124 is requested based, at least in part, on the preference of the content viewer 102.
In an implementation, the system may be implemented with or within a Server Guided Ad Insertion (SGAI) server as well. In particular, the advertisements or the second media content (or the URLs associated with the second media content) can be directly shared with an electronic device 104 or the user device associated with the content viewer 102. Upon receiving said information, the original manifest may be modified to create the modified manifest locally. Then, the electronic device 104 may share the modified manifest with the system 150.
The content handling server 124 is configured to provide the information related to the selected second media content to the second content server 128. The communication exchange between the content handling server 124 and the second content server 128 is shown using communication link 136. The second content server 128 is configured to provide the content related to selected second media content, also referred to herein as ‘Ad content’ to the CDN 122, which is configured to embed the second media content in appropriate locations within the encoded content segments corresponding to the first streaming media content 110.
The content handling server 124 is also configured to modify the original manifest to reflect the second media content insertion and provide a modified manifest to the content provider platform 126. In particular, the content handling server 124 is configured to receive the original manifest corresponding to the first streaming media content requested by the content viewer 102 from the CDN 122. Upon receiving the original manifest, the content handling server 124 is further configured to determine the Ad content to be accommodated within the first streaming media content 110 based, at least in part, on the second content position marker 118. It should be noted that, as discussed before, in real-time, the content handling team 116 inserts the second content position marker 118. The content handling server 124 is further configured to insert the Ad content within the first streaming media content 110 based, at least in part, on the second content position marker 118. Finally, the content handling server 124 is configured to generate the modified manifest based, at least in part, on the first streaming media content 110 inserted with the Ad content. The transmission of the modified manifest to the content provider platform 126 from the content handling server 124 is shown using communication link 138.
The content provider platform 126, which is aware of the location of the cached first streaming media content 110 in the CDN 122, is configured to generate a playback URL, which includes the URL information of the CDN 122 and provides the playback URL to the electronic device 104 along with the modified manifest. The transmission of the playback URL and the modified manifest to the electronic device 104 is shown using communication link 140. The electronic device 104 is then configured to use the playback URL provided by the content provider platform 126 to access the CDN 122 and request the first streaming media content 110 from the CDN 122 as per the modified manifest. The CDN 122 is configured to stream encoded content segments including the second media content to the electronic device 104 as per the modified manifest. In a scenario, if there are fluctuations in the network/bandwidth of the electronic device 104, then the electronic device 104 may request content segments at different resolutions/bitrates as allowed within the modified manifest and the CDN 122 may provide encoded content segments as per the resolution/bitrate requested by the electronic device 104. The transmission of the content segments from the CDN 122 to the electronic device 104, i.e., the transmission of the streaming content from the CDN 122 to the electronic device 104 is shown using a communication link 142.
In various scenarios, the second media content may include one or more Ads that have to be inserted in-between encoded content segments of varying durations. Moreover, as explained above, the encoded content segments are of fixed length/duration (e.g., 4 seconds), as per the chosen streaming protocol. So, if an Ad length is greater than the fixed length/duration, the Ad may also be split into segments of equivalent length. In other words, at first, it is determined if the time duration of the second media content segments is less than a fixed time duration. Then, a filler content segment is accessed from a database (not shown). The length of the filler content segment is determined based, at least in part, on the time duration of the second media content segments. In other words, upon determining that the second media content segments have a time duration less than the fixed time duration, the second media content segments are supplemented with the filler content segment. For example, if an Ad length is 7 seconds and the fixed length of encoded content segments is 4 seconds, then the Ad may be split into two segments, with the first segment corresponding to the first four seconds of the Ad and the second segment including the remaining three seconds followed by a 1-second filler to configure another four-second segment. It is noted that the terms ‘filler’ or ‘filler content’ as used hereinafter corresponds to any content that avoids an empty screen on the electronic device 104 of the content viewer 102 during the streaming activity. As an example, a final image frame of the Ad segment may be maintained in a static state for a brief duration to avoid the empty screen on the electronic device 104.
It is understood that the second media content that is delivered during the second media content slot may include one or more Ads among other suitable content. However, due to the uncertain nature of the live event, the second content position marker 118 indicates the ongoing stream to cueing into streaming the live event, and for cueing out from streaming the media content can be received at any instant. For example, when a cricket player abruptly returns to continue the match after a team discussion. Due to this, the ongoing stream of the second media content has to be stopped or crashed forcefully. For example, an ongoing Ad might have to be crashed to resume the live event stream. This abrupt crashing of the second media content leads to poor user experience and in some cases might lead to financial damages for the content provider platform 126 as well.
To overcome the aforementioned drawbacks and provide additional advantages, a system 150 is provided. The system 150 is in operative communication with the content handling server 124 and the CDN 122. The system 150 is configured to optimize the insertion of the second media content insertion during the live-streaming of media content while ensuring the maximum amount of second media content is streamed to the content viewer 102 before a second media content marker (i.e., a cue-out marker) is received. More specifically, the system 150 is configured to intelligently predict the distribution of one or more media content included in the second media content (such as Ads, promotional content, etc.) before it is streamed to the content viewer 102 such that the maximum amount of second media content can be streamed before the cue-out marker is received. In particular, Artificial Intelligence (AI) and/or Machine Learning (ML) models are utilized by the system 150 to predict the second media content distribution during the first streaming media content 110 streaming process based on historical patterns. In an example, the historical patterns are predicted by the AI or ML models through the historically received second media content markers corresponding to a particular category of live-streaming events such as sporting events, music concerts, award ceremonies, and the like. In some scenarios, various models may predict the second media content distribution during the content streaming process based on historical patterns interpreted through the historically received second media content markers corresponding to a particular sub-category of live-streaming events such as cricket and football (i.e., a sub-category of sporting events), opera and music events (i.e., a sub-category of concerts), Oscars™ (i.e., a sub-category of award ceremonies) and the like. It is noted that the various aspects of the system 150 have been explained in further detail with reference to FIG. 2.
FIG. 2 depicts a block diagram 200 of a system 150 configured for facilitating the stream of the first streaming media content 110 with the second media content to content viewers, in accordance with an embodiment of the present disclosure. It is noted that the system 150 depicted in FIG. 2 is identical to the system 150 in FIG. 1.
In at least one embodiment, the system 150 may be embodied as a server system or machine in operative communication with a content repository server such as the CDN 122 and the content handling server 124, as shown in FIG. 1. Alternatively, the system 150 may be included within the CDN 122 or maybe communicably accessible via API plugins to the CDN 122. The system 150 is depicted to include a processing module 202, a memory module 204, an Input/Output (I/O) module 206, and a communication module 208. It is noted that although the system 150 is depicted to include the processing module 202, the memory module 204, the I/O module 206, and the communication module 208, in some embodiments, the system 150 may include more or fewer components than those depicted herein. The various components of the system 150 may be implemented using hardware, software, firmware, and/or any combination thereof. Further, it is also noted that one or more components of the system 150 may be implemented in a single server or a plurality of servers, which are remotely placed from each other.
In one embodiment, the processing module 202 may be embodied as a multi-core processor, a single-core processor, or a combination of one or more multi-core processors and one or more single-core processors. For example, the processing module 202 may be embodied as one or more of various processing devices, such as a coprocessor, a microprocessor, a controller, a digital signal processor (DSP), a processing circuitry with or without an accompanying DSP, or various other processing devices including integrated circuits such as, for example, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a microcontroller unit (MCU), a hardware accelerator, a special-purpose computer chip, or the like. In one embodiment, the memory module 204 is capable of storing machine-executable instructions, referred to herein as platform instructions 205. Further, the processing module 202 is capable of executing the platform instructions 205. In an embodiment, the processing module 202 may be configured to execute hard-coded functionality. In an embodiment, the processing module 202 is embodied as an executor of software instructions, wherein the instructions may specifically configure the processing module 202 to perform the algorithms and/or operations described herein when the instructions are executed. The processing module 202 is depicted to include a parsing sub-module 210, an activity determination sub-module 212, and a generation sub-module 216. The activity determination sub-module 212 is depicted to further include a frame detection sub-module 214a, and a frame classification sub-module 214b.
The memory module 204 stores instructions/code configured to be used by the processing module 202, or more specifically by the various modules of the processing module 202 such as the parsing sub-module 210, the frame detection sub-module 214a, the frame classification sub-module 214b, and the generation sub-module 216 to perform respective functionalities, as will be explained in detail with reference to FIG. 2. The memory module 204 may be embodied as one or more non-volatile memory devices, one or more volatile memory devices, and/or a combination of one or more volatile memory devices and non-volatile memory devices. For example, the memory module 204 may be embodied as semiconductor memories, such as flash memory, mask ROM, PROM (programmable ROM), EPROM (erasable PROM), RAM (random access memory), and the like.
In an embodiment, the I/O module 206 may include mechanisms configured to receive inputs from and provide outputs to the operator(s) of the system 150. To that effect, the I/O module 206 may include at least one input interface and/or at least one output interface. Examples of the input interface may include, but are not limited to, a keyboard, a mouse, a joystick, a keypad, a touch screen, soft keys, a microphone, and the like. Examples of the output interface may include, but are not limited to, a display such as a light-emitting diode display, a thin-film transistor (TFT) display, a liquid crystal display, an active-matrix organic light-emitting diode (AMOLED) display, a microphone, a speaker, a ringer, a vibrator, and the like.
In another embodiment, the processing module 202 may include I/O circuitry configured to control at least some functions of one or more elements of the I/O module 206, such as, for example, a speaker, a microphone, a display, and/or the like. The processing module 202 and/or the I/O circuitry may be configured to control one or more functions of the one or more elements of the I/O module 206 through computer program instructions, for example, software and/or firmware, stored on a memory, for example, the memory module 204, and/or the like, accessible to the processing module 202.
In an embodiment, the communication module 208 may include a communication circuitry such as for example, a transceiver circuitry including an antenna and other communication media interfaces to facilitate communication between the system 150 and one or more remote entities such as, the content repository server, i.e., the CDN 122 and the content handling server 124 over a communication network (not shown in FIG. 2). The communication circuitry may, in at least some example embodiments enable reception of: (1) a manifest (or a modified manifest) related to first streaming media content such as live-streaming content requested by the content viewer 102 from the content repository server or the electronic device 104 of the content viewer 102, and (2) second media content and second media content related information from the second content server 128.
The system 150 is further depicted to be in operative communication with a database 218. The database 218 is any computer-operated hardware suitable for storing and/or retrieving data. In one embodiment, the database 218 is configured to store historical second media content marker data. The historical second media content marker data may be extracted from a plurality of historical manifest files corresponding to the plurality of historical first streaming media content. As may be understood, all ad breaks in content are managed using cue-out and cue-in markings indicating the actual ad start and end time for each Ad break, i.e., the duration of the advertisement. Further, the distribution model 220 is trained or learned to look for such markings in all the contents that have run ads historically. In other words, the distribution model 220 is trained or learned based, at least in part, on the historical second media content marker data extracted from a plurality of historical manifest files corresponding to the plurality of historical first streaming media content. This historical information helps the distribution model 220 determine the duration of Ads along with content category details including but not limited to frames ran before and after the ads in the contents and content category, etc., among other relevant factors. Once trained, the distribution model 220 is configured to compute/predict a break duration in the live-streaming content by comparing the action in the key summary with the learned information from the historically viewed content.
In another embodiment, the database 218 is configured to store various AI or ML models required for implementing the functionality of the various modules of the processing module 202. The database 218 may include multiple storage units such as hard drives and/or solid-state drives in a redundant array of inexpensive disks (RAID) configuration. In some embodiments, the database 218 may include a storage area network (SAN) and/or a network-attached storage (NAS) system. In one embodiment, the database 218 may correspond to a distributed storage system, wherein individual databases are configured to store custom information such as content view logs, etc.
The database 218 may be accessed by the system 150 using a storage interface (not shown in FIG. 2). The storage interface is any component capable of providing the processing module 202 with access to the generation sub-module 216. The storage interface may include, for example, an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a RAID controller, a SAN adapter, a network adapter, and/or any component providing the processing module 202 with access to the generation sub-module 216. Alternatively, in some embodiments, the database 218 is integrated within the system 150. For example, the system 150 may include one or more hard disk drives or solid state drives as the database 218. In an embodiment, the database 218 is depicted to further include a second media content distribution model 220. It is noted that the operation of the second media content distribution model 220 (hereinafter referred to as ‘distribution model 220’) is explained further with reference to FIG. 4 in the present disclosure.
The various components of the system 150, such as the processing module 202, the memory module 204, the I/O module 206, and the communication module 208 are configured to communicate with each other via or through a centralized circuit system 222. The centralized circuit system 222 may be various devices configured to, among other things, provide or enable communication between the components of the system 150. In certain embodiments, the centralized circuit system 222 may be a central printed circuit board (PCB) such as a motherboard, a main board, a system board, or a logic board. The centralized circuit system 222 may also, or alternatively, include other printed circuit assemblies (PCAs) or communication channel media.
As explained with reference to FIG. 1, the content provider platform 126 provides a playback URL and a modified manifest to the electronic device 104. The playback URL identifies a content repository server, such as the CDN 122 (shown in FIG. 1), which is caching the content (i.e., the first streaming media content 110) corresponding to the content title selected by the content viewer 102 for playback. The modified manifest includes information about the availability of encoded streams of different resolutions and a sequence of encoded content segments within each stream that are to be transmitted from the CDN 122 along with information related to the second media content for the content viewer 102. For example, a modified manifest corresponding to the first streaming media content 110 such as a live-streaming musical concert may include a schedule of streaming segments related to the live music concert event, which further includes a schedule of streaming segments related to the second media content such as Ads, which are to be inserted at locations identified by the second content position marker 118 inserted within the live media content stream (i.e., the first streaming media content 110).
The electronic device 104 sends a request for the first streaming media content 110 identified in the playback URL to the CDN 122. In one embodiment, the electronic device 104 is also configured to send the modified manifest to the CDN 122 for requesting a content stream for the first streaming media content 110 of a particular resolution. The system 150 on account of being in operative communication with the CDN 122 may receive the request for the first streaming media content 110 along with the modified manifest sent by the electronic device 104 to the CDN 122. More specifically, the communication module 208 in the system 150 may receive the request for the first streaming media content 110 via the modified manifest, and provide the modified manifest to the parsing sub-module 210 in the processing module 202. In some embodiments, the electronic device 104 may only send a request for content of a particular resolution to the CDN 122 and the system 150 on account of being in operative communication with the CDN 122 may receive the request for the first streaming media content 110 via the communication module 208. The communication module 208 may be configured to forward the request for the first streaming media content 110 to the parsing sub-module 210. The parsing sub-module 210 may then communicate, using the communication module 208, with the content handling server 124 or the content provider platform 126 to receive the modified manifest corresponding to the requested content (i.e., the first streaming media content 110).
In at least one embodiment, the parsing sub-module 210 is configured to parse the modified manifest to extract a schedule or a streaming order for transmitting encoded content segments corresponding to the first streaming media content 110 requested by the content viewer 102 and the second media content. In at least some embodiments, the schedule may include URLs for encoded content segments that need to be transmitted in sequence to the electronic device 104 of the content viewer 102. The schedule also includes URLs for a second media content that needs to be transmitted in-between transmission of encoded content segments to the electronic device 104.
In an embodiment, the activity determination sub-module 212 is configured to determine a plurality of activities being performed within the first streaming media content 110. In particular, for determining the activities in the first streaming media content 110 the individual image frames of the first streaming media content 110 are detected and classified into the plurality of activities via the frame detection sub-module 214a and the frame classification sub-module 214b respectively. In a non-limiting example, an AI or ML model such as an action determination model may be utilized by the activity determination sub-module 212 to determine a plurality of activities being performed within the first streaming media content 110.
The distribution model 220 is trained on various image datasets and video libraries to detect or learn the context present within the summary key frames via the movement of objects and/or characters within the key summary frames. In particular, each content present within the library is run past various different ML models to identify frames or key frames within the said content. The goal of this process is to identify one or more actions or contexts present within the said frames by analyzing various objects, character motion, subtitles running in the content, scene boundaries, etc., present within the content. Further, the one or more actions are linked to the content category type: sports, action, kids, etc., among other categories. Then, the one or more actions (or the information related to these actions) are saved into a repository or database.
In an embodiment, the frame detection sub-module 214a is configured to identify a set of summary key frames from the first streaming media content 110. It is noted that the set of summary key frames represents video sequences from the first streaming media content 110, wherein the frame is a single still image within a video content. In an example, an AI or ML model is configured to identify a set of summary key frames. This aspect is explained in detail later in the present disclosure.
The frame classification sub-module 214b is configured to classify the set of summary key frames into an activity being performed within the first streaming media content 110. In an example, an AI or ML model is configured to classify a set of summary key frames into an activity being performed within the first streaming media content 110. This aspect is explained in detail later in the present disclosure. In various examples, if the first streaming media content 110 is a cricket game, then, an activity may include players huddling together, the fall of a wicket, an announcement from third empire, idle practice by players, a player timeout, and the like.
The term ‘frame’ may be described as the smallest unit of video content. A ‘shot’ generally refers to a combination of multiple frames while a ‘scene’ refers to a combination of several continuous shots. To that end, scenes and shots are different from each other. In general, a shot is captured by a camera that operates for an uninterrupted period of time and thus is visually continuous; while a scene is a semantic unit at a higher level. The term ‘scene’ includes a sequence of shots that present a semantically coherent part of the story portrayed by the video content. The scene detection process performed by the scene or shot detection model aims to segment a video temporally into scenes such that the different scenes are separated by scene boundaries. Generally, advertisements or the second media content is streamed between scene boundaries as a mid-roll advertisement break to reduce interruptions for the content viewer 102.
In an embodiment, the generation sub-module 216 is configured to compute a break duration prediction for inserting the second media content based, at least in part, on the second content position marker 118. More specifically, an AI or ML model such as the second media content distribution model 220 is configured to compute a break duration prediction for inserting the second media content based, at least in part, on the activity being determined using the second content position marker 118. This aspect is explained in detail later in the FIG. 5 of the present disclosure.
In another embodiment, the generation sub-module 216 is configured to access encoded content segments of the first streaming media content 110 corresponding to the modified manifest. Then, the generation sub-module 216 is configured to determine maximum amount of the plurality of second media content to be streamed to the content viewer 102 between a cue-in marker and a cue-out marker of the second content position marker 118 by sorting each of the plurality of second media content based, at least in part, on the break duration prediction. In an instance, in order to prevent Ad crash, the break duration prediction is used to sort the plurality of second media content between the cue-in marker, and the cue-out marker of the second content position marker 118. This allows the plurality of second media content to be efficiently distributed within the cue-in marker, and the cue-out marker of the second content position marker 118, such that the crash loss of the Ad content can be minimized. As may be understood, the second content position marker 118 acts as the cue-in marker while the cue-out marker is received from the content handling team 116.
The generation sub-module 216 is further configured to stream the second media content between the cue-in marker and the cue-out marker of the second content position marker 118. This aspect is explained in detail later in the FIG. 6C of the present disclosure.
In another embodiment, the generation sub-module 216 is configured to generate an updated manifest based, at least in part, on the modified manifest and the break duration prediction. In an example, the updated manifest includes a requested playback URL and the second content position marker 118. In another embodiment, the generation sub-module 216 is further configured to facilitate transmission of the updated manifest to the electronic device 104 associated with the content viewer 102.
FIG. 3 depicts a process flow 300 for determining a plurality of activities based, at least in part, on analyzing the first streaming media content 110, in accordance with an embodiment of the present disclosure. In an example, the first streaming media content 110 is streamed from the content repository server (such as the CDN 122) to the electronic device 104 of the content viewer 102. It is noted that first streaming media content 302 of FIG. 3 is identical to the first streaming media content 110 depicted by FIG. 1. At first, the first streaming media content 302 is received by the frame detection sub-module 214a, this is shown using a communication link 304.
The frame detection sub-module 214a is configured to perform a frame selection step and a frame classification step. The frame selection step involves comparing frames in a programmed and periodical manner with predefined frame events/actions, for example, bowling, hitting, boundary crossing, etc., to determine an activity in the first streaming media content 302. The frame classification step involves classifying the frames into predefined events for the determination or prediction of the Ad duration, which in turn estimates the optimized Ad that can be fitted into the first streaming media content 110.
More specifically, in an embodiment, the frame detection sub-module 214a is configured to estimate shots from the live-stream content such as the first streaming media content 302 using a shot detection model. Further, the frame detection sub-module 214a may train a classification model to determine an activity in the estimated shot. An example for such an activity is a bowling moment in the cricket match. The bowling moment is pre-defined based on the task (such as the cricket match) and it may be understood that the Ads will be inserted soon after the bowling moment. In an embodiment, it should be noted that the classification model is adapted from an action recognition model. Further, in a non-limiting example, the shot in which the bowling action happens is defined as a positive training sample while other shots may be defined as negative training samples. Moreover, in an embodiment, it may be noted that a temporal action recognition algorithm is used as a network to tune the task. In a non-limiting example, the temporal action recognition algorithm extracts about three to six keyframes as input from the first streaming media content 302. The temporal action recognition algorithm further extracts features based on performing convolutions between the corresponding keyframes and then conducts classification across the features.
In an embodiment, the frame detection sub-module 214a is configured to extract a set of image frames from the first streaming media content 302 based, at least in part, on the second content position marker 118 included in the modified manifest. Then, the extracted set of image frames is analyzed to identify a set of summary key frames (see, 306(1), 306(2), 306(3), 306(4),…, 306(n), where ‘n’ is a natural number) of the first streaming media content 302. In an embodiment, the frame detection sub-module 214a is configured to identify the set of summary key frames of the first streaming media content, based, at least in part, on the set of image frames. The set of summary key frames represents video sequences from the first streaming media content 302. To identify the set of summary key frames, at first, the frame detection sub-module 214a determines one of important frames and unimportant frames from the set of image frames. Then, the important frames are classified as the set of summary key frames. Further, the set of summary key frames is transmitted to the frame classification sub-module 214b, this is shown using a communication link 308.
The frame classification sub-module 214b is configured to analyze the set of summary key frames to determine an activity being performed within the set of summary key frames. In various examples, deep learning models such as AI or ML models may be used by the frame classification sub-module 214b to detect an activity based on analyzing the set of summary key frames. In the illustrated example, upon analyzing the set of summary key frames, the frame is classified into an activity. In an instance, frame 3 can be classified to be associated with an activity. Thereafter, the generation sub-module 216 is configured to compute the break duration prediction for inserting the second media content based, at least in part, on the activity being determined using the second content position marker 118.
FIG. 4 depicts a graph 400 illustrating a variance in the break duration prediction computed using the distribution model 220, in accordance with an embodiment of the present disclosure. It is understood that the graph 400 depicts the variance that is determined based on the break duration prediction computed for different orders of streaming the second media content.
In an embodiment, the distribution model 220 is an AI or ML model such as the second media content distribution model 220 (referred to hereafter interchangeably as ‘distribution model 220’) that is trained using historical second media content marker data accessed from the database 218. During the training process, the distribution model 220 is configured to use deep learning and pattern recognition techniques to determine a plurality of break patterns associated with a plurality of activities being performed during a plurality of historical live-streaming events. In an embodiment, frame recognition and frame classification techniques may be used by the distribution model 220 to determine the plurality of activities being performed during a particular historical live-streaming event. In a non-limiting example, historical second media content marker data associated with a historical sporting event may be used for training the distribution model 220. The historical second media content marker data includes information related to various cue-in and cue-out markers received while streaming the live-streaming event in the past. The distribution model 220 performs frame determination and classification techniques to determine different activities that led to the insertion of the second media content markers in the manifest. For example, if the historical live-streaming event is a sporting event such as a cricket match then the distribution model 220 may determine using frame determination and classification that activities such as players huddling together, the fall of a wicket, an announcement from third empire, idle practice by players, a player timeout, and the like led to the insertion of the second media content markers. It is noted that the distribution model 220 analyzes a plurality of historical live-streaming events to determine the plurality of relevant activities. Further, the distribution model 220 determines an average time duration, i.e., an average break duration, associated with each of the plurality of relevant activities. Further, it is noted that the distribution model 220 is trained using historical second content data associated with a plurality of different historical live-streaming events. Therefore, the distribution model 220 can be applied to grain insights for various different live-streaming events and allows for a wide range of applicability.
In another embodiment, the distribution model 220 is configured to analyze an ongoing stream of the first streaming media content 110, i.e., a live media content stream to determine a suitable distribution of the second media content before it is streamed to the content viewer 102 to ensure that maximum amount second media content is streamed to the content viewer 102 before a second media content crash due to a cue-out marker is encountered. At first, the distribution model 220 analyses the modified manifest to determine the presence of a second media content marker. Upon determining that the second media content marker data is present in the modified manifest, the distribution model 220 performs frame determination and frame classification using the techniques described earlier with reference to FIG. 3 to determine an activity being performed during the live-streaming event. Based on the determined activity, the distribution model 220 computes a prediction of break duration (also referred to as break duration probability, or break duration prediction) based, at least in part, on an average break duration previously determined for a historical activity corresponding to the determined activity. In a non-limiting example, an equation for computing the break duration prediction (P) is given below:
P(B) = [(D1, F1), (D2, F2)…( Dn, Fn)] … Eqn. (1)
Herein, D1 to Dn represents average break durations corresponding to the determined activity determined based on analyzing a plurality of historical live-streaming events, and F1 to Fn is the frequency of breaks during the plurality of historical live-streaming events.
Then, the distribution model 220 accesses and analyzes a second media content from the second content server 128. The analysis further includes determining the duration and the type of each of the second media content (SM). In various non-limiting examples, the type of the second media content may include educational, fashion, finance, retail, and the like. In a non-limiting example, an equation for representing the second media content is given below:
SM = [(DS1, T1), (DS2, T2)…( DSn, Tn)] … Eqn. (2)
Herein, DS1 to DSn represents a duration of each of the second media content and T1 to Tn represents the type of each of the second media content. Herein, N is a non-zero natural number.
Further, the distribution model 220 is configured to generate streaming orders (referred hereinafter interchangeably as a ‘plurality of impressions’) for the second media content based, at least in part, on permutation and combination techniques. It is noted that in some scenarios, the various permutations and combinations such that no subsequent second media content share the same type. Furthermore, a Crash Loss (CL) is computed by the distribution model 220 for each of the plurality of streaming orders, i.e., each impression of the plurality of impressions. It is understood that the term ‘crash loss’ refers to an average length of crashed second media content per impression. In a non-limiting example, an equation for computing the Crash Loss (CL) is given below:
CL=(?_(i=1)^N¦?I (Pi= T
return T - DS_Sum
DS_Sum = DS_Sum + DS
endif
endfor
return 0
- CL (crash loss|order=[ DS1, DS2, DS3, DS4]) is a minimum that is 3.6, so the final result is [DS2, DS4, DS3, DS1]

It is understood that a variance (depicted using graph 400) exists in the break distribution prediction when it is computed for different data points present within the plurality of historical live-streaming events, i.e., old data and recent data corresponding to the historical live-streaming events. The term ‘variance’ refers to changes in the output of the model (such as the distribution model 220) due to using different portions of the training data set (i.e., old and recent data points within the plurality of historical live-streaming events). Experimentally, it has been determined that the break distribution prediction computed using the old and recent data points is not completely in line as there is always some variance present. To overcome this discrepancy due to variance or to minimize variance, the break distribution prediction is calculated based on an ‘X’ duration instead of calculating the break distribution prediction for the entire ‘M’ duration. Experimentally, it has been determined that the break distribution prediction calculated for X duration, i.e., X P(X) is more stable and has a smaller variance when compared to the break distribution prediction calculated for M duration, i.e., M P(M). Additionally, the difference between the prediction of P(M) and the prediction of P(X) is variance V.
Referring now to FIG. 5, a representation 500 of the maximum break duration ‘M’, in accordance with an embodiment of the present disclosure. In an embodiment, the distribution model 220 is optimized to reduce the variance by calculating the break distribution prediction for X duration. As depicted by FIG. 5, the maximum break duration (hereinafter referred to as ‘M’) includes a delay time (D), actual break duration (B), and residual time (R). The maximum break duration may be defined as the duration including the overall time between the cue-in and cue-out position markers (see, position marker 1 and position marker 2 respectively). The term ‘delay time’ or D refers to the time delay between receiving the position marker 1 (i.e., cue-in marker) and the beginning of the Ad (i.e., second media content) streaming (see, Ad break starts). More specifically, the delay duration indicates a duration of the time delay (D) between the beginning of the position marker 1 (i.e., cue-in marker) of the second content position marker 118 and the beginning of the Ad (i.e., second media content). It is understood that the Ad does not start at the exact timestamp on which the position marker 1 is received, rather it takes a few microseconds or seconds before the Ad streaming is initiated. Further, the term ‘residual time’ or R refers to the time between any given time ‘t’ and the next epoch of the renewal process under consideration. In other words, the time between the finish of the Ad break and the resumption of the live-streaming event (i.e., the first streaming media content 110) back to the content viewer 102 is known as the ‘residual time’. As may be understood, the streaming of live media content does not start immediately after the timestamp at which the Ad break has been finished, there is always some latency involved in the process of resuming the live media content stream (at position marker 2) after an Ad break.
In various non-limiting examples, the phenomenon of residual time takes place due to certain reasons such as, but not limited to, time taken for data transfer, delay due to processing by the electronic device 104 of the content viewer 102, and the like. The residual time R may also be called a buffer that has to be maintained before receiving the cue-out marker so that the live stream can be resumed exactly upon receiving the cue-out marker (i.e., position marker 2). For example, consider a cricket match as live-streaming content, and an Ad break has to be added between two bowling overs during the cricket match. Then, if the first bowling of the next over is supposed to start at ‘L min: M sec’ then the Ad will get crashed at ‘L min: M sec – R sec’ to make sure that the content viewer 102 does not miss even a single second of live-streaming media content. It is noted that the residual time is also highly inconsistent and it changes largely depending on the situation or the activity being performed within the live-streaming event.
To counteract the inconsistent R, the break duration prediction is calculated for the portion X. In an embodiment, the portion ‘X’ is the combination of the delay time D and the actual break duration B. It is understood that the calculation of break duration prediction by taking X duration into consideration, optimizes the accuracy of break duration prediction and decreases the variance. The optimized break duration prediction calculated for X duration, i.e., X P(X) is more stable and has a smaller variance when compared to the break distribution prediction calculated for M duration, i.e., M P(M).
FIG. 6A, 6B, and 6C, collectively, depict a comparison between crash losses calculated using the conventional methods and the proposed method, in accordance with an embodiment of the present disclosure.
As shown in reference to FIG. 6A, within the conventional method 610, the streaming orders for the second media content, i.e., the plurality of Ads is random. In the illustrative example, upon receiving the cue-in position marker, it is determined that the four Ads should be shown to the content viewer 102. In this scenario, the respective duration of the four Ads is [30 sec, 10 sec, 20 sec, 5 sec]. Now, for an Ad duration of 58 sec, only the first two Ads of 30 sec and 10 sec durations respectively can be shown to the content viewer 102 while the third Ad of 20 seconds duration has to be crashed in between at the 18th second timestamp. This leads to a total crash loss of 18 seconds. As may be understood, crashing the Ad while it is being streamed leads to poor content viewer 102 experience and a loss of revenue since the promoters will not pay for crashed Ads even if 18 sec out of the 20 sec were streamed to the content viewer 102.
As shown in reference to FIG. 6B, within another conventional method 620, the streaming orders for the second media content, i.e., the plurality of Ads is determined by sorting the Ads in descending order of time duration. In the illustrative example, upon receiving the cue-in position marker, it is determined that the four Ads should be shown to the content viewer 102. In this scenario, the respective duration of the four Ads is [30 sec, 10 sec, 20 sec, 5 sec]. At first, the four Ads are sorted in descending order of time duration before they are streamed to the content viewer 102, i.e., [30 sec, 20 sec, 10 sec, 5 sec]. Now, for an Ad duration of 58 sec, only the first two Ads of 30 sec and 20 sec durations respectively can be shown to the content viewer 102 while the third Ad of 10 sec duration has to be crashed in between at the 8th second timestamp. This leads to a total crash loss of 8 seconds. Although the overall crash loss is reduced it is noted that the novel approach provided by the present disclosure reduced the crash loss even further.
As shown in reference to FIG. 6C, using the proposed method 630, the crash loss is reduced even further. In an embodiment, the streaming orders for the second media content, i.e., the plurality of Ads are determined by sorting the Ads based at least in part on the break distribution prediction computed by the distribution model 220. In the illustrative example, upon receiving the cue-in position marker, it is determined that the four Ads should be shown to the content viewer 102. In this scenario, the respective duration of the four Ads is [30 sec, 10 sec, 20 sec, 5 sec]. At first, the four Ads are sorted based on the break distribution prediction before they are streamed to the content viewer 102, i.e., [30 sec, 20 sec, 5 sec, 10 sec]. Now, for an Ad duration of 58 sec, the first three Ads of 30 sec, 20 sec, and 10 sec durations respectively can be shown to the content viewer 102 while the fourth Ad of 10 sec duration has to be crashed in between at the 3rd second timestamp. This leads to a total crash loss of 3 seconds. Therefore, the approach disclosed by the present disclosure maximizes the Ads that are shown to the content viewer 102 while also reducing the overall crash loss. It should be noted that the Ad inventory includes different Ad lengths for different user interests or cohorts. One or a group of Ads with a certain length is selected from the Ad inventory based on the predicted Ad break length.
FIG. 7 depicts a flow diagram of a method 700 for optimizing a second media content insertion during a live media content stream, in accordance with an embodiment of the present disclosure. The various steps and/or operations of the flow diagram, and combinations of steps/operations in the flow diagram, may be implemented by, for example, hardware, firmware, a processor, circuitry and/or by a system such as the system 150 explained with reference to FIGS. 2 to 5B and/or by a different device associated with the execution of software that includes one or more computer program instructions. The method 700 starts at operation 702.
At the operation 702, the method 700 includes receiving, by the system 150, a modified manifest corresponding to first streaming media content (such as the first streaming media content 110) requested by a content viewer (such as the content viewer 102) from a content repository server. The modified manifest includes at least second content position marker 118 and a plurality of second media content playback URLs. In various examples, the modified manifest further includes at least a plurality of encoded content playback Uniform Resource Locators (URLs) each corresponding to an encoding ladder of the first streaming media content 110, encoded stream availability data, sequence data of encoded content segments, and the like.
At operation 704, the method 700 includes accessing, by the system 150, a second media content from a database (such as the database 218 explained in FIG. 2) associated with the system 150 based, at least in part, on the modified manifest. In an alternative embodiment, the second media content is accessed from the second content server 128.
At operation 706, the method 700 includes computing, by the system 150 via a machine learning model, a break duration prediction for inserting the second media content based, at least in part, on the second content position marker 118.
At operation 708, the method 700 includes generating, by the system 150, an updated manifest based, at least in part, on the modified manifest and the break duration prediction.
At operation 710, the method 700 includes facilitating, by the system 150, a transmission of the updated manifest to the electronic device 104 associated with the content viewer 102.
FIG. 8 depicts a simplified block diagram of a Content Delivery Network (CDN) 800, in accordance with various embodiments of the present disclosure. The CDN 800 is an example of the CDN 122 of FIG. 1. The CDN 800 refers to a distributed group of servers that are connected via a network (such as Network 804, which is explained later). The CDN 800 provides quick delivery of media content to various content viewers. The CDN 800 includes a plurality of interconnected servers that may interchangeably be referred to as a plurality of content repository servers. The CDN 800 includes an origin CDN server 802, a public CDN server 806, a private CDN server 808, a Telecommunication CDN server (referred to hereinafter as ‘Telco CDN server’) 810, an Internet Service Provider CDN server (referred to hereinafter as ‘ISP CDN server’) 812, and a CDN point of presence server (referred to hereinafter as ‘CDN POP server’) 814 each coupled to, and in communication with (and/or with access to) the network 804. It is noted that CDN POP may also be interchangeably referred to as ‘sub-CDNs’, ‘subnet CDN’, ‘surrogate CDN’, and ‘CDN sub-box’. Further, two or more components of the CDN 800 may be embodied in one single component, and/or one component may be configured using multiple sub-components to achieve the desired functionalities. Some components of the CDN 800 may be configured using hardware elements, software elements, firmware elements, and/or a combination thereof.
The network 804 may include, without limitation, a light fidelity (Li-Fi) network, a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a satellite network, the Internet, a fiber-optic network, a coaxial cable network, an infrared (IR) network, a radio frequency (RF) network, a virtual network, and/or another suitable public and/or private network capable of supporting communication among two or more of the parts illustrated in FIG. 8, or any combination thereof. Various servers within the CDN 800 may connect to the network 804 using various wired and wireless communication protocols, such as Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), 2nd Generation (2G), 3rd Generation (3G), 4th Generation (4G), 5th Generation (5G) communication protocols, Long Term Evolution (LTE) communication protocols, future communication protocols or any combination thereof. For example, the network 804 may include multiple different networks, such as a private network made accessible by the origin CDN server 802 and a public network (e.g., the Internet, etc.) through which the various servers may communicate.
The origin CDN server 802 stores the media content accessed/downloaded from the streaming content provider and/or content producers. The origin CDN server 802 serves the media content to one or more cache servers which are either located in the vicinity of the content viewer/subscriber 102 or connected to another cache server located in the content viewer’s vicinity. In various examples, cache servers include the public CDN server 806, the private CDN server 808, the Telco CDN server 810, the ISP CDN server 812, the CDN POP server 814, and the like.
The origin CDN server 802 includes a processing system 816, a memory 818, a database 820, and a communication interface 822. The processing system 816 is configured to extract programming instructions from the memory 818 to perform various functions of the CDN 800. In one example, the processing instructions include instructions for ingesting media content via the communication interface 822 from a remote database 824 which may further include one or more data repositories/databases (not shown) to an internal database such as the database 820. The remote database 824 is associated with a streaming content provider and/or content producer. In another example, the media content stored within the database 820 can be served to one or more cache servers via the communication interface 822 over the network 804.
In some examples, the public CDN server 806 is associated with a public CDN provider which hosts media content among other types of data for different content providers within the same server. The private CDN server 808 is associated with a private CDN provider (such as a streaming content provider) which hosts media content for serving the needs of the content viewer 102. The Telco CDN server 810 is associated with telecommunication service providers which provide content hosting services to various entities such as the streaming content platform. The ISP CDN server 812 is associated with internet service providers which provide content hosting services to various entities such as the streaming content platform. The CDN POP server 814 caches content and allows the electronic devices of the content viewers to stream the content. It is noted that the various cache servers download and cache media content from the origin CDN server 802 and further allow a valid user or content viewer 102 to stream the media content.
FIG. 9 depicts a flow diagram of a method 900 for optimizing a second media content insertion during a live media content stream, in accordance with an embodiment of the present disclosure. The various steps and/or operations of the flow diagram, and combinations of steps/operations in the flow diagram, may be implemented by, for example, hardware, firmware, a processor, circuitry and/or by a system such as the system 150 explained with reference to FIG. 2 to FIG. 8 and/or by a different device associated with the execution of software that includes one or more computer program instructions. The method 900 starts at operation 902.
At the operation 902, the method 900 includes receiving, by the system 150, a modified manifest corresponding to first streaming media content (such as the first streaming media content 110) requested by a content viewer (such as the content viewer 102). The modified manifest includes at least a second content position marker 118 for inserting a plurality of second media content within the first streaming media content 110.
At operation 904, the method 900 includes computing, by the system 150 via a distribution model (for example, the distribution model 220), a break duration prediction based, at least in part, on the first streaming media content 110 and the second content position marker 118.
At operation 906, the method 900 includes determining, by the system 150, a streaming order for each of the plurality of second media content based, at least in part, on the break duration prediction. Herein, the streaming order enables a maximum amount of the plurality of second media content to be streamed in between the first streaming media content 110.
At operation 908, the method 900 includes generating, by the system 150, an updated manifest based, at least in part, on the modified manifest and the streaming order.
At operation 910, the method 900 includes facilitating, by the system 150, a transmission of the updated manifest to the electronic device 104 associated with the content viewer 102.
It is noted that various embodiments of the present disclosure, the various functions of the system 150, or the method disclosed in FIG. 7 and/or FIG. 9 can be implemented using any one or more components of the CDN 800 such as the origin CDN server 802 and/or one or more cache servers individually and/or in combination with each together. Alternatively, the system 150 can be communicably coupled with the CDN 800 to perform the various embodiments or methods described by the present disclosure.
Various embodiments disclosed herein provide numerous advantages. More specifically, the embodiments disclosed herein suggest techniques for reducing the Ad crash, reducing the wastage of Ad inventory, and reducing the financial damages to the content provider. It is noted that by providing a streaming order for streaming each of a plurality of second media content enables the electronic device of the content viewer to easily parse the updated manifest and show the Ads to the content viewer 102. Further, since the streaming order is selected for a least crash loss, it is ensured that a maximum amount of the second media content is shown to the content viewer 102 before the cue-out marker is received.
The foregoing descriptions of specific embodiments of the present disclosure have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the present invention to the precise forms disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiment was chosen and described in order to best explain the principles of the present invention and its practical application, thereby enabling others skilled in the art to best utilize the present invention and various embodiments with various modifications as are suited to the particular use contemplated.
,CLAIMS:1. A computer-implemented method, comprising:
receiving, by a system, a modified manifest corresponding to first streaming media content requested by a content viewer, the modified manifest comprising a second content position marker for inserting a plurality of second media content within the first streaming media content;
computing, by the system via a distribution model, a break duration prediction based, at least in part, on the first streaming media content and the second content position marker;
determining, by the system, a streaming order for each of the plurality of second media content based, at least in part, on the break duration prediction, wherein the streaming order enables a maximum amount of the plurality of second media content to be streamed in between the first streaming media content;
generating, by the system, an updated manifest based, at least in part, on the modified manifest and the streaming order; and
facilitating, by the system, a transmission of the updated manifest to an electronic device associated with the content viewer.

2. The computer-implemented method as claimed in claim 1, wherein computing the break duration prediction comprises:
extracting, by the system, a set of image frames from the first streaming media content based, at least in part, on the second content position marker;
identifying, by the system, a set of summary key frames of the first streaming media content, based, at least in part, on the set of image frames;
determining, by the system via a classification model, an activity being performed within the set of summary key frames, based, at least in part, on the set of summary key frames; and
determining, by the system via the distribution model, the break duration prediction for inserting the plurality of second media content based, at least in part, on the determined activity.

3. The computer-implemented method as claimed in claim 1, wherein the break duration prediction indicates a combination of delay time and actual break duration before receiving a cue-out marker.

4. The computer-implemented method as claimed in claim 1, wherein determining the streaming order for each of the plurality of second media content comprises:
sorting, by the system, each of the plurality of second media content in a plurality of streaming orders based, at least in part, on the break duration prediction, wherein an individual streaming order indicates an individual order for streaming each of the plurality of second media content;
computing, by the system, a crash loss for each of the plurality of streaming orders, wherein the crash loss indicates an average length of crashed second media content corresponding to the individual streaming order; and
selecting, by the system, the streaming order for each of the plurality of second media content based, at least in part, on the streaming order being associated with the least crash loss.

5. The computer-implemented method as claimed in claim 1, wherein receiving the modified manifest comprises:
receiving, by the system, an original manifest corresponding to the first streaming media content requested by the content viewer;
determining, by the system, the plurality of second media content to be inserted within the first streaming media content at a position marked by the second content position marker;
determining, by the system, a plurality of second media content playback Uniform Resource Locators (URLs) associated with each of the plurality of second media content; and
generating, by the system, the modified manifest based, at least in part, on the plurality of second media content playback URLs.

6. The computer-implemented method as claimed in claim 5, wherein determining the plurality of second media content comprises:
accessing, by the system, a profile associated with the content viewer from a database associated with the system;
classifying, by the system, the content viewer into at least one cohort based, at least in part, on the profile; and
accessing, by the system, the plurality of second media content relevant for the content viewer based, at least in part, on the at least one cohort.

7. The computer-implemented method as claimed in claim 5, wherein determining the plurality of second media content comprises:
accessing, by the system, a profile associated with the content viewer from a database associated with the system;
determining, by the system, a preference of the content viewer based, at least in part, on the profile; and
accessing, by the system, the plurality of second media content relevant for the content viewer based, at least in part, on the preference.

8. The computer-implemented method as claimed in claim 1, wherein the distribution model is trained to predict the break duration prediction based, at least in part, on historical second media content marker data.

9. The computer-implemented method as claimed in claim 1, wherein the system is one of a content repository server, Server Guided Ad Insertion (SGAI) server, or a Server-Side Ad Insertion (SSAI) server.

10. A system comprising:
a memory for storing instructions; and
a processor configured to execute the instructions and thereby cause the system, at least in part, to:
receive a modified manifest corresponding to first streaming media content requested by a content viewer, the modified manifest comprising a second content position marker for inserting a plurality of second media content within the first streaming media content;
compute, via a distribution model, a break duration prediction based, at least in part, on the first streaming media content and the second content position marker;
determine a streaming order for each of the plurality of second media content based, at least in part, on the break duration prediction, wherein the streaming order enables a maximum amount of the plurality of second media content to be streamed in between the first streaming media content;
generate an updated manifest based, at least in part, on the modified manifest and the streaming order; and
facilitate a transmission of the updated manifest to an electronic device associated with the content viewer.

11. The system as claimed in claim 10, wherein to compute the break duration prediction, the system is caused to:
extract a set of image frames from the first streaming media content based, at least in part, on the second content position marker;
identify a set of summary key frames of the first streaming media content, based, at least in part, on the set of image frames;
determine, via a classification model, an activity being performed within the set of summary key frames, based, at least in part, on the set of summary key frames; and
determine, via the distribution model, the break duration prediction for inserting the plurality of second media content based, at least in part, on the determined activity.

12. The system as claimed in claim 10, wherein the break duration prediction indicates a combination of delay time and actual break duration before receiving a cue-out marker.

13. The system as claimed in claim 10, wherein to determine the streaming order for each of the plurality of second media content, the system is caused to:
sort each of the plurality of second media content in a plurality of streaming orders based, at least in part, on the break duration prediction, wherein an individual streaming order indicates an individual order for streaming each of the plurality of second media content;
compute a crash loss for each of the plurality of streaming orders, wherein the crash loss indicates an average length of crashed second media content corresponding to the individual streaming order; and
select the streaming order for each of the plurality of second media content based, at least in part, on the streaming order being associated with the least crash loss.

14. The system as claimed in claim 10, wherein to receive the modified manifest, the system is caused to:
receive an original manifest corresponding to the first streaming media content requested by the content viewer;
determine the plurality of second media content to be inserted within the first streaming media content at a position marked by the second content position marker;
determine a plurality of second media content playback Uniform Resource Locators (URLs) associated with each of the plurality of second media content; and
generate the modified manifest based, at least in part, on the plurality of second media content playback URLs.

15. The system as claimed in claim 14, wherein to determine the plurality of second media content, the system is caused to:
access a profile associated with the content viewer from a database associated with the system;
classify the content viewer into at least one cohort based, at least in part, on the profile; and
access the plurality of second media content relevant for the content viewer based, at least in part, on the at least one cohort.

16. The system as claimed in claim 14, wherein to determine the plurality of second media content, wherein the system is caused to:
access a profile associated with the content viewer from a database associated with the system;
determine a preference of the content viewer based, at least in part, on the profile; and
access the plurality of second media content relevant for the content viewer based, at least in part, on the preference.

17. The system as claimed in claim 10, wherein the distribution model is trained to predict the break duration prediction based, at least in part, on historical second media content marker data.

18. The system as claimed in claim 10, wherein the system is one of a content repository server, Server Guided Ad Insertion (SGAI) server, or a Server-Side Ad Insertion (SSAI) server.

19. A non-transitory computer-readable storage medium comprising computer-executable instructions that, when executed by at least a processor of a system, cause the system to perform a method comprising:
receiving a modified manifest corresponding to first streaming media content requested by a content viewer, the modified manifest comprising a second content position marker for inserting a plurality of second media content within the first streaming media content;
computing, via a distribution model, a break duration prediction based, at least in part, on the first streaming media content and the second content position marker;
determining a streaming order for each of the plurality of second media content based, at least in part, on the break duration prediction, wherein the streaming order enables a maximum amount of the plurality of second media content to be streamed in between the first streaming media content;
generating an updated manifest based, at least in part, on the modified manifest and the streaming order; and
facilitating a transmission of the updated manifest to an electronic device associated with the content viewer.

20. The non-transitory computer-readable storage medium as claimed in claim 19, wherein computing the break duration prediction comprises:
extracting a set of image frames from the first streaming media content based, at least in part, on the second content position marker;
identifying a set of summary key frames of the first streaming media content, based, at least in part, on the set of image frames;
determining, via a classification model, an activity being performed within the set of summary key frames, based, at least in part, on the set of summary key frames; and
determining, via the distribution model, the break duration prediction for inserting the plurality of second media content based, at least in part, on the determined activity.

Documents

Application Documents

# Name Date
1 202321067981-STATEMENT OF UNDERTAKING (FORM 3) [10-10-2023(online)].pdf 2023-10-10
2 202321067981-PROVISIONAL SPECIFICATION [10-10-2023(online)].pdf 2023-10-10
3 202321067981-POWER OF AUTHORITY [10-10-2023(online)].pdf 2023-10-10
4 202321067981-FORM 1 [10-10-2023(online)].pdf 2023-10-10
5 202321067981-DRAWINGS [10-10-2023(online)].pdf 2023-10-10
6 202321067981-DECLARATION OF INVENTORSHIP (FORM 5) [10-10-2023(online)].pdf 2023-10-10
7 202321067981-Proof of Right [30-10-2023(online)].pdf 2023-10-30
8 202321067981-PA [08-10-2024(online)].pdf 2024-10-08
9 202321067981-ASSIGNMENT DOCUMENTS [08-10-2024(online)].pdf 2024-10-08
10 202321067981-8(i)-Substitution-Change Of Applicant - Form 6 [08-10-2024(online)].pdf 2024-10-08
11 202321067981-DRAWING [10-10-2024(online)].pdf 2024-10-10
12 202321067981-CORRESPONDENCE-OTHERS [10-10-2024(online)].pdf 2024-10-10
13 202321067981-COMPLETE SPECIFICATION [10-10-2024(online)].pdf 2024-10-10
14 Abstract.jpg 2025-01-04
15 202321067981-FORM 18 [23-01-2025(online)].pdf 2025-01-23
16 202321067981-RELEVANT DOCUMENTS [18-09-2025(online)].pdf 2025-09-18
17 202321067981-POA [18-09-2025(online)].pdf 2025-09-18
18 202321067981-MARKED COPIES OF AMENDEMENTS [18-09-2025(online)].pdf 2025-09-18
19 202321067981-FORM 13 [18-09-2025(online)].pdf 2025-09-18
20 202321067981-AMENDED DOCUMENTS [18-09-2025(online)].pdf 2025-09-18