Method And System For Facilitating Insertion Of Advertisements In

< Back

Method And System For Facilitating Insertion Of Advertisements In Streaming Content

Abstract: Methods and systems for inserting second media content in streaming content. Method performed by the system includes accessing viewer behavior data, a first media content, and a second media content. Herein, the first media content and the second media content includes plurality of first encoded segments, and plurality of second encoded segments, respectively. Then, identifying probable scene boundaries based on analyzing plurality of first encoded segments. Then, determining scene transition markers corresponding to the first media content based on probable scene boundaries and pre-defined rules. Then, determining dynamic tolerance threshold based on analyzing viewer behavior data and second media content. Then, selecting second position markers from scene transition markers based on dynamic tolerance threshold and generating new media content record based on inserting second encoded segments from second encoded segments in between first encoded segments based on second position markers. Then, generating modified manifest based on new media content record.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

13 June 2022

Publication Number

50/2023

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Novi Digital Entertainment Private Limited

Star House, Urmi Estate, 95 Ganpatrao Kadam Marg, Lower Parel (W) Mumbai 400 013, Maharashtra, India

Inventors

1. Guanqing Hu

Unit N711, Floor 7, North Building, Raycom Infotech Park Tower C, No.2 Kexuyuan South Road, Haidian District, Beijing, 100190, China

2. Runzhou Zhao

Unit N711, Floor 7, North Building, Raycom Infotech Park Tower C, No.2 Kexuyuan South Road, Haidian District, Beijing, 100190, China

3. Tao Xiong

Unit N711, Floor 7, North Building, Raycom Infotech Park Tower C, No.2 Kexuyuan South Road, Haidian District, Beijing, 100190, China

4. Mukund Acharya

Star House, Urmi Estate, 95 Ganpatrao Kadam Marg, Lower Parel (W) Mumbai 400 013, Maharashtra, India

5. Akash Saxena

Star House, Urmi Estate, 95 Ganpatrao Kadam Marg, Lower Parel (W) Mumbai 400 013, Maharashtra, India

Specification

DESC:[0001] The present technology generally relates to the delivery of digital content such as streaming content to content viewers, and more particularly, to a method and system for inserting second media content (such as advertisements) in a first media content (such as streaming content).
BACKGROUND

[0002] On-demand video streaming as well as live streaming of content has gained popularity in recent times and subscribers are increasingly using a variety of electronic devices to access streaming content. The streaming content is accessed on electronic devices using Over-The-Top (OTT) media services (i.e., over the Internet).
[0003] Typically, streaming content providers may also provide a second content along with the content stream of another content for a variety of reasons. For instance, some information related to the content stream may be shown as the second media content. In some scenarios, the streaming content providers may generate revenue from subscriptions availed by subscribers for accessing the streaming content. In addition to the revenue from the subscriptions, the content providers generate revenue from the second media content as well. For example, second media content may include advertisements inserted in the streaming content provided to the subscribers. As may be understood, the advertisements, also referred to as ‘Ads’, serve as a medium to market enterprise offerings, such as products or services, to viewers (such as subscribers) of the content being streamed by the OTT streaming content provider.
[0004] In general, Ads are inserted in the streaming content at random intervals without any regard to scenes included in the content. For example, an Ad may be inserted in between an intense dialogue exchange between two characters in the streaming content. Such insertion of Ads at random intervals in the content degrades the viewing experience of the customer. Although some streaming content providers identify scene boundaries and aim to insert Ads at scene boundaries, most of the identified scene boundaries may not correlate with an actual scene boundary. In addition, all the scene boundaries may not be relevant slots for inserting Ads. For example, a scene boundary between a scene that serves as a prelude to a plot twist and the next scene where the plot twist actually takes place may not be appropriate for Ad insertion. In some cases, a large number of Ads may be interspersed between the streaming content, which may be frustrating for a viewer and eventually, the viewer may stop watching that content, especially if the Ads are irrelevant and do not effectively engage the viewer. Moreover, the Ad insertion slots are currently identified manually, which is cumbersome and time-consuming especially if the content library is large and frequently updated.
[0005] Accordingly, there is a need to overcome the aforementioned drawbacks and provide an improved mechanism for inserting second media content such as Ads in streaming content. More specifically, there is a need for identifying appropriate slots in the streaming content for the insertion of Ads, which improves viewer experience significantly and may not be intrusive to the viewer. Further, there is a need for customization of the number of Ad slots for viewers to reduce the number of interruptions for certain viewers. Further, it would be advantageous to automate at least a part of the process for identifying Ad insertion slots in streaming content.
SUMMARY
[0006] Various embodiments of the present disclosure for methods and systems for inserting a second media content in streaming content, i.e., the first media content.
[0007] In an embodiment of the present disclosure, a computer-implemented method is disclosed. The computer-implemented method performed by the system includes accessing viewer behavior data corresponding to a content viewer from a database associated with the system. Further, the method includes accessing a first media content and a second media content from a content repository server based, at least in part, on a manifest file. The first media content includes a plurality of first encoded segments, and the second media content includes a plurality of second encoded segments. Further, the method includes identifying a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments. Further, the method includes determining one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules. Further, the method includes determining a dynamic tolerance threshold for the content viewer based, at least in part, on analyzing the viewer behavior data and the second media content. Further, the method includes selecting one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic tolerance threshold. Further, the method includes generating a new media content record for the content viewer based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers. The modified manifest is generated for the content repository server based, at least in part, on the new media content record.
[0008] In another embodiment of the present disclosure, another computer-implemented method is disclosed. The computer-implemented method performed by the system includes accessing viewer behavior data including data related to a plurality of content viewers from a database associated with the system. Further, the method includes accessing a first media content and a second media content from a content repository server based, at least in part, on a manifest file. The first media content includes a plurality of first encoded segments, and the second media content includes a plurality of second encoded segments. Further, the method includes classifying via a second machine learning model, the plurality of content viewers into a plurality of viewer cohorts based, at least in part, on the viewer behavior data. Further, the method includes identifying a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments. Further, the method includes determining one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules. Further, the method includes determining a dynamic cohort tolerance threshold of each viewer cohort of the plurality of the viewer cohorts based, at least in part, on analyzing the viewer behavior data and the second media content. Further, the method includes selecting one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic cohort tolerance threshold of the each viewer cohort. Further, the method includes generating a new media content record corresponding to the each viewer cohort of the plurality of the viewer cohorts based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers. The modified manifest is generated for the content repository server for each viewer cohort of the plurality of the viewer cohorts based, at least in part, on the new media content record.
[0009] In yet another embodiment of the present disclosure, a system is disclosed. The system includes memory and a processor. The memory stores instructions which are executed by the processor and cause the system to access information related to a viewer behavior data corresponding to a content viewer from a database associated with the system. Further, the system is caused to access a first media content and a second media content from a content repository server based, at least in part, on a manifest file. The first media content includes a plurality of first encoded segments, and the second media content includes a plurality of second encoded segments. Further, the system is caused to identify a plurality of probable scene boundaries based, at least in part, on the analysis of the plurality of first encoded segments. Further, the system is caused to determine one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules. Further, the system is caused to determine a dynamic tolerance threshold for the content viewer based, at least in part, on analysis of the viewer behavior data and the second media content. Further, the system is caused to select one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic tolerance threshold. Further, the system is caused to generate a new media content record for the content viewer based, at least in part, on insertion of the one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers. Further, the system is caused to generate a modified manifest for the content repository server based, at least in part, on the new media content record.
[0010] In yet another embodiment of the present disclosure, a non-transitory computer-readable storage medium is disclosed. The non-transitory computer-readable storage medium includes computer-executable instructions that, when executed by at least a processor of a system, cause the system to perform a method. The method includes accessing information related to viewer behavior data corresponding to a content viewer from a database associated with the system. Further, the method includes accessing a first media content and a second media content from a content repository server based, at least in part, on a manifest file. The first media content includes a plurality of first encoded segments, and the second media content includes a plurality of second encoded segments. Further, the method includes identifying a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments. Further, the method includes determining one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules. Further, the method includes determining a dynamic tolerance threshold for the content viewer based, at least in part, on analyzing the viewer behavior data and the second media content. Further, the method includes selecting one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic tolerance threshold. Further, the method includes generating a new media content record for the content viewer based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers. The modified manifest is generated for the content repository server based, at least in part, on the new media content record.
BRIEF DESCRIPTION OF THE FIGURES

[0011] The advantages and features of the invention will become better understood with reference to the detailed description taken in conjunction with the accompanying drawings, wherein like elements are identified with like symbols, and in which:
[0012] FIG. 1 shows a representation for illustrating the provisioning of video content offered by a streaming content provider to a viewer, related to at least some embodiments of the present disclosure;
[0013] FIG. 2 shows a representation for depicting a scene boundary in-between scenes of the video content of FIG. 1, in accordance with an example scenario;
[0014] FIG. 3 is a block diagram of a system configured to facilitate the insertion of second media content in streaming content, in accordance with an embodiment of the present disclosure;
[0015] FIG. 4A shows an example illustration of a plurality of scene transition markers in relation to a detection of a probable scene boundary in the video content, in accordance with an embodiment of the present disclosure;
[0016] FIG. 4B shows an example illustration of second content position markers indicating an actual scene boundary in the video content, in accordance with an embodiment of the present disclosure;
[0017] FIG. 5 shows a representation for illustrating an example insertion of second media content segments in-between segments of streaming content, in accordance with an embodiment of the present disclosure;
[0018] FIG. 6A shows a schematic representation of a UI corresponding to the video content displayed to a viewer for illustrating the provisioning of targeted Ads in the video content, in accordance with an embodiment of the invention;
[0019] FIG. 6B shows a schematic representation of a UI corresponding to the video content displayed to another viewer for illustrating the provisioning of second media content in the video content, in accordance with an embodiment of the present disclosure;
[0020] FIG. 7 shows a flow diagram of a method for facilitating the insertion of advertisements in streaming content, in accordance with an embodiment of the invention;
[0021] FIG. 8 shows a flow diagram of a method for facilitating the insertion of second media content in streaming content, i.e., first media content, in accordance with another embodiment of the present disclosure;
[0022] FIG. 9 shows a flow diagram of a method for facilitating the insertion of second media content in streaming content, i.e., the first media content, in accordance with yet another embodiment of the present disclosure; and
[0023] FIG. 10 is a simplified block diagram of a content repository server, i.e., a Content Delivery Network (CDN), in accordance with various embodiments of the invention.
[0024] The drawings referred to in this description are not to be understood as being drawn to scale except if specifically noted, and such drawings are only exemplary in nature.

DETAILED DESCRIPTION
[0025] The best and other modes for carrying out the present invention are presented in terms of the embodiments, herein depicted in FIGS. 1 to 10. The embodiments are described herein for illustrative purposes and are subject to many variations. It is understood that various omissions and substitutions of equivalents are contemplated as circumstances may suggest or render expedient but are intended to cover the application or implementation without departing from the spirit or scope of the invention. Further, it is to be understood that the phraseology and terminology employed herein are for the purpose of the description and should not be regarded as limiting. Any heading utilized within this description is for convenience only and has no legal or limiting effect.
[0026] The terms “a” and “an” herein do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced items.
OVERVIEW
[0027] Various embodiments of the present disclosure methods and systems for facilitating the insertion of second media content in streaming content, i.e., a first media content.
[0028] Conventional approaches for the insertion of second media content such as advertisements in first media content such as streaming content have various drawbacks and limitations. A few such drawbacks include random insertion of the Ads in the streaming content without any regard to scenes included in the content, a large number of Ads may be interspersed between the streaming content, and as a result of which eventually the viewer may stop watching that content, manual insertion of the Ads which is cumbersome and time-consuming especially if the content library is large and is frequently updated. To overcome such problems or limitations, the present disclosure describes a system that is configured to perform the below operations.
[0029] In an embodiment, the system is configured to access information related to viewer behavior data corresponding to a content viewer from a database associated with the system. In an example, the information related to the viewer behavior data may include at least one of the information related to the content viewer indicating a content preference of the content viewer, a language preference of the content viewer, a cast preference of the user, requested media content, gender, age group, nationality, location, e-mail identifier, cart information, URLs, payment history, call logs, chat logs device identifier, IP address, user profiles, messaging platform information, social media interactions, browser information, time of the day, device operating system (OS) and network provider.
[0030] In an embodiment, the system is configured to access a first media content and a second media content from a content repository server based, at least in part, on a manifest file. In an example, the content repository server is a Content Delivery Network (CDN). Herein, the first media content includes a plurality of first encoded segments, and the second media content includes a plurality of second encoded segments. In an example, the manifest file is accessed from the content repository server in response to the playback URL request received from the content viewer. In an implementation, the manifest file may include one or more URLs associated with the first media content and one or more URLs associated with the second media content. It is noted that second media content may include one or more ad content. In another embodiment, the sequence encoded segments may be called a ‘media content record’. To that end, the sequence of the first encoded segments is called first media content record. Similarly, the sequence of the second encoded segments is called second media content record.
[0031] In another embodiment, the system is configured to identify a plurality of probable scene boundaries based, at least in part, on analysis of the plurality of first encoded segments. In an embodiment, identifying the plurality of probable scene boundaries includes at first, determining at least one of a cast, a background, a dialogue, and a sphere of activity within the first encoded segments based, at least in part, on analyzing the first encoded segments. Then, determining via a first machine learning model, at least one of a change in the cast, the background, the dialogue, and the sphere of activity based, at least in part, on the determining step. Then, determining a point of transition from one scene to another scene within the first media content based on at least one of the changes in the cast, the background, the dialogue, and the sphere of activity. It is noted that the use of a machine learning model to identify the plurality of probable scene boundaries eliminates the need for manual identification of the scene boundaries.
[0032] In another embodiment, the system is configured to determine one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules.
[0033] In another embodiment, the system is configured to determine a dynamic tolerance threshold for the content viewer based, at least in part, on analyzing the viewer behavior data and the second media content.
[0034] In another embodiment, the system is configured to select one or more second position markers from the one or more scene transition markers. In an example, selecting one or more second position markers from the one or more scene transition markers includes at first, selecting an intermediate scene transition marker from the one or more scene transition markers corresponding to each of the plurality of probable scene boundaries that coincide with the actual scene boundary based, at least in part, on a plurality of pre-defined rules. In particular, for selecting the intermediate scene transition marker, the system at first determines via the first machine learning model, a confidence score for one or more scene transition markers based, at least in part, on determining if the one or more scene transition marker coincides with an actual scene boundary. Then, the system assigns a color code to the one or more scene transition markers based, at least in part, on the confidence score. Further, the system indexes the one or more scene transition markers based, at least in part, on the assigned color code. Furthermore, the system selects the intermediate scene transition marker from the one or more scene transition markers based, at least in part, on the one or more predefined rules and the assigned color code corresponding to the highest confidence score. This aspect allows the machine learning model to automatically identify scene transitions or scene boundaries in the video content.
[0035] Then, setting the intermediate scene transition marker corresponding to each of the plurality of probable scene boundaries as the second position marker. In a particular embodiment, for indexing the one or more scene transition markers the system is configured to assign an indexing tag to each of the color coded one more scene transition markers. Then, the system is configured to facilitate the content viewer to navigate from one scene to another scene based, at least in part, on the assigned indexing tags.
[0036] In another embodiment, the system is configured to generate a new media content record for the content viewer based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers. Then, a modified manifest is generated for the content repository server based, at least in part, on the new media content record. In an example, the content repository server is a content delivery network. It is noted that the modified manifest is stored in the content repository server for the playback of the video content to the content viewer. In an non-limiting implementation, if the content viewer requests the same video content, the system does not process the request to generate a modified manifest but rather causes the fetching of the modified manifest cached in the content repository server. It is noted that the modified manifest is stored in the content repository server for any playback requests from a content viewer from the same viewer cohort corresponding to which the modified manifest was generated.
[0037] In yet another embodiment, the system is configured to access information related to a viewer behavior data includes data related to a plurality of content viewers from a database associated with the system. Then, accessing a first media content and a second media content from a content repository server based, at least in part, on a manifest file, the first media content includes a plurality of first encoded segments, and the second media content includes a plurality of second encoded segments. Then, classifying via a second machine learning model, the plurality of content viewers into a plurality of viewer cohorts based, at least in part, on the viewer behavior data. Then, identifying a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments. Then, determining one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules. Then, determining a dynamic cohort tolerance threshold of each viewer cohort of the plurality of the viewer cohorts based, at least in part, on analyzing the viewer behavior data and the second media content. Then, selecting one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic cohort tolerance threshold of each viewer cohort. Then, generating a new media content record corresponding to each viewer cohort of the plurality of the viewer cohorts based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers. Then, generating a modified manifest for the content repository server for each viewer cohort of the plurality of the viewer cohorts based, at least in part, on the corresponding new media content record. It is noted, the dynamic cohort tolerance threshold may vary dynamically based on the changing viewer behavior data of the viewer cohort over time. As such, a cohort with content viewers aged 16 to 18 years will have specific Ad interests and the Ad Tolerance level. However, after a few years, the Ad interests and preferences of the cohort with the same content viewers may change which may also change the Ad Tolerance level accordingly.
[0038] FIG. 1 shows a representation 100 for illustrating the provisioning of video content offered by a streaming content provider to a content viewer, in accordance with an embodiment of the invention. The term ‘streaming content provider’ as used herein refers to an entity that holds digital rights associated with digital content, i.e., media content, present within digital video content libraries, offers the content on a subscription basis by using a digital platform and over-the-top (OTT) media services, i.e., content is streamed over the Internet to the electronic devices of the subscribers, i.e., content viewers. A streaming content provider is hereinafter referred to as a ‘content provider’ for ease of description. The content offered by the content provider may be embodied as streaming video content such as live streaming content or on-demand video streaming content. It is noted that though the content offered by the content provider is explained with reference to video content, the term ‘content’ as used hereinafter may not be limited to only video content. Indeed, the term ‘content’ may refer to any media content including but not limited to ‘video content’, ‘audio content’, ‘gaming content’, ‘textual content’, and any combination of such content offered in an interactive or non-interactive form. Accordingly, the term ‘content’ is also interchangeably referred to hereinafter as ‘media content’ for the purposes of description. Individuals wishing to view/access the media content may subscribe to at least one type of subscription offered by the content provider. A streaming content provider is hereinafter referred to as a ‘content provider’ for ease of description. Individuals accessing/viewing the content offered by the content provider are referred to herein as ‘user’, ‘subscriber’, ‘content viewer’, or simply as a ‘viewer’.
[0039] The representation 100 depicts a first media content 110 offered by the content provider for illustration purposes. The first media content 110 may correspond to a movie content stored in a content library 114 associated with the content provider. The first media content 110 can be streamed on-demand to one or more subscribers, i.e., the first media content 110 may be embodied as a Video-On-Demand (VOD) content. The insertion of second media content in streaming content, such as VOD content, is explained hereinafter with reference to video content. In a non-limiting example, the second media content may include one or more Advertisements or ‘Ads’ It is noted that the embodiments explained hereinafter are not limited to video content and that the second media content may be inserted in any type of media content capable of being streamed to the subscribers. The ‘first media content 110’ is also hereinafter interchangeably referred to as ‘video content 110’ or ‘content 110’ for the sake of description. It is noted that the second media content has been described with reference to Advertisements throughout the present disclosure for the sake of explanation only and the same should not be construed as a limitation and other suitable forms of second media content are also covered within the scope of the present disclosure.
[0040] Video content, such as the video content 110, may include a plurality of scenes and a plurality of scene transitions. Accurate identification of the scene transitions or the scene boundaries in the video content may assist in identifying potential slots for inserting the second media content. An example scene boundary in-between two scenes of a video content 110 is shown in FIG. 2.
[0041] FIG. 2 shows a representation 200 for depicting a scene boundary in-between scenes of the video content 110 of FIG. 1, in accordance with an example scenario. The video content 110 includes a plurality of image frames, such as image frames 202a, 202b, … 202n. It is noted that an image frame corresponds to a still digital image capture of an object and/or its surroundings, and is the basic and smallest unit of the video content 110, i.e., a collection of such image frames 202a, 202b, … 202n constitutes the video content 110. A collection of image frames captured by one or more cameras (not shown in FIG. 2) that operates for an uninterrupted period of time is referred to herein as a shot (or camera shot). A shot may span a few seconds or several minutes, and accordingly, the collection of image frames captured in a single take by a camera constitutes a shot. For example, the image frames 202a, 202b, 202c constitute a shot 204a, the image frames 202d, 202e, and 202f constitute a shot 204b, and image frames 202g, 202h, 202i constitute a shot 204c. It shall be noted that a shot may include multiple angles of capturing the same object but represents a single take from one single camera. Further, it shall be noted that only 3 image frames have been depicted in shots 204a, 204b, and 204c for example purposes, and in fact, more or fewer number of image frames may be combined to constitute a shot based on the frame analysis. Moreover, a number of image frames that constitute each shot (e.g., shot 204a, 204b) may be the same or different across the video content 110 and as already explained depend on an uninterrupted operational time of the camera module to capture a still image. For example, a shot may have 5 image frames and another shot may have 20 image frames.
[0042] A collection of shots constitute a scene that delivers visual details of a part of the video content 110. In general, a scene is a section of the video content 110 that includes a unique combination of background, cast, dialogue, and sphere of activity, and multiple shots (or camera shots) are captured of this section in the video content 110. In other words, a series of shots taken from different angles of an action on a single location and continuous time is referred to as a scene. For example, the shots 204a, 204b, and 204c constitute a scene 206a. In one illustrative example, a scene depicts two kids playing out in a park, and such a scene in the park may be captured from different angles, i.e., by employing multiple cameras and combining shots from multiple cameras to form the scene. It shall be noted that multiple cameras may be employed to capture different camera angles of the same object and all such shots may be manipulated to constitute a single scene.
[0043] A point of transition from one scene to another is referred to herein as a scene boundary. More specifically, a drastic change in a visual segment, for example, change in cast, background, dialogue, or sphere of activity indicates the scene boundary. For example, an instant in which the scene 206a transitions to another scene 206b is represented as a scene boundary 208 (shown in FIG. 2). In one illustrative example, a scene in the video content 110 may depict a villain plotting to murder a person and in the next consecutive scene, the villain may be shown pursuing the person. The change in a scene from the villain plotting to murder the person to executing his plot indicates a scene boundary i.e., a transition in the scene.
[0044] Conventionally, a team of personnel (shown as a content handling team 116) manually goes through each content in the content library 114 and identifies such scene boundaries. More specifically, the content handling team 116 analyzes content, such as the video content 110, to identify scene boundary slots where the second media content such as Ad content may be inserted to engage the content viewers. In many cases, the manual identification of scene boundaries is also skipped and the Ads are inserted in the streaming content at random intervals without any regard to scenes included in the content. For example, an Ad may be inserted in-between an intense dialogue exchange between two characters in the streaming content. Such insertion of Ads at random intervals in the content degrades the viewing experience of the customer/content viewer. Even in the case of manual identification of Ad insertion slots, most of the identified scene boundaries may not correlate with an actual scene boundary, i.e., the correct image frame at which the scene ends may not be accurately identified by the content handling team 116. In addition, all the scene boundaries may not be relevant slots for inserting Ads. For example, a scene boundary in-between a scene that serves as a prelude to a plot twist and the next scene where the plot twist actually takes place may not be appropriate for Ad insertion. In some cases, a large number of Ads may be interspersed between the streaming content, which may be frustrating for a content viewer and eventually, the content viewer may stop watching that content, especially if the Ads are irrelevant and do not effectively engage the content viewer. Moreover, the Ad insertion slots are currently identified manually, which is cumbersome and time-consuming especially if the content library is large and frequently updated.
[0045] Referring now to FIG. 1, the representation 100 depicts a system 150, which is configured to overcome the aforementioned drawbacks of the conventional mechanisms and provide additional advantages. The system 150 is configured to receive the video content 110 and automatically identify scene transitions or scene boundaries in the video content 110 which may serve as potential slots for inserting the second media content. To that effect, the system 150 is configured to assign a plurality of scene transition markers in relation to identifying a probable scene boundary in the video content 110. It is understood that streaming content is made up of a plurality of encoded content segments. Each encoded content segment is encoded at a particular bit rate and resolution for a specific time. For instance, an encoded media content may be an encoded media content of 4-second duration. Therefore, it is understood that the first media content and the second media content include a plurality of first encoded content segments and a plurality of second encoded content segments, respectively. In an embodiment, the system 150 is configured to identify a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments. This aspect has been described later in the present disclosure.
[0046] In one embodiment, the content handling team 116 reviews only the plurality of scene transition markers in relation to the probable scene boundary to identify an actual scene boundary. In one illustrative example, if three scene transition markers t1, t2, and t3 are assigned at t-1, t, and t+1 seconds, the content handling team 116 manually analyzes the scene transition markers (i.e., scene transition markers t1, t2, and t3) to identify an actual scene boundary. For example, the content handling team 116 may identify the scene transition marker at t+1 second as the actual scene boundary. In another embodiment, the system 150 includes a learning algorithm trained to analyze the plurality of scene transition markers assigned in relation to a scene boundary and identify one scene transition marker as the actual scene boundary which is explained in detail with reference to FIG. 3. Similarly, a scene transition marker is selected from among a plurality of scene transition markers in relation to each scene boundary in the video content 110. Accordingly, if the content 110 has a plurality of scene boundaries, then a scene transition marker may be selected in relation to each scene boundary. A plurality of such selected scene transition markers configures a plurality of second position markers. It is noted that the term ‘plurality of second position markers’ is referred to hereinafter as ‘a plurality of Ad position markers’, for ease of explanation. In other words, these actual scene boundaries indicate that the second media content such as one or more Ads may be inserted at each Ad position marker of the plurality of Ad position markers to monetize the content 110 delivered to the content viewers.
[0047] Further, the system 150 is configured to customize the Ad content delivered to the content viewer by managing Ad content and a number of slots in which the Ads are inserted in the content 110. More specifically, the system 150 identifies one or more Ad position markers from the plurality of Ad position markers based on the user online behavior determined using user behavior data for inserting Ads as will be explained in detail later.
[0048] The content 110 and the Ad position markers are provided to a video encoder 120. The video encoder 120 is configured to convert the content 110 (i.e., video content) into a format capable of being streamed to content viewers. More specifically, the content 110 is encoded using one or more video encoding algorithms to configure a plurality of encoded media segments which may be combined to form the streaming content that may be provided to electronic devices of content viewers. Each segment corresponds to a fragment of the encoded content with a fixed-length and all segments of the content 110 have uniform or identical length, for example, 3 seconds. For example, a movie (i.e., the content) of 90-minute duration may be segmented into 1080 segments of 5 seconds each. When the movie is delivered to a content viewer, a segment is delivered to the electronic device and a subsequent segment is fetched. During streaming, all the segments are delivered sequentially to the electronic device of the content viewer providing a seamless experience of watching the movie.
[0049] In general, the content 110 may be encoded at different resolutions and/or different bit rates by the video encoder 120 to meet the bandwidth requirements of electronic devices of the subscribers. In one illustrative example, if a short film is encoded at 3 different resolutions such as, 1080p, 720p, and 480p, the content may include 3240 segments (i.e., 1080 segments of 1080p, 1080 segments of 720p, and 1080 segments of 480p). In another illustrative example, the short film may be encoded at 720p with a bitrate of 3kbps and at 720p with a bitrate of 4kbps for delivering 30 frames per second (fps). Accordingly, 2160 segments may be generated corresponding to the content 110 (i.e., the short film). During network issues, such encoding of the content 110 at different resolutions/bitrates ensures that the delivery of the content 110 is seamless by switching from a segment of higher resolution (e.g., 1080p) to a corresponding segment of less resolution (e.g., 720p) or from a high bitrate (e.g., 4kbps) to a lower bitrate (e.g., 3kbps) to ensure seamless delivery of the streaming content to subscribers. Some examples of encoding techniques used by the video encoder 120 include, but not limited to, Alternate Bitrate Streaming (ABR) technique, HTTP Streaming (HTTPS), Dynamic Adaptive Streaming over HTTP (DASH), HTTP Live Streaming (HLS), HTTP Dynamic Streaming (HDS), and the like.
[0050] Further, the video encoder 120 may also be configured to generate an original manifest that includes metadata related to the content 110 (i.e., video content encoded with the Ad position markers). More specifically, the original manifest is a log or record of information related to the transfer of the content 110 to the electronic device of the content viewer. For example, an original manifest includes information related to the content 110 such as, but not limited to, content title, content duration, keywords (e.g., actor, genre, language, sport, etc.), format, encoding techniques (i.e., available bitrates, available resolutions, a number of segments of the content, a size of the plurality of segments), Uniform Resource Locators (URLs) associated with each of the segments and advertisement content-related information (e.g., a number of Ad position markers).
[0051] The original manifest and the plurality of segments related to the content 110 are cached in a content repository server. An example of a content repository server is Content Delivery Network 122 (hereinafter referred to as ‘CDN 122’). It shall be noted that only one CDN is shown here for exemplary purposes and the original manifest along with the plurality of segments related to the content 110 may be cached in more than one CDN for serving the content 110 to a plurality of content viewers.
[0052] The representation 100 further depicts an example content viewer 102 controlling an electronic device 104 for viewing/accessing the content 110 offered by the content provider. The electronic device 104 is depicted to be a smartphone for illustration purposes. It is noted that the content viewer 102 may use one or more electronic devices, such as a television (TV), a laptop, a smartphone, a desktop, or a personal computer to view the content 110 provided by the content provider. In an illustrative example, the content viewer 102 may create an account and, as part of the account creation process, provide personal information such as, age, gender, language preference, content preference, and any other preference of the content viewer 102. Such information may be stored in a content viewer profile along with other account information such as, a type of subscription, a validity date of the subscription, and the like.
[0053] In one illustrative example, the content viewer 102 may access a user interface (UI) of a mobile application or a Web application associated with a content provider by using the electronic device 104. It is understood that the electronic device 104 may be in operative communication with a communication network, such as the Internet, enabled by a network provider, also known as the Internet Service Provider (ISP). The electronic device 104 may connect to the ISP network using a wired network, a wireless network, or a combination of wired and wireless networks. Some non-limiting examples of the wired networks may include the Ethernet, the Local Area Network (LAN), a fiber-optic network, and the like. Some non-limiting examples of the wireless networks may include the Wireless LAN (WLAN), cellular networks, Bluetooth or ZigBee networks, and the like.
[0054] The electronic device 104 may fetch a Web interface associated with the content provider over the ISP network and cause display of the Web interface on a display screen of the electronic device 104. In an illustrative example, the Web interface may include a plurality of content titles corresponding to a variety of content offered by the content provider to its subscribers/content viewers. In an illustrative example, the content viewer 102 may select a content title from among the plurality of content titles displayed on the display screen of the electronic device 104. In one example scenario, the content viewer 102 may select a content title related to the content 110 (for example a movie) streamed from the content library 114. The selection of the content title may trigger a request for a playback uniform resource locator (URL). The request for the playback URL also includes content viewer metadata. For example, the content viewer metadata includes information related to a type of electronic device (for example, mobile phone, TV, or tablet device) used by the content viewer 102 for requesting the login, the type of login method (for example, Email or Web login), the type of network access (for example, cellular or Wi-Fi), network provider, device identifier, IP address, geo-location information, browser information (e.g., cookie data, MAID), time of the day, and the like. In addition, the content viewer metadata includes the content viewer profile of the content viewer 102.
[0055] The request for the playback URL is sent from the electronic device 104 to a content provider platform 124 associated with the content provider. The content provider platform 124 is configured to facilitate the streaming of digital content to a plurality of content viewers, such as the content viewer 102. On receiving the request for the playback URL, the content provider platform 124 initially performs a check to determine if Ad content can be inserted in the content 110 requested by the subscriber 102. More specifically, the content provider platform 124 may perform a check on metadata related to the content 110 (i.e., the requested media content) to identify if the content 110 has Ad position markers to insert/integrate Ad content.
[0056] The content provider platform 124 forwards the request for the playback URL to the system 150 on identifying at least one Ad position marker in which Ad content may be integrated/inserted in the content 110. The system 150 is configured to efficiently customize Ad content delivered within the content 110 for the content viewers such as, the content viewer 102.
[0057] In an embodiment, the system 150 is configured to access viewer behavior data corresponding to a content viewer from a database (i.e., storage module 316 of FIG. 3) associated with the system 150. More specifically, in at least one example embodiment, the Ad content to be inserted in the content 110 can be customized for each viewer of the content 110 based on online viewer behavior using the viewer behavior data. The term ‘online viewer behavior’ as used herein primarily refers to characteristics or attributes of an individual content viewer and may include information related to historical data such as past content views, social media interactions, advertisement content interaction, and the like. Additionally, the online viewer behavior data may also include personal information of the subscriber (e.g., name, age, gender, nationality, e-mail identifier, and the like), cart information, URLs, transaction information such as, payment history, call logs, chat logs, and the like. Such information may be received from web servers hosting and managing third-party websites, remote data gathering servers tracking content viewer activity on a plurality of enterprise websites, a plurality of interaction channels (for example, websites, native mobile applications, social media, etc.), and a plurality of devices. In addition, the online viewer behavior data may also include the content viewer metadata received along with the request for playback URL such as, device identifier, IP address, geo-location information, browser information, time of the day, chat logs, device identifiers, content viewer profile, messaging platforms, social media interactions, user device information such as, device type, device operating system (OS), device browser, browser cookies, and the like along with content viewer profile such as, age group, gender, language preference, content preference and any other preference of the content viewer provided as a part of registration.
[0058] In general, the system 150 is configured to determine a number of Ads and a type of Ad content for customizing the Ad content delivered in between the content 110. In an embodiment, the system 150 is configured to determine the number of Ads to be inserted between the content 110 based on a dynamic tolerance threshold of the content viewer 102. The term ‘dynamic tolerance threshold’ as used herein refers to the second content viewing capacity of a content viewer. More specifically, an upper limit corresponding to a maximum number of second media content (for example, Ads) the content viewer 102 can view between the content 110 before dropping off from watching the content 110 is referred to as the dynamic tolerance threshold of the content viewer 102. In at least one example embodiment, the dynamic tolerance threshold may be determined based on the online viewer behavior, i.e., viewer behavior data of the content viewer 102. For example, the online viewer behavior may indicate a number of Ads a content viewer usually tolerates while viewing a content 110. In one illustrative example, the dynamic tolerance threshold of the content viewer 102 may be ‘low’ based on the online viewer behavior i.e., historical data based on content viewer activity may indicate that the content viewer prefers to drop off from viewing a content if there are more than 3 Ads in a content spanning 30 minutes. As such, the dynamic tolerance threshold may indicate an upper limit (i.e., a maximum) of 3 Ads/30-minute content and accordingly, a maximum of 6 Ads may be inserted in a content spanning one hour. In another embodiment, the dynamic tolerance threshold may depend on the content characteristics. The content characteristics may include but not limited to, the length of the content, the language used in the content, cast in the content, a genre of the content, and the like which may be utilized to determine the dynamic tolerance threshold of the content viewer 102. In one illustrative example, if the genre of the content corresponds to a horror movie, the dynamic tolerance threshold of the content viewer 102 may be low whereas the dynamic tolerance threshold of the content viewer 102 may be high when the genre of the content corresponds to a comedy movie. In another illustrative example, if the length of the content is too long and if there are too many Ads embedded in between the content, the dynamic tolerance threshold of the content viewer 102 may be less. As such, the system 150 may factor in the dynamic tolerance threshold based on at least one of the online viewer behavior and the content characteristics to determine the number of Ads that may be inserted for the content viewer 102. It is noted that the tolerance threshold is dynamic in nature since the viewer behavior data, i.e., the online viewer behavior is dynamic, i.e., it changes constantly over time. For instance, a content viewer will have specific Ad interests for a few days or months and the dynamic Tolerance threshold. On the other hand, after a few days or months, the Ad interests and preferences of the content viewer may change which may also change the dynamic tolerance threshold accordingly.
[0059] In another embodiment, the system 150 is configured to classify each content viewer into a cohort based on the online content viewer behavior. In an embodiment, a number of Ads and the type of Ad content are customized for different cohorts of content viewers. The term ‘cohort’ as used herein refers to a group of content viewers accessing the same or similar streaming content on respective devices at the same time period and sharing the same or similar online content viewer behavior for example, requested media content, gender, age group, dynamic tolerance threshold, network provider, etc. For example, each cohort may prefer or appreciate certain advertisement content and have a certain Ad tolerance level that is determined based on the online viewer behavior. In one illustrative example, content viewers of a movie from a location (e.g., region X of city A) in the age group of 10-18 are likely to watch a trailer for a similar movie and have a ‘Low’ dynamic tolerance threshold (for example, 5 Ads in movie spanning 1 hour 30 minutes), and accordingly, this cohort of content viewers may be presented with 4 Ads with Advertisement content that may appeal to this cohort (e.g., an Ad related to the trailer of a similar movie or a smartwatch Ad for the fitness conscious teens). In another example illustration, the online viewer behavior may indicate that a middle-age person (i.e., the content viewer 102) has been viewing websites of different enterprises (for example, health and wellness websites) and has a “Medium” dynamic tolerance threshold. As such, the middle-age person may be classified into a cohort having the same Ad interests and the dynamic tolerance threshold. Accordingly, the middle-age person may be presented with 8 Ads relating to health/wellness interspersed within a movie spanning 1 hour 30 minutes. The type of Ad content presented for the content viewer 102 may be determined by an advertisement server 128 as will be explained in detail later. In another illustrative example, the dynamic tolerance threshold determined for a cohort may also be referred to as the ‘dynamic cohort tolerance threshold’ and varies dynamically based on the changing viewer behavior data of the viewer cohort over time.
[0060] The system 150 requests for the original manifest related to the content 110 from the CDN 122. The CDN 122 then provides the original manifest to the system 150. Further, on determining a viewer cohort or cohort of the viewer 102, the system 150 sends a request to the advertisement server 128 for Ad contents to be inserted in the one or more Ad position markers determined for the cohort. It shall be noted that the request for Ad content to be inserted in the one or more Ad position markers may include information related to the cohort of the viewer 102.
[0061] The advertisement server 128 is configured to manage and run online advertising campaigns for advertising entities such as, an advertising entity 130. More specifically, the advertising entity 130 is an enterprise that employs advertising agencies for developing advertisement content related to the enterprise that is creative and appealing for different cohorts of content viewers. Thus, a plurality of advertisement content related to multiple advertising entities such as, the advertising entity 130 that cater to a wide variety of cohorts are stored in the advertisement server 128. Moreover, the Ad content is also encoded at different resolutions and/or bitrates and stored in the advertisement server 128. As already explained with reference to the encoding of the content 110, the Ad contents are also segmented into segments of a fixed length that are consistent with the length of the segments of the content 110 (known as second encoded content segments). The advertisement server 128 selects one or more Ad contents for the one or more Ad position markers based on the cohort of the content viewer 102 and sends them to the system 150.
[0062] In another embodiment, the system 150 is configured to generate a modified manifest or a new manifest related to the content 110 requested by the subscriber 102 based on the original manifest and the one or more Ad contents. The term ‘modified manifest’ or ‘new manifest’ as used herein refers to an original manifest that is customized to add information related to the fetched advertisement content based on behavior attributes of a cohort. The one or more Ad contents are inserted in the one or more Ad position markers (i.e., selected scene transition markers) specified by the content handling team 116 (or automatically selected by the system 150). More specifically, the system 150 is configured to integrate URL information related to the one or more advertisement content within the original manifest to generate the modified manifest or the new manifest for the content viewer 102. The system 150 is configured to insert segments of the one or more Ad content among the plurality of segments related to the content 110 based on corresponding Ad position markers. In general, the system 150 stitches the modified manifest by inserting segments related to each Ad content in slots identified by the corresponding Ad position markers (i.e., one or more Ad position markers determined based on the dynamic tolerance threshold of the content viewer 102) within the content 110. It is noted that whenever encoded content segments are linked in the manifest file, an encoded content record or a media content record is also provided in the manifest. The encoded content record provides a sequence for the encoded content segments. This sequence helps the electronic device to understand in what sequence the encoded content segments are to be played. For instance, a first media content record (or first encoded content record) may state that segments have to be played as follows: S1, S2, S3, S4, S5, S6 and while a second media content record (or second encoded content record) may state that second media content segments such as Ads have to be played as follows A1, and A2. To that end, while generating the modified manifest, the system 150 may also generate a new media content record for the content viewer based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers. In an example, the new media content record may state that the content segments have been played as follows: S1, S2, A1, A2, S3, S4, S5, and S6.
[0063] Further, the modified manifest is provided to a content repository server such as the CDN 122. The CDN 122 caches a copy of the modified manifest and if other content viewers from the same cohort request the same content, for example, the content 110, the CDN 122 automatically refers to the cached modified manifest to deliver the content 110. Further, the modified manifest is also provided to the content provider platform 124.
[0064] The content provider platform 124 provides the modified manifest to the content viewer 102 in response to the request for the playback URL. The content viewer 102 may send a request to the CDN 122 to stream the content 110 using the modified manifest. The CDN 122 streams the segments related to the content 110 based on the modified manifest to the electronic device 104. In one embodiment, the system 150 checks the time duration for which the copy of the modified manifest was cached in the CDN 122. Additionally, the CDN 122 stores a time threshold associated with the cached manifest. If the time duration is more than or equal to the time threshold, another new modified manifest is generated in response to the request for the playback URL from the content viewer 102 from the same cohort. It is noted that this aspect of the present disclosure keeps the modified manifest up to date with respect to the preferences of the content viewer 102 or the content viewer’s cohort. This is due to the fact that since the viewer behavior data is dynamic in nature (i.e., it keeps changing with time based on the viewer’s preferences) therefore, by updating the modified manifest whenever the time threshold is crossed, the modified manifest stays in like with the latest presences of the content viewer 102 or the content viewer’s cohort.
[0065] In a non-limiting scenario, the content viewer 102 may click on a widget or select an icon for a first media content on an application associated with the content provider platform on their electronic device 104. In response to this selection, a request for a playback Uniform Resource Locator (URL) request is sent to the system 150 from the content viewer 102. In an example, the playback URL request may include or indicate a request for the first media content 110 by the content viewer 102. Now, in response to the playback URL request, the system 150 accesses the manifest file from the content repository server such as CDN 122. In various non-limiting examples, the manifest file may include one or more URLs associated with the first media content, one or more URLs associated with the second media content, a first media content record, a second media content record. It is noted that the term encoded content record may also be interchangeably referred to as a media rendition record. Various components of the system 150 that process the content to identify Ad position markers for providing targeted Ads are explained next with reference to FIG. 3.
[0066] FIG. 3 is a block diagram of a system 150 configured to facilitate the insertion of targeted Ads in streaming content, in accordance with an embodiment of the invention. The term ‘targeted Ads’ as used herein primarily refers to Ads that are delivered to a content viewer 102 or cohort of content viewers with certain traits or behavioral attributes which have a strong preference or inclination to a specific type of Advertising. With respect to the present disclosure targeted second media content that includes the targeted second media content is selected based on the preferences of the content viewer or the content viewer’s cohort based on the viewer behavior data. Herein, the second media content is assumed to be one or more Ads and the targeted second media content is assumed to be targeted Ads for the sake of explanation without limiting the scope of the present disclosure. Further, the system 150 defines a number of targeted Ads the content viewer may tolerate before dropping off from watching the video content 110. In general, the system 150 is configured to customize the Ad insertion, i.e., the system 150 customizes the type of Ads and the number of Ads for individual content viewers or viewer cohorts, which substantially improves the content viewers’ viewing experience as the number of interruptions is reduced and, moreover, the Ad content is engaging making it feel less intrusive to a content viewer 102.
[0067] In at least one embodiment, the system 150 may be included within a content repository server such as a CDN (such as the CDN 122 shown in FIG. 1). In some embodiments, the system 150 may be implemented external to the CDN 122 and configured to be in operative communication with the CDN 122.
[0068] The system 150 is depicted to include a processor 302, a memory module 304, an input/output module 312, a communication module 314, and a storage module 316. It is noted that although the system 150 is depicted to include the processor 302, the memory module 304, the input/output module 312, the communication module 314, and the storage module 316, in some embodiments, the system 150 may include more or fewer components than those depicted herein. The various components of the system 150 may be implemented using hardware, software, firmware, or any combination thereof. Further, it is also noted that one or more components of the system 150 may be implemented in a single server or a plurality of servers, which are remotely placed from each other.
[0069] In one embodiment, the processor 302 may be embodied as a multi-core processor, a single-core processor, or a combination of one or more multi-core processors and one or more single-core processors. For example, the processor 302 may be embodied as one or more of various processing devices, such as a coprocessor, a microprocessor, a controller, a digital signal processor (DSP), a processing circuitry with or without an accompanying DSP, or various other processing devices including integrated circuits, such as for example, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), a microcontroller unit (MCU), a hardware accelerator, a special-purpose computer chip, or the like. In one embodiment, the memory module 304 is capable of storing machine-executable instructions, referred to herein as platform instructions 305. Further, the processor 302 is capable of executing the platform instructions 305. In an embodiment, the processor 302 may be configured to execute hard-coded functionality. In an embodiment, the processor 302 is embodied as an executor of software instructions, wherein the instructions may specifically configure the processor 302 to perform the algorithms and/or operations described herein when the instructions are executed. The processor 302 is depicted to include a scene boundary detection module 306, a scene indexing module 308, and an Ad management module 310. The modules of the processor 302 may be implemented as software modules, hardware modules, firmware modules, or as a combination thereof.
[0070] The memory module 304 stores instructions configured to be used by the processor 302, or more specifically by the various modules of the processor 302 such as the scene boundary detection module 306, the scene indexing module 308, and the Ad management module 310 to perform respective functionalities. The memory module 304 may be embodied as one or more non-volatile memory devices, one or more volatile memory devices, and/or a combination of one or more volatile memory devices and non-volatile memory devices. For example, the memory module 304 may be embodied as semiconductor memories, such as flash memory, mask ROM, PROM (programmable ROM), EPROM (erasable PROM), RAM (random access memory), etc., and the like. In at least one example embodiment, the system 150 may store logic and/or instructions for: (1) analyzing the streaming content to assign a plurality of scene transition markers in relation to each probable scene boundary in the content, (2) selecting a scene transition marker from among the plurality of scene transition markers that coincides with the actual scene boundary, for each probable scene boundary, (3) determining dynamic tolerance threshold of a content viewer based on at least one of online viewer behavior and content characteristics, (4) selecting one or more Ad position markers from among a plurality of Ad position markers based on the Ad tolerance level of the content viewer and scene analysis, (5) providing the one or more Ad position markers in relation to the content to the video encoder (see, video encoder 120 shown in FIG. 1).
[0071] In an embodiment, the input/output module 312 (hereinafter referred to as ‘I/O module 312’) may include mechanisms configured to receive inputs from and provide outputs to the operator(s) of the system 150. To that effect, the I/O module 312 may include at least one input interface and/or at least one output interface. Examples of the input interface may include, but are not limited to, a keyboard, a mouse, a joystick, a keypad, a touch screen, soft keys, a microphone, and the like. Examples of the output interface may include, but are not limited to, a display such as a light-emitting diode display, a thin-film transistor (TFT) display, a liquid crystal display, an active-matrix organic light-emitting diode (AMOLED) display, a microphone, a speaker, a ringer, a vibrator, and the like.
[0072] In an example embodiment, the processor 302 may include I/O circuitry configured to control at least some functions of one or more elements of the I/O module 312, such as, for example, a speaker, a microphone, a display, and/or the like. The processor 302 and/or the I/O circuitry may be configured to control one or more functions of the one or more elements of the I/O module 312 through computer program instructions, for example, software and/or firmware, stored on a memory, for example, the memory module 304, and/or the like, accessible to the processor 302.
[0073] The communication module 314 may include communication circuitry such as for example, a transceiver circuitry including an antenna and other communication media interfaces to facilitate communication between the system 150 and one or more remote entities, such as the content library 114, the video encoder 120, the CDN 122, the content provider platform 124, and the advertisement server 128 over a communication network (not shown). The communication circuitry may, in at least some example embodiments enable reception of: (1) a request for playback of the content from the content provider platform 124, (2) content from the content library 114 in relation to the request for playback, (3) an original manifest related to the content requested by the content viewer from the CDN 122, and (4) one or more Ads from the advertisement server 128. The communication circuitry may further be configured to provide: (1) one or more Ad position markers to the video encoder, and (2) the modified manifest to the CDN 122 in response to the request for playback of the content.
[0074] The system 150 is depicted to be in operative communication with a storage module 316. The storage module 316 is any computer-operated hardware suitable for storing and/or retrieving data. In one embodiment, the storage module 316 is configured to store data related to the viewing history of a plurality of content viewers.
[0075] The storage module 316 may include multiple storage units such as hard drives and/or solid-state drives in a redundant array of inexpensive disks (RAID) configuration. In some embodiments, the storage module 316 may include a storage area network (SAN) and/or a network attached storage (NAS) system. In one embodiment, the storage module 316 may correspond to a distributed storage system, wherein individual databases are configured to store custom information such as, content view logs, etc. In another embodiment, the storage module 316 is configured to the viewer behavior data corresponding to a plurality of content viewers. In some embodiments, the storage module 316 is integrated within the system 150. For example, the system 150 may include one or more hard disk drives as the storage module 316. In other embodiments, the storage module 316 is external to the system 150 and may be accessed by the system 150 using a storage interface (not shown in FIG. 3). The storage interface is any component capable of providing the processor 302 with access to the storage module 316. The storage interface may include, for example, an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a RAID controller, a SAN adapter, a network adapter, and/or any component providing the processor 302 with access to the storage module 316. In an embodiment, the storage module 316 may be referred to interchangeably as the database 316 throughout the present disclosure.
[0076] The various components of the system 150, such as the processor 302, the memory module 304, the I/O module 312, the communication module 314, and the storage module 316 are configured to communicate with each other via or through a centralized circuit system 318. The centralized circuit system 318 may be various devices configured to, among other things, provide or enable communication between the components of the system 150. In certain embodiments, the centralized circuit system 318 may be a central printed circuit board (PCB) such as a motherboard, a mainboard, a system board, or a logic board. The centralized circuit system 318 may also, or alternatively, include other printed circuit assemblies (PCAs) or communication channel media.
[0077] In an embodiment, the system 150 is configured to access a first media content and a second media content from a content repository server. In a particular implementation, the first media content is the video content 110, the second media content is the Ad content and the content repository server is the CDN 122. In at least one example embodiment, the communication module 314 is configured to receive video content 110 from a content library such as, the content library 114 (shown in FIG. 1). The content provider may have implemented a number of content ingestion servers (not shown) to ingest content from various content sources. The content library 114 may be in operative communication with content ingestion servers associated with one or more OTT platforms to receive the latest content offerings from content production houses, media portals, and the like. Accordingly, the communication module 314 is configured to receive the video content from the content library 114. The communication module 314 forwards the video content 110 to the scene boundary detection module 306 of the processor 302.
[0078] In an embodiment, the scene boundary detection module 306 is configured to identify a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments. In an additional embodiment, the identification of the probable scene boundaries includes determining at least one of a cast, a background, a dialogue, and a sphere of activity within the first encoded segments based, at least in part, on analyzing the first encoded segments. The scene boundary detection module 306 then uses a first machine learning model to determine at least one of a change in the cast, the background, the dialogue, and the sphere of activity and further to determine a point of transition from one scene to another scene within the first media content based on at least one of the change in the cast, the background, the dialogue, and the sphere of activity. In a particular implementation, the scene boundary detection module 306 in conjunction with the instructions in the memory module 304 is configured to analyze the first media content to identify actual scene boundaries in the video content. In one illustrative example, the scene boundary detection module 306 includes at least one learning algorithm (i.e., a first machine learning model) trained to analyze the first media content, cameral focal distance, motion, audio and semantic continuity to group shots that exhibit some form of continuity into scenes. The image in-between frames which show the maximum change in parameters such as color properties, motion, audio, and the like are predicted to be scene boundaries. In an embodiment, the system 150 is configured to determine one or more scene transition markers corresponding to the first media content based on the plurality of probable scene boundaries and one or more pre-defined rules. It is understood that the one or more pre-defined rules are designed based on various parameters associated with a media content (such as the first media content). In a non-limiting example, the various parameters may be extracted from meta-data associated with the media content and may include at least a camera shot, a camera angle, a camera pan, subtitles (i.e., captions), and the like related to media content being analyzed. It is pertinent to note that the one or more pre-defined rules may be designed or generated by an administrator (not shown) of the system 150 or the content handling team 116 associated with the system 150. As may be understood, the one or more pre-defined rules may assist the system 150 in determining the one or more scene transition markers based on the various parameters associated with the media content (as described earlier). For instance, a particular pre-defined rule may indicate that a scene transition marker can be added after N number of camera shots (where N is a non-zero natural number), or after a change in the camera angle.
[0079] In a particular implementation, the scene boundary detection module 306 is configured to parse through image frames of the first media content to assign a plurality of scene transition markers in relation to identifying a probable scene boundary in the video content 110. Each scene transition marker is configured to identify a run-time, which is indicative of an ending image frame of a scene and, which may serve as an indication of the content transitioning from one scene to another, and hence, a probable candidate for insertion of a second media content. In one illustrative example, the scene boundary detection module 306 assigns three scene transition markers, for example, a scene transition marker at ‘t-1’ seconds, a scene transition marker at ‘t’ seconds, and another scene transition marker at ‘t+1’ seconds. The three scene transition markers are indicative of a probable scene boundary in the first media content. It shall be noted that the assignment of three scene transition markers is mentioned herein for example purposes and the scene boundary detection module 306 may assign fewer or more number of scene transition markers on identifying a probable scene boundary. An example assignment of scene transition markers is explained next with reference to FIG. 4A.
[0080] FIG. 4A shows an example representation 400 for illustrating assignment of a plurality of scene transition markers in relation to a probable scene boundary detected in the first media content 402, in accordance with an embodiment of the invention. As explained, the first media content 402 received from the content library 114 is analyzed for identification of probable scene boundaries, and on detecting a probable scene boundary, a plurality of scene transition markers is assigned. The plurality of scene transition markers indicates the presence of an actual scene boundary among the plurality of scene transition markers assigned in the first media content 402.
[0081] In this representation 400, the markers M1-1, M1-2 (404a, 404b) are assigned in relation to a probable scene boundary B1 (not shown in FIG. 4A), and the markers M2-1, M2-2 (406a, 406b) are assigned in relation to a probable scene boundary B2 (not shown in FIG. 4A) in the first media content 402. It shall be noted that each probable scene boundary is assigned with 2 scene transition markers (i.e., markers 404a, 404b for a probable scene boundary B1) for example purposes, and more than two scene transition markers may be assigned in relation to each scene transition boundary based on one or more pre-defined rules. This further includes determining via a first machine learning model, a confidence score for one or more scene transition markers based on determining if the one or more scene transition marker coincides with an actual scene boundary. Further, selecting the intermediate scene transition marker includes assigning a color code to the one or more scene transition markers based on the confidence score and indexing the one or more scene transition markers based on the assigned color code. Further, the system 150 is configured to select the intermediate scene transition marker from the one or more scene transition markers based on the one or more pre-defined rules and the assigned color code corresponding to the highest confidence score. For example, a probable scene boundary may be identified at the end of a song sequence in the first media content 402 and as such, two markers M1-1 and M1-2 (404a, 404b) are assigned at the end of the song. It shall be noted that although two scene transition markers have been assigned, only one of the scene transition markers coincides with an actual scene boundary in the first media content 402 which is explained with reference to FIG. 4B.
[0082] In an embodiment, the system 150 is configured to select one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic tolerance threshold. The scene boundary detection module 306 is configured to select a second position marker from the plurality of scene transition markers (i.e., one or more scene transaction markers) for each probable scene boundary. The selected second position marker for each probable scene boundary coincides with an actual scene boundary in the first media content 402 (i.e., movie). In an embodiment, the second position marker is selected based on the prediction of a machine learning model, i.e., the first machine learning model. Accordingly, the first machine learning model may assign a color code for each second position marker of the plurality of scene transition markers based on a confidence level while predicting the actual scene boundary. For example, green colour may be assigned to a scene transition marker which is predicted with a confidence score of 95% and blue colour may be assigned to the scene transition marker with a confidence score of 80% and red color may be assigned to a scene transition marker with a confidence score less than 80%. For example, the first machine learning model may assign the colour ‘green’ to marker M1-1 (404a) and colour ‘Red’ to marker M1-2 (404b) of FIG. 4A. As such, the scene boundary detection module 306 may select a second position marker that has the highest confidence score (i.e., marker M1-1) among the plurality of scene transition markers for each probable scene boundary, which is shown in the representation 420 in FIG. 4B. The marker M1-1 coincides with the actual scene boundary in the first media content 402. Similarly, the plurality of scene transition markers in relation to each probable scene boundary is processed by the first machine learning model for identifying actual scene boundaries in the first media content 402. The selected second position markers in relation to each probable scene boundary are referred to as a plurality of Ad position markers. For example, markers M1-1 (404a) and M2-2 (406b) are selected as the Ad position markers in first media content 402, as shown in FIG. 4B. It shall be noted that although two Ad position markers have been identified in the first media content 402, Ads may be inserted in only some Ad position markers based on the dynamic tolerance threshold of the content viewer 102 as will be explained in detail later.
[0083] In an embodiment, the system 150 is configured to index the one or more second position markers based on the assigned color code. Further, the system 150 is configured to assign an indexing tag to each of the color coded one more second position markers and facilitate the content viewer to navigate from one scene to another scene based on the assigned indexing tags. In a particular implementation, the scene indexing module 308 in conjunction with the instructions in the memory module 304 is configured to insert a tag at each second position marker of the plurality of second position markers. The tag includes a timestamp and information related to the scene, for example, an important event in the scene. In one illustrative example, a tag may be inserted at 10 minutes and 17 seconds indicating an event of ‘a murder’ and another tag may be inserted at 25 minutes and 10 seconds indicating an event of ‘a first clue’. The tags inserted at the plurality of second position markers provide an option for the content viewer 102 to skip/move between scenes of the first media content 402. For example, the first media content 402 may include a tag indicating a ‘song’ at a second position marker and a subsequent second position marker may be associated with a tag ‘friends meet over dinner’. In such cases, the content viewer 102 may choose to skip the song and move an indicator on a time slider widget to the subsequent second position marker to view the subsequent scene (i.e., friends meet over dinner). The communication module 314 is configured to send the plurality of second position markers along with the assigned tags to the video encoder 120 (shown in FIG. 1). As already explained, the video encoder 120 is configured to encode the first media content 402 with the plurality of second position markers and tags.
[0084] When the content viewer 102 requests for the first media content 402, the communication module 314 receives a request for playback URL of the first media content 402. The request for playback URL of the first media content 402 includes content viewer metadata associated with the content viewer 102. Accordingly, the communication module 314 may fetch an original manifest corresponding to the first media content 402 from the CDN 122 (shown in FIG. 1). The original manifest includes the segments of the content along with the plurality of second position markers.
[0085] In an embodiment, the system 150 is configured to generate a new media content record for the content viewer 102 based on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based on the one or more second position markers. In a particular implementation, the Ad management module 310 in conjunction with the instructions in the memory module 304 is configured to insert the targeted second media content e.g., targeted Ads in the video content, i.e., the first media content 402 for effectively engaging the content viewer 102. To that effect, the Ad management module 310 is configured to determine a dynamic tolerance threshold of the content viewer 102. Typically, the content viewer 102 is classified into a cohort and the targeted Ads are provided based on analysis of the online viewer behavior of the cohort using the viewer behaviour data. In one illustrative example, historical data of the cohort indicates a low tolerance for Ads inserted between video content. In general, the content viewers of this cohort drop off from viewing video content that has more than 4 Ad contents/hour. As such, the Ad management module 310 is configured to select one or more second position markers based on the dynamic tolerance threshold of the content viewer/the cohort of viewers. For example, three second position markers may be identified among the plurality of second position markers to insert Ads in the first media content spanning an hour.
[0086] In an embodiment, the Ad management module 310 is configured to analyze the plurality of second position markers for identifying the one or more second position markers. More specifically, the Ad management module 310 identifies if each second position maker is a suitable position for insertion of second media content based on events determined from frame analysis as will be explained in detail. To that effect, the Ad management module 310 is configured to perform a frame analysis to identify an event from the image frames prior to the second position marker and image frames succeeding the second position marker. For example, one or more frames prior to the second position marker and one or more frames after the second position marker are retrieved and analyzed for identifying an event. In one illustrative example, the one or more frames prior to the second position marker may indicate a tragedy (i.e., an event), such as a horrific car accident. Further, analysis of the one or more image frames of the first media content retrieved after that second position marker may identify an emotional event ,for example a family grieving over the death of a person in the horrific car accident. As such, the system 150 interprets that the second position marker is not a suitable candidate for insertion of Ad content. In some example embodiments, the online viewer behavior may also be factored in along with the frame analysis to identify the suitable second position markers for the content viewer/cohort of the viewer based on behavioral attributes of the cohort. It shall be noted that such frame analysis is performed on the plurality of second position markers to determine the one or more second position markers in which Ads may be inserted for the content viewer 102.
[0087] In an embodiment, the system 150 is configured to generate a modified manifest or a new manifest for the content repository server based on the new media content record. In a particular implementation, after identifying the one or more second position markers for inserting the second media content such as Ads, the Ad management module 310 requests second media content from the advertisement server 128 (shown in FIG. 1). Accordingly, the communication module 314 is configured to receive one or more second media content from the advertisement server 128 to be inserted in the place of the one or more second position markers of the first media content. In an embodiment, the Ad management module 310 is configured to generate a modified manifest based on the original manifest and the one or more Ads (i.e., Ads inserted in the one or more second position markers). The communication module 314 sends the modified manifest to the CDN 122 which stores the modified manifest for the playback of the video content to the content viewer 102. It shall be noted that if any other content viewer from the same cohort requests the same first media content, the system 150 does not process the request to generate a modified manifest but rather causes fetching of the modified manifest cached in the CDN 122 for playback of the first media content. An example of targeted Ads inserted within the first media content is explained next with reference to FIG. 5.
[0088] FIG. 5 shows a representation 500 for illustrating an example insertion of Ad content segments in-between segments of streaming content, in accordance with an embodiment of the invention. The representation 500 depicts a sequence of segments as per a modified manifest. The sequence of segments may correspond to a content (e.g., a short film) to be streamed to a content viewer or a cohort of viewers. As explained with reference to FIGS. 3 to 4B, the content received from a content library, such as, the content library 114 (shown in FIG. 1) is encoded with one or more Ad position markers based on analysis of online viewer behavior. In one embodiment, the one or more Ad position markers are inserted by the system 150 (shown in FIG. 2).
[0089] The sequence of segments in the representation 500 is shown as S1, S2, S3, A1, S4, S5, S6, S7, A2, and S8. The segments S1- S8 are related to the short film and the segments A1, and A2 are related to two different Ad contents inserted by the system 150 when two Ad position markers are selected for the content viewer 102 or cohort of viewers. The number of advertisements to be inserted in the content for a content viewer or a cohort of viewers is based on the Ad tolerance level (i.e., dynamic tolerance threshold) of the content viewer/the cohort of viewers. In this example, it is determined that the content viewer 102 has low Ad tolerance level, and thereby the Ad management module 310 selects only two Ad position markers from among a plurality of Ad position markers to insert Ads for the content viewer. The Ad contents are received based on the number of Ad position markers selected by the system 150 for the content viewer 102 or the cohort of the viewer. It shall be noted that the segments S1, S2, S3, A1, S4, S5, S6, S7, A2, and S8 are all of uniform length and span a time duration of ‘t’ seconds, for example, 5 seconds.
[0090] As shown in FIG. 5, two Ad position markers were selected in the short film based on the dynamic tolerance threshold of the content viewer 102. Thereafter, the system 150 is configured to insert an Ad (i.e., the Ads A1, A2) at each of those Ad position markers. As shown in FIG. 5, a segment corresponding to a first Ad A1 is inserted by the system 150 at a first Ad position marker (see, 404a shown in FIG. 4B) after segment S3 and a segment corresponding to a second Ad A2 is inserted at a second Ad position marker (see, 406b shown in FIG. 4B) after segment S7. The Ad segments A1 and A2 represent a time frame in which any other content, for example, Ad content may be served for the content viewer 102. As already explained, the system 150 utilizes the online viewer behavior of a content viewer/a cohort of viewers to serve Advertisement content in the slots specified by the Ad segments A1 and A2. Accordingly, each Ad content spanning a duration of 5 seconds (i.e., 2 Ad segments A1, and A2 of 5 seconds duration) is served to the content viewer between the streaming of the movie i.e., after the segments S3 and S7, respectively. An example of a UI providing targeted Ads to content viewers with different online viewer behavior is explained next with reference to FIGS. 6A and 6B.
[0091] FIG. 6A shows a schematic representation 600 of a UI 602 corresponding to a video content 604 displayed to a content viewer 605 for illustrating the provisioning of targeted Ads in the video content 604 (i.e., the first media content), in accordance with an embodiment of the invention. As explained with reference to FIG. 3, a plurality of Ad position markers (not shown in FIG. 6A) is identified in the video content 604. The plurality of Ad position markers corresponds to actual scene boundaries in the video content 604 and are optimal slots for inserting Ad content within the video content 604. These Ad position markers are determined from among a plurality of scene transition markers assigned in relation to each probable scene boundary in the video content 604. Further, one or more Ad position markers are selected among the plurality of Ad position markers based on at least one of frame analysis in relation to each Ad position marker and online viewer behavior of a corresponding content viewer.
[0092] In this example scenario, the video content 604 is a thriller movie starring actor Pierce Brosnan and the content viewer 605 is an ardent fan of the actor. As determined from the online viewer behavior, the content viewer 605 has a high level of Ad tolerance and even more when he views video content starring the actor (i.e., Pierce Brosnan). As such, when the content viewer requests for the video content 604, the content viewer 605 is presented the video content 604 interspersed with 8 Ads. The Ads are represented by Ad position markers 608, 610, 612, 614, 616, 618, 620, and 622 on a time slider widget 606. The time slider widget 606 is a simplified visualization of temporal data related to the video content 604. More specifically, the system 150 determines the maximum number of Ads a content viewer can endure watching before dropping off from watching the video content 604. Further, the Ad position markers 608, 610, 612, 614, 616, 618, 620, and 622 are selected from among the plurality of Ad position markers based on the maximum number of Ads and the frame analysis to insert the Ads. It shall be noted that the Ad content of the Ads inserted at the Ad position markers 608, 610, 612, 614, 616, 618, 620, and 622 are also customized based on the online viewer behavior. In one example scenario, the Ads inserted in the video content 604 are customized for a cohort to which the content viewer belongs, as explained with reference to FIG. 3. Further, the content viewer can skip between scenes of the video content 604 based on the Ad position markers as already explained. It shall be noted that the video content 604 is shown to be interspersed with 8 Ad position markers (i.e., Ad position markers 608, 610, 612, 614, 616, 618, 620, and 622) for example purposes and the video content 604 may indeed include fewer or more Ads based on the dynamic tolerance threshold determined for the content viewer.
[0093] FIG. 6B shows a schematic representation 630 of a UI 632 corresponding to the same video content 604 displayed to another content viewer 631 for illustrating provisioning of targeted Ads in the video content 604, in accordance with an embodiment of the invention. In this example scenario, the content viewer 631 requests for the video content 604 (i.e., an action thriller starring actor Pierce Brosnan) which is also viewed by the content viewer 605 who shows similar interests in action thrillers. However, it shall be noted that the content viewer 605 and the content viewer 631 are assigned to different cohorts based on their online viewer behavior and as such targeted Ads (i.e., number of Ads and Ad contents) provided to the content viewer 605 is different from the targeted Ads provided to the content viewer 631.
[0094] In this example scenario, the content viewer 631 prefers thriller movies but the content viewer 631 is not an ardent fan of Pierce Brosnan. As already explained, the online viewer behavior of the content viewer 631 may be analyzed to determine an Ad tolerance level of the content viewer 631 for inserting targeted Ads within the video content 604. The analysis may indicate that the content viewer 631 has a low threshold of dynamic tolerance, for example, the content viewer 631 usually drops off from watching content interspersed with more than 5 Ads in a content spanning 1 hour 30 minutes. Moreover, as the content viewer 631 is not an ardent fan of the actor (i.e., Pierce Brosnan), the dynamic tolerance threshold may be reduced further as determined by the Ad management module 310. Accordingly, the Ad management module 310 selects 3 Ad position markers 610, 616, and 622 among the plurality of Ad position markers 608, 610, 612, 614, 616, 618, 620, and 622 for inserting Ads within the video content 604 for the content viewer 631. These Ad position markers 610, 616, and 622 may be selected from among the Ad position markers 608, 610, 612, 614, 616, 618, 620, and 622 based, at least in part, on scene analysis and dynamic tolerance threshold of the content viewer 631. The Ads are represented by Ad position markers 610, 616, and 622 on a time slider widget 634 for the content viewer 631. It shall be noted that the Ad position markers 610, 616, and 622 are shown on the time slider widget 634 for illustration purposes only and indeed different Ad position markers may be selected based, at least in part, on scene analysis. A method for facilitating the insertion of advertisements in streaming content is explained next with reference to FIG. 7.
[0095] FIG. 7 shows a flow diagram of a method 700 for facilitating insertion of advertisements in streaming content, in accordance with an embodiment of the invention. The various steps and/or operations of the flow diagram, and combinations of steps/operations in the flow diagram, may be implemented by, for example, hardware, firmware, a processor, circuitry and/or by a system such as the system 150 explained with reference to FIGS. 3 to 6A-6B and/or by a different device associated with the execution of software that includes one or more computer program instructions. The method 700 starts at operation 702.
[0096] At 702 of the method 700, a video content is received from a content library (e.g., the content library) 114 by a system such as, the system 150. The content library 114 may be in operative communication with content ingestion servers associated with one or more OTT platforms to receive the latest content offerings from content production houses, media portals, and the like. The video content received as a part of the latest content offering is processed further for targeted Ad insertion.
[0097] At 704 of the method 700, a plurality of scene transition markers is assigned in relation to the detection of a probable scene boundary in the video content. In at least one example embodiment, the system 150 assigns at least two scene transition markers for each probable scene boundary detected based, at least in part, on audio and visual feature analysis of the video content. A scene transition marker is selected among the plurality of scene transition markers for each probable scene boundary as an actual scene boundary. In one example embodiment, the selection of the scene transition marker is performed by a content handling team such as, the content handling team 116 (shown in FIG. 1) and the plurality of selected scene transition markers in the video content is provided to the system 150. In another example embodiment, the system 150 automatically identifies the actual scene boundary from the plurality of scene transition markers and the plurality of selected scene transition markers in the video content constitutes a plurality of Ad position markers.
[0098] At 706 of the method 700, the plurality of Ad position markers is provided to a video encoder for encoding the video content with the plurality of Ad position markers.
[0099] FIG. 8 shows a flow diagram of a method 800 for facilitating insertion of advertisements in streaming content, in accordance with an embodiment of the invention. The various steps and/or operations of the flow diagram, and combinations of steps/operations in the flow diagram, may be implemented by, for example, hardware, firmware, a processor, circuitry and/or by a system such as the system 150 explained with reference to FIGS. 3 to 6A-6B and/or by a different device associated with the execution of software that includes one or more computer program instructions. The method 800 starts at operation 802.
[00100] At the operation 802, the method 800 includes accessing, by a system such as system 150, viewer behavior data corresponding to a content viewer such as content viewer 102 from a database (such as storage module 316) associated with the system 150.
[00101] At the operation 804, the method 800 includes accessing, by the system 150, a first media content and a second media content from a content repository server based, at least in part, on a manifest file. Herein, the first media content includes a plurality of first encoded segments, and the second media content includes a plurality of second encoded segments.
[00102] At the operation 806, the method 800 includes identifying, by the system 150, a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments.
[00103] At the operation 808, the method 800 includes determining, by the system 150, one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules.
[00104] At the operation 810, the method 800 includes determining, by the system 150, a dynamic tolerance threshold for the content viewer 102 based, at least in part, on analyzing the viewer behavior data and the second media content.
[00105] At the operation 812, the method 800 includes selecting, by the system 150, one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic tolerance threshold.
[00106] At the operation 814, the method 800 includes generating, by the system 150, a new media content record for the content viewer based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers.
[00107] At the operation 816, the method 800 includes generating, by the system 150, a modified manifest for the content repository server based, at least in part, on the new media content record.
[00108] FIG. 9 shows a flow diagram of a method 900 for facilitating insertion of advertisements in streaming content, in accordance with an embodiment of the invention. The various steps and/or operations of the flow diagram, and combinations of steps/operations in the flow diagram, may be implemented by, for example, hardware, firmware, a processor, circuitry and/or by a system such as the system 150 explained with reference to FIGS. 3 to 6A-6B and/or by a different device associated with the execution of software that includes one or more computer program instructions. The method 900 starts at operation 902.
[00109] At the operation 902, the method 900 includes accessing, by a system 150, viewer behavior data includes data related to a plurality of content viewers from a database associated with the system 150.
[00110] At the operation 904, the method 900 includes accessing, by the system 150, a first media content and a second media content from a content repository server based, at least in part, on a manifest file. Herein, the first media content includes a plurality of first encoded segments, and the second media content includes a plurality of second encoded segments.
[00111] At the operation 906, the method 900 includes classifying, by the system 150 via a second machine learning model, the plurality of content viewers into a plurality of viewer cohorts based, at least in part, on the viewer behavior data.
[00112] At the operation 908, the method 900 includes identifying, by the system 150, a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments.
[00113] At the operation 910, the method 900 includes determining, by the system 150, one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules.
[00114] At the operation 912, the method 900 includes determining, by the system 150, a dynamic cohort tolerance threshold of each viewer cohort of the plurality of the viewer cohorts based, at least in part, on analyzing the viewer behavior data and the second media content.
[00115] At the operation 914, the method 900 includes selecting, by the system 150, one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic cohort tolerance threshold of the each viewer cohort.
[00116] At the operation 916, the method 900 includes generating, by the system 150, a new media content record corresponding to the each viewer cohort of the plurality of the viewer cohorts based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers.
[00117] At the operation 918, the method 900 includes generating, by the system 150, a modified manifest for the content repository server for each viewer cohort of the plurality of the viewer cohorts based, at least in part, on the corresponding new media content record.
[00118] FIG. 10 is a simplified block diagram of a Content Delivery Network (CDN) 1000, in accordance with various embodiments of the invention. It noted that in a non-limiting example, the CDN of FIG. 1 can be implemented within the CDN 1000. The CDN 1000 refers to a distributed group of servers that are connected via a network (such as Network 1024, which is explained later). The CDN 1000 provides quick delivery of media content to various content viewers subscribed to the content provider platform 108. The CDN 1000 includes a plurality of interconnected servers that may interchangeably be referred to as a plurality of content repository servers or simply as content repository server, as a whole. The CDN includes an origin CDN server 1022, a public CDN server 1002, a private CDN server 1004, a Telecommunication CDN server (referred to hereinafter as ‘Telco CDN server’) 1006, an Internet Service Provider CDN server (referred to hereinafter as ‘ISP CDN server’) 1008, and a CDN point of presence server (referred to hereinafter as ‘CDN POP server’) 1010 each coupled to, and in communication with (and/or with access to) the network 1024. It is noted that CDN POP may also be interchangeably referred to as ‘sub-CDNs’, ‘subnet CDN’, ‘surrogate CDN’, and ‘CDN sub-box’. Further, two or more components of the CDN 1000 may be embodied in one single component, and/or one component may be configured using multiple sub-components to achieve the desired functionalities. Some components of the CDN 1000 may be configured using hardware elements, software elements, firmware elements, and/or a combination thereof.
[00119] The network 1024 may include, without limitation, a light fidelity (Li-Fi) network, a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a satellite network, the Internet, a fiber-optic network, a coaxial cable network, an infrared (IR) network, a radio frequency (RF) network, a virtual network, and/or another suitable public and/or private network capable of supporting communication among two or more of the parts illustrated in FIG. 10, or any combination thereof. Various servers within the CDN 1000 may connect to the network 1004 using various wired and wireless communication protocols, such as Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), 2nd Generation (2G), 3rd Generation (3G), 4th Generation (4G), 5th Generation (5G) communication protocols, Long Term Evolution (LTE) communication protocols, future communication protocols or any combination thereof. For example, the network 1024 may include multiple different networks, such as a private network made accessible by the origin CDN server 1022 and a public network (e.g., the Internet, etc.) through which the various servers may communicate.
[00120] The origin CDN server 1022 stores the media content accessed/downloaded from the streaming content provider and/or content producers. The origin CDN server 1022 serves the media content to one or more cache servers which are either located in the vicinity of the content viewer/subscriber or connected to another cache server located in the content viewer’s vicinity. In various examples, cache servers include the public CDN server 1002, the private CDN server 1004, the Telco CDN server 1006, the ISP CDN server 1008, the CDN POP server 1010, and the like.
[00121] The origin CDN server 1022 includes a processing system 1012, a memory 1014, a database 1016, and a communication interface 1018. The processing system 1012 is configured to extract programming instructions from the memory 1014 to perform various functions of the CDN 1000. In one example, the processing instructions include instructions for ingesting media content via the communication interface 1018 from a remote database 1020 which may further include one or more data repositories/databases (not shown) to an internal database such as the database 1016. The remote database 1020 is associated with a streaming content provider and/or content producer. In another example, the media content stored within the database 1016 can be served to one or more cache servers via the communication interface 1018 over the network 1024.
[00122] In some examples, the public CDN server 1002 is associated with a public CDN provider which hosts media content among other types of data for different content providers within the same server. The private CDN server 1004 is associated with a private CDN provider (such as a streaming content provider) which hosts media content for serving the needs of its subscribers. The Telco CDN server 1006 is associated with telecommunication service providers which provide content hosting services to various entities such as the streaming content platform. The ISP CDN server 1008 is associated with internet service providers which provide content hosting services to various entities such as the streaming content platform. The CDN POP server 1010 caches content and allows the electronic devices of the content viewers to stream the content. It is noted that the various cache servers download and cache media content from the origin CDN server 1022 and further allow a valid user or content viewer to stream the media content.
[00123] It is noted that various embodiments of the present disclosure, the various functions of the remote server can be implemented using any one or more components of the CDN 1000 such as the origin CDN server 1022 and/or one or more cache servers individually and/or in combination with each together. Alternatively, the system 150 can be communicably coupled with the CDN 1000 to perform the various embodiments or methods described by the present disclosure.
[00124] Various embodiments disclosed herein provide numerous advantages. More specifically, the embodiments disclosed herein suggest techniques for improving the accuracy of identification of a scene boundary in the content. The insertion of ad content at the scene boundary rather than in the middle of the scene improves the user experience and feels less intrusive to the content viewer. Further, considering the dynamic tolerance threshold of the content viewer and the characteristics of the content for determination of the number of advertisements to be inserted in the content, ensures an optimal number of ads being inserted for different types of content viewers that can improve the viewing experience of the content viewer.
[00125] The foregoing descriptions of specific embodiments of the present invention have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the present invention to the precise forms disclosed, and many modifications and variations are possible in light of the above teaching. The exemplary embodiment was chosen and described in order to best explain the principles of the present invention and its practical application, to thereby enable others skilled in the art to best utilize the present invention and various embodiments with various modifications as are suited to the particular use contemplated.
,CLAIMS:WE CLAIM:

1. A computer-implemented method comprising:
accessing, by a system, viewer behavior data corresponding to a content viewer from a database associated with the system;
accessing, by the system, a first media content and a second media content from a content repository server based, at least in part, on a manifest file, the first media content comprising a plurality of first encoded segments, and the second media content comprising a plurality of second encoded segments;
identifying, by the system, a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments;
determining, by the system, one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules;
determining, by the system, a dynamic tolerance threshold for the content viewer based, at least in part, on analyzing the viewer behavior data and the second media content;
selecting, by the system, one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic tolerance threshold;
generating, by the system, a new media content record for the content viewer based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers; and
generating, by the system, a modified manifest for the content repository server based, at least in part, on the new media content record.

2. The computer-implemented method as claimed in claim 1, wherein the viewer behavior data further comprises at least one of information related to the content viewer indicating a content preference of the content viewer, a language preference of the content viewer, a cast preference of the user, requested media content, gender, age group, nationality, location, e-mail identifier, cart information, URLs, payment history, call logs, chat logs device identifier, IP address, user profiles, messaging platform information, social media interactions, browser information, time of the day, device operating system (OS) and network provider.

3. The computer-implemented method as claimed in claim 1, further comprises:
receiving, by the system, a playback Uniform Resource Locator (URL) request from the content viewer, the playback URL request indicating a request for the first media content by the content viewer; and
in response to the playback URL request accessing, by the system, the manifest file from the content repository server, the manifest file comprising one or more URLs associated with the first media content and one or more URL associated with the second media content.

4. The computer-implemented method as claimed in claim 3, wherein the manifest file further comprises at least one of information related to the first encoded content segments and information related to the second encoded content segments, the information related to the first encoded content segments comprising a number of segments related to the first encoded content, size of each segment, size of each content of the first encoded content, a first media content record indicating an order of streaming of first encoded content, available resolutions for each first encoded content segment and available bitrates for each first encoded content segment, the information related to the second encoded content segments comprising a number of segments related to the second encoded content, size of each segment, size of each content of the second encoded content, a second media content record indicating an order of streaming of second encoded content, available resolutions for each second encoded content segment and available bitrates for each second encoded content segment.

5. The computer-implemented method as claimed in claim 1, wherein identifying the plurality of probable scene boundaries, further comprises:
determining, by the system, at least one of a cast, a background, a dialogue, and a sphere of activity within the first encoded segments based, at least in part, on analyzing the first encoded segments;
determining, by the system via a first machine learning model, at least one of a change in the cast, the background, the dialogue, and the sphere of activity based, at least in part, on the determining step; and
determining, by the system, a point of transition from one scene to another scene within the first media content based on at least one of the change in the cast, the background, the dialogue, and the sphere of activity.

6. The computer-implemented method as claimed in claim 1, wherein selecting one or more second position markers from the one or more scene transition markers, further comprises:
selecting, by the system, an intermediate scene transition marker from the one or more scene transition markers corresponding to each of the plurality of probable scene boundaries that coincide with the actual scene boundary based, at least in part, on the one or more pre-defined rules; and
setting, by the system, the intermediate scene transition marker corresponding to each of the plurality of probable scene boundaries as the second position marker.

7. The computer-implemented method as claimed in claim 6, wherein selecting the intermediate scene transition marker, further comprises:
determining, by the system via a first machine learning model, a confidence score for one or more scene transition markers based, at least in part, on determining if the one or more scene transition marker coincides with an actual scene boundary;
assigning, by the system, a color code to the one or more scene transition markers based, at least in part, on the confidence score;
indexing, by the system, the one or more scene transition markers based, at least in part, on the assigned color code; and
selecting, by the system, the intermediate scene transition marker from the one or more scene transition markers based, at least in part, on the assigned color code corresponding to the highest confidence score.

8. The computer-implemented method as claimed in claim 7, wherein indexing the one or more scene transition markers, further comprising:
assigning, by the system, an indexing tag to each of the color coded one more scene transition markers; and
facilitating, by the system, the content viewer to navigate from one scene to another scene based, at least in part, on the assigned indexing tags.

9. The computer-implemented method as claimed in claim 1, wherein the content repository server associated is a content delivery network.

10. A computer-implemented method comprising:
accessing, by a system, viewer behavior data comprising data related to a plurality of content viewers from a database associated with the system;
accessing, by the system, a first media content and a second media content from a content repository server based, at least in part, on a manifest file, the first media content comprising a plurality of first encoded segments, and the second media content comprising a plurality of second encoded segments;
classifying, by the system via a second machine learning model, the plurality of content viewers into a plurality of viewer cohorts based, at least in part, on the viewer behavior data;
identifying, by the system, a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments;
determining, by the system, one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules;
determining, by the system, a dynamic cohort tolerance threshold of each viewer cohort of the plurality of the viewer cohorts based, at least in part, on analyzing the viewer behavior data and the second media content;
selecting, by the system, one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic cohort tolerance threshold of the each viewer cohort;
generating, by the system, a new media content record corresponding to the each viewer cohort of the plurality of the viewer cohorts based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers; and
generating, by the system, a modified manifest for the content repository server for each viewer cohort of the plurality of the viewer cohorts based, at least in part, on the corresponding new media content record.

11. A system, the system comprising:
a memory for storing instructions; and
a processor configured to execute the instructions and thereby cause the system, at least in part, to:
access information related to a viewer behavior data corresponding to a content viewer from a database associated with the system;
access a first media content and a second media content from a content repository server based, at least in part, on a manifest file, the first media content comprising a plurality of first encoded segments, and the second media content comprising a plurality of second encoded segments;
identify a plurality of probable scene boundaries based, at least in part, on analysis of the plurality of first encoded segments;
determine one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules;
determine a dynamic tolerance threshold for the content viewer based, at least in part, on analysis of the viewer behavior data and the second media content;
select one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic tolerance threshold;
generate a new media content record for the content viewer based, at least in part, on insertion of the one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers; and
generate a modified manifest for the content repository server based, at least in part, on the new media content record.

12. The system as claimed in claim 11, wherein the viewer behavior data further comprises at least one of information related to the content viewer indicating a content preference of the content viewer, a language preference of the content viewer, a cast preference of the user, requested media content, gender, age group, nationality, location, e-mail identifier, cart information, URLs, payment history, call logs, chat logs device identifier, IP address, user profiles, messaging platform information, social media interactions, browser information, time of the day, device operating system (OS) and network provider.

13. The system as claimed in claim 11, further comprises:
receive a playback URL request from the content viewer, the playback URL request indicating a request for the first media content by the content viewer; and
in response to the playback URL request access the manifest file from the content repository server, the manifest file comprising one or more URLs associated with the first media content and one or more URL associated with the second media content.

14. The system as claimed in claim 13, wherein the manifest file further comprises at least one of information related to the first encoded content segments and information related to the second encoded content segments, the information related to the first encoded content segments comprising a number of segments related to the first encoded content, size of each segment, size of each content of the first encoded content, a first media content record indicating an order of streaming of first encoded content, available resolutions for each first encoded content segment and available bitrates for each first encoded content segment, the information related to the second encoded content segments comprising a number of segments related to the second encoded content, size of each segment, size of each content of the second encoded content, a second media content record indicating an order of streaming of second encoded content, available resolutions for each second encoded content segment and available bitrates for each second encoded content segment.

15. The system as claimed in claim 11, wherein identifying the plurality of probable scene boundaries, further comprises:
determine at least one of a cast, a background, a dialogue, and a sphere of activity within the first encoded segments based, at least in part, on analyzing the first encoded segments;
determine via a first machine learning model, at least one of a change in the cast, the background, the dialogue, and the sphere of activity based, at least in part, on the determining step; and
determine a point of transition from one scene to another scene within the first media content based on at least one of the change in the cast, the background, the dialogue, and the sphere of activity.

16. The system as claimed in claim 11, wherein to select one or more second position markers from the one or more scene transition markers, the system is further caused, at least in part to:
select an intermediate scene transition marker from the one or more scene transition markers corresponding to each of the plurality of probable scene boundaries that coincide with the actual scene boundary, based, at least in part, on the one or more pre-defined rules; and
set the intermediate scene transition marker corresponding to each of the plurality of probable scene boundaries as the second position marker.

17. The system as claimed in claim 16, wherein selecting the intermediate scene transition marker, further comprises:
determine via a first machine learning model, a confidence score for one or more scene transition markers based, at least in part, on determination if the one or more scene transition marker coincides with an actual scene boundary;
assign a color code to the one or more scene transition markers based, at least in part, on the confidence score;
index the one or more scene transition markers based, at least in part, on the assigned color code; and
select the intermediate scene transition marker from the one or more scene transition markers based, at least in part, on the assigned color code corresponding to the highest confidence score.

18. The system as claimed in claim 17, wherein to index the one or more scene transition markers, the system is further caused, at least in part to:
assign an indexing tag to each of the color coded one more scene transition markers; and
facilitate the content viewer to navigate from one scene to another scene based, at least in part, on the assigned indexing tags.

19. The system as claimed in claim 11, wherein the content repository server is a content delivery network.

20. A non-transitory computer-readable storage medium comprising computer-executable instructions that, when executed by at least a processor of a system, cause the system to perform a method comprising:
accessing information related to a viewer behavior data corresponding to a content viewer from a database associated with the system;
accessing a first media content and a second media content from a content repository server based, at least in part, on a manifest file, the first media content comprising a plurality of first encoded segments, and the second media content comprising a plurality of second encoded segments;
identifying a plurality of probable scene boundaries based, at least in part, on analyzing the plurality of first encoded segments;
determining one or more scene transition markers corresponding to the first media content based, at least in part, on the plurality of probable scene boundaries and one or more pre-defined rules;
determining a dynamic tolerance threshold for the content viewer based, at least in part, on analyzing the viewer behavior data and the second media content;
selecting one or more second position markers from the one or more scene transition markers based, at least in part, on the dynamic tolerance threshold;
generating a new media content record for the content viewer based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers; and
generating a modified manifest for the content repository server based, at least in part, on the new media content record.

Documents

Application Documents

#	Name	Date
1	202221033666-STATEMENT OF UNDERTAKING (FORM 3) [13-06-2022(online)].pdf	2022-06-13
2	202221033666-PROVISIONAL SPECIFICATION [13-06-2022(online)].pdf	2022-06-13
3	202221033666-POWER OF AUTHORITY [13-06-2022(online)].pdf	2022-06-13
4	202221033666-FORM 1 [13-06-2022(online)].pdf	2022-06-13
5	202221033666-DRAWINGS [13-06-2022(online)].pdf	2022-06-13
6	202221033666-DECLARATION OF INVENTORSHIP (FORM 5) [13-06-2022(online)].pdf	2022-06-13
7	202221033666-Proof of Right [22-11-2022(online)].pdf	2022-11-22
8	202221033666-ORIGINAL UR 6(1A) FORM 1-151222.pdf	2022-12-20
9	202221033666-FORM 18 [13-06-2023(online)].pdf	2023-06-13
10	202221033666-DRAWING [13-06-2023(online)].pdf	2023-06-13
11	202221033666-CORRESPONDENCE-OTHERS [13-06-2023(online)].pdf	2023-06-13
12	202221033666-COMPLETE SPECIFICATION [13-06-2023(online)].pdf	2023-06-13
13	Abstract1.jpg	2023-11-06
14	202221033666-PA [03-10-2024(online)].pdf	2024-10-03
15	202221033666-ASSIGNMENT DOCUMENTS [03-10-2024(online)].pdf	2024-10-03
16	202221033666-8(i)-Substitution-Change Of Applicant - Form 6 [03-10-2024(online)].pdf	2024-10-03
17	202221033666-FER.pdf	2025-04-17
18	202221033666-RELEVANT DOCUMENTS [08-10-2025(online)].pdf	2025-10-08
19	202221033666-POA [08-10-2025(online)].pdf	2025-10-08
20	202221033666-MARKED COPIES OF AMENDEMENTS [08-10-2025(online)].pdf	2025-10-08
21	202221033666-FORM 13 [08-10-2025(online)].pdf	2025-10-08
22	202221033666-AMENDED DOCUMENTS [08-10-2025(online)].pdf	2025-10-08
23	202221033666-POA [15-10-2025(online)].pdf	2025-10-15
24	202221033666-MARKED COPIES OF AMENDEMENTS [15-10-2025(online)].pdf	2025-10-15
25	202221033666-FORM 13 [15-10-2025(online)].pdf	2025-10-15
26	202221033666-AMMENDED DOCUMENTS [15-10-2025(online)].pdf	2025-10-15
27	202221033666-OTHERS [17-10-2025(online)].pdf	2025-10-17
28	202221033666-FER_SER_REPLY [17-10-2025(online)].pdf	2025-10-17
29	202221033666-COMPLETE SPECIFICATION [17-10-2025(online)].pdf	2025-10-17
30	202221033666-CLAIMS [17-10-2025(online)].pdf	2025-10-17

Search Strategy

1	202221033666searchE_21-08-2024.pdf