Sign In to Follow Application
View All Documents & Correspondence

Method And System For Adaptive Encoding Of Streaming Content Specific To Content Viewers

Abstract: A system and method for adaptive encoding of streaming content specific to content viewers is disclosed. The system is configured to access historical playback data and viewer behavior data from a database associated with the system. The system is configured to determine via a machine learning model, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data. The system is configured to determine via a machine learning model, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data. The system is further configured to receive a media content request for accessing first media content and generate a set of adapted encoding profiles for a first media content based, at least in part, on the average media playback score.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
02 April 2024
Publication Number
40/2025
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

Star India Private Limited
Star House, Urmi Estate, 95, Ganpatrao Kadam Marg, Lower Parel (W), Mumbai 400013, Maharashtra, India

Inventors

1. Ramesh V. Panchagnula
Star House, Urmi Estate, 95 Ganpatrao Kadam Marg, Lower Parel West, Mumbai, Maharashtra 400013, India
2. Qian Chang
Unit N711, Floor 7, North Building, Raycom Infotech Park Tower C, No.2 Kexuyuan South Road, Haidian District, Beijing, 100190, P.R.C

Specification

Description:[0001] The present technology generally relates to mechanisms for streaming content and, more particularly, to a method and a system for adaptive encoding of streaming content specific to content viewers.
BACKGROUND

[0002] On-demand video streaming as well as live streaming of content has gained popularity in recent times and, users are increasingly using a variety of user devices to access streaming content. Generally, the streaming content is accessed on user devices using Over-The-Top (OTT) media services (i.e., over the Internet or air).

[0003] Typically, each content to be streamed by the content viewers is encoded as per multiple encoding profiles and stored in a repository associated with a Content Delivery Network (CDN) server. For example, movie content may be encoded using codecs associated with multiple encoding standards, such as H.264, H.265, VP8, VP9, AOMedia Video 1 (AVI), Versatile Video Coding (VVC), and the like. Further, for each encoding standard, the content may be encoded at multiple resolutions such as 360 progressive scans (p), 480p, 720p, 1080p, and 4 Kilo(K)/Ultra-high Definition (UHD) resolution to configure multiple sets of encoding profiles (also known as ‘encoding ladders’). The term ‘encoding profile’ as used herein corresponds to a set of streams corresponding to the content, where each stream is characterized by an encoding bitrate (r), a spatial resolution (s), and a frame rate (fps). For example, video content may be encoded using the H.264 standard at a bitrate of 400 Kbps, a spatial resolution of 360p, a frame rate of 30 fps, and a media playback score of 75 to configure one encoded stream. Similarly, media content may be encoded using the H.264 standard at a bitrate of 500 Kbps, a spatial resolution of 480p, a frame rate of 30 fps, and a media playback score of 80 to configure another encoding stream. Multiple such streams of the media content encoded as per H.264 standard constitute one ‘encoding profile’. Similarly, video content may be encoded using the H.265 standard at one or more resolutions, frame rates, and media playback scores to generate multiple encoded streams of video content, thereby constituting another encoding profile. Depending upon factors like underlying network conditions and/or attributes associated with a content viewer’s device used for content playback, an appropriate encoding profile is chosen for streaming content from the CDN to the content viewer’s device.

[0004] The streaming content, once encoded as per different encoding profiles, may be segmented into chunks for transmission from the CDN to the content viewer’s device. The sequence of segments corresponding to a content stream encoded as per an encoding profile is configured to generate a rendition of the content on the display screen of the content viewer’s device. In cases, where there is a change in network conditions or change in attributes associated with the device playing the content, the streaming of content has to be switched from one rendition to another. More specifically, the streaming of content has to be adapted from a segment in a sequence of segments corresponding to one encoding profile to the next appropriate segment in a sequence of segments corresponding to another encoding profile. Moreover, switching from one rendition to another must be performed so that a content viewer does not experience any drastic change in content viewing experience before and after switching content stream renditions.

[0005] In many cases, a content viewer’s device or general underlying conditions may be known in advance and it may be a waste of resources to maintain several encoding profiles of streaming content for such a content viewer. For example, if a content viewer is associated with a high-end smartphone and lives in a part of a metropolitan city with good network coverage, then maintaining encoding profiles associated with low resolution and low bitrate combinations may be a waste of resources. Similarly, if a content viewer is associated with a low-end smartphone incapable of displaying high-resolution content and whose network connection is patchy, then maintaining encoding profiles associated with high-resolution and high-bitrate combinations may not be the most efficient use of resources. Further, maintaining such unnecessary encoding profiles associated with high resolution and high bitrate combinations is a waste of resources. In addition to the inefficacy of maintaining a large number of encoding profiles, the frequent switching of content streams from one rendition to another incurs a processing load on the content viewer’s device as well. For example, the content viewer’s device may include a buffer that is configured to fetch segments of encoded content from the CDN in advance and store the fetched segments corresponding to a rendition. When the encoding profile is switched due to underlying network conditions or otherwise, a timestamp of switching the rendition needs to be considered while requesting the next appropriate segment from another rendition. Moreover, the fetched segments in the buffer also need to be appropriately managed (i.e., discarded in full or partly discarded) based on the next segment being fetched from the CDN. The conventional approach for streaming content for such content viewers also fails to consider the dynamic nature of the various playback factors associated with different users.

[0006] As may be understood, the media playback scores which are conventionally used to generate the multiple encoded streams of video content require complex media playback data points and complex calculations at the viewer’s end which delays the streaming (watch experience) of video content for users. Further, within the existing architecture, chunk-level scoring is not possible, as different content viewers use different types of user devices. Generally, the machine learning model used for scoring is kept static and it is difficult to change the settings and parameters used for calculating the media playback score from the viewer’s end. Further, as both the audio content and video content are muxed in a video segment (chunk), it leads to incorrect average bit rate calculation of a segment and in turn the media playback score may not be accurate.

SUMMARY

[0007] Various embodiments of the present disclosure provide methods and systems for providing encoded streaming content to content viewers.

[0008] In an embodiment, a computer-implemented method for adaptive encoding of streaming content specific to content viewers is disclosed. The computer-implemented method performed by a system includes accessing historical playback data and viewer behavior data from a database associated with the system. Herein, the historical playback data includes information related to a set of encoding profiles corresponding to each media content of a plurality of media contents viewed by a content viewer over a predefined time period using one or more dynamic playback metrics. The viewer behavior data indicates information related to a behavior of the content viewer during playback of each media content. The method further includes determining via a machine learning model, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data. The method further includes receiving, by the system, a media content request for first media content from the content viewer. The method further includes accessing, by the system, the first media content from a content repository server associated with a Digital Platform Server (DPS), based, at least in part, on the media content request. The method further includes generating a set of adapted encoding profiles for the first media content based, at least in part, on the average media playback score. The method further includes generating and transmitting a manifest file to the content viewer based, at least in part, on the set of adapted encoding profiles.

[0009] In another embodiment, a system for adaptive encoding of streaming content specific to content viewers is disclosed. The system includes a memory and a processor. The memory stores instructions that are executed by the processor and causes the system to access historical playback data and viewer behavior data from a database associated with the system. Herein, the historical playback data includes information related to a set of encoding profiles corresponding to each media content of a plurality of media contents viewed by a content viewer over a predefined time period. The viewer behavior data indicates information related to a behavior of the content viewer during playback of each media content. The system is further caused to determine via a machine learning model, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data. The system is further caused to receive a media content request for first media content from the content viewer. The system is further caused to access the first media content from a content repository server associated with a Digital Platform Server (DPS), based, at least in part, on the media content request. The system is further caused to generate a set of adapted encoding profiles for the first media content based, at least in part, on the average media playback score. The system is further caused to generate and transmit a manifest file to the content viewer based, at least in part, on the set of adapted encoding profiles.

[0010] In another embodiment, a non-transitory computer-readable storage medium including computer-executable instructions that, when executed by at least a processor of a system, cause the system to perform a method. The method includes accessing historical playback data and viewer behavior data from a database associated with the system. Herein, the historical playback data includes information related to a set of encoding profiles corresponding to each media content of a plurality of media contents viewed by a content viewer over a predefined time period using one or more dynamic playback metrics. The viewer behavior data indicates information related to a behavior of the content viewer during playback of each media content. The method further includes determining via a machine learning model, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data. The method further includes receiving, by the system, a media content request for first media content from the content viewer. The method further includes accessing the first media content from a content repository server associated with a Digital Platform Server (DPS), based, at least in part, on the media content request. The method further includes generating a set of adapted encoding profiles for the first media content based, at least in part, on the average media playback score. The method further includes generating and transmitting a manifest file to the content viewer based, at least in part, on the set of adapted encoding profiles.

[0011] The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.

BRIEF DESCRIPTION OF THE FIGURES

[0012] The advantages and features of the invention will become better understood concerning the detailed description taken in conjunction with the accompanying drawings, wherein like elements are identified with like symbols, and in which:

[0013] FIG. 1A is an example representation of adaptive encoding of streaming content and delivering content from a streaming content provider to content viewers, in accordance with an example scenario;

[0014] FIG. 1B is an example representation of adaptive bitrate streaming of content from a content provider to the content viewers, in accordance with an example scenario of a prior art;

[0015] FIG. 2 is a block diagram of a system for adaptive encoding of streaming content for content viewers, in accordance with an embodiment of the invention;

[0016] FIG. 3 illustrates a simplified block diagram representation for generating a set of adapted encoding profiles using the adaptive encoding model, in accordance with an embodiment of the present disclosure;

[0017] FIG. 4A is a schematic representation illustrating an example encoding profile used by the system for encoding of streaming of media content for content viewers, in accordance with an embodiment of the invention;

[0018] FIG. 4B is a schematic representation illustrating an example adapted encoding profile used by the system for adaptive encoding of first media content to be streamed to content viewers, in accordance with an embodiment of the invention;

[0019] FIG. 5 shows an example flow diagram of a method for adaptive encoding of streaming content for content viewers, in accordance with an embodiment of the invention;

[0020] FIG. 6 shows an example flow diagram of a method for adaptive encoding of streaming content for content viewers, in accordance with an embodiment of the invention;

[0021] FIG. 7 shows an example flow diagram of a method for determining an average media playback score for the plurality of media contents, in accordance with an embodiment of the invention; and

[0022] FIG. 8 is a simplified block diagram of a multi-Content Delivery Network (CDN), in accordance with various embodiments of the invention.

[0023] The drawings referred to in this description are not to be understood as being drawn to scale except if specifically noted, and such drawings are only exemplary in nature.

DETAILED DESCRIPTION

[0024] The best and other modes for carrying out the present invention are presented in terms of the embodiments, herein depicted in FIG. 1A, FIG. 1B, FIG. 2, FIG. 3, FIG. 4A, FIG. 4B, FIG. 5 to FIG. 8. The embodiments are described herein for illustrative purposes and are subject to many variations. It is understood that various omissions and substitutions of equivalents are contemplated as circumstances may suggest or render expedient but are intended to cover the application or implementation without departing from the spirit or scope of the invention. Further, it is to be understood that the phraseology and terminology employed herein are for the purpose of the description and should not be regarded as limiting. Any heading utilized within this description is for convenience only and has no legal or limiting effect.

[0025] Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. The appearance of the phrase “in an embodiment” in various places in the specification does not necessarily all refer to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not for other embodiments.

[0026] Moreover, although the following description contains many specifics for illustration, anyone skilled in the art will appreciate that many variations and/or alterations to said details are within the scope of the present disclosure. Similarly, although many of the features of the present disclosure are described in terms of each other, or conjunction with each other, one skilled in the art will appreciate that many of these features can be provided independently of other features. Accordingly, this description of the present disclosure is set forth without any loss of generality to, and without imposing limitations upon, the present disclosure.

[0027] Embodiments of the present disclosure may be embodied as an apparatus, system, method, or computer program product. Accordingly, embodiments of the present disclosure may take the form of an entire hardware embodiment, an entire software embodiment (including firmware, resident software, micro-code, etc.), or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” "engine" “module” or “system.” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer-readable storage media having computer-readable program code embodied thereon.

[0028] The terms “a” and “an” herein do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced items.

OVERVIEW

[0029] Various embodiments of the present disclosure provide methods and systems for providing encoded streaming content to content viewers.

[0030] In an embodiment, a system that may be a digital platform server associated with a content provider is configured to access historical playback data and viewer behavior data from a database associated with the system. The historical playback data indicates information related to a set of encoding profiles corresponding to each media content of a plurality of media contents viewed by a content viewer over a predefined time period. The viewer behavior data indicates information related to a behavior of the content viewer during the playback of each media content. Herein, the media content may include at least a plurality of image frames and the viewer behavior data may include data related to a plurality of content viewers. In various examples, the data related to the plurality of content viewers may further include at least one of content viewer name, age, gender, e-mail identifier, device identifier, Internet Protocol (IP) address, geolocation information, browser information, time of the day, user profiles, social media interactions, user device information, user interaction, language preference, content preference, cast preference, device log data, and the like.

[0031] In another embodiment, the system is configured to determine via a machine learning model, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data. The system is configured to generate the plurality of adapted encoding profiles for the first media content based, at least in part, on the historical playback data and the viewer behavior data. The system computes a final aggregated bitrate for the plurality of media content based, at least in part, on aggregating the corresponding aggregated bitrate for each media content of the plurality of media content. The system determined via a machine learning model, a behavior profile for the content viewer based, at least in part, on the viewer behavior data, the behavior profile indicating the behavior of the content viewer when a particular media content is streamed using a particular encoding profile. The system determined an average media playback score for the plurality of media contents based, at least in part, on the final aggregated bitrate and the behavior profile.

[0032] Various embodiments of the present disclosure provide multiple advantages and technical effects while addressing technical problems such as the need for updating the video scoring module at the client side, that is at the content viewer end, can be avoided. To that end, the present invention provides an approach for adapting and optimizing the encoding process based on the average media playback score generated by the system at the cloud-based system. Moreover, optimized and adaptive encoding of streaming content ensures that the switching of the content streams from one rendition to another is performed in a manner such that a content viewer does not observe any drastic change in content viewing experience before and after switching content stream renditions. Further, resources are effectively used by the system for encoding based on the average media playback score and hence provide an enjoyable and seamless content viewing experience to the content viewer.

[0033] FIG. 1A is an example representation 100 for adaptive encoding of streaming content and delivering content from a streaming content provider to a content viewer 102, in accordance with an example scenario.

[0034] The term ‘streaming content provider’ as used herein refers to an entity that holds digital rights associated with digital content, i.e., media content, present within digital video content libraries, offers the content on a subscription basis by using a digital platform and Over-The-Top (OTT) media services, i.e., content is streamed over the Internet to the user devices of the subscribers, i.e., content viewers. A streaming content provider is hereinafter referred to as a ‘content provider’ for ease of description. The content offered by the content provider may be embodied as streaming video content such as live streaming content or on-demand video streaming content. It is noted that though the content offered by the content provider is explained concerning video content, the term ‘content’ as used hereinafter may not be limited to only video content. Indeed, the term ‘content’ may refer to any media content including but not limited to ‘video content’, ‘audio content’, ‘gaming content’, ‘textual content’, ‘pictorial content’, ‘image content’, ‘webpage’ and any combination of such content offered in an interactive or non-interactive form. Accordingly, the term ‘content’ is also interchangeably referred to hereinafter as ‘media content’ for the purposes of description. Individuals wishing to view/access the media content may subscribe to at least one type of subscription offered by the content provider. Though a content provider is not shown in FIG. 1A, a digital platform server 104, and a content library 106 associated with a content provider are shown in the representation 100.

[0035] The media content offered by the content provider may be embodied as streaming video content such as live streaming content or on-demand video streaming content. Individuals wishing to view/access the content may subscribe to at least one type of subscription offered by the content provider. Accordingly, the terms ‘subscriber’, ‘user’, ‘content viewer’, and ‘viewer’ as interchangeably used herein may refer to a viewer of subscribed content, which is offered by the content provider.

[0036] The representation 100 depicts the content viewer 102 controlling a user device 108 for accessing media content offered by the content provider. It is noted that the content viewer 102 may use one or more user devices, such as a smartphone, a laptop, a desktop, a personal computer, a wearable device, or any spatial computing device to view the content provided by the content provider. The content viewer 102 may have downloaded a software application 110 (hereinafter referred to as an ‘application 110’ or an ‘app 110’) corresponding to at least one content provider on the user device 108. The ‘subscriber’ 102 is also interchangeably referred to hereinafter as a ‘content viewer’ 102.

[0037] In an embodiment, the environment 100 may have a plurality of subscribers (shown as subscriber 102 in FIG. 1A) associated with a plurality of user devices (shown in FIG, 1, the user device 108). In at least some embodiments, the term ‘subscriber’ may also include one or more users in addition to the individual subscriber, for example, family members of the subscriber. To that effect, the term ‘subscriber’ as used herein may include one or more users. The user device 108 is utilized by the subscriber 102 for viewing (also includes accessing, and requesting) media content offered by the content provider. It is noted that the user device 108 is depicted as a smartphone for illustration purposes only and other suitable user devices may be used by the subscriber 102 as well. In some non-limiting examples, the user device 108 may include a smartphone, a tablet computer, a handheld computer, a wearable device, a portable media player, a gaming device, a Personal Digital Assistant (PDA), and the like.

[0038] In an illustrative example, to subscribe to the streaming content services offered by the content provider, subscribers such as the subscriber 102 may register with the content provider by creating an online account on the content provider’s portal. As a part of the account creation process, the subscriber 102 may provide personal information, such as age, gender, language preference, content preference, and any other personal preferences to the content provider. Such information may be stored in a subscriber profile along with other account information such as type of subscription, validity date of the subscription, etc., in a database (not shown in FIG. 1A) associated with the content provider. Further, user device-related data is also captured in the subscriber profile during the registration process and stored in the database. The user device-related data may include at least the screen resolution of the user device 108, processing constraints, software constraints, and the like. Further, subscriber meta-data related to the subscriber 102 may be captured regularly by the content provider upon each new login. In a non-limiting example, the subscriber meta-data may include at least subscriber location, available bandwidth data, user device-related data, and the like. The user device-related data includes at least the screen resolution of the user device 108, processing constraints, software constraints, and the like. In a non-limiting example, the subscriber profile is generated based on the subscriber meta-data by an AI or ML model for each login session or over the duration of the subscription.

[0039] Once the subscriber 102 has created the account, the subscriber 102 may access a User Interface (UI) of the mobile application or the Web application associated with the content provider to view/access content. It is understood that the user device 108 may be in operative communication with a communication network, such as the Internet, enabled by a network provider, also known as the network 112 (such as an Internet service provider (ISP) network). The user device 108 may connect to the network 112 using a wired network, a wireless network, or a combination of wired and wireless networks. Some non-limiting examples of wired networks may include the Ethernet, the Local Area Network (LAN), a fiber-optic network, and the like. Some non-limiting examples of wireless networks may include Wireless LAN (WLAN), cellular networks, Bluetooth or ZigBee networks, and the like.

[0040] The user device (such as the user device 108) may fetch a Web interface associated with the content provider over the network 112 and cause a display of the Web interface on a display screen of the user device 108. In an illustrative example, the Web interface may include a plurality of content titles corresponding to the content offered by the content provider to its subscriber (such as subscriber 102).

[0041] In one illustrative example, the content viewer 102 may access a Web interface associated with the application 110 provided by the content provider on the user device 108. It is understood that the user device 108 may be in operative communication with a cloud network 112, such as the Internet, enabled by a network provider, also known as an Internet Service Provider (ISP). The user device 108 may connect to the cloud network 112 using a wired network, a wireless network, or a combination of wired and wireless networks. Some non-limiting examples of wired networks may include the Ethernet, the Local Area Network (LAN), a fiber-optic network, and the like. Some non-limiting examples of wireless networks may include Wireless LAN (WLAN), cellular networks, Bluetooth or ZigBee networks, and the like. In one embodiment of the environment 100, the digital platform server 104 and the content library 106 associated with the content provider are provided in the cloud network 112.

[0042] The user device 108 may fetch the Web interface associated with the application 110 over the cloud network 112 and cause a display of the Web interface on a display screen (not shown) of the user device 108. In an illustrative example, the Web interface may display a plurality of content titles corresponding to the media content offered by the content provider to its consumers, i.e., the content viewers.

[0043] In an illustrative example, the content viewer 102 may select a content title from among the plurality of content titles displayed on the display screen of the user device 108. The selection of the content title may trigger a request for a playback Uniform Resource Locator (URL). The request for the playback URL is sent from the user device 108 to the digital platform server 104 associated with the content provider. In at least some embodiments, the digital platform server 104 may include at least one of a Content Management System (CMS) and a User Management System (UMS) for facilitating the streaming of digital content from the content library 106 of the content provider to a plurality of users, such as the content viewer 102. The digital platform server 104 is configured to authenticate the content viewer 102 and determine if the content viewer 102 is entitled to view the requested content. To this effect, the digital platform server 104 may be in operative communication with one or more remote servers, such as an authentication server and an entitlement server. The authentication server and the entitlement server are not shown in FIG. 1A. The authentication server may facilitate the authentication of viewer account credentials using standard authentication mechanisms, which are not explained herein. The entitlement server may facilitate the determination of the viewer’s subscription type (i.e., whether the user has subscribed to regular or premium content) and status (i.e., whether the subscription is still active or expired), which in turn may enable the determination of whether the content viewer 102 is entitled to view/access the requested content or not.

[0044] Further, the digital platform server 104 extracts an Autonomous System Number (ASN) and an IP address from the playback URL request and identifies at least one Content Delivery Network (CDN) Point of Presence (PoP), which is in the proximity of the location of the content viewer 102. As an illustrative example three CDN PoPs, shown as a CDN PoP 116A, a CDN PoP 116B, and a CDN PoP 116C in FIG. 1A, are depicted to be identified as CDN PoPs 116 in the proximity of the location of the content viewer 102.

[0045] The CDN PoP 116A, 116B, and 116C are hereinafter collectively referred to as CDN Points of Presences (PoPs) 116. The digital platform server 104 performs a check to determine if the content associated with the requested content title is cached in at least one CDN PoP from among the CDN PoPs 116. It is noted that the requested content may have been cached from the content library 106 of the content provider to one or more CDN PoPs from among the CDN PoPs 116. If the content is not cached in the CDN PoPs 116, the digital platform server 104 checks whether any other CDN or CDN PoP in the vicinity of the CDN PoPs 116 is caching the content requested with the content title or not. If the content is cached in one or more CDN PoPs from among the CDN PoPs 116 and/or in any other CDN/CDN PoP, the digital platform server 104 identifies an optimal CDN PoP taking into account, the location of the user, a content ID and performance metrics associated with the plurality of CDNs/ CDN PoPs.

[0046] In the event that the content associated with the requested content title is not cached with any of the CDNs/ CDN PoPs, the digital platform server 104 may be configured to cache the content from the content library 106 to a CDN PoP nearest to a location of the content viewer 102. The digital platform server 104 is configured to encode the content as per multiple encoding profiles and cause storage of the plurality of encoded versions of the content in the CDN PoP nearest to the location of the content viewer 102. It is noted that the content, and especially high-resolution video content, needs to be compressed to optimize bandwidth usage, and, therefore, the content is encoded (or compressed) before storage in the CDN PoP. As the digital platform server 104 is not aware of the underlying network conditions or the playback quality currently being experienced by the content viewer 102, the digital platform server 104 may encode the content using codecs associated with multiple encoding standards, such as H.264, H.265, VP8, VP9, AOMedia Video 1 (AVI), Versatile Video Coding (VVC), and the like. Further, for each encoding standard, the content may be encoded at multiple resolutions such as 360p, 480p, 720p, 1080p, and 4K/Ultra-high Definition (UHD) resolution to configure multiple sets of encoding profiles (also known as ‘encoding ladders’). The term ‘encoding profile’ as used herein corresponds to a set of streams corresponding to a content, where each stream is characterized by an encoding bitrate (r), a spatial resolution (s), a frame rate (fps), and a media playback score value (v). For example, video content may be encoded using the H.264 standard at a bit rate of 400 Kbps, a spatial resolution of 360p, a frame rate of 30 fps, and a media playback score of 75 to configure one encoded stream. Similarly, video content may be encoded using the H.264 standard at a bit rate of 500 Kbps, a spatial resolution of 480p, a frame rate of 30 fps, and a media playback score of 80 to configure another encoding stream. Multiple such streams of content encoded as per H.264 standard configure one ‘encoding profile’. Similarly, video content may be encoded using the H.265 standard at one or more resolutions, frame rates, and media playback scores to generate multiple encoded streams of video content, thereby configuring another encoding profile.

[0047] Each content to be streamed to the content viewer 102 is encoded as per multiple encoding profiles and stored in a repository associated with a content repository server (e.g., Content Delivery Network (CDN)) server 114. As mentioned above, the digital platform server 104 may encode the content as per multiple encoding profiles and cache the content in the CDN PoP (e.g., the CDN PoP 116A). The digital platform server 104 provides a playback URL identifying the network/IP address of the CDN PoP 116A in which the content is cached, to the user device 108 of the user 102. The content provider provides the playback URL and a modified manifest to the user device 108. The playback URL identifies a content repository server, such as the CDN 114 (shown in FIG. 1A), which is caching the content corresponding to a content title selected by the content viewer 102 for playback. The user device 108 is configured to generate a Hypertext Transfer Protocol (HTTP) request using the playback URL and provide the HTTP request over the communication network 112 to the CDN PoP 116A to fetch the requested content and display the content to the content viewer 102 on a display screen of the user device 108.

[0048] The user device 108 may employ adaptive bitrate (ABR) streaming to fetch the content from the CDN PoP 116A. More specifically, the user device 108 may first identify an encoding profile from among the multiple encoding profiles based on the current network condition. For example, if the network condition is poor, the user device 108 may fetch an encoded stream at a lower resolution, such as 360p or 480p, whereas the user device 108 may fetch an encoded stream at a higher resolution, such as 1080p if the network quality is excellent and if the screen size and a resolution associated with the user device 108 supports such a resolution. Further, the encoding profile may be fetched at a bitrate that minimizes or avoids buffering or stalling of the playback, or severe degradation in the quality of playback. For example, a bit rate for fetching a slice of encoded content may be reduced from 1400 Kbps to 500 Kbps if the network quality is poor. Moreover, the bitrate may be dynamically adapted as per fluctuations in the network throughput.

[0049] The aforementioned process of delivering content, though useful, is suboptimal and is associated with several drawbacks. For example, a CDN POP caches a large number of encoded streams corresponding to each content irrespective of the fact that the user device 108 may request content as per only a few encoded profiles. For example, if a user stays in a location, that has excellent network connectivity in general, then caching of encoded streams as per all available encoding profiles is sub-optimal. Similarly, if a user is known to always use a Wi-Fi connection instead of a cellular network connection for watching content, then caching content as per all available encoding profiles may be sub-optimal. Moreover, in some cases where the device switches between playing different renditions i.e., different streams of encoding profiles, the timestamp in which a switch occurs from one segment in a rendition to a subsequent segment in a different rendition has to match to provide a seamless experience. In some cases, the user device 108 may avoid playing streams of a rendition that were previously loaded in a buffer of the user device 108 to match the playback rate (i.e., starting point) of the other rendition to which the user device 108 switched for playing content due to some network issues. Further, such switching increases the processing of content providers and also increases the consumption of bandwidth for streaming the content to the user device 108 of the content viewer 102.

[0050] FIG. 1B is an example representation of adaptive bitrate streaming 150 of content from a content provider to the content viewers, in accordance with an example scenario of a prior art. The CDN POP (e.g., 116A) caches a large number of encoded streams corresponding to each content irrespective of the fact that the user device 108 may request content as per only a few encoded profiles. A set of video renditions 152 and a set of audio renditions 154 that are cached by the CDN POP (e.g., 116A) are shown in Figure 1B. The set of video renditions 152 includes video renditions 160, 162, 164, 166, and 168, and the set of audio renditions 154 includes audio renditions 170, and 172. In adaptive bitrate streaming 150 of the content, segments in each selected video rendition 160, 162, 164, and 166 form a video path 174, and the segments in each selected audio rendition 170, and 172 form an audio path 176. It should be noted that the video rendition 168 is not used in adaptive bitrate streaming 150, as the user device is not suitable for high bitrates. In the present disclosure, encoding of subsequent streaming of the media content that were already viewed by the content viewer is optimized based on the calculated media score and the viewer behavior data, and only renditions specific to the content viewer are cached in the CDN POP (e.g., 116A).

[0051] The system 118 via the application 110 installed in the user device 108 monitors the behavior of the content viewer 102 at the time of streaming the content and generates viewer behavior data based on the activities performed by the content viewer 102. The representation 100 also has a system 118 associated with the digital platform server 104. The viewer behavior data is stored in one or more databases associated with the system 118. The viewer behavior data is not limited to pausing, skipping advertisements, seeking to required content, etc. In an embodiment, the system 118 is configured to access information related to viewer behavior data corresponding to the content viewer 102 from the database associated with the system 118. In an example, the information related to the viewer behavior data may include at least one of the information related to the content viewer 102 indicating a content preference of the content viewer 102, a language preference of the content viewer 102, a cast preference of the user, requested media content, gender, age group, location, e-mail identifier, device identifier, IP address, user profiles, user interaction, device log data, messaging platform information, social media interactions, browser information, time of the day, device Operating System (OS) and network provider.

[0052] In one embodiment of the invention, the system 118 may be implemented as a part of digital platform server 104 in the cloud network 112, instead of on the client side, that is at the user device 108 of the content viewer 102. The system 118 is used to calculate an average media playback score of the media content accessed by the content viewer 102 using a machine learning model 120. It should be noted that the average media playback score calculated by the machine learning model 120 is specific to the content viewer. In conventional systems, the average media playback score of the media content is calculated on the client side, this creates a delay in the score calculation process. In the present invention, as the system 118 can be implemented in the cloud network 112, the delay in calculating the average playback score can be prevented. Further, the score parameters, score criteria, etc., that are used in the scoring process can be easily changed on the cloud side compared to the client-side scoring as in conventional systems. The machine learning model 120 determined an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data. The historical playback data indicates information related to a set of encoding profiles corresponding to each media content of a plurality of media contents viewed by a content viewer 102 over a predefined time period. The viewer behavior data indicates information related to a behavior of the content viewer 102 during the playback of each media content.

[0053] In various non-limiting examples, the machine learning model 120 may include but is not limited to at least one of a Linear regression model, a Support Vector Machine (SVM) model, a decision tree-based model, a random forest model, a reinforcement learning model, a supervised learning model, an unsupervised learning model, a logistic regression model, a K-Nearest Neighbor (KNN) model, a Naive Bayes model, a neural network based model, a gradient boosting model, a Principal Component Analysis (PCA) model, a decision tree regression model, an ordinary least squares algorithm based model, a polynomial regression model, an Adaptive Boosting (AdaBoost) model, an Apriori algorithm based model, a classification model, a clustering data model, a deep belief network-based models, and so on.

[0054] In one embodiment, the system 118 receives a media content request for first media content (i.e. subsequent media content to be watched by the content viewer 102) from the content viewer 102. The system 118 accesses the first media content from the Content Delivery Network (CDN) associated with the Digital Platform Server (DPS) 104, based, at least in part, on the media content request. The Digital Platform Server (DPS) 104 generates a set of adapted encoding profiles for the first media content based, at least in part, on the average media playback score. The Digital Platform Server (DPS) 104 generates and transmits a manifest file to the content viewer 102 based, at least in part, on the set of adapted encoding profiles. In one embodiment, the system 118 can access the first media content from the content library 106, and ingest the first media content to the content repository server, i.e., the CDN 114. In some embodiments, the first media content accessed from CDN 114 is re-transcoded for the adapted encoding profiles. In case the adapted encoding profiles are mere re-arrangements of existing encoding profiles, thus re-transcoding may not be required.

[0055] The encoding of “subsequent streaming of the media content” (also referred to as first media content) is optimized based on the calculated media score and the viewer behavior data. The quality score provides the assessment result representing the encoding quality of the media content previously accessed by the content user. The encoding of the first media content is performed based on the average media playback score. This provides optimized adaptively encoded media content to the content viewer 102. More specifically, based on the average media playback score, the system 118 determines a set of optimal bitrates for the first media content based, at least in part, on the average media playback score. The set of adapted encoding profiles for the first media content is generated based, at least in part, on the set of optimal bitrates.

[0056] Various embodiments of the present invention provide the system and a method for providing adaptive encoded streaming content to content viewers. The encoding of the streaming content is optimized to make efficient use of resources on the CDN side and incur less processing load on the content viewer’s device side. Moreover, optimized encoding of streaming content ensures that the switching of the content streams from one rendition to another is performed in a manner such that a content viewer 102 does not observe any drastic change in content viewing experience before and after switching content stream renditions. Further, the streaming content is personalized based on camera metadata in real-time to provide an enjoyable and seamless content viewing experience to the users. An example system 200 for optimally and adaptively encoding content for content viewers is explained next with reference to FIG. 2.

[0057] FIG. 2 illustrates a block diagram of a system 200 configured to adaptively encode media content (i.e., raw media content 234 files, also referred to as first media content) to generate encoded media content, in accordance with an embodiment of the present disclosure. It is noted that the system 200 is identical to the system 118 of FIG. 1A. In some embodiments, the system 200 is embodied as a cloud-based and/or SaaS-based (software as a service) architecture.

[0058] The system 200 is depicted to include a processor 202 (also referred to as processing module 202), a memory module 204, an input/output (I/O) module 206, and a communication module 208. It is noted that although the system 200 is depicted to include the processing module 202, the memory module 204, the input/output (I/O) module 206, and the communication module 208, in some embodiments, the system 200 may include more or fewer components than those depicted herein. The various components of the system 200 may be implemented using hardware, software, firmware, or any combination thereof. Further, it is also noted that one or more components of the system 200 may be implemented in a single server or a plurality of servers, which are remotely placed from each other.

[0059] In one embodiment, the processing module 202 may be embodied as a multi-core processor, a single-core processor, or a combination of one or more multi-core processors and one or more single-core processors. For example, the processing module 202 may be embodied as one or more of various processing devices, such as a coprocessor, a microprocessor, a controller, a Digital Signal Processor (DSP), a processing circuitry with or without an accompanying DSP, or various other processing devices including integrated circuits such as, for example, an Application Specific Integrated Circuit (ASIC), a field programmable gate array (FPGA), a microcontroller unit (MCU), a hardware accelerator, a special-purpose computer chip, or the like. In one embodiment, the memory module 204 is capable of storing machine-executable instructions, referred to herein as platform instructions 210. Further, the processing module 202 is capable of executing the platform instructions 210. In an embodiment, the processing module 202 may be configured to execute hard-coded functionality. In an embodiment, the processing module 202 is embodied as an executor of software instructions, wherein the instructions may specifically configure the processing module 202 to perform the algorithms and/or operations described herein when the instructions are executed. For example, in at least some embodiments, each component of the processing module 202 may be configured to execute instructions stored in the memory module 204 for realizing respective functionalities, as will be explained in further detail later.

[0060] In an embodiment, the I/O module 206 may include mechanisms configured to receive inputs from and provide outputs to an operator of the system 200. The term ‘operator of the system 200’ as used herein may refer to one or more individuals, whether directly or indirectly associated with managing the digital OTT platform on behalf of the content provider. To enable the reception of inputs and provide outputs to the system 200, the I/O module 206 may include at least one input interface and/or at least one output interface. Examples of the input interface may include, but are not limited to, a keyboard, a mouse, a joystick, a keypad, a touch screen, soft keys, a microphone, and the like. Examples of the output interface may include but are not limited to, a display such as a light-emitting diode display, a thin-film transistor (TFT) display, a liquid crystal display, an Active-Matrix Organic Light-Emitting Diode (AMOLED) display, a microphone, a speaker, a ringer, and the like. In an example embodiment, at least one module of the system 200 may include an I/O circuitry (not shown in FIG. 1A) configured to control at least some functions of one or more elements of the I/O module 206, such as, for example, a speaker, a microphone, a display, and/or the like. The processing module 202 of the system 200 and/or the I/O circuitry may be configured to control one or more functions of the elements of the I/O module 206 through computer program instructions, for example, software and/or firmware, stored on a memory, for example, the memory module 204, and/or the like, accessible to the processing module 202 of the system 200.

[0061] In some embodiments, the processing module 202 and/or other components of a content encoder module 214 may access the storage module 204 using a storage interface (not shown in FIG. 2). The storage interface may include, for example, an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a RAID controller, a SAN adapter, a network adapter, and/or any component providing the processing module 202 and/or the content encoder module 214 with access to the storage module 204.

[0062] The memory module 204 is any computer-operated hardware suitable for storing and/or retrieving data. In one embodiment, the memory module 204 is configured to store a set of scoring rules, a set of viewer cohorts, encoding profiles for different scores, playback event logs for a plurality of users, reconstruction rules/policies, and the like. The memory module 204 may include multiple storage units such as hard drives and/or solid-state drives in a Redundant Array of Inexpensive Disks (RAID) configuration. In some embodiments, the memory module 204 may include a Storage Area Network (SAN) and/or a Network Attached Storage (NAS) system. In one embodiment, the memory module 204 may correspond to a distributed storage system, wherein individual databases are configured to store custom information, such as user playback event logs.

[0063] The various components of the system 200, such as the processing module 202, the memory module 204, the I/O module 206, the communication module 208, and the memory module 204 are configured to communicate with each other via or through a centralized circuit system 212. The centralized circuit system 212 may be various devices configured to, among other things, provide or enable communication between the components of the system 200. In certain embodiments, the centralized circuit system 212 may be a central Printed Circuit Board (PCB) such as a motherboard, a main board, a system board, or a logic board. The centralized circuit system 212 may also, or alternatively, include other Printed Circuit Assemblies (PCAs) or communication channel media.

[0064] In some embodiments, the system 200 is associated with a database 218. The database 218 may be integrated within the memory module 204 as well. For example, the memory module 204 may include one or more hard disk drives as the database 218. It is understood that the media content received from the content ingestion server may correspond to live streaming content or video-on-demand content. In a non-limiting example, the media content such as media content may be stored in the database 218 associated with the system 200. In one embodiment, the media content may correspond to a video received from an image capture device, such as a camera. The received media content may include a plurality of image frames and a plurality of audio frames, for example, a sequence of image frames and a sequence of audio frames. In addition to the main playback content, in some example embodiments, the plurality of image frames that are received related to the media content may include image frames capturing content captured from multiple camera angles of the same scene or different picture exposure options. Similarly, in other example embodiments, the plurality of audio frames that are received related to the media content may include audio frames capturing content captured from multiple audio sources within the same environment where the plurality of image frames is captured.

[0065] As such, a scene of the media content may have multiple captures or in simple terms ‘multiple image frames’ that can be provisioned for the content viewer 102 consecutively to depict the said scene. In one illustrative example, an action thriller movie may include a scene in which an actor pursued by two different groups of people is seen hanging precariously from a cliff and may be captured from different camera angles, for example, a first camera angle in which a first group of people is seen trying to help the actor and a second camera angle in which the second group of people is trying to push the actor down the cliff. Information related to multiple camera angles for a scene may be referred to herein as ‘camera metadata’ and may be received as part of the media content. For example, sets of image frames that correspond to one scene are arranged in parallel, and information such as several such sets, several image frames in each set, a brief description related to each set (e.g., cast, plot, angle information, exposure information, etc.) and the like, configure the camera metadata, which may be received as part of the media content in some cases.

In some example embodiments, the communication module 208 receives the historical playback data and the online viewer behavior data related to a viewer. The term ‘online viewer behavior’ (also referred to as ‘viewer behavior data’) as used herein primarily refers to characteristics or attributes of an individual viewer and may include information related to historical data related to past video content views, social media interactions, viewer audio preference, viewer audio experience, viewer preferences, and the like. Additionally, the online viewer behavior may also include personal information of the content viewer 102 (e.g., name, age, gender, user interaction, device log data, and the like), cart information, URLs, transaction information such as, and the like. Such information may be received from web servers hosting and managing third-party websites, remote data gathering servers tracking viewer activity on a plurality of enterprise websites, a plurality of interaction channels (for example, websites, native mobile applications, social media, etc.), and a plurality of devices. In addition, the online viewer behavior data may also include the viewer metadata received along with the request for playback URL such as, device identifier (ID), Internet Protocol (IP) address, geolocation information, browser information, time of the day,, device identifiers, user profiles, social media interactions, user device information such as, device type, device Operating System (OS), device browser, browser cookies, and the like along with subscriber information such as, age group, gender, language preference, content preference, cast preference and any other preference of the content viewer 102 provided as a part of registration.

[0066] The historical playback data indicates information related to a set of encoding profiles corresponding to each media content of a plurality of media contents viewed by a content viewer 102 over a predefined time period. The term ‘historical playback data’ as used herein primarily refers to characteristics or attributes related to the set of encoding profiles used for encoding the media content specific to the content viewer 102. Over the predefined time period (e.g. one month), the content viewer 102 would have accessed the plurality of media contents and each media content is encoded with the set of encoding profiles, before delivering it to the content viewer. The historical playback data corresponds to all the sets of encoding profiles of the plurality of media contents accessed over the predefined time period using one or more dynamic playback metrics. The term “predefined time period” refers to the most recent playback data viewed by the content viewer. The term ‘media content’ may refer to any media content including but not limited to ‘video content’, ‘audio content’, ‘gaming content’, ‘textual content’, ‘pictorial content’, ‘image content’, ‘webpage’, and any combination of such content offered in an interactive or non-interactive form. The dynamic playback metrics indicate information related to at least one system parameter and at least one network parameter used in the set of encoding profiles. The dynamic playback metrics are specific to the user and the encoding of media content is based on such dynamic playback metrics. In one example, the system parameter may include the capacity of the user device 108 to receive, process, and deliver the media content requested by the content viewer 102 of the user device 108. The system parameter may include but is not limited to the processing speed of the user device 108, bitrate, quality of video content, etc. The network parameter may include the capacity of the network 112 to receive media content from the content library 106 and deliver the media content to the user device 108. In one example, the network parameter may include but is not limited to bandwidth, latency, packet loss, firewalls, transfer protocols, etc. The dynamic playback metrics will be used by the machine learning model 120 to determine which encoding profile best suits the content viewer 102 while delivering the first media content.

[0067] The communication module 208 may be configured to forward such inputs (i.e., the historical data and the behavior data) to the processing module 202, or more specifically, to the content encoder module 214. The content encoder module 214 in conjunction with the instructions stored in the memory module 204 is configured to process such inputs (i.e., the media content and the online viewer behavior data) to generate a media rendition record. The media rendition record includes information related to streaming the video content such as, but not limited to, a number of image frames in the video content, a number of image portions in each image frame, the stream of encoded image portions, and the sequence number for each image portion in an image frame, and the like.

[0068] In an embodiment, the communication module 208 may include a communication circuitry such as for example, a transceiver circuitry including an antenna and other communication media interfaces to facilitate communication between the system 200 and one or more remote entities such as, the content repository server, i.e., the CDN 144 over a communication network (not shown in FIG. 2).

[0069] The content encoder module 214 has a data-preprocessing sub-module 222, a model training sub-module 224, and a score generation sub-module 226, each configured to encode each media content based on the set encoding profiles. The term ‘encoding profile’ as used herein corresponds to a set of streams corresponding to the content, where each stream is characterized by an encoding bitrate (r), a spatial resolution (s), a frame rate (fps), and a media playback score value (v). Each encoding profile may contain at least one encoding standard at a resolution, an encoding bitrate, a spatial resolution, a frame rate, a media playback score value, etc.

[0070] In at least one embodiment, the media encoding module 214 includes suitable logic and/or is configured to adaptively encode the raw media content 234 based on one or more media content-related characteristics. The adaptive encoding process is collectively carried out by the various sub-modules of the media encoding module 214. It is noted that while adaptively encoding the raw media content 234, either one, a combination, or all of the sub-modules of the media encoding module 214 can be utilized by the system 200, as per the requirements of the content provider. In a non-limiting example, the decision to use one or more of the sub-modules of the media encoding module 214 may be based on the subscriber profile of the subscriber 102. In another non-limiting example, the decision to use one or more of the sub-modules of the media encoding module 214 may be based on the analysis of a plurality of subscriber profiles associated with the plurality of subscribers. This analysis may be performed for a specific region as a first set of sub-modules may be used for the India region, while a second set of sub-modules may be used for a different/non-Indian region. Other analysis approaches may also be used while determining which sub-modules to use during the adaptive encoding process and the same would be covered under the scope of the present disclosure. Such plurality of subscribers based on the specific region are also referred to as region-specific viewer cohorts.

[0071] The data-preprocessing sub-module 222 is configured to access the historical playback data and the viewer behavior data from a database (not shown in FIG. 2) associated with the system 200. In one embodiment, the database 200 is configured to store the historical playback data and the viewer behavior data. The historical playback data indicates information related to the set of encoding profiles corresponding to each media content of a plurality of media contents viewed by the content viewer 102 over the predefined time period using one or more dynamic playback metrics. The viewer behavior data indicates information related to the behavior of the content viewer 102 during playback of each media content.

[0072] In at least some embodiments, the operator of the system 200 may use the I/O module 206 to provide inputs to train the machine learning model 220 stored in the database 216. The model training sub-module 224 is used to train a machine learning model 220 (e.g. adaptive encoding model) using the input received from the I/O module 206. It is noted that the machine learning model 220 is identical to the machine learning model 120 of FIG. 1A. For example, the operator of the system 200 may provide inputs related to a plurality of image portions and corresponding encoding profiles preferred by various viewers for viewing content, to the machine learning model 220. It is understood that the term ‘encoding profiles’ refers to various sets of encodings indicating a bit rate and a resolution at which a media content that is being streamed by the content viewer 102 can be encoded. The inputs provided to the machine learning model 220 may also include information related to a playback quality experienced by a viewer after a selection of a particular encoding profile. The machine learning model 220 is trained using such inputs to accurately determine and assign a tag for optimal encoding of corresponding image portions given viewer preference (i.e., based on online viewer behavior) and a set of user/network attributes. The operator of the system 200 may also use the I/O module 206 and the model training sub-module 224 to tune the weights of parameters of the machine learning model 220. More specifically, viewer behavior data including data related to the plurality of content viewers may be used to train a machine learning model 220.

[0073] In various examples, the viewer behavior data may include information related to online behavior patterns, network attributes, age, gender, location, network provider, access patterns, content genre preferences, language preferences, and the like related to the plurality of content viewers. Particularly, the machine learning model 220 can be configured to classify the plurality of content viewers that share similarities with each other into different categories or viewer cohorts based on the viewer behavior data. In other words, the machine learning model 220 is configured to generate a plurality of viewer cohorts based, at least in part, on the viewer behavior data. The term ‘viewer cohort’ refers to a group of content viewers that share similar behavior and network attributes. To that end, the content viewers present within the same cohort share various other similarities such as demographics, location, bandwidth, similar user devices with each other, and the like. In various non-limiting examples, the machine learning model 220 can be a classification or sorting type machine learning model 220.

[0074] The score generation sub-module 226 is configured to determine via the machine learning model 220, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data. The determination of the average media playback score for the plurality of media contents is explained in detail in FIG. 7.

[0075] Further, the system 200 is communicably coupled with a database 216. The database 232 may be incorporated in the system 200 or maybe an individual entity connected to the system 200 or maybe a database stored in cloud storage. The database 232 is configured to store the raw media content 234 and the machine learning model 220. In various non-limiting examples, the database 232 may further include various instructions or firmware data essential for the operation of the system 200 (not shown).

[0076] In at least one embodiment, the system 200 may be embodied as a server system or machine in operative communication with a content repository server such as the CDN 114, as shown in FIG. 1A. Alternatively, the system 200 may be included within the CDN 114 or maybe communicably accessible via API plugins to the CDN 144. The system 200 may be implemented as a part of a CDN such as an origin CDN server, a public CDN server, a private CDN server, a Telcom CDN server, an Internet Service Provider (ISP) CDN server, or a CDN POP server. In some embodiments, the system 200 may be deployed within the digital platform server 104 (as shown in FIG. 1A) or may be placed external to, and in operative communication with, the digital platform server 104. In some embodiments, the system 200 and the digital platform server 104 can be configured in the cloud network 112. The system 200 is configured to generate the average media playback score and generate adapted encoding profiles for subsequent encoding of the media data, based on the generated average media playback score. The digital platform server 104 adaptively encodes the streaming content based on the adapted encoding profiles before providing the streaming content to each content viewer 102. The system 200 may be implemented as part of the digital platform server 104, such as the digital platform server 104 shown in FIG. 1A.

[0077] The memory module 204 may be embodied as one or more non-volatile memory devices, one or more volatile memory devices, and/or a combination of one or more volatile memory devices and non-volatile memory devices. For example, the memory module 204 may be embodied as semiconductor memories, such as flash memory, mask ROM, PROM (programmable ROM), EPROM (erasable PROM), RAM (random access memory), etc., and the like. In at least some embodiments, the memory module 204 may store a machine learning model (not shown in FIG. 2). The machine learning model 220 is configured to facilitate the assignment of tags for different image portions in an image frame for encoding the image frame as will be explained in further detail later.

[0078] Further, it is noted that although the present explanation has been provided with regards to raw media content 234 for a content provider, the various embodiments of the present disclosure are applicable for live streaming media content as well and, therefore, the same should construed as a limitation to the scope of the present disclosure.

[0079] FIG. 3 illustrates a simplified block diagram representation 300 for generating a set of adapted encoding profiles using a machine learning model such as machine learning model 302, in accordance with an embodiment of the present disclosure.

[0080] The representation 300 depicts components such as the machine learning model 302, a database 304, a ladder prediction module 306, an adjustment module 308, a profile generation module 310, and a content encoder module 312. It is noted that the machine learning model 302, and the content encoder module 312 are identical to the machine learning model 220, and the content encoder module 214 of FIG. 2. An example of the machine learning model 302 can be a media playback score model and an adaptive encoding model. The machine learning model 302 is not limited to the media playback score model and the adaptive encoding model, other models can also be included to achieve the adaptive encoding profiles for the media content to be delivered to the content viewer 102. The machine learning model 302 can be a single model that performs all the functions for generating adaptive encoding profiles or can be a set of individual models, each performing one or more functions to achieve adaptive encoding profiles. The media playback score model can be configured to determine the average media playback score of the media content, while the adaptive encoding model can be configured to generate the adaptive encoding profiles based on the average media playback score of the media content. The content encoder module 312 encodes the first media content based on the adaptive encoding profiles, before delivering the first media content to the content viewer 102.

[0081] The machine learning model 302 includes a behavior profile determination module 314, a bitrate aggregation module 316, and a media playback score aggregation module 318.

[0082] The machine learning model 302 receives the historical playback data 320 and the viewer behavior data 322 from the database (e.g. database 216) associated with the system 200 of FIG. 2. The machine learning model 302 is used to determine the average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data. The profile generation module 310 generates the set of adapted encoding profiles for the first media content based, at least in part, on the average media playback score.

[0083] The bitrate aggregation module 316 computes a final aggregated bitrate for the plurality of media content based, at least in part, on aggregating a corresponding aggregated bitrate for each media content of the plurality of media content. Initially, the bitrate aggregation module 316 determines a content identifier (ID) of the plurality of media content. The bitrate aggregation module 316 also determines a plurality of playback tags, wherein each playback tag of the plurality of the playback tags corresponds to each encoding profile of the set of encoding profiles. Each encoding profile indicates an individual bitrate used by the content viewer 102 during playback of a subset of media segments from the set of media segments.

[0084] The bitrate aggregation module 316 determines a corresponding bitrate for each media segment of the set of media segments based, at least in part, on the content ID, the playback tag, and the historical playback data. The bitrate aggregation module 316 is configured to compute an aggregated bitrate for each encoding profile of the set of encoding profiles based, at least in part, on aggregating the corresponding bitrate for each media segment of the set of media segments. A final aggregated bitrate for the plurality of media content is computed based, at least in part, on aggregating the corresponding aggregated bitrate for each media content of the plurality of media content. The standard resolutions and the bit rates are usually stored as a set of available bitrates 330 in the database 304 and the assigned scores may include a resolution value or bitrate based on the set of available bitrates 330.

[0085] The behavior profile determination module 314 determines via a machine learning model 302, a behavior profile for the content viewer 102 based, at least in part, on the viewer behavior data. The behavior profile indicates a behavior of the content viewer 102 when a particular media content is streamed using a particular encoding profile. The media playback score aggregation module 318 then determines an average media playback score for the plurality of media contents based, at least in part, on the final aggregated bitrate and the behavior profile. The media playback score aggregation module 318 is configured to utilize the machine learning model 302 to assign a score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data.

[0086] In an embodiment, the ladder prediction module 306 is configured to update the media content encoding parameter and the bitrate ladder for the media content based on the darkness level-specific parameters. Further, the ladder prediction module 306 is configured to compute the final values of the media content encoding parameter and the bitrate ladder. Then, the ladder prediction module 230 determines a cumulative encoding ladder based at least in part on the final media content encoding parameter value and the final bitrate ladder value. This cumulative encoded ladder is transmitted to the adjustment unit 412.

[0087] In a non-limiting example, the adjustment unit 412 may include one or more testing teams who are responsible for manually watching the encoded media content i.e., encoded based on the cumulative encoding ladder. In some embodiments, semi-automated testing techniques can be used for testing the encoded content. The one or more testing teams may then identify one or more issues with the encoded media content based at least on a quality assurance policy of the content provider. These issues may then be used as a basis for providing a feedback (see, 324) to the machine learning model 302 in order to improve the existing machine learning model 302.

[0088] The database 304 is configured to store a set of viewer cohorts 326, a set of scoring rules 328, and a set of available bitrates 330. The one or more components of the representation 300 can access the database 304, to perform processes related to generating media playback scores, encoding profiles, etc. In at least one example embodiment, the media playback score aggregation module 318 is configured to assign a score for the plurality of media contents based, at least in part, on the cohort of the viewer (i.e., the viewer cohort) and the set of scoring rules 328.

[0089] In one embodiment, the set of scoring rules 328 includes a comparison score determined for the first media content based, at least in part, on comparing the average media playback score with a predefined reference media quality score. The predefined reference media quality score denotes the minimum reference value of the media quality score of the first media content that is essential for delivering a satisfied watching experience by the user. The predefined reference media quality score can be stored along with the set of scoring rules 328. The content encoder module 312 encodes at least one of the plurality of video segments and the plurality of audio segments to generate one or more first media content renditions based, at least in part, on the comparison score.

[0090] The profile generation module 310 is configured to generate the set of adapted encoding profiles for the first media content based, at least in part, on the final values of the media content encoding parameter and the bitrate ladder received from the ladder prediction module 306. More specifically, the profile generation module 310 unmuxes the first media content into at least one of a plurality of video segments and a plurality of audio segments based, at least in part, on predefined criteria. The predefined criteria are not limited to separating the segments based on bitrates, time, resolution, etc.

[0091] In one embodiment, the profile generation module 310 accesses a set of available bitrates 330 from the database 304. The profile generation module 310 selects the set of optimal bitrates from the set of available bitrates 330 based, at least in part, on the average media playback score being lower than the predefined reference media quality score. In such a scenario, the selected set of optimal bitrates indicates lower quality bitrates. In another embodiment, the profile generation module 310 selects the set of optimal bitrates from the set of available bitrates 330 based, at least in part, on the average media playback score being higher than the predefined reference media quality score. In such a scenario, the selected set of optimal bitrates indicates higher quality bitrates.

[0092] The profile generation module 310 is configured to set at least one video identity (ID) in each video segment of the plurality of video segments to identify the plurality of video segments from the plurality of audio segments. The plurality of video segments and the plurality of audio segments corresponding to each adapted encoding profile of the set of adapted encoding profiles are separately stored in the database. This allows the profile generation module 310 to select the set of optimal bitrates for at least one of the plurality of video segments and the plurality of audio segments based, at least in part, on the average media playback score. The set of adapted encoding profiles for the first media content is generated by profile generation module 310 based, at least in part, on the set of optimal bitrates. Thus, using the approach of the present invention, the number of the set of adapted encoding profiles for the first media content can be reduced and only the optimal set of adapted encoding profiles are used for encoding, thereby reducing the computational resources required during the encoding process.

[0093] It should be noted that in HTTP Live Streaming (HLS), the bandwidth value should be the maximum of the sum of the optimal bitrates of the plurality of video bitrate and the plurality of the audio bitrate. The same video stream (e.g. 1080p) can be linked to the plurality of the audio bitrate to generate a multiple renditions with the same 1080p video. The bandwidth value of each rendition might be different from each other due to the variation of audio bitrates. Using the video ID in each of the plurality of the video bitrates, redundancy of renditions can be avoided.

[0094] The content encoding module 312 encodes the first media content based on the adapted encoding profiles. In some embodiments, the generation of the set of adapted encoding profiles includes at least encoding at least one of the plurality of video segments and the plurality of audio segments to generate one or more first media content renditions based, at least in part, on the set of adapted encoding profiles, and facilitating a storage of the one or more first media content renditions in the CDN. Then, a manifest file 332 is generated by the content provider and transmitted to the user device 108 of the subscriber 102. The manifest file 332 includes content playback URLs corresponding to various available resolutions, a CDN PoP identifier, and the like. Further, the user device 108 parses the manifest file 332 to access the requested content from the CDN PoP.

[0095] In some embodiments, upon receiving a playback URL request for the media content from the content viewer 102, the machine learning model 302 accesses the viewer profile of the content viewer 102 from the database 304. Then, the machine learning model 302 determines the viewer cohort for the content viewer 102 from the set of viewer cohorts 326 based, at least in part, on a content viewer profile. Further, the machine learning model 302 facilitates the transmission of the corresponding media rendition record for the determined viewer cohort to the content viewer 102. The machine learning model 302 may access profiles of viewers from the set of viewer cohorts 326 of the database 304 to identify a cohort for a viewer such as the content viewer 102. In one embodiment, the machine learning model 302 classifies each viewer into a viewer cohort based, at least in part, on the viewer behavior data.

[0096] FIG. 4A is a schematic representation 400 illustrating an example encoding profile used by the system 200 for encoding streaming media content for content viewers, in accordance with an embodiment of the invention. The representation 400 shows three encoded streams 402, 404, and 406 of the media contents. The media content represents one of the plurality of media contents accessed or viewed by the content viewer 102 over a predefined period of time. The encoding profile includes one or more encoded streams 402, 404, and 406 of the media contents. Each encoded stream includes one or more media segments (both video and audio segments), for example, Segment 1, Segment 2, Segment 3…Segment N is collectively referred to as the media segments. For example, media content may be encoded using the H.264 standard at a bitrate of 400 Kbps, a spatial resolution of 360p, a frame rate of 30 fps, and a media playback score of 75 to configure one encoded stream 402. Similarly, media content may be encoded using the H.264 standard at a bitrate of 500 Kbps, a spatial resolution of 480p, a frame rate of 30 fps, and a media playback score of 70 to configure another encoding stream 404. The media content may be encoded using the H.264 standard at a bitrate of 700 Kbps, a spatial resolution of 480p, a frame rate of 30 fps, and a media playback score of 80 to configure another encoding stream 406. Multiple such streams 402, 404, and 406 of the media contents encoded as per H.264 standard configure one ‘encoding profile’. Similarly, video content may be encoded using the H.265 standard at one or more resolutions, frame rates, and media playback scores to generate multiple encoded streams of video content, thereby configuring another encoding profile. Depending upon factors like underlying network conditions and/or attributes associated with a content viewer’s device used for content playback, an appropriate encoding profile is chosen for streaming content from the CDN to the content viewer’s device.

[0097] FIG. 4B is a schematic representation 420 illustrating an example adapted encoding profile used by the system 200 for adaptive encoding of the first media content to be streamed to the content viewer 102, in accordance with an embodiment of the invention. The first media content represents one of the plurality of first media contents that has been requested recently for accessing or viewing by the content viewer 102. The system 200 adaptively and optimally encodes the first media content over the encoding performed on the media content as discussed in FIG. 4A.

[0098] The first media content is unmuxed (i.e., split) into at least one of the plurality of video segments and the plurality of audio segments based, at least in part, on the predefined criteria. The system 200 is configured to set at least one video identity (ID) in each video segment of the plurality of video segments to identify the plurality of video segments from the plurality of audio segments. The plurality of video segments and the plurality of audio segments corresponding to each adapted encoding profile of the set of adapted encoding profiles are separately stored in the database. This allows the system 200 to select the set of optimal bitrates for at least one of the plurality of video segments and the plurality of audio segments based, at least in part, on the average media playback score. The set of adapted encoding profiles for the first media content is generated by system 200 based, at least in part, on the set of optimal bitrates. Thus, using the approach of the present invention, the number of the set of adapted encoding profiles for the first media content can be reduced and only the optimal set of adapted encoding profiles are used for encoding.

[0099] As shown in FIG. 4B, the first media content is unmuxed into the plurality of video segments and the plurality of audio segments. The encoding video stream 426 includes a plurality of video segments 426 (Segment 1, Segment 2, Segment 3…Segment N, collectively referred to as the video segments), and each of the video segments 426 has a respective segment ID, for example, V1S1, V1S2, V1S3… V1SN. The encoding video stream 428 includes a plurality of video segments 428 (Segment 1, Segment 2, Segment 3…Segment N, collectively referred to as the video segments), and each of the video segments 428 has a respective segment ID, for example, V2S1, V2S2, V2S3… V2SN. The encoding video stream 430 includes a plurality of video segments 430 (Segment 1, Segment 2, Segment 3…Segment N, collectively referred to as the video segments), and each of the video segments 430 has a respective segment ID, for example, V3S1, V3S2, V3S3… V3SN.

[0100] Encoding audio stream 432 includes a plurality of audio segments A1S1 to A1SN, encoding audio stream 434 includes a plurality of audio segments A2S1 to A2SN, and encoding audio stream 436 includes a plurality of audio segments A3S1 to A3SN. Herein, N is a non-zero natural number. The encoding audio streams 432, 434, and 436, each can be combined with at least one of the plurality of video segments 426, 428, and 430 to generate an optimized encoding profile using the set of optimized bitrates of the plurality of video segments 426, 428, and 430 and the plurality of audio segments 432, 434, and 436. For example, the first media content may be encoded using the H.264 standard at an optimal bitrate of 400 Kbps, a spatial resolution of 360p, a frame rate of 30 fps, and a media playback score of 75 to configure one encoded stream. The system 200 determines the set of optimal bitrates for at least one of the plurality of video segments and the plurality of audio segments based, at least in part, on the average media playback score. Thus, the optimal bitrate of 450 kbps includes the sum of bitrates of the plurality of video segments (e.g. 400 Kbps) and the plurality of audio segments (e.g. 50 Kbps). The set of adapted encoding profiles is generated for the first media content based, at least in part, on the set of optimal bitrates.

[0101] Similarly, the first media content may be encoded using the H.264 standard at a bitrate of 500 Kbps, a spatial resolution of 480p, a frame rate of 30 fps, and a media playback score of 70 to configure another encoding stream. The optimal bitrate of 500Kbps includes the sum of bitrates of the plurality of video segments (e.g. 420 Kbps) and the plurality of audio segments (e.g. 80 Kbps). The set of adapted encoding profiles is generated for the first media content based, at least in part, on the set of optimal bitrates.

[0102] The media content may be encoded using the H.264 standard at a bitrate of 700 Kbps, a spatial resolution of 480p, a frame rate of 30 fps, and a media playback score of 80 to configure another encoding stream. The optimal bitrate of 480Kbps includes the sum of bitrates of at least the plurality of video segments (e.g. 479 Kbps) and the plurality of audio segments (e.g. 1 Kbps). It is pertinent to note that in some implementations, only the plurality of video segments may be optimized since the audio segments are generally leaner or of lower size. Thus, in such implementations, the optimization process for the audio segments can be ignored since the impact of such optimizations may turn out to be of lower significance when compared with the processing resources required to perform the same. The set of adapted encoding profiles is generated for the first media content based, at least in part, on the set of optimal bitrates.

[0103] In another embodiment, the system 200 is configured to generate a modified manifest or a new manifest related to the content requested by the subscriber 102 based on the original manifest and the one or more Ad contents. The term ‘modified manifest’ or ‘new manifest’ as used herein refers to an original manifest that is customized to add information related to the fetched advertisement content based on behavior attributes of a cohort. More specifically, the system 200 is configured to integrate URL information related to the one or more advertisement content within the original manifest to generate the modified manifest or the new manifest for the content viewer 102. The system 200 is configured to insert segments of the one or more Ad content among the plurality of segments related to the content based on corresponding Ad position markers. In general, the system 200 appends the modified manifest by inserting segments related to each Ad content in slots identified by the corresponding Ad position markers (i.e., one or more Ad position markers determined based on the dynamic tolerance threshold of the content viewer 102) within the content. It is noted that whenever encoded content segments are linked in the manifest file, an encoded content record or a media content record is also provided in the manifest. The encoded content record provides a sequence for the encoded content segments. This sequence helps the electronic device to understand in what sequence the encoded content segments are to be played.

[0104] For instance, a first media content record (or first encoded content record) may state that segments have to be played as follows: S1, S2, S3, S4, S5, S6 and while a second media content record (or second encoded content record) may state that second media content segments such as Ads have to be played as follows A1, and A2. To that end, while generating the modified manifest, the system 200 may also generate a new media content record for the content viewer based, at least in part, on inserting one or more second encoded segments from the plurality of second encoded segments in between the plurality of first encoded segments based, at least in part, on the one or more second position markers. In an example, the new media content record may state that the content segments have been played as follows: S1, S2, A1, A2, S3, S4, S5, and S6.

[0105] In one embodiment of the invention, the first media content is unmuxed into the plurality of video segments only, and the plurality of video segments is used to generate the optimized encoding profile using the set of optimized bitrates of the plurality of video segments only. In this scenario, the plurality of audio segments is not used for optimization, as the bitrates of the audio segments are small and may not much contribute to the set of optimized bitrates.

[0106] FIG. 5 shows a flow diagram of a method 500 for adaptive encoding of streaming content for content viewers, in accordance with an embodiment of the invention. The method 500 depicted in the flow diagram may be executed by, for example, the system 200. Operations of the method 500, and combinations of operation in the method 500, may be implemented by, for example, hardware, firmware, a processor, circuitry, and/or a different device associated with the execution of software that includes one or more computer program instructions. The operations of the method 500 are described herein may be performed by an application interface that is hosted and managed with the help of the system 200. The method starts at Step 501.

[0107] At Step 502, the system 200 accesses the historical playback data and viewer behavior data from the database associated with the system 200. Herein, the historical playback data indicates information related to the set of encoding profiles corresponding to each media content of the plurality of media contents viewed by the content viewer 102 over the predefined time period using one or more dynamic playback metrics. Further, herein the viewer behavior data indicates information related to a behavior of the content viewer 102 during the playback of each media content.

[0108] At Step 504, the system 200 computes via the machine learning model 220, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data.

[0109] At Step 506, the system 200 receives the media content request for the first media content from the content viewer 102. The system 200 accesses the first media content from the Content Delivery Network (CDN) associated with a Digital Platform Server (DPS) 104, based, at least in part, on the media content request. The digital platform server 104 may include at least one of a Content Management System (CMS) and a User Management System (UMS) for facilitating the streaming of digital content from the content library 106 of the content provider to a plurality of users, such as the content viewer 102.

[0110] At Step 508, the system 200 checks whether the average media playback score is lower than the reference average media playback score stored in the database 304. If the average media playback score is lower than the reference average media playback score, Step 510 is performed, or else Step 512 is performed.

[0111] At step 510, the system 200 performs adaptive encoding of the first media content, that is the encoding of the first media content is performed with higher bitrates for better quality if the average media playback score is too low compared to the reference average media playback score stored in the database.

[0112] At step 512, the system 200 performs adaptive encoding of the first media content, that is the encoding of the first media content is performed with a lower bitrate for higher watch time of high resolutions if the average media playback score is not too low compared to the reference average media playback score stored in the database.

[0113] At Step 514, the system 200 checks whether the average media playback score is higher than the reference average media playback score stored in the database. If the average media playback score is higher than the reference average media playback score, Step 516 is performed, or else Step 518 is performed.

[0114] At step 516, the system 200 performs adaptive encoding of the first media content, that is the encoding of the first media content is performed with lower bitrates for better cost if the average media playback score is too high compared to the reference average media playback score stored in the database.

[0115] At step 518, the system 200 uses the standard encoding of the first media content, if the average media playback score is substantially the same as compared to the reference average media playback score stored in the database. This aspect indicates that the machine learning model 220 was trained sufficiently to generate the optimal average media playback score for adaptive encoding of the first media content. The method 500 continues to be adaptively encoded for the subsequent media content (i.e. first media content to be watched by the content viewer 102).

[0116] The system 200 is configured to determine the set of optimal bitrates by accessing the set of available bitrates from the database. The system 200 selects the set of optimal bitrates from the set of available bitrates based, at least in part, on the average media playback score being lower than a predefined reference media quality score. In this scenario, the selected set of optimal bitrates indicates lower quality bitrates. The system 200 selects the set of optimal bitrates from the set of available bitrates based, at least in part, on the average media playback score being at least equal to the predefined reference media quality score. In this scenario, the selected set of optimal bitrates indicates higher quality bitrates.

[0117] FIG. 6 shows a flow diagram depicting a method 600 for adaptive encoding of streaming content for content viewers, in accordance with another embodiment of the invention. The method 600 depicted in the flow diagram may be executed by, for example, the system 200. Operations of the method 600, and combinations of operation in the method 600, may be implemented by, for example, hardware, firmware, a processor, circuitry and/or a different device associated with the execution of software that includes one or more computer program instructions. The operations of the method 600 are described herein may be performed by an application interface that is hosted and managed with the help of the system 200. The method 600 starts at step 602.

[0118] At Step 602, the system 200 accesses the historical playback data and the viewer behavior data from the database associated with the system 200. The historical playback data indicates information related to the set of encoding profiles corresponding to each media content of the plurality of media contents viewed by the content viewer 102 over the predefined time period using one or more dynamic playback metrics. The viewer behavior data indicates information related to a behavior of the content viewer 102 during the playback of each media content.

[0119] At Step 604, the system 200 via the machine learning model 220, determines the average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data

[0120] At Step 606, the system 200 receives the media content request for the first media content from the content viewer 102.

[0121] At Step 608 the system 200 accesses the first media content from the CDN associated with the DPS, based, at least in part, on the media content request.

[0122] At Step 610 the system 200 generates the set of adapted encoding profiles for the first media content based, at least in part, on the average media playback score.

[0123] At Step 612 the system 200 generates and transmits the manifest file to the content viewer 102 based, at least in part, on the set of adapted encoding profiles.

[0124] FIG. 7 shows a flow diagram of a method 700 for determining the average media playback score for the plurality of media contents, in accordance with another embodiment of the invention. The method 700 depicted in the flow diagram may be executed by, for example, the system 200. Operations of the method 700, and combinations of operation in the method 700, may be implemented by, for example, hardware, firmware, a processor, circuitry and/or a different device associated with the execution of software that includes one or more computer program instructions. The operations of the method 700 are described herein may be performed by an application interface that is hosted and managed with help of the system 200. The method 700 starts at step 702.

[0125] At Step 702, the system 200 performs performing a set of operations for each media content including a set of media segments. The set of operations 702 includes Sub-Step 702A to Sub-Step 702C.

[0126] At Sub-Step 702A, the system 200 determines, the content identifier (ID), and the plurality of playback tags. Each playback tag of the plurality of the playback tags corresponds to each encoding profile of the set of encoding profiles. Each encoding profile indicates the individual bitrate used by the content viewer 102 during playback of a subset of media segments from the set of media segments.

[0127] The Set of operations 702 further includes Sub-Step 702B. At Sub-Step 702B the system 200 determines a corresponding bitrate for each media segment of the set of media segments based, at least in part, on the content ID, the playback tag, and the historical playback data.

[0128] The set of operations 702 further includes Sub-Step 702C. At Sub-Step 702C the system 200 computes the aggregated bitrate for each encoding profile of the set of encoding profiles based, at least in part, on aggregating the corresponding bitrate for each media segment of the set of media segments.

[0129] At step 704, the system 200 computes the final aggregated bitrate for the plurality of media content based, at least in part, on aggregating the corresponding aggregated bitrate for each media content of the plurality of media content.

[0130] At step 706, the system 200 determines via a machine learning model 220, a behavior profile for the content viewer 102 based, at least in part, on the viewer behavior data, the behavior profile indicating a behavior of the content viewer 102 when a particular media content is streamed using a particular encoding profile.

[0131] At step 708 the system 200 determines the average media playback score for the plurality of media contents based, at least in part, on the final aggregated bitrate and the behavior profile.

[0132] FIG. 8 is a simplified block diagram of a multi-CDN 800, in accordance with various embodiments of the invention. The CDN POPs 116 disclosed in FIG. 1A can be embodied with the multi-CDN 800 of FIG. 8. As may be understood, the multi-CDN 800 refers to a distributed group of servers that are connected via a network (such as a Network 804, which is explained later). The multi-CDN 800 provides quick delivery of media content to various content viewers (such as the content viewer 102) subscribed to the digital platform server 104. The multi-CDN 800 includes a plurality of interconnected servers that may interchangeably be referred to as a plurality of content repository servers. The multi-CDN 800 includes an origin CDN server 802, a public CDN server 806, a private CDN server 808, a Telecommunication CDN server (referred to hereinafter as ‘Telco CDN server’) 810, an Internet Service Provider CDN server (referred to hereinafter as ‘ISP CDN server’) 812, and a CDN point of presence server (referred to hereinafter as ‘CDN POP server’) 814 each coupled to, and in communication with (and/or with access to) the network 804. The CDN POP server 814 is an example of the CDN POP 116 of FIG. 1A. It is noted that CDN POP 116 may also be interchangeably referred to as ‘sub-CDNs’, ‘subnet CDN’, ‘surrogate CDN’, and ‘CDN sub box’. Further, two or more components of the multi-CDN 800 may be embodied in one single component, and/or one component may be configured using multiple sub-components to achieve the desired functionalities. Some components of the multi-CDN 800 may be configured using hardware elements, software elements, firmware elements, and/or a combination thereof.

[0133] The network 804 may include, without limitation, a light fidelity (Li-Fi) network, a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a satellite network, the Internet, a fiber-optic network, a coaxial cable network, an infrared (IR) network, a Radio Frequency (RF) network, a virtual network, and/or another suitable public and/or private network capable of supporting communication among two or more of the parts illustrated in FIG. 8, or any combination thereof. Various servers within the multi-CDN 800 may connect to the network 804 using various wired and wireless communication protocols, such as Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), 2nd Generation (2G), 3rd Generation (3G), 4th Generation (4G), 5th Generation (5G) communication protocols, Long Term Evolution (LTE) communication protocols, future communication protocols or any combination thereof. For example, the network 804 may include multiple different networks, such as a private network made accessible by the origin CDN server 802 and a public network (e.g., the Internet, etc.) through which the various servers may communicate.

[0134] The origin CDN server 802 stores the media content accessed/downloaded from the streaming content provider and/or content producers. The origin CDN server 802 serves the media content to one or more cache servers which are either located in the vicinity of the content viewer 102 or the subscriber or connected to another cache server located in the content viewer’s vicinity. In various examples, cache servers include the public CDN server 806, the private CDN server 808, the Telco CDN server 810, the ISP CDN server 812, the CDN POP server 814, and the like.

[0135] The origin CDN server 802 includes a processing system 816, a memory 818, a database 820, and a communication interface 822. The processing system 816 is configured to extract programming instructions from the memory 818 to perform various functions of the multi-CDN 800. In one example, the processing instructions include instructions for ingesting media content via the communication interface 822 from a remote database 824 which may further include one or more data repositories/databases (not shown) to an internal database such as database 820. The remote database 824 is associated with a streaming content provider and/or a content producer. In another example, the media content stored within the database 820 can be served to one or more cache servers via the communication interface 822 over the network 804.

[0136] In some examples, the public CDN server 806 is associated with a public CDN provider that hosts media content among other types of data for different content providers within the same server. The private CDN server 808 is associated with a private CDN provider (such as a streaming content provider) which hosts media content to serve the needs of its subscribers. The Telco CDN server 810 is associated with telecommunication service providers that provide content hosting services to various entities such as the streaming content platform. The ISP CDN server 812 is associated with internet service providers that provide content hosting services to various entities such as the streaming content platform. The CDN POP server 814 caches content and allows the electronic devices of the content viewers to stream the content. It is noted that the various cache servers download and cache media content from the origin CDN server 802 and further allow a valid user or the content viewer 102 to stream the media content.

[0137] It is noted that various embodiments of the present disclosure, the various functions of the system 200, or the method disclosed in FIG. 5, FIG. 6, and/or FIG. 7, can be implemented using any one or more components of the multi-CDN 800 such as the origin CDN server 802 and/or one or more cache servers individually and/or in combination with each together. Alternatively, the system 200 can be communicably coupled with the multi-CDN 800 to perform the various embodiments or methods described by the present disclosure.

[0138] Various embodiments disclosed herein provide numerous advantages. More specifically, the embodiments disclosed herein suggest techniques for encoding content in an optimized way for the content viewers. Such optimized encoding of content gracefully adapts based on network conditions or device limitations and employs a scalable format across legacy and state-of-art devices. Further, the encoding of each image frame at different bitrates and/or resolutions ensures optimal limits on the quality of streaming video content which adapts gracefully to fluctuating mobile networks and consumes lesser bandwidth. Moreover, video personalization by providing different camera angles or different camera exposures for viewers in real-time provides an enjoyable and seamless content viewing experience to the users. Furthermore, it is understood that though various embodiments of the present invention have been explained with reference to image frames, these embodiments can easily be applied to audio frames as well and the same will still be within the scope of the present invention.

[0139] The methods described herein may be performed using the systems described herein. In addition, it is contemplated that the methods described herein may be performed using systems different than the systems described herein. Moreover, the systems described herein may perform the methods described herein and may perform or execute instructions stored in a non-transitory Computer-Readable Storage Medium (CRSM). The CRSM may include any electronic, magnetic, optical, or other physical storage device that stores executable instructions. The instructions may include instructions to cause a processor to perform or control the performance of operations of the proposed methods. It is also contemplated that the systems described herein may perform functions or execute instructions other than those described in relation to the methods and CRSMs described herein.

[0140] Furthermore, the CRSMs described herein may store instructions corresponding to the methods described herein, and may store instructions which may be performed or executed by the systems described herein. Furthermore, it is contemplated that the CRSMs described herein may store instructions different than those corresponding to the methods described herein, and may store instructions which may be performed by systems other than the systems described herein.

[0141] The methods, systems, and CRSMs described herein may include the features or perform the functions described herein in association with any one or more of the other methods, systems, and CRSMs described herein.

[0142] In an embodiment, the method or methods described above may be executed or carried out by a computing system including a tangible computer-readable storage medium, also described herein as a storage machine, that holds machine-readable instructions executable by a logic machine (i.e., a processor or programmable control device) to provide, implement, perform, and/or enact the above-described methods, processes and/or tasks. When such methods and processes are implemented, the state of the storage machine may be changed to hold different data. For example, the storage machine may include memory devices such as various hard disk drives, CD, or DVD devices. The logic machine may execute machine-readable instructions via one or more physical information and/or logic processing devices. For example, the logic machine may be configured to execute instructions to perform tasks for a computer program. The logic machine may include one or more processors to execute the machine-readable instructions. The computing system may include a display subsystem to display a Graphical User Interface (GUI) or any visual element of the methods or processes described above. For example, the display subsystem, storage machine, and logic machine may be integrated such that the above method may be executed while visual elements of the disclosed system and/or method are displayed on a display screen for user consumption. The computing system may include an input subsystem that receives user input. The input subsystem may be configured to connect to and receive input from devices such as a mouse, keyboard or gaming controller. For example, a user input may indicate a request that certain task is to be executed by the computing system, such as requesting the computing system to display any of the above-described information, or requesting that the user input updates or modifies existing stored information for processing. A communication subsystem may allow the methods described above to be executed or provided over a computer network. For example, the communication subsystem may be configured to enable the computing system to communicate with a plurality of personal computing devices. The communication subsystem may include wired and/or wireless communication devices to facilitate networked communication. The described methods or processes may be executed, provided, or implemented for a user or one or more computing devices via a computer-program product such as via an Application Programming Interface (API).

[0143] The foregoing descriptions of specific embodiments of the present invention have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the present invention to the precise forms disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiment was chosen and described in order to best explain the principles of the present invention and its practical application, thereby enabling others skilled in the art to best utilize the present invention and various embodiments with various modifications as are suited to the particular use contemplated.

, C , C , Claims:1. A computer-implemented method comprising:
accessing, by a system, historical playback data and viewer behavior data from a database associated with the system, the historical playback data indicating information related to a set of encoding profiles corresponding to each media content of a plurality of media contents viewed by a content viewer over a predefined time period using one or more dynamic playback metrics, the viewer behavior data indicating information related to a behavior of the content viewer during playback of the each media content;
determining, by the system via a machine learning model, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data;
receiving, by the system, a media content request for first media content from the content viewer;
accessing, by the system, the first media content from a content repository server associated with a Digital Platform Server (DPS), based, at least in part, on the media content request;
generating, by the system, a set of adapted encoding profiles for the first media content based, at least in part, on the average media playback score; and
generating and transmitting, by the system, a manifest file to the content viewer based, at least in part, on the set of adapted encoding profiles.

2. The computer-implemented method as claimed in claim 1, wherein determining the average media playback score for the plurality of media contents, further comprises:
performing, by the system, for each media content comprising a set of media segments, a set of operations comprising:
determining a content identifier (ID) and a plurality of playback tags, each playback tag of the plurality of the playback tags corresponding to each encoding profile of the set of encoding profiles, wherein the each encoding profile indicates an individual bitrate used by the content viewer during playback of a subset of media segments from the set of media segments;
determining a corresponding bitrate for each media segment of the set of media segments based, at least in part, on the content ID, the playback tag, and the historical playback data; and
computing an aggregated bitrate for the each encoding profile of the set of encoding profiles based, at least in part, on aggregating the corresponding bitrate for each media segment of the set of media segments;
computing, by the system, a final aggregated bitrate for the plurality of media content based, at least in part, on aggregating the corresponding aggregated bitrate for each media content of the plurality of media content;
determining, by the system via a machine learning model, a behavior profile for the content viewer based, at least in part, on the viewer behavior data, the behavior profile indicating a behavior of the content viewer when a particular media content is streamed using a particular encoding profile; and
determining, by the system, an average media playback score for the plurality of media contents based, at least in part, on the final aggregated bitrate and the behavior profile.

3. The computer-implemented method as claimed in claim 1, wherein generating the set of adapted encoding profiles, further comprises:
unmuxing, by the system, the first media content into at least one of a plurality of video segments and a plurality of audio segments based, at least in part, on predefined criteria;
determining, by the system, a set of optimal bitrates for the plurality of video segments based, at least in part, on the average media playback score; and
generating, by the system, the set of adapted encoding profiles for the first media content based, at least in part, on the set of optimal bitrates.

4. The computer-implemented method as claimed in claim 3, further comprising:
encoding, by the system, at least one of the plurality of video segments to generate one or more first media content renditions based, at least in part, on the set of adapted encoding profiles; and
facilitating, by the system, a storage of the one or more first media content renditions in the content repository server.

5. The computer-implemented method as claimed in claim 1, further comprising:
selecting, by the system, the one or more first media content renditions from a plurality of first media content renditions stored in the content repository server based, at least in part, on the set of adapted encoding profiles; and
facilitating, by the system, an inclusion of one or more playback Uniform Resource Locators (URLs) associated with the one or more first media content renditions in the manifest file.

6. The computer-implemented method as claimed in claim 3, wherein determining the set of optimal bitrates, further comprises:
accessing, by the system, a set of available bitrates from the database; and
performing, by the system, one of:
selecting the set of optimal bitrates from the set of available bitrates based, at least in part, on the average media playback score being lower than a predefined reference media quality score, wherein the selected set of optimal bitrates indicates lower quality bitrates; and
selecting the set of optimal bitrates from the set of available bitrates based, at least in part, on the average media playback score being at least equal to the predefined reference media quality score, wherein the selected set of optimal bitrates indicates higher quality bitrates.

7. The computer-implemented method as claimed in claim 4, wherein the encoding further comprises:
determining, by the system, a comparison score for the first media content based, at least in part, on comparing the average media playback score with a predefined reference media quality score; and
encoding, by the system, at least one of the plurality of video segments to generate one or more first media content renditions based, at least in part, on the comparison score.

8. The computer-implemented method as claimed in claim 7, wherein generating the set of adapted encoding profiles further comprises:
setting, by the system, at least one video ID in each video segment of the plurality of video segments to identify the plurality of video segments from the plurality of audio segments;
storing, by the system, the plurality of video segments corresponding to each adapted encoding profile of the set of adapted encoding profiles in the database; and
generating, by the system, the set of adapted encoding profiles by encoding at least one of the plurality of video segments with the at least one video ID based, at least in part, on the comparison score.

9. The computer-implemented method as claimed in claim 1, wherein the system is incorporated as a part of one of the content repository server and the DPS.

10. The computer-implemented method as claimed in claim 1, wherein the one or more dynamic playback metrics indicate information related to at least one system parameter and at least one network parameter used in the set of encoding profiles.

11. A system for streaming content on an electronic device of a content viewer, the system comprising:
a memory for storing instructions; and
a processor configured to execute the instructions and thereby cause the system, at least in part, to:
access historical playback data and viewer behavior data from a database associated with the system, the historical playback data indicating information related to a set of encoding profiles corresponding to each media content of a plurality of media contents viewed by a content viewer over a predefined time period using one or more dynamic playback metrics, the viewer behavior data indicating information related to a behavior of the content viewer during playback of the each media content;
determine via a machine learning model, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data;
receive a media content request for first media content from the content viewer;
access the first media content from a content repository server associated with a Digital Platform Server (DPS), based, at least in part, on the media content request;
generate a set of adapted encoding profiles for the first media content based, at least in part, on the average media playback score; and
generate and transmit a manifest file to the content viewer based, at least in part, on the set of adapted encoding profiles.

12. The system as claimed in claim 11, wherein to generate the average media playback score for the plurality of media contents, the system is further caused, at least in part, to:
perform for each media content comprising a set of media segments, a set of operations comprising:
determine a content identifier (ID) and a plurality of playback tags, each playback tag of the plurality of the playback tags corresponding to each encoding profile of the set of encoding profiles, wherein the each encoding profile indicates an individual bitrate used by the content viewer during playback of a subset of media segments from the set of media segments;
determine a corresponding bitrate for each media segment of the set of media segments based, at least in part, on the content ID, the playback tag, and the historical playback data; and
compute an aggregated bitrate for the each encoding profile of the set of encoding profiles based, at least in part, on aggregating the corresponding bitrate for each media segment of the set of media segments;
compute a final aggregated bitrate for the plurality of media content based, at least in part, on aggregating the corresponding aggregated bitrate for each media content of the plurality of media content;
determine via a machine learning model, a behavior profile for the content viewer based, at least in part, on the viewer behavior data, the behavior profile indicating a behavior of the content viewer when a particular media content is streamed using a particular encoding profile; and
determine an average media playback score for the plurality of media contents based, at least in part, on the final aggregated bitrate and the behavior profile.

13. The system as claimed in claim 11, wherein to generate the set of adapted encoding profiles, the system is further caused, at least in part, to:
unmux the first media content into at least one of a plurality of video segments and a plurality of audio segments based, at least in part, on predefined criteria;
determine a set of optimal bitrates for at least one of the plurality of video segments based, at least in part, on the average media playback score; and
generate the set of adapted encoding profiles for the first media content based, at least in part, on the set of optimal bitrates.

14. The system as claimed in claim 13, wherein the system is further caused, at least in part, to:
encode at least one of the plurality of video segments to generate one or more first media content renditions based, at least in part, on the set of adapted encoding profiles; and
facilitate a storage of the one or more first media content renditions in the content repository server.

15. The system as claimed in claim 11, wherein the system is further caused, at least in part, to:
select the one or more first media content renditions from a plurality of first media content renditions stored in the content repository server based, at least in part, on the set of adapted encoding profiles; and
facilitate an inclusion of one or more playback Uniform Resource Locators (URLs) associated with the one or more first media content renditions in the manifest file.

16. The system as claimed in claim 13, wherein to determine the set of optimal bitrates, the system is further caused, at least in part, to:
access a set of available bitrates from the database; and
performing, by the system, one of:
selecting the set of optimal bitrates from the set of available bitrates based, at least in part, on the average media playback score being lower than a predefined reference media quality score, wherein the selected set of optimal bitrates indicates lower quality bitrates; and
selecting the set of optimal bitrates from the set of available bitrates based, at least in part, on the average media playback score being at least equal to the predefined reference media quality score, wherein the selected set of optimal bitrates indicates higher quality bitrates.

17. The system as claimed in claim 14, wherein to encode the at least one of the plurality of video segments, the system is further caused, at least in part, to:
determining, by the system, a comparison score for the first media content based, at least in part, on comparing the average media playback score with a predefined reference media quality score; and
encoding, by the system, at least one of the plurality of video segments to generate one or more first media content renditions based, at least in part, on the comparison score.

18. The system as claimed in claim 15, wherein to generate the set of adapted encoding profiles, the system is further caused, at least in part, to:
setting, by the system, at least one video ID in each video segment of the plurality of video segments to identify the plurality of video segments from the plurality of audio segments;
storing, by the system, the plurality of video segments corresponding to each adapted encoding profile of the set of adapted encoding profiles, in the database; and
generating, by the system, the set of adapted encoding profiles by encoding at least one of the plurality of video segments with the at least one video ID based, at least in part, on the comparison score.

19. The system as claimed in claim 11, wherein the system is incorporated as a part of one of the content repository server and the DPS.

20. The system as claimed in claim 11, wherein the one or more dynamic playback metrics indicate information related to at least one system parameter and at least one network parameter used in the set of encoding profiles.

21. A non-transitory computer-readable storage medium comprising computer-executable instructions that, when executed by at least a processor of a system, cause the system to perform a method comprising:
accessing historical playback data and viewer behavior data from a database associated with the system, the historical playback data indicating information related to a set of encoding profiles corresponding to each media content of a plurality of media contents viewed by a content viewer over a predefined time period using one or more dynamic playback metrics, the viewer behavior data indicating information related to a behavior of the content viewer during playback of the each media content;
determining, via a machine learning model, an average media playback score for the plurality of media contents based, at least in part, on the historical playback data and the viewer behavior data;
receiving a media content request for first media content from the content viewer;
accessing the first media content from a content repository server associated with a Digital Platform Server (DPS), based, at least in part, on the media content request;
generating a set of adapted encoding profiles for the first media content based, at least in part, on the average media playback score; and
generating and transmitting a manifest file to the content viewer based, at least in part, on the set of adapted encoding profiles.

Documents

Application Documents

# Name Date
1 202421027353-STATEMENT OF UNDERTAKING (FORM 3) [02-04-2024(online)].pdf 2024-04-02
2 202421027353-POWER OF AUTHORITY [02-04-2024(online)].pdf 2024-04-02
3 202421027353-FORM 18 [02-04-2024(online)].pdf 2024-04-02
4 202421027353-FORM 1 [02-04-2024(online)].pdf 2024-04-02
5 202421027353-FIGURE OF ABSTRACT [02-04-2024(online)].pdf 2024-04-02
6 202421027353-DRAWINGS [02-04-2024(online)].pdf 2024-04-02
7 202421027353-DECLARATION OF INVENTORSHIP (FORM 5) [02-04-2024(online)].pdf 2024-04-02
8 202421027353-COMPLETE SPECIFICATION [02-04-2024(online)].pdf 2024-04-02
9 202421027353-Proof of Right [30-04-2024(online)].pdf 2024-04-30
10 Abstract1.jpg 2024-05-15
11 202421027353-PA [08-10-2024(online)].pdf 2024-10-08
12 202421027353-ASSIGNMENT DOCUMENTS [08-10-2024(online)].pdf 2024-10-08
13 202421027353-8(i)-Substitution-Change Of Applicant - Form 6 [08-10-2024(online)].pdf 2024-10-08
14 202421027353-RELEVANT DOCUMENTS [22-09-2025(online)].pdf 2025-09-22
15 202421027353-POA [22-09-2025(online)].pdf 2025-09-22
16 202421027353-MARKED COPIES OF AMENDEMENTS [22-09-2025(online)].pdf 2025-09-22
17 202421027353-FORM 13 [22-09-2025(online)].pdf 2025-09-22
18 202421027353-AMENDED DOCUMENTS [22-09-2025(online)].pdf 2025-09-22