A System And Method For Dynamic Generation And Navigation Of Non

A System And Method For Dynamic Generation And Navigation Of Non Linear Videos

Abstract: ABSTRACT A SYSTEM AND METHOD FOR DYNAMIC GENERATION AND NAVIGATION OF NON-LINEAR VIDEOS The invention provides a method and a system for dynamic generation and navigation of non-linear videos in a platform. The method includes interactively receiving a voice input in any vernacular language from a consumer, through a voice capture engine. The voice input received from the consumer regarding choices and preferences is processed through an AI based natural language speech processing module and a multilingual speech processing module to form a meaningful query. The interactions with the consumer are stored in a consumer database and are used to generate a personalized non-linear video from a pre-existing audio-video content through a personalization engine. A media data containing relevant information is dynamically generated through a next-part generation module. The dynamically generated media data is merged dynamically with the pre-existing video through an interactive video player engine to provide a non-linear video to the consumer. FIG.1

Patent Information

Application #

Filing Date

22 February 2024

Publication Number

18/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Email

Parent Application

Applicants

KPOINT TECHNOLOGIES PRIVATE LIMITED

201, 2nd Floor, Iriz, Baner - Pashan Link Road, Pashan, Pune- 411021, Maharashtra, India

Inventors

1. Pushyamitra Navare

Kpoint Technologies Private Limited 201, 2nd Floor, Iriz, Baner - Pashan Link Road, Pashan, Pune- 411021, Maharashtra, India

2. Shekhar Bhabad

Kpoint Technologies Private Limited 201, 2nd Floor, Iriz, Baner - Pashan Link Road, Pashan, Pune- 411021, Maharashtra, India

3. Nidhika Gohil

Kpoint Technologies Private Limited 201, 2nd Floor, Iriz, Baner - Pashan Link Road, Pashan, Pune- 411021, Maharashtra, India

4. Aditya Kumar Panda

Kpoint Technologies Private Limited 201, 2nd Floor, Iriz, Baner - Pashan Link Road, Pashan, Pune- 411021, Maharashtra, India

Specification

DESC:A SYSTEM AND METHOD FOR DYNAMIC GENERATION AND NAVIGATION OF NON-LINEAR VIDEOS
FIELD OF INVENTION
The invention generally relates to the field of digital assistant systems. In particular, the invention relates to a system and method for dynamic generation and navigation of non-linear videos through a conversational voice interface.
BACKGROUND
As the technology is advancing day-to-day, the human machine interaction has become inevitable in our daily life. Voice-based navigation is an increasingly popular technology, which allows users to control devices and applications without the need for physical interaction or ocular attention. Voice-based user interfaces use speech technology to provide the users with access to information, allow them to perform the desired transactions, and support easy communication.
The existing user interactive technologies are well suited for tech-savvy people who are well conversant with device screens and advanced video features. The user interactive systems existing in the art respond to the user’s queries using a pre-existing video. But, such systems that answer a query with fixed navigation interfaces may not be suitable for common people, especially geriatric people and people from Tier 2 and Tier 3 cities, who are not well versed in English language and advanced technologies. The existing systems for non-linear video navigation do not support vernacular language processing in the voice recognition. Also, the existing systems for non-linear video navigation need advanced hardware and/or media data to be present locally on the same system for the navigation to work. This is not possible with consumer grade hardware and these systems cannot work in a distributed network environment such as mobile networks and consumer’s personal mobile devices.
Therefore, there is a need in the art for a system and a method for dynamic generation and navigation of non-linear videos through conversational voice interface which not only interacts with a user in his personal mobile device by providing dynamically generated personalized videos according to the user’s queries, but also supports vernacular language processing in the voice recognition and guides the user through the navigation options with dynamically generated commentaries.
BRIEF DESCRIPTION OF DRAWINGS
So that the manner in which the recited features of the invention can be understood in detail, some of the embodiments are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
Fig. 1 shows a system for dynamic generation and navigation of non-linear videos through a conversational voice interface, according to an embodiment of the invention.
Fig. 2 shows the conversational voice interface, according to an embodiment of the invention.
SUMMARY OF THE INVENTION
One aspect of the invention provides a method for dynamic generation and navigation of non-linear videos in a platform. The method includes interactively receiving a voice input from a consumer, processing the voice input received from the consumer to form a meaningful query, dynamically generating a media data containing relevant information based on a plurality of parameters wherein the media data is an AI generated video segment based on a pre-existing audio-video content in the platform and merging the dynamically generated media data to the pre-existing video to provide a non-linear video that enables easy navigation through the information.
Another aspect of the invention provides a system for dynamic generation and navigation of non-linear videos in a platform. The system includes a consumer database, a personalization engine, a next-part generation module, a video player engine, a business process module, a conversational voice interface and an analytics engine.
DETAILED DESCRIPTION OF THE INVENTION
The definitions, terms and terminology adopted in the disclosure have their usual meaning and interpretations, unless otherwise specified. The terms “consumer”, “user”, “client” and “viewer” are used interchangeably throughout the description of the invention.
Various embodiments of the invention provide a method and a system for dynamic generation and navigation of non-linear videos in a platform. The method includes receiving a voice input from a consumer, the voice input is received through a voice capture engine. The voice capture engine interactively captures inputs from the consumer seeking an information. The interactions with the consumers are in any language preferred by the consumer including vernacular languages. The voice input received from the consumer regarding choices and preferences is processed through an AI based natural language speech processing module and a multilingual speech processing module to form a meaningful query. The interactions with the consumer including the consumer choices and preferences are stored in a consumer database. The consumer database further stores information including basic demographic information, past subscriptions, future interests and other business data. The stored information is used to generate a personalized non-linear video from a pre-existing audio-video content through a personalization engine. A media data containing relevant information based on the consumer choices and preferences is dynamically generated through a next-part generation module. In one embodiment of the invention, the media data is an AI generated video segments that are pluggable to the pre-existing audio-video content in the platform. The dynamically generated media data is merged dynamically with the pre-existing video through an interactive video player engine to provide a non-linear video to the consumer. The method allows interactive communication with the consumer in any vernacular language and guide the consumer to take further actions in the non-linear video navigation.
Various embodiments of the invention provide a system for dynamic generation and navigation of non-linear videos. Fig. 1 shows a system for dynamic generation and navigation of non-linear videos through a conversational voice interface, according to an embodiment of the invention. The system includes a consumer database 101, a personalization engine 103, a next-part generation module 105, a video player engine 107, a business process module 109, a conversational voice interface 111 and an analytics engine 113. The consumer database 101 stores information including basic demographic information, past subscriptions, future interests and other business data. In one embodiment of the invention, the stored information is used to generate a personalized non-linear video navigation experience for the consumer. In another embodiment of the invention, the information is used to generate a dynamic, non-linear video experience for the consumer. The personalization engine 103 ensures a personalized non-linear video navigation experience to the consumer taking into account the consumer demographic information, consumer preferences, consumer interests, consumer business goals, and consumer buying history. Thus, the non-linear video navigation experience is completely different for each individual based on their preferences and in-video selections. The next-part generation module 105 dynamically generates personalized video segments in accordance with the choices and queries made by the consumer. The next- part generation module includes but is not limited to generative adversarial networks, video segmentation modules and other AI based models that allow dynamic creation of video segments. The next- part generation module allows dynamic creation of video segments personalized to consumer choices and seamlessly blend pre-existing video content with the video segments. The video player engine 107 takes the dynamically generated video segments from the next-part generation module 105 and dynamically generates a personalized non-linear video for the consumer. The dynamically generated personalized non-linear videos are merged into the playing video stream as the next part for ensuring a continuous non-linear video engagement experience. The video player engine 107 generates the experience dynamically by fetching necessary media data from the video platform servers. The video player engine 107 also converses (text-to-speech) with the consumer in their preferred language and guides the consumer to take further actions in the non-linear video navigation.
The conversational voice interface 111 provides an interface to consumers in a form where consumers can interact through a conversational voice interaction and make choices. In one embodiment of the invention, the conversational voice is given in any vernacular language. In an example of the invention, the conversational voice is given in the consumer’s preferred spoken language. The business process module 109 provides business specific directions to the system based on business specific intelligence. The analytics engine 113 captures all the interactions in the non-linear video navigation between the consumer and the system. The information captured in the analytics engine 113 is used to further understand the needs and preferences of the consumer and to make predictions for building future non-linear video navigation experiences.
The video player engine 107 captures the consumer selections and passes them further to the business process module 109 to identify and generate the next step of the video. The video player engine 107 provides playback of relevant non-linear video experiences. It is driven through the inputs from conversational voice interface 111, which is in-turn generated by the personalization engine 103 from the stored information in the consumer database 101. The system provides an intermediate step of visual and voice prompt for next step to consumer by displaying options to consumers to navigate through the non-linear video easily and interact with the system comfortably.
Fig. 2 shows the conversational voice interface, according to an embodiment of the invention. The conversational voice interface 111 includes a voice capture engine 111a, an AI based natural language speech processing module 111b and a multilingual speech processing module 111c. The voice capture engine 111a interactively captures the conversation of the consumer. Upon capturing, the conversational voice is processed by the AI based natural language speech processing module 111b and the multilingual speech processing module 111c. The AI based natural language speech processing module 111b understands the conversation given by the user by speech recognition, speech to text conversion and translation. The natural language processing models include but are not limited to transformer models, speech to text models, text to speech models and multi-language speech to text models. The multilingual speech processing module 111c understands the language spoken by the consumer, converts it into text and translate the text into the required language. In one embodiment of the invention, the conversational voice is in any vernacular language. In another embodiment of the invention, the conversational voice is in consumer’s preferred spoken language.
The method and system for dynamic generation and navigation of non-linear videos through conversational voice interface provides dynamically generated personalized videos according to the consumer’s queries. The system for dynamic generation and navigation of non-linear videos through conversational voice interface 111 also supports vernacular language processing in the voice recognition and guides the consumer through the non-linear video navigation options with dynamically generated commentaries, thereby making it more consumer-friendly, interactive and engageable. The system is configured to work with consumer grade hardware and distributed network environment such as mobile networks and consumer’s personal mobile devices.
Example 1: In one example of the invention, the system is used in the insurance or banking sector. A customer who is trying to understand the details of the insurance policy or investment options can just have a conversation with the system on the queries. The system creates customized video on the spot, addressing the queries of the customer. The system initially gives a brief overview and then allows the customer to choose the aspect he/she wants to know more about. The information video is not presented in a linear way. The customer can directly move to the part which matters him the most. The system dynamically adapts to the queries of the customer and guides the customer through the dynamically generated video with dynamically generated commentaries, tailored to the needs of the customer.
Example 2: In another example of the invention, the system is used in an e-commerce platform. Instead of scrolling through endless product pages, the customer can ask the system about the required products. The system generates a non-linear video showcasing various products, highlighting their key features dynamically. The system provides a personalized and interactive product exploration and post purchase assistance including but not limited to tracking the order and warranty details of the purchased product.
The foregoing description of the invention has been set for merely to illustrate the invention and is not intended to be limiting. Since modifications of the disclosed embodiments incorporating the spirit and substance of the invention may occur to person skilled in the art, the invention should be construed to include everything within the scope of the appended claims and equivalents thereof.
,CLAIMS:We Claim:
1. A method for dynamic generation and navigation of a non-linear video in a platform, the method comprising:
receiving interactively, through a voice capture engine (111a), a voice input from a consumer seeking an information;
processing the voice input received from the consumer, through an AI based natural language speech processing module (111b) and a multilingual speech processing module (111c), to form a meaningful query;
dynamically generating, through a next-part generation module (105), a media data containing relevant information based on a plurality of parameters wherein the media data is an AI generated video segment based on a pre-existing audio-video content in the platform; and
merging the dynamically generated media data to the pre-existing video, through an interactive video player engine (107), to provide a non-linear video that enables easy navigation through the information.
2. The method as claimed in claim 1, wherein the method further comprises of personalising the consumer experience of navigating and generating a non-linear video through a personalization engine (103) and an analytics engine (113).
3. The method as claimed in claim 1, wherein the voice input is received in any vernacular language.
4. The method as claimed in claim 1, wherein the voice input is processed through natural language processing models including transformer models, speech to text models, text to speech models and multi-language speech to text models.
5. The method as claimed in claim 1, wherein the method allows interactive communication with the consumer in any vernacular language and guides the consumer to take further actions in the non-linear video navigation.
6. The method as claimed in claim 1, wherein the plurality of parameters for generating a non-linear video are collected and stored in a consumer database (101) wherein the plurality of parameters includes consumer interaction data, consumer demographic information, consumer preferences, consumer interests, consumer business goals, consumer business data and consumer buying history.
7. The method as claimed in claim 1, wherein the analytics engine (113) is configured for capturing consumer interaction data to understand the preferences of the consumer and make predictions for building future non-linear video navigation experiences.
8. The method as claimed in claim 1, wherein the next-part generation module includes generative adversarial networks, video segmentation modules.

Documents

Application Documents

#	Name	Date
1	202421012661-PROVISIONAL SPECIFICATION [22-02-2024(online)].pdf	2024-02-22
2	202421012661-FORM FOR SMALL ENTITY(FORM-28) [22-02-2024(online)].pdf	2024-02-22
3	202421012661-FORM FOR SMALL ENTITY [22-02-2024(online)].pdf	2024-02-22
4	202421012661-FORM 1 [22-02-2024(online)].pdf	2024-02-22
5	202421012661-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [22-02-2024(online)].pdf	2024-02-22
6	202421012661-EVIDENCE FOR REGISTRATION UNDER SSI [22-02-2024(online)].pdf	2024-02-22
7	202421012661-DRAWINGS [22-02-2024(online)].pdf	2024-02-22
8	202421012661-DECLARATION OF INVENTORSHIP (FORM 5) [22-02-2024(online)].pdf	2024-02-22
9	202421012661-Proof of Right [05-03-2024(online)].pdf	2024-03-05
10	202421012661-FORM-26 [05-03-2024(online)].pdf	2024-03-05
11	202421012661-ENDORSEMENT BY INVENTORS [05-03-2024(online)].pdf	2024-03-05
12	202421012661-RELEVANT DOCUMENTS [21-02-2025(online)].pdf	2025-02-21
13	202421012661-POA [21-02-2025(online)].pdf	2025-02-21
14	202421012661-FORM-26 [21-02-2025(online)].pdf	2025-02-21
15	202421012661-FORM 3 [21-02-2025(online)].pdf	2025-02-21
16	202421012661-FORM 13 [21-02-2025(online)].pdf	2025-02-21
17	202421012661-DRAWING [21-02-2025(online)].pdf	2025-02-21
18	202421012661-COMPLETE SPECIFICATION [21-02-2025(online)].pdf	2025-02-21
19	Abstract.jpg	2025-04-11
20	202421012661-MSME CERTIFICATE [25-04-2025(online)].pdf	2025-04-25
21	202421012661-FORM28 [25-04-2025(online)].pdf	2025-04-25
22	202421012661-FORM-9 [25-04-2025(online)].pdf	2025-04-25
23	202421012661-FORM 18A [25-04-2025(online)].pdf	2025-04-25
24	202421012661-FER.pdf	2025-07-15

Search Strategy

1	202421012661_SearchStrategyNew_E_SearchStrategyE_19-05-2025.pdf