Methods And Systems For Generating Questions From A Multimedia

< Back

Methods And Systems For Generating Questions From A Multimedia

Abstract: The embodiments herein provide systems and methods for generating questions from a multimedia, the method comprising extracting textual information from a multimedia. Further, the method includes extracting at least one sentence from the extracted textual information associated with the multimedia. Further, the method includes parsing the extracted at least one sentence present in the extracted textual information. Further, the method includes comparing the parsed at least one sentence with at least one predefined question template. Further, the method includes generating the at least one question, when the parsed at least one sentence matches with at least one predefined question template. Further, the method includes filtering and ranking the generated at least one question to extract relevant questions from the generated at least one question. Further, the method includes displaying the at least one question to a user while watching the multimedia. The at least one question includes descriptive questions. FIG. 2

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

09 January 2018

Publication Number

32/2018

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

patent@bananaip.com

Parent Application

Applicants

Streamingo Solutions Private Limited

#42 2nd Main, Vyalikaval, Bangalore, Karnataka, India-560003

Inventors

1. Sanjeev Kilarapu

#42 2nd Main, Vyalikaval, Bangalore, Karnataka, India-560003

2. Vinay T.S

472 11th , C Cross, 4th B Main, WOCR 2nd Stage Mahalakshmipuram, Bangalore, Karnataka, India-560086

3. Vidhya T.V

472 11th , C Cross, 4th B Main, WOCR 2nd Stage, Mahalakshmipuram, Bangalore, Karnataka, India- 560086

4. Sharath Manjunath

21A Noreatt Place, Leeming, WA Australia 6149

Specification

Claims:What is claimed is:
1. A method for generating questions from a multimedia, the method comprising:
extracting, by an electronic device (100), a textual information from a multimedia;
extracting, by the electronic device (100), at least one sentence from the extracted textual information associated with the multimedia; and
parsing, by the electronic device (100), the extracted at least one sentence present in the extracted textual information;
comparing, by the electronic device (100), the parsed at least one sentence with at least one predefined question template; and
generating, by the electronic device (100), the at least one question, when the parsed at least one sentence matches with at least one predefined question template.
2. The method of claim 1, wherein the method further includes filtering and ranking the generated at least one question to extract relevant questions from the generated at least one question.
3. The method of claim 1, wherein the method further includes displaying, by the electronic device (100), the at least one question to a user while watching the multimedia.
4. The method of claim 1, wherein the at least one question includes at least one descriptive question.
5. An electronic device (100) for generating questions from a multimedia, the electronic device comprising
a text extractor unit (102) configured to
extract a textual information from a multimedia;
a sentence extractor unit (104) configured to
extract at least one sentence from the extracted textual information associated with the multimedia;
a question generation engine (106) configured to
parse the extracted at least one sentence present in the extracted textual information;
compare the parsed at least one sentence with at least one predefined question template; and
generate the at least one question, when the parsed at least one sentence matches with at least one predefined question template.
6. The electronic device (100) of claim 5, wherein the question generating engine (106) further configured to filter and rank the generated at least one question to extract relevant questions from the generated at least one question.
7. The electronic device (100) of claim 5, wherein a display unit (108) configured to display the at least one question to a user while watching the multimedia.
8. The electronic device (100) of claim 5, wherein the at least one question includes at least one descriptive questions.

, Description:CROSS REFERENCE TO RELATED APPLICATION
This divisional application is based on and derives the benefit of Indian Provisional Application number 201641041399 filed on 3rd December 2016, and hereby is the contents of which are incorporated herein by reference.
TECHNICAL FIELD
[001] The embodiments herein relate to electronic devices and, more particularly to generating questions from a multimedia.

BACKGROUND
[002] Generally, multimedia (for example videos) is being used to share information and messages on the Internet. It is now easy to make videos and capture your thought processes. There is no limitation on the length of videos, resulting in long videos. There are various kinds of videos, such as entertainment, informative, interactive or the like. Due to long duration of the videos, a viewer might find them uninteresting or lose their attention midway and may even lead to viewer losing out on important information.
[003] Furthermore, information gathering on any online video channels is often non-interactive. Because of the non-interactive nature of the video channels and due to limited viewer attention span, the viewer may likely disengage with the videos of longer duration. The viewer disengagement with the videos may result in significant drop in interest to understand the rest of the video. Thus, multimedia content providers may find difficulty in validating the viewer’s understanding and assessing the viewer’s learning intent. Thus, the multimedia content providers are realizing that, an interactive multimedia content (for example, video) is more engaging than a passive or uni-directional multimedia content. Therefore, the multimedia content providers are finding ways to make the multimedia content interactive, so that they can evaluate, analyze and assess the viewer understanding and learning intent. Similarly, the viewer can also evaluate, reflect and assess his understanding and learning intent of the multimedia content.
BRIEF DESCRIPTION OF THE FIGURES
[004] The embodiments disclosed herein will be better understood from the following detailed description with reference to the drawings, in which:

[005] FIG. 1 is a block diagram illustrating various units of an electronic device for generating questions from a multimedia, according to embodiments as disclosed herein;

[006] FIG. 2 is a flow diagram illustrating a method for generating questions from a multimedia, according to embodiments as disclosed herein; and

[007] FIG. 3 is an example block diagram illustrating a question generation engine for generating questions from a multimedia, according to embodiments as disclosed herein.
DETAILED DESCRIPTION OF EMBODIMENTS
[008] The embodiments herein and the various features and advantageous details thereof are explained more fully with reference to the non-limiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the embodiments herein. The examples used herein are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the examples should not be construed as limiting the scope of the embodiments herein.
[009] The embodiments herein provide systems and methods for generating questions from a multimedia, the method comprising of extracting textual information from the multimedia. Further, the method includes extracting at least one sentence from the extracted textual information associated with the multimedia. Further, the method includes parsing the at least one sentence extracted from the textual information associated with the multimedia. Further, the method includes comparing the parsed at least one sentence with at least one predefined question template. Further, the method includes generating the at least one question, when the parsed at least one sentence matches with the at least one predefined question template. Further, the method includes filtering and ranking the generated at least one question to extract relevant questions from the generated at least one question. Further, the method includes displaying the at least one question to a user while watching the multimedia. In an embodiment, the at least one question includes at least one descriptive question. Referring now to the drawings, and more particularly to FIGS. 1 through 3, where similar reference characters denote corresponding features consistently throughout the figures, there are shown embodiments.
[0010] FIG. 1 is a block diagram illustrating various units of an electronic device 100 for generating questions from a multimedia, according to embodiments as disclosed herein.
[0011] In an embodiment, the electronic device 100 can be at least one of, but not restricted to, a mobile phone, a smartphone, tablet, a phablet, a personal digital assistant (PDA), a laptop, a computer, a wearable computing device, a smart TV, wearable device (for example, smart watch, smart band), or any other electronic device which has the capability of accessing and playing multimedia content or accessing an application (such as a browser) which can access and display multimedia content. The electronic device 100 includes a text extractor unit 102, a sentence extractor unit 104, a question generation engine 106, a display unit 108, a communication interface unit 110 and a memory 112.
[0012] The text extractor unit 102 can be configured to extract textual information from a multimedia. The multimedia can be for example, in the form of video, audio and video, textual content present in image and video format, animations with audio, text or the like. The sentence extractor unit 104 can be configured to extract at least one sentence from the extracted textual information associated with the multimedia. The question generation engine 106 can be configured to parse the at least one sentence present in the extracted textual information associated with the multimedia. Further, the question generation engine 106 can be configured to compare the parsed at least one sentence with at least one predefined question template. Further, the question generation engine 106 can be configured to generate at least one question, when the parsed at least one sentence matches with at least one predefined question template. Further, the question generation engine 106 can be configured to filter and rank the generated at least one question to extract relevant questions from the generated at least one question. The display unit 108 can be configured to display the at least one question to a user, while the user is watching the multimedia. The at least one question includes at least one descriptive question.
[0013] The communication interface unit 110 can be configured to establish communication between the electronic device 100 and at least one external entity such as a network.
[0014] The memory 112 can be configured to store multimedia and textual information generated from the respective multimedia. The memory 112 may include one or more computer-readable storage media. The memory 108 may include non-volatile storage elements. Examples of such non-volatile storage elements may include magnetic hard discs, optical discs, floppy discs, flash memories, or forms of electrically programmable memories (EPROM) or electrically erasable and programmable (EEPROM) memories. In addition, the memory 112 may, in some examples, be considered a non-transitory storage medium. The term “non-transitory” may indicate that the storage medium is not embodied in a carrier wave or a propagated signal. However, the term “non-transitory” should not be interpreted to mean that the memory 112 is non-movable. In some examples, the memory 112 can be configured to store larger amounts of information than the memory. In certain examples, a non-transitory storage medium may store data that can, over time, change (e.g., in Random Access Memory (RAM) or cache).
[0015] FIG. 1 shows exemplary units of the electronic device 100, but it is to be understood that other embodiments are not limited thereon. In other embodiments, the electronic device 100 may include less or more number of units. Further, the labels or names of the units are used only for illustrative purpose and does not limit the scope of the embodiments herein. One or more units can be combined together to perform same or substantially similar function in the electronic device 100.
[0016] The embodiments herein generate a context dependent set of questions using machine learning (ML) and natural language processing (NLP) techniques from the multimedia (for example, spoken audio or a text recognized from the spoken audio in the media) under analysis. Automatic Speech Recognition (ASR) operates on the audio to generate textual information. Further, the textual information sentences can be parsed to identify suitable sentences based on context and automatically generate question that enhances/tests the understanding of the textual information.
[0017] In an embodiment, if an image is the multimedia under analysis, then the context dependent set of questions can be auto-generated using computer vision (CV) machine learning artificial intelligence (AI) techniques. This techniques uses image/object detection and recognition techniques (for example, optical character recognition (OCR)) from CV to detect objects in the image. The detected objects can then form a description which can be rendered in the textual information to generate set of questions.
[0018] In an embodiment, generating context dependent set of questions from PDF documents using machine learning artificial intelligence (AI) includes, at least one of image/object detection and recognition techniques from computer vision (CV), or text extraction techniques can be used to extract the textual information. Further, from the extracted textual information, the context dependent set of questions can be auto-generated using machine learning AI techniques.
[0019] The embodiments herein provide an electronic device 100 configured to use at least one of image processing, speech recognition, natural language processing, machine learning and neural networks to determine what is inside the multimedia (for example, video) by generating a textual information of the video using a combination of video frame analysis or textual information summarization.
[0020] FIG. 2 is a flow diagram illustrating a method for generating questions from the multimedia, according to embodiments as disclosed herein.
[0021] At step 202, the method includes extracting textual information from the multimedia. The method allows the text extractor unit 102 to extract the textual information from the multimedia.
[0022] At step 204, the method includes extracting at least one sentence from the extracted textual information associated with the multimedia. The method allows the sentence extractor unit 104 to extract at least one sentence from the extracted textual information associated with the multimedia.
[0023] At step 206, the method includes parsing the at least one sentence present in the extracted textual information associated with the multimedia. The method allows the question generation engine 106 to parse the at least one sentence present in the extracted textual information associated with the multimedia.
[0024] At step 208, the method includes comparing the parsed at least one sentence with at least one predefined question template. The method allows the question generation engine 106 to compare the parsed at least one sentence with at least one predefined question template.
[0025] At step 210, the method includes generating the at least one question, when the parsed at least one sentence matches with at least one predefined question template. The method allows the question generation engine 106 to generate the at least one question, when the parsed at least one sentence matches with the at least one predefined question template.
[0026] At step 212, the method includes filtering and ranking the generated at least one question to extract relevant questions from the generated at least one question. The method allows the question generation engine 106 to filter and rank the generated at least one question to extract relevant questions from the generated at least one question.
[0027] At step 214, the method includes displaying the at least one question to a user while watching the multimedia. The method allows the display unit 108 to display the at least one question to a user while watching the multimedia.
[0028] The various actions, acts, blocks, steps, or the like in the method and the flow diagram 200 may be performed in the order presented, in a different order or simultaneously. Further, in some embodiments, some of the actions, acts, blocks, steps, or the like may be omitted, added, modified, skipped, or the like without departing from the scope of the invention.
[0029] FIG. 3 is an example block diagram illustrating the question generation engine for generating questions from the multimedia, according to embodiments as disclosed herein. Questions can be generated automatically using the textual information extracted using a speech decoder and video frames output from video frame extractor. In addition to this, the questions are also automatically generated using a curated text created based on a viewer’s profile and recently captured responses. The question generation engine 106 can be configured to extract contextual keywords which help in determining the relevant context in the video content. The question generation engine 106 can be further configured to transform visual description of the video frames into relevant questions. The question generation engine 106 can be further configured to generate new questions for the visual patterns inside the video frames. The question generation engine 106 further includes a question bank which can be indexed using tagged questions with the domain relevance and with the time sections inside the video. The automatic question generator can be further configured to render a quiz intelligently using interactive inputs from the viewer.
[0030] The question generation engine 106 uses the predefined question template matching technique based on semantic parsing of sentences, for automatic generation of questions. The templates are independent of the domain of the textual information and can generate questions of variable difficulty that require good understanding of the input text to answer. The generated questions can be completely related to the context of the given textual information and do not use any database of facts or facts retrieved from the internet to generate questions.
[0031] The process of Automatic Question Generation can be divided into one or more of the below stages.
1. Preprocessing stage
2. Generation stage
3. Post processing stage, the post processing stage further includes at least one of
I. Filtering
II. Correction
III. Ranking
[0032] Preprocessing stage: The extracted textual information from the multimedia is given as input to the question generation (QG) engine 106. In the preprocessing stage, the QG engine 106 can be configured to parse the sentences present in the extracted textual information. Further, the QG engine 106 can be configured to match sentence parse with the predefined question templates and if a match occurs, then the parsed and sentence data can be sent to the generation stage. Additional templates can be added depending on the type of questions that are to be generated. The matched sentence parses are sent to the Generation stage for further processing.
[0033] For example, the extracted textual information includes a sentence; ‘Sachin played his last international Cricket match in Mumbai’. Further, the sentence can be parsed using the parser. The output of the parser is shown as below:
Sentence parse: - Sachin - , played - , his last international Cricket match - , in - , Mumbai -
[0034] Further, the parsed sentence can be compared with a list of predefined templates. The predefined templates matched with the parsed sentence are shown below:
1. Who ?
2. Where did ?
[0035] Generation Stage: The matched sentences are converted to descriptive/subjective questions using the predefined question templates. The predefined question templates provide related information on what type of question can be generated for example like a ‘Who/What/When’ or the like based on named entities or other semantic labels. These generated questions are sent to the Post processing stage. Further, the question generation engine 106 can be configured to generate questions based on the matched predefined templates. For example, the generated questions are shown as below:
1. Who played his last international Cricket match in Mumbai?
2. Where did Sachin play his last international Cricket match?
[0036] Post processing Stage: The generated questions from the generation stage are filtered, corrected and ranked using at least one of the below strategies.
[0037] Filtering Stage: This stage filters out any questions that can be considered less grammatical/useful for learning or any other criteria as appropriate. This involves at least one of keyword based filtering, filtering based on pronouns in main clause, and filtering based on number of words in the questions.
[0038] Keyword filtering: At least one of Keyword or key phrase of the textual information is not present in the generated question, and then the generated questions can be discarded.
[0039] Pronoun filtering: The questions obtained after keyword filtering are syntactically parsed. If the resultant parse has a pronoun in the main clause, then the questions of that kind can be discarded.
[0040] Number of words filtering: Any question without having a keyword/key phrase or a noun phrase and having very few words, or too many words can be discarded.
[0041] Correction Stage: In an embodiment, the questions obtained from the filtering stage can be sent for co-reference resolution. Later the questions can be sent for grammar correction. The co-reference occurs when two or more expressions in a text refer to the same person or thing; they have the same referent. The co-reference resolution is the task of finding all expressions that refer to the same entity in a text.
[0042] Ranking Stage: The questions obtained from the correction stage are ranked using machine learning techniques that give scores to the questions based on at least one of grammar usefulness, learning value or the like. Further, final list of questions are sorted based on the ranking score.
[0043] The embodiments herein can popup questions in between playing multimedia (for example, video, audio lectures) to evaluate, analyze, assess the viewer understanding and learning intent. Further, the questions can also help in increased user engagement. The generated questions pop up while playing the multimedia in frequent intervals to increase the user engagement while watching/listening to the multimedia. This helps a learner to gauge his/her understanding of the lecture by attempting to answer the question.
[0044] The embodiments herein can automatically generate questions/quiz from the extracted textual information document for testing the viewer/user understanding about the document.
[0045] The embodiments herein can generate a question bank of questions for a particular domain using documents from the domain. These can later be used to test the understanding of the viewer. The embodiment herein provides adaptive learning tests for students by providing questions with varying levels of difficulty.
[0046] The automated quiz or question generation described here in adds a unique value to the existing video systems, textbooks, audio podcasts and all relevant sources of information gathering. The automated quiz or question generation automatically generates questions based on the textual information extracted from the multimedia content under consideration. This method effectively eliminates any manual approach to reading and analyzing multimedia content to generate questions, which is currently the forte of domain experts only. Specifically, in the video domain, this method of generating questions helps in creating a highly engaged viewing experience and also provides ability to the viewer to assess his learning efficiently, while watching the selected video.
[0047] The textual extractor unit 102 can be configured to extract the textual information from the multimedia content. For example, the extracted textual information from the multimedia is shown below:
[0048] Sample text: “What is the purpose of any communication system? The purpose of any communication system is to transmit some signal, which is generated by a source to a destination through a media or channel. So, there is a source, which is generating some electrical signal, which is possibly captured from some real life image or audio. And, then the signal generated by a transducer and then that needs to be transmitted to a destination through a media which is technically called the channel”.
[0049] Further, the extracted textual information is passed to a semantic parser and then the parsed output is compared with the predefined templates. Further, the questions can be generated when the parsed sentence matches with the predefined template.
[0050] For the sample text given in the paragraph 0048, below are the possible questions generated:
1. How can the purpose of any communication system transmit some signal, which is generated by a source to a destination through a media or channel?
2. What is the purpose of any communication system?
3. What generates some signal?
4. What generates some electrical signal, which is possibly captured from some real life image or audio?
5. What is a transducer?
6. What is it and how can it generate the electrical signal?
[0051] Once, all possible questions are generated, the generated questions are sent to the filtering stage. The filtering can be performed to extract only relevant questions and filter the unwanted questions. The filtering stage includes keywords filtering, pronoun filtering or the like.
[0052] The keyword based filtering includes extracting keywords and key phrases from the extracted textual information. For example, for the sample text provided above, the keywords and key phrases can be, communication system, transmit, channel, electrical signal, transducer or the like. Further, based on the keywords and key phrases, the question generation engine 106 can be configured to filter the generated question based on the keywords and key phrases. The above generated questions can be filtered based on the keywords and key phrases (for example, communication system, transmit, channel, electrical signal, transducer or the like). The filtered questions based on the keywords and key phrases are shown below:
1. How can the purpose of any communication system transmit some signal, which is generated by a source to a destination through a media or channel?
2. What is the purpose of any communication system?
3. What generates some electrical signal, which is possibly captured from some real life image or audio?
4. What is a transducer?
5. What is it and how can it generate the electrical signal?
[0053] Further, the questions can be filtered using pronouns. For example, if the pronoun occurs in a main clause of the sentence, then that question can be discarded. Once the generated questions are filtered using keywords and key phrases, the questions can further be filtered based on the pronouns. For example, the filtered questions based on the pronouns are shown below:
1. How can the purpose of any communication system transmit some signal, which is generated by a source to a destination through a media or channel?
2. What is the purpose of any communication system?
3. What generates some electrical signal, which is possibly captured from some real life image or audio?
4. What is a transducer?
[0054] Once the questions are filtered using the above techniques, the filtered questions can be filtered based on number of words. Any question without having a keyword/key phrase or a noun phrase and having very few words, or too many words can be discarded. The output of filtering based on number of words as given below:
1. What is the purpose of any communication system?
2. What generates some electrical signal, which is possibly captured from some real life image or audio?
3. What is a transducer?
[0055] Once the questions are filtered using the above techniques, the filtered questions can be ranked by assigning scores based on grammaticality, learning value and possibility of answering the question based on the extracted textual information. For example, the filtered questions using keywords and key phrases, pronouns and number of words are further ranked to generate relevant question. The ranked questions based on the scores are shown as below:
1. What is the purpose of any communication system? (Score 358.62)
2. What generates some electrical signal, which is possibly captured from some real life image or audio? (Score 310.36)
3. What is a transducer? (Score 308.6)
[0056] The embodiments disclosed herein can be implemented through at least one software program running on at least one hardware device and performing functions to control the at least one hardware device. The electronic device 100 shown in FIG. 1 includes blocks, which can be at least one of a hardware sub-module, or a combination of hardware sub-modules and software module.
[0057] The foregoing description of the specific embodiments will so fully reveal the general nature of the embodiments herein that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. Therefore, while the embodiments herein have been described in terms of embodiments and examples, those skilled in the art will recognize that the embodiments herein can be practiced with modification within the spirit and scope of the claims as described herein.

Documents

Application Documents

#	Name	Date
1	201842001026-STATEMENT OF UNDERTAKING (FORM 3) [09-01-2018(online)].pdf	2018-01-09
1	abstract 201842001026.jpg	2018-01-11
2	201842001026-COMPLETE SPECIFICATION [09-01-2018(online)].pdf	2018-01-09
2	201842001026-POWER OF AUTHORITY [09-01-2018(online)].pdf	2018-01-09
3	201842001026-FORM FOR STARTUP [09-01-2018(online)].pdf	2018-01-09
3	201842001026-DECLARATION OF INVENTORSHIP (FORM 5) [09-01-2018(online)].pdf	2018-01-09
4	201842001026-FORM FOR SMALL ENTITY(FORM-28) [09-01-2018(online)].pdf	2018-01-09
4	201842001026-DRAWINGS [09-01-2018(online)].pdf	2018-01-09
5	201842001026-EVIDENCE FOR REGISTRATION UNDER SSI [09-01-2018(online)].pdf	2018-01-09
5	201842001026-FORM 1 [09-01-2018(online)].pdf	2018-01-09
6	201842001026-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [09-01-2018(online)].pdf	2018-01-09
7	201842001026-EVIDENCE FOR REGISTRATION UNDER SSI [09-01-2018(online)].pdf	2018-01-09
7	201842001026-FORM 1 [09-01-2018(online)].pdf	2018-01-09
8	201842001026-DRAWINGS [09-01-2018(online)].pdf	2018-01-09
8	201842001026-FORM FOR SMALL ENTITY(FORM-28) [09-01-2018(online)].pdf	2018-01-09
9	201842001026-DECLARATION OF INVENTORSHIP (FORM 5) [09-01-2018(online)].pdf	2018-01-09
9	201842001026-FORM FOR STARTUP [09-01-2018(online)].pdf	2018-01-09
10	201842001026-POWER OF AUTHORITY [09-01-2018(online)].pdf	2018-01-09
10	201842001026-COMPLETE SPECIFICATION [09-01-2018(online)].pdf	2018-01-09
11	abstract 201842001026.jpg	2018-01-11
11	201842001026-STATEMENT OF UNDERTAKING (FORM 3) [09-01-2018(online)].pdf	2018-01-09