Method And System For Automatic Generation Of Video Files From Audio

Method And System For Automatic Generation Of Video Files From Audio Files

Abstract: The invention discloses a method and system for the generation of video file from the audio file automatically. The given audio file is broken into multiple audio files on each occurrence of pause in the continuous speech. The broken/chopped audio file is converted into text separately for each chopped audio file. Each converted text is searched to find the relevant image separately from the available image database. The obtained image is combined along with each corresponding chopped audio file to convert into a video file. All separately prepared video files are appended into one video file.

Patent Information

Application #

Filing Date

16 December 2021

Publication Number

25/2023

Publication Type

INA

Invention Field

ELECTRONICS

Status

Email

abhishekgupta10@yahoo.co.in

Parent Application

Applicants

ABHISHEK GUPTA

School of Computer Science & Engineering, Shri Mata Vaishno Devi University, Jammu and Kashmir, India.

Inventors

1. ABHISHEK GUPTA

School of Computer Science & Engineering, Shri Mata Vaishno Devi University, Katra, Jammu & Kashmir-182320

Claims

1. A method for generating video file from the audio file, the method comprising: break the audio file into multiple audio files on each occurrence of pause in the continuous speech; convert the audio into text separately for each audio file; search the relevant image from the database for each converted text separately for each text; combine the obtained image along with each corresponding chopped audio to make it a video file; append all separately prepared video files into one video file.

2. The method as claimed in claim 1, where pause indicate the change of action in the speech.

3. The method as claimed in claim 1, where database has at least one image or the plurality of the images.

4. The method as claimed in claim 1, where searching an image from the database is based on the image captioning.

5. A system for generating video file from the audio file, the method comprising: break the audio file into multiple audio files on each occurrence of pause in the continuous speech; convert the audio into text separately for each audio file; search the relevant image from the database for each converted text separately for each text; combine the obtained image along with each corresponding chopped audio to make it a video file; append all separately prepared video files into one video file.

6. The system as claimed in claim 5, where pause indicate the change of action in the speech.

7. The system as claimed in claim 5, where database has at least one image or the plurality of the images.

8. The system as claimed in claim 5, where searching an image from the database is based on the image captioning.

Specification

The present disclosure relates to the generation of video files from
audio files forthe user entertainment purposes, and creating more interactive and interesting material forthe end user. The present disclosure relates to the creation of video files using the audio files and choosing appropriate images from the already prepared large database. The invention uses an image processing method such as content-based image retrieval and image captioning for the selection of appropriate images to create video files.
BACKGROUND
[0002] The present invention discloses a method of automatic generation
of video files from the audio files. Generally, the audio files are not able to engage the users completely and user may lose self-attention. The audio files are not interactive and user may start doing another task while listening the audios. Eventually, after a moment, user may lose listening the original audio and engage in the new task. To prepare the video files is a costly affair requiring more efforts compared to the audio files. Sometimes the requirements do not match to prepare a video file compared to audio files. Therefore, it is desirable to prepare a video files automatically from the given audio files.
[0003] This invention is related to the generation of video files from the
audio files so that the additional cost of preparing video files can be eliminated. The audio files can be created by the original sources then the disclosed system can be used to converting them into the video files with the same content.
OBJECTIVE
[0004] To create video files from the audio files automatically.

SUMMARY OF THE INVENTION
[0005] According to the some embodiment, the present invention
automatically generate the video file using the audio file so that the additional cost and efforts of preparing the video files can be eliminated. The audio files can be created by the original sources then the disclosed system can be used to converting them into the video files with the same content.
BRIEF DESCRIPTION OF DRAWINGS
[0006] Figure 1 illustrates a method for automatic generation of video file
from the audio file.
[0007] Figure 2 illustrate a system for automatically generating video file
from the audio file.
DETAILED DESCRIPTION
[0008] The present invention provides a method of generation of video file
from the audio file. The invention does not require any human intervention to create such content. The process is completely automatic but uses large image database to select relevant images as per the content. This generation of video files save the cost of the content creation.
[0009] The audio files can be originally created by the user where video is
not specific and significant. Later on, the same audio file can be converted into the video file using the invention without doing any effort. The video file can be used for entertainment purposes, and to engage the end user during the listening of the content.

[0010] The present invention provides a method of generating video file
from the audio file related to any subject. The audio file may not engage the user while video file engage the user completely that's why video content is most preferred. Fig. 1 demonstrate the method of converting the audio file into video files.
[0011] Step 101 shows the input of the audio file which is required to be
converted into the video file. At step 103, the obtained audio file can be broken into multiple similar audio files on each occurrence of pause in the continuous speech in the audio. The occurrence of pause indicate the change in the reference or task or action where display image is also required to be changed. In this way, each chunk of audio file will have only one statement with same reference or task or action. As the task or action or reference is changing, another chunk can be used.
[0012] At step 105, the audio of each file can be converted into the text so
that the relevant image can be searched from the large image database. The concept of content-based image retrieval and image captioning can be used to search the most relevant image of the text so that the action of the particular file can be defined through the image. The image can be useful to visualize the action on the screen so that user can be more engage in the listening. The conversion of the audio into the text can be performed separately for each chunk of video to process each file separately.
[0013] At step 107, the relevant image is searched from the large
database. It is desirable that user has the large database of the images from where the most relevant image can be searched to best define the audio. However, a database with at least one image may also be used for the invention. It is also recommended that the image database should have the images of the specific subject of the audio file. For each chunk of audio's text, separate image can be searched from the large database.

[0014] At step 109, the audio file can be converted into the video file by
combining the respective searched image in it's audio's chunk. It makes the video file respective to each audio chunk file. At this step, multiple video chunks will be received.
[0015] At step 111, append all the received video chunkfilesto make them
only 1 video file. This step gives the final video file as converted from the audio file.
[0016] Figure 2 demonstrate the system which used the method of
generating the video file from the audio file. At step 202, the audio file is given as input to the system using the input device.
[0017] At step 204, a processing unit is shown which process the method
as shown in the Figure 1.
[0018] At step 206, shows a storage unit which contents the large image
database. This large image database can be updated whenever it is required to be updated. The intermediate audio and video files can also be stored in the storage unit. The storage unit and processing unit interact to each other in orderto process the method disclosed in the invention.
[0019] At step 208, the final output video file is shown on the monitor
using the display device.
[0020] The present invention is disclosed a general purpose method to
generate video files from the audio files. However, this is an exemplary to explain the invention and must not be restricted as a domain specific. The same disclosure can be used in other applications.

We claim:

1. A method for generating video file from the audio file, the method
comprising:
break the audio file into multiple audio files on each occurrence of
pause in the continuous speech;
convert the audio into text separately for each audio file;
search the relevant image from the database for each converted text
separately for each text;
combine the obtained image along with each corresponding chopped
audio to make it a video file;
append all separately prepared video files into one video file.
2. The method as claimed in claim 1, where pause indicate the change of action in the speech.
3. The method as claimed in claim 1, where database has at least one image or the plurality of the images.
4. The method as claimed in claim 1, where searching an image from the database is based on the image captioning.
5. A system for generating video file from the audio file, the method comprising:
break the audio file into multiple audio files on each occurrence of
pause in the continuous speech;
convert the audio into text separately for each audio file;
search the relevant image from the database for each converted text
separately for each text;
combine the obtained image along with each corresponding chopped
audio to make it a video file;
append all separately prepared video files into one video file.

6. The system as claimed in claim 5, where pause indicate the change of action in the speech.
7. The system as claimed in claim 5, where database has at least one image or the plurality of the images.
8. The system as claimed in claim 5, where searching an image from the database is based on the image captioning.

Documents

Application Documents

#	Name	Date
1	202111058704-FORM 1 [16-12-2021(online)].pdf	2021-12-16
2	202111058704-DRAWINGS [16-12-2021(online)].pdf	2021-12-16
3	202111058704-COMPLETE SPECIFICATION [16-12-2021(online)].pdf	2021-12-16