Abstract: The availability of cloud infrastructure and distributed programming frameworks, the usage of multimedia is drastically increased. In the process, many computer vision applications are using image data more frequently. In this context, Content Based Image Retrieval (CBIR) has become need of every real-world application. The existing heuristics-based techniques have certain limitations in obtaining images that reflect user intent. To overcome this problem there is need for an AI enabled CBIR system which could improve image retrieval performance. The proposed invention is meant for efficient retrieval of images with query by example. It is based on deep learning models that could extract features from query image and also images in the database leading to more efficient matching of images. The deep learning-based framework in the current invention exploits three pre-trained models like LeNet, UNet and Inception V3. The features extracted by these three models are concatenated and used by the matching module for retrieving more relevant images. The CBIR system is also integrated with elastic search engine for indexing and faster retrieval of images. 3 claims & 5 Figures
Description:Field of Invention
The current invention is meant for efficient retrieval of images when a query is given by example. In other words, the current invention is an AI enabled CBIR system which exploits deep learning models like UNet, LeNet InceptionV3 for extracting features from given query image and also images in the database to help matching module in the retrieval of most relevant images that satisfy user intent. This invention has provision for integration with elastic search engine that helps in indexing and also faster retrieval of images. Thus, the current invention is very useful when integrated with computer vision applications being used by different organizations.
The objectives of this invention
The main objective of the current invention is to have an efficient CBIR system which is an AI enabled and capable of retrieving most relevant images that satisfy user intent besides supporting integration with elastic search engine for indexing and faster retrieval of images. The invention is based on deep learning models that are pre-trained and can be retrained with transfer learning to improve efficiency in image retrieval from time to time.
Background of the invention
With the availability of cloud infrastructure and distributed programming frameworks, the usage of multimedia is drastically increased. In the process, many computer vision applications are using image data more frequently. In this context, Content Based Image Retrieval (CBIR) has become need of every real-world application. The existing heuristics-based techniques have certain limitations in obtaining images that reflect user intent. There is need for an ai enabled approach which improves the state of the art. Here are some existing research contributions(Sura Mahmood Abdullah et.al, Journal of Intelligent Systems, Vol. 32, pp.1-17, 2023)suggested RCNN_CKKS, which integrates CKKS for picture privacy with CNN for feature extraction to achieve excellent retrieval accuracy. (Suneel Kumar et.al, Intelligent Circuits and Systems, pp.1-11, 2022) integrated DarkNet-19 and DarkNet-53 characteristics to improve CBIR systems and greatly increases image retrieval precision. (Tzelepi Maria et.al, Neurocomputing, Vol.275, pp.2467-2478, 2017) provided three methods based on available information for CNN-based Content-Based Image Retrieval, and presents a model retraining technique for increased efficiency. (R. Rani Saritha et.al, Cluster Computing, Vol.22, pp.4187-4200, 2018) used for feature extraction and classification in content-based image retrieval (CBIR), which uses picture content to search and retrieve digital images. Based on experimental data, the performance appears promising with deep belief network (DBN) technique. (Rajiv Kapoor et.al, Multimedia Tools and Applications, Vol.80, pp.29561-29583, 2021) issued for CBIR is the increasing amount of multimedia material. There is promise for closing the semantic gap using deep learning. Exciting findings from studies in a range of contexts encourage more investigation.
(Shiv Ram Dubey, IEEE Transactions on Circuits and Systems for Video Technology, Vol.32, No. 5, pp. 2687-2704, 2021) examined deep learning approaches for content-based image retrieval over the last 10 years, classifying methods and providing performance analysis to support researchers. (M.H. Hadid et.al, Iraqi Journal for Computer Science and Mathematics, Vol.4, No. 3, pp. 66-78, 2023) with the quick advancement of internet technology, CBIR has made picture retrieval from huge datasets easier. Feature extraction has been transformed by deep learning, increasing CBIR efficiency. (Shahbaz Sikandar et.al, MDPI, Vol.13, No. 7, pp. 1-17, 2023) with transfer learning, a revolutionary CBIR system achieves 100% accuracy by combining machine learning with deep learning. It works better than conventional techniques, particularly for a range of orientations. (R. Punithavathi et.al, Multimedia Tools and Applications, Vol.80, pp. 26889-2691, 2021) presented SIRS-IR, a secure image retrieval model that uses MSC and ResNet v2 with Inception for cloud computing. Its improved retrieval performance is shown by several trials. (M. Sivakumar et.al, Computer Systems Science & Engineering, Vol.43, No. 2, pp. 1-18, 2022) suggested using GLCM and DLECNN to improve the accuracy of the CBIR system. It consists of similarity calculation, feature extraction, and noise reduction. (Md. Mohsin Kabir et.al, Journal of Advances in Information Technology, Vol.13, No. 3, pp. 1-9, 2022) presented a unique CBIR method for rapidly retrieving comparable pictures from huge datasets by using AutoEmbedder and Deep CNN. Results from experiments indicate that it works better than current approaches.
Detailed of Prior Art
There are many existing patents related to AI enabled content-based image retrieval. A description is given of an automated video labeling system that learns from videos, their online context, and comments published on social media. Internet crawling analyzes large multimedia collections, and real-time changes are made to a knowledge base without requiring human oversight. Each video is therefore connected to other relevant resources and indexed with a comprehensive set of labels. In order to detect new actions, situations, and individuals in real-time, practical video recognition systems need a training dataset that can be updated in real-time and a label scheme that is user-friendly (US20190258671A1). An internet-sourced keyword list and a machine learning pre-processing step are combined with a weakly supervised deep learning technique to generate this real-time evolving dataset with user-relevant labels. In an unsupervised way, a neural network is trained using the generated tags in conjunction with videos and video summaries, enabling the tagging system to progress from an image to a collection of tags for the picture and ultimately to the visual representation of a tag (US20190258671A1).To find the similarity between two visual pictures, process-response statistical modeling of the images can be utilized. A variety of interactions (such as searching, indexing, grouping, summarizing, annotating, and keyframing) with a collection of visual pictures may be achieved by evaluating the content of those images—and, in particular, by determining image similarity (US7702185B2). It is possible to compare word pictures and concepts semantically thanks to a system and methodology. To embed word pictures and ideas in a semantic subspace where comparisons between the two may be done without requiring the word image's text content to be transcribed, word images and their concept labels are trained, and the neural network's parameters are learned from these pairs. By rating non-relevant ideas for a picture higher than relevant ones, the neural network is trained to minimize the ranking loss throughout the training set (US10635949B2).
Techniques, tools, and electronics for creating a feature vector are revealed, along with tools, equipment, and electronic devices for searching. The steps involved in creating a feature vector are as follows: gathering data; extracting a semantic feature from the data; and using a preset function to create a feature vector from the data information, passing the semantic feature information as an input (US10860641B2). By identifying the semantics of image data and comparing it with descriptions in plain language, the technical approach detects picture information. Unlike the traditional image search strategies of current search engines, this technical solution retrieves and identifies pictures based on the content of the image information instead of requiring the retrieval of a written description of the image information. As a consequence, as compared to the current text-based picture search, results with greater accuracy could be returned (US10860641B2).
An annotation procedure is integrated with relevance feedback and object retrieval operations in a multimedia object retrieval and annotation system. Semantically relevant keywords are added to multimedia items, including digital photographs, through the annotation process. While the user does standard searches, the annotation process runs in the background and is concealed from view (US7349895B2). The annotation procedure is "semi-automatic" in that it automatically searches for multimedia items using both content-based image retrieval and keyword-based information retrieval approaches, and then requests human input about the objects that are retrieved. The system automatically annotates the objects with semantically relevant keywords and/or modifies connections between the keywords and objects based on the user's classification of the items as either relevant or irrelevant to the query keywords.The annotation coverage and accuracy of next searches keep becoming better as long as the retrieval-feedback-annotation cycle is continued (US7349895B2).
Summary of Invention
The current invention is an efficient AI enabled CBIR system which supports faster image retrieval with high accuracy levels in meeting the user’s intent. The system is based on deep learning for extracting features from given query image and also from the database images. The features extracted are given to a matching module which is responsible for retrieving most relevant images. In the process, the proposed system exploits pre-trained deep learning models like LeNet, UNet and InceptionV3 for extracting features from given query image and also images in the database. Proposed system exploits deep learning models as they are efficient in retrieval of features. The features extracted by each deep learning model are combined to form final feature sets. The concatenated features are used by the matching module towards efficient retrieval of images. Besides, the proposed system has provision to integrate with elastic search engine for faster image retrieval and also indexing.
A Detailed Description of the Invention
The current invention is meant for efficient retrieval of images with query by example. It is based on deep learning models that could extract features from query image and also images in the database leading to more efficient matching of images. The deep learning-based framework in the current invention exploits three pre-trained models like LeNet, UNet and Inception V3. The features extracted by these three models are concatenated and used by the matching module for retrieving more relevant images. The CBIR system is also integrated with elastic search engine for indexing and faster retrieval of images.
The current invention makes use of A collection of training images that are available for content-based image Retrieval.When user gives a query image the system is supposed to extract all the images that are most relevant to this query image. In the process there is need for efficient approach in feature extraction and also matching the features of database images with the query image. There is also need for a distance metric that is to be used for finding most similar images based on Cuban query image to the ai enabled cbis system.Each image in the database as different features that are to be retrieved and saved in a database in order to help the matching model to perform image retrieval process.Towards this end the current invention makes use of three pretrained deep learning models like Lennon and the inception V3 these models are used in the proposed system to extract features from given query image and also from the images of database. The features extracted by each deep learning model are concatenated to form a final feature map which is used by the matching module towards retrieving highly relevant images.Catching module makes use of distance metric which is meant for finding similarity between the query image and the images in the database while making well informed decisions in image retrieval process.The current invention is integrated with elastic search engine which helps in indexing and also faster retrieval of images the integration of the proposed CBIR system with elastic search engine makes the system more robust and useful to computer vision applications where frequent image Retrieval important.
The training dataset used in the current invention has a number of labeled images that are used to train different deep learning models linked to the proposed architecture. The training data helps the models to gain the knowledge pertaining to extraction of features. The invention has both offline phase and online phase activities. The offline phase is the phase in which user interaction is not required. In this phase, the system is continuously trained with training data set based on the data availability. As the system gains knowledge from time to time with the help of training the underlying models become more intelligent. In other words, all the deep learning models used in the proposed system are pretrained with ImageNet dataset. Moreover, transfer learning technique is used by the models in order to get retrained from time to time based on the availability of new training samples. In the process, the deep learning models become more knowledgeable and able to perform better in content based image retrieval.
In the offline phase, each model extracts features from training dataset. The features extracted from training dataset by each model are combined in order to have final set of features for training dataset and they are saved in image features database. This image features database plays an important role in the current invention because the matching module exploits this database while processing a given query image in the online phase. In the online phase user gives a query image as input. The input is given by the user with the intention to collect most relevant images that match the input image. In the online phase also, the learned models are used for extracting features from the given image and all the features are concatenated in order to have final feature map which is used by the matching module. The matching module is responsible to process input query and retrieve most relevant images. In fact, the matching module takes features extracted from query image and matches them with the features already stored in the database of training dataset towards making a decision. The matching module exploits a distance based metric in order to achieve the similarity measurement and make decisions pertaining to inclusion of images in the results.
There are three deep learning models used in the current invention in order to improve the robustness in extracting features. All the models are based on CNN which is found to be efficient in extracting features from images. In fact, the CNN model is widely used in computer vision applications where the applications deal with processing image content. The three deep learning models used in the current invention are based on CNN and therefore they are efficient in extracting features from given image. The AlexNet model is a CNN variant widely used in computer vision applications. ResNet50 model is also a CNN variant which is efficient in dealing with images. The VGC19 model is also known to be efficient to deal with image content. Instead of using individual deep learning model, the current invention combines the features extracted from three deep learning models in order to have better feature map for comparison.
The current invention has provisioned for improving its knowledge from time to time in the offline phase due to its ability to exploit transfer learning with respect to all the underlying deep learning models In order to extract features from additionally available images. The transfer learning technique enables deep learning models to retain the existing knowledge and learn new knowledge from newly available training samples. In other words, the deep learning models used in this invention will not get trained from the scratch every time when new samples are available. Instead of that with the transfer learning they incrementally get trained and gain knowledge from time to time leading to efficient content-based image retrieval.
In summary, the proposed deep learning-based framework specially designed for efficient content-based image retrieval where the results of image retrieval reflect the user intent. This deep learning framework in the current invention can be used by real world computer vision applications in order to retrieve images sufficiently. This invention can also be used by health care domain because health care professionals need different kinds of case studies in the form of medical images. This invention is beneficial to many stakeholders like healthcare professionals, healthcare units, government healthcare departments, researchers and academia.
Brief Description of Drawing
Here are lists of Figures reflecting exemplary embodiment of the current invention.
Figure 1: This diagram illustrates the framework which is part of the current invention designed for content-based imager retrieval based on deep learning-based feature extraction from input query image and images of database to help a matching module retrieving most relevant images.
Figure 2: This diagram illustrates the process in which the proposed CBIR system is integrated with elastic search engine for indexing and faster retrieval of images.
Figure 3: This diagram illustrates the architectural overview of UNet model which has encoder and decoder-based U-shape structure which is efficient in extracting features from a given image.
Figure 4: The diagram illustrates another pre-trained deep learning model known as LeNet which is designed for efficient processing of images. In the current invention it is used to extract features from given input image.
Figure 5: This diagram illustrates an architectural overview of InceptionV3 model which is used in the current invention for retrieval of images.
Setailed description of the drawing
The current invention is meant for efficient retrieval of images with query by example. It is based on deep learning models that could extract features from query image and also images in the database leading to more efficient matching of images.
Figure 1 illustrates the proposed framework which is used to realize an efficient CBIR system for faster retrieval of images reflecting most relevant results that satisfy user intent. The system has both online phase and offline phase to facilitate image retrieval process. In offline phase, deep learning models like LeNet, UNet and Inception V3 are used to extract features from training images. The extracted features are then combined and stored in image features database which is used by a matching module later for efficient retrieval of images.In the online phase, when an input query image is given, the three deep learning models extract features from the given image and then features are combined and used by the matching module to find most relevant and similar images using a distance metric. Finally, the matching module is responsible to return matching images that are similar to given query image.
Figure 2 illustrates the integration of the proposed CBIR system with elastic search engine for indexing and faster retrieval of images. The query image and index areused by the proposed system where image preprocessing and feature extraction are done with the help of deep learning models and the results are associated with elastic search engine which makes indexing and allows faster retrieval of images in future.
Figure 3 illustrates the architectural overview of UNet model which has both encoder and decoder structures to facilitate extraction of features from given input image.This pre-trained deep learning model is found efficient in obtaining features from given image.
Figure 4 illustrates architectural overview of LeNet model which is a CNN variant used in the current invention for extracting features from given input image. This model is also found efficient in retrieval of features from images.
Figure 5 illustrates architectural overview of InceptionV3 model which is another pretrained deep learning model used in the current invention for retrieving features from given images. These three models extracted features are combined and used by matching module for efficient retrieval of images. , Claims:The following claims define the scope of the invention:
Claims:
1. An AI enabled CBIR system for efficient image retrieval integrated with elastic search engine for indexing and faster image retrieval.
a) A set of deep learning models are used to extract features from query image and also images in the database leading to more efficient matching of images.
b) The deep learning-based framework in the current invention exploits three pre-trained models like LeNet, UNet and Inception V3.
c) The CBIR system is also integrated with elastic search engine for indexing and faster retrieval of images.
2. As per claim1, the features extracted by the models are concatenated and used by the matching module for retrieving more relevant images.
3. As per claim1, the training dataset used in the current invention has a number of labeled images that are used to train different deep learning models linked to the proposed architecture. The training data helps the models to gain the knowledge pertaining to extraction of features. The invention has both offline phase and online phase activities.
| # | Name | Date |
|---|---|---|
| 1 | 202441053227-REQUEST FOR EARLY PUBLICATION(FORM-9) [12-07-2024(online)].pdf | 2024-07-12 |
| 2 | 202441053227-FORM-9 [12-07-2024(online)].pdf | 2024-07-12 |
| 3 | 202441053227-FORM FOR STARTUP [12-07-2024(online)].pdf | 2024-07-12 |
| 4 | 202441053227-FORM FOR SMALL ENTITY(FORM-28) [12-07-2024(online)].pdf | 2024-07-12 |
| 5 | 202441053227-FORM 1 [12-07-2024(online)].pdf | 2024-07-12 |
| 6 | 202441053227-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [12-07-2024(online)].pdf | 2024-07-12 |
| 7 | 202441053227-EVIDENCE FOR REGISTRATION UNDER SSI [12-07-2024(online)].pdf | 2024-07-12 |
| 8 | 202441053227-EDUCATIONAL INSTITUTION(S) [12-07-2024(online)].pdf | 2024-07-12 |
| 9 | 202441053227-DRAWINGS [12-07-2024(online)].pdf | 2024-07-12 |
| 10 | 202441053227-COMPLETE SPECIFICATION [12-07-2024(online)].pdf | 2024-07-12 |