System/Method For Document Classification Using Content Based Image

< Back

System/Method For Document Classification Using Content Based Image Retrieval Technique

Abstract: A potentially useful contribution that could aid users in identifying and extracting knowledge from a huge corpus of available information is the classification of comparable objects. The enormous amount of data accessible adds complexity and processing overhead to the classification process, despite the implementation of numerous techniques and enhancements. Videos, photos, texts, music, and other forms of media may all be part of the documents that need to be classified. Significant exceptional classification challenges are present in all document types. In order to achieve better results in classification, CBIR merges the shape and color characteristics of the images in the text corpus into a single feature vector through feature level fusion, sometimes called early fusion. The problem was tackled by applying the keyword based clustering technique through two distinct actions. Two operations were performed: the first was feature extraction, and the second was clustering. The goal of the suggested approach was to improve performance with document classification illustrations by combining visual and textual data. Each classifier in the suggested document classification system came to its own conclusion, and then the results were combined using the decision-level fusion method. In order to impact performance, the classifier adaption practice was linked to decision-level fusion.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

24 April 2024

Publication Number

18/2024

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

MLR Institute of Technology

MLR Institute of Technology, Hyderabad

Inventors

1. Mrs. P. Sirisha

Department of Information Technology, MLR Institute of Technology, Hyderabad

2. Mr. B. VeeraSekharReddy

Department of Information Technology, MLR Institute of Technology, Hyderabad

3. Dr. J. Mahalakshmi

Department of Information Technology, MLR Institute of Technology, Hyderabad

4. Mr. V. Gopikrishna

Department of Information Technology, MLR Institute of Technology, Hyderabad

Specification

Description:Field of the Invention
The records are highly vital and valuable materials in the sector like Medicine, Business, Politics, and so on. Text, graphs, figures, photos, audio, and video are all components of data documents. Particularly useful for publishers, news sites, blogs, financial institutions, insurance companies, and anyone else dealing with massive amounts of text and image content would be the ability to recognize the relationship between a document's subject and the text and images contained within. This would be a significant advancement in data mining. Everything that can be digitized, whether it's text or images, is regarded as data. Thorough analysis of the text and image is required to reveal the document's hidden patterns. Whether supervised or unsupervised, ML can use either of these two methods in various contexts. Labeled datasets form the basis of supervised learning in practice. A predetermined class is manually applied to identify the data set. The classifier can be trained to assign new instances to the correct class using the training set. The classifier can deliver the efficient accuracy and certainty of the anticipated label based on the design pattern and stages involved. Supervised document classification uses a dataset that has been manually tagged for training purposes.
Objective of the Invention
The objectives of the invention include constructing an appropriate algorithm to determine the document’s category by means of the hidden patterns in the image and text. An appropriate similarity metric to deal with massive amounts of photos.
Background of the Invention
Due to its capacity to process massive amounts of data—tens of terabytes in size—big data and data science have recently emerged as rapidly expanding fields. Structured, unstructured, and semi-structured data are the three main types of big data. Images are crucial in big data due to their low storage capacity, ease of depiction, and general usefulness. The quick growth of data science, as well as the communication technology, leads to highly easy transmission and possibility of viewing the photographs. The rapid and straightforward approach of managing the visual data is the Content Based Image retrieval (CBIR). For CBIR to work, it is crucial to accurately extract features from images, such as their color, shape, and texture. The capacity to recognize and appropriately apply underlying patterns dispersed throughout visual input is essential to the CBIR system. The effusiveness of automatic retrieval of the relevant image enhances the interest of the visual data users as the system is found more widespread. Many crucial aspects surrounding image processing must be considered by the developer in order to construct an efficient CBIR. The researcher's understanding of the user's information-searching requirements should be the first and foremost. The proper image processing methodology needs to be used in order to extract the necessary features from the original image. The goal of the picture database and its retrieval queries is to mimic human vision as closely as possible. Lastly, the system should offer a user-friendly and engaging interface for interacting with the CBIR system. The goal of convolutional neural imaging (CBIR) is to reveal previously unseen patterns in images by analyzing their color, shape, and texture using a variety of image processing algorithms. The primary goal of the system is to automatically retrieve related images to the query image. Because of its complexity, visual information is notoriously difficult to process. In order to overcome the complexity of managing low-level visual information, the automatic retrieval of query-based images by the user requires high-level user interpretation coupled with database handling abilities(US20120008013A1).
Conventional picture retrieval methods rely on textual interactions with metadata and keywords. The dimensions, timing, and other details of the needed retrieval picture must be included in the query text. These factors make the text-based image more labor-intensive to reach optimal accuracy. To resolve the discrepancy in determining which image to get, a lot of human interpretation and processing time is needed. A pioneering system for content-based image retrieval (CBIR) emerged in the 1990s.

Most of the space has been occupied by visual supervisions since the beginning of science. Textual explanations and standard man-made diagrams were the sole ways to report experimental results in the past. An additional significant innovation was the development of photography, which allowed for the visualization of the research. Recent advances in image processing, video analysis, and internet-related computer expertise have propelled the field into the middle of its second stage of development. Handheld computers and their processing capabilities have proven to be sufficiently dominant in handling computational tasks involving visual data, and their low cost has made them suitable for widespread application. Consequently, image processing has evolved into an all-encompassing technology with crucial investigative applications in domains such as computer science, astronomy, health, commerce, remote sensing, and physical research. Image sensors, image storage, and processing are the three main components of an image processing method's mechanism. Probabilistic mathematical interpretations provide the groundwork for the fundamental image processing operations.
At the lowest level, the Color characteristics are paramount since they are fundamental, obvious, and important. The majority of CBIR methods are based on the assumption that important color descriptors are being considered. Newest approach to CBIR by Gautam et al. is based on ant colony optimization and support vector machines. To improve the performance and estimate, they utilized color descriptors such as RGB and HSV. One important thing to keep in mind with color descriptor-based picture retrieval is that it can't convey both spatial structure and color information with just one feature. In order to enhance the retrieval process in a very professional approach, the way the descriptor is handled combines the density of the dominant color descriptor with the accuracy of the color structure descriptor. In order to carry out content-based image retrieval, a method has been suggested that rapidly extracts image texture and color information. It all starts with correctly determining the HSV color representation. The co-occurrence matrix was calculated as a byproduct of the feature extraction operation using the color histogram and texture features of the image in order to build the feature vector. In addition, the global color histogram served as the basis for measuring and evaluating the local color histogram and texture feature foundations in order to ascertain CBIR. When it comes to image retrieval, a lot of CBIR apps scour the same picture for discriminative features. Cleaning up the characteristics is one step in developing a good picture retrieval system(US4941125A).
Modern detectors and descriptors can work together to identify the locations of interest according to their respective fields of expertise. In their innovative approach to merging spatial color information with shaped extracted features and object recognition, Ahmed et al. provide a powerful method for utilizing image features in information fusion. An intensity-ranged structure is created for the grey-level image by linking the detected edges and corners, and features are retrieved for the RGB channels using L2 spatial color arrangements. Utilizing symmetric sampling on the identified interest locations, peri-foveal receptive field estimation with 128-bit cascade matching uncovers potential information for complex, overlay, foreground, and background items. First, a large variance coefficient is used to reduce the enormous feature vectors. Then, a Bag-of-Words strategy is used to acquire the indexing and retrieval.
Summary of the Invention
In order to illustrate concepts, discoveries, innovations, and inspirations, visuals and words are two distinct but equally vital mediums utilized across all disciplines. By constructing an appropriate algorithm to determine the document’s category by means of the hidden patterns in the image and text is always intriguing. A number of algorithms are now being worked on and put into action. In this case, the approach suggests using a single platform for text and image execution. Clustering and classification are two of the most popular approaches used in machine learning. Separate methods exist for clustering and classification. Both strategies are taken into account by the suggested system. The classification method is applied to images, while the clustering method is modified for text. The category of the image we selected always determines the feature selection in the CBIR system development. Color, texture, and shape are the three most basic aspects of any image that humans use to interpret it. When building the feature vector, the suggested system takes shape and color into account. An appropriate similarity metric should be used when dealing with massive amounts of photos. A lot of thought went into the suggested approach for selecting the similarity metric.
Brief Description of Drawings
Figure 1: Architecture of the proposed Model
Detailed Description of the Invention
Images are now easily accessible and transmittable thanks to the fast evolution of data science and communication technologies. With content-based image retrieval, working with visual data is quick and easy. It is crucial to accurately extract the image's color, form, and texture features in order for the CBIR to work properly. To operate the CBIR system, one must be able to spot and make good use of cryptic patterns in visually dispersed data. More and more visual data users are becoming interested in the system because to the effusiveness of autonomous picture retrieval. The developer must take into account numerous significant aspects that are revealed by image processing in order to create an efficient CBIR. A researcher's first and greatest responsibility is to comprehend the user's information-searching requirements. Applying the appropriate image processing technology is crucial for extracting the required information from the raw image. When it comes to storing and retrieving images from databases, it's important that they mimic human vision as closely as possible. The CBIR system should be accessible through a user-friendly and engaging interface. Using a variety of image processing methods, CBIR enables researchers to uncover color, shape, and texture-based hidden patterns inside images. Finding related images to the query image automatically is a primary goal of the system.
Because of its complexity, visual information is notoriously difficult to process. In order to overcome the complexity of managing low-level visual information, the automatic retrieval of query-based images by the user requires high-level user interpretation coupled with database handling abilities. Conventional picture retrieval methods rely on textual interactions with metadata and keywords. The dimensions, timing, and other details of the needed retrieval picture must be included in the query text. These factors make the text-based image more labor-intensive to reach optimal accuracy. To resolve the discrepancy in determining which image to get, a lot of human interpretation and processing time is needed. A pioneering system for content-based image retrieval (CBIR) emerged in the 1990s. The CBIR enhances the indexing and description process by leveraging visual information such as color, texture distribution, object forms, and spatial orientation of images, as opposed to the text-based method. The parts of the CBIR system are shown in figure 1. An image feature extractor that can generate information for the photos saved in the database is the major component. Secondly, the query engine compares the two images and determines how similar they are. Finally, a user-friendly interface for assessing the CBIR. By comparing color and edge histograms, the CBIR algorithm determines how similar two images are. The histogram shows what percentage of an image's pixels have certain values. Image dimensions and orientation are ignored by the color histogram retrieval system. Users are irritated because they have to spend a lot of time and effort discovering how to find a desired image in a large and varied data corpus. The growing interest in this field of active research and development is largely attributable to the need to find a solution to the challenges that have been revealed in the context of image retrieval.
Image features are what make each picture unique and allow us to use what we know about image similarity to find matching photos. Common visual features of substances, such as color, shape, texture, and spatial relationships, are represented by feature primitives. A few of the CBIR frameworks could make use of these amenities. The foundational building blocks of Feature tools are the Feature primitives. Simply said, feature tools allow for the automated generation of time-bound features, which are groups of related data sets. In CBIR, the color feature is among the most popular image features. Features like as color, texture, and shape are part of the General features, whereas application-oriented Domain specific features include things like fingerprints and face features. The human eye has a color perception bias, making colors one of the most fundamental tools for visual processing and organizing. For these reasons, when representing picture content, the features should be thought of as the most basic attributes. Color features are extremely helpful in picture retrieval and can sometimes provide the most crucial information for doing the classification process.
The term "color" refers to the subjective human perception of light's visibility as it varies with wavelength, intensity, and a network of interconnected channels. An average for a set of four pixels, as described by the organization of wavelengths, is reduced in relation to half the bandwidth needed to broadcast the full resolution of viewable light to the human visual apparatus. Color and saturation are two of its characteristics. Relative to the impact of worrying relationship awareness, human color sensitivity is based. Prescribed color descriptors can only be created by precisely defining the color space as a single entity. Color separation and a metric for comparing hues should also be well-defined. While certain colors in a supreme color space are independent of all other factors, many widely used color palettes aren't actually supreme, despite widespread belief to the contrary based on how accurately they portray their essential components. Color sets like L*a*b* represent a theoretically true color that, with the right viewing equipment, can be accurately reproduced in an ideal setting.
When it is absolutely necessary to decrease the amount of color information in an image, color quantization is applied. The conversion of 24-bit color images to 8-bit color images is one of the commonplace scenarios. With the goal of creating color histograms, it is also useful for reducing the number of colors used to illustrate a picture. There are two possible conclusions regarding color. The first is that the colors from the superior color set are retained in the newly acquired image. The second is that the method by which excess colors are rendered as redundant ones.
Instead of assigning similar colors to the same sub-cube, an ideal color range quantization would envision precise colors not being placed there. Using fewer colors will decrease the likelihood of different bins being assigned the same color, while increasing the likelihood of different bins being assigned the same exact color. There is room for improvement in the amount of text describing the visual content. Hetograms presented with a larger number of bins will almost surely provide a genuine capacity for knowing diagonally with visual information, as is generally known. As a result, fewer instances of specific hues being lactated towards specific bins are being displayed. One drawback of this approach is that it will raise the storage dimension of metadata due to the higher time necessity to create the distance surrounded in color histograms, which is a result of the better opportunity in finding identical colors to various bins. Because histograms absorb so much information, the user needs a thorough understanding of how to determine the number of bins to show. After conducting a thorough investigation across the CBIR region, the final result would undoubtedly be 64.
The suggested CBIR method is based on research into image classification and retrieval systems that can determine a document's category by examining its visual content using statistical supervised learning. Using a set of labelled examples or a collection of photos, both types of image analysis look for semantic similarities between the images and either the dependent class label or other images. The ability to understand the semantic significance of visual input is an important and crucial component of any CBIR system. Creating a method that can show annotating visual stuff in a way that is human-like is the end goal. , Claims:The scope of the invention is defined by the following claims:

Claim:
1. A System/Method for Document Classification using Content based Image Retrieval Technique comprising the steps of:
a) A single platform for text and image execution. The clustering and classification are two of the most popular approaches used in machine learning. Separate methods exist for clustering and classification. Both strategies are taken into account by the suggested system.
b) The classification method is applied to images, while the clustering method is modified for text.
2. According to claim 1, feature vector Quadratic distance similarity measure is used.
3. According to claim 1, for adapting a decision fusion routine CBIR methodology is used.
4. According to claim 1, to classify the Documents Support vector machine Algorithm is used.

Documents

Application Documents

#	Name	Date
1	202441032318-REQUEST FOR EARLY PUBLICATION(FORM-9) [24-04-2024(online)].pdf	2024-04-24
2	202441032318-FORM-9 [24-04-2024(online)].pdf	2024-04-24
3	202441032318-FORM FOR SMALL ENTITY(FORM-28) [24-04-2024(online)].pdf	2024-04-24
4	202441032318-FORM 1 [24-04-2024(online)].pdf	2024-04-24
5	202441032318-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [24-04-2024(online)].pdf	2024-04-24
6	202441032318-EVIDENCE FOR REGISTRATION UNDER SSI [24-04-2024(online)].pdf	2024-04-24
7	202441032318-EDUCATIONAL INSTITUTION(S) [24-04-2024(online)].pdf	2024-04-24
8	202441032318-DRAWINGS [24-04-2024(online)].pdf	2024-04-24
9	202441032318-COMPLETE SPECIFICATION [24-04-2024(online)].pdf	2024-04-24