A Device To Classify Labelling Complexity Of An Image And A Training

< Back

A Device To Classify Labelling Complexity Of An Image And A Training Method Thereof.

Abstract: TITLE: A device (10) to classify labelling complexity of an image and a training method (200) thereof. Abstract The present disclosure proposes a device (10) to classify labelling complexity of an image and a method (200) for training an AI Model (1011) thereof. The device (10) is characterized by the functionality of a processing unit (101) inside the device (10). The processing unit (101) trains the AI Model (1011) to categorize the complexity of an image through a supervised learning mechanism in accordance with method steps (200). The processing unit (101) comprising the trained AI Model (1011) is configured to extract a plurality of feature vectors from the image and feed them as input trained AI Model (1011) to classify labelling complexity of the image. The AI Model (1011) is trained correlate between feature vectors and number of edges to a corresponding image category. Figure 1.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

30 September 2022

Publication Number

14/2024

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

Bosch Global Software Technologies Private Limited

123, Industrial Layout, Hosur Road, Koramangala, Bangalore – 560095, Karnataka, India

Robert Bosch GmbH

Feuerbach, Stuttgart, Germany

Inventors

1. Ashutosh Kumar Agrawal

C-470, Mahaveer Tuscan, Hoodi Circle, Bangalore, Karnataka, India – 560048

2. Amit Kale

B-407 Raheja Residency Koramangala 3rd Block Bangalore 560034, Karnataka, India

3. Srihari Humbarwadi

#227, Guruprasad Colony, Tilakwadi, Belgaum, Karnataka, India - 590006

Specification

Description:Complete Specification:
The following specification describes and ascertains the nature of this invention and the manner in which it is to be performed

Field of the invention
[0001] The present disclosure relates to the field of data management and annotation. In particular, it discloses a device to classify labelling complexity of an image and a method for training an AI model thereof.

Background of the invention
[0002] Image classification, object detection, semantic segmentation or instance segmentation are all tasks of interest in computer vision. The recent success of deep learning approaches for computer vision tasks has led to the need for large amounts of annotated data. Labeling complexity and costs increase in the order of these tasks, the highest cost being incurred for instance level segmentation, where every instance of an object such as a vehicle or human has to be annotated at pixel level. Cost aware data pre-selection which aims at segregating data (data source example: camera, radar and lidar) into simple medium and complex categories proportional to labelling complexity of data which has direct impact on quality control and labelling cost becomes an important parameter.

[0003] Chinese Patent Application CN109934834A titled “Image contour extraction method and system” establishes methods to extract contours from binary images in the form of 2D coordinates. Research Paper titled “Measuring the Complexity of Polygonal Objects” introduces a metric to quantify the complexity of 2D polygons shapes using the convex hull and contours of the polygon. Research Paper titled "No-Reference Image Quality Assessment in the Spatial Domain” proposes a standalone statistical method called blind/reference-less image spatial quality evaluator (BRISQUE) to assess the quality of an image by regressing image quality score directly from the input image. The proposed idea in the paper assesses the image quality based on scene statistics and distortions that might be present in it.

[0004] The effort or the complexity required to annotate or label objects in an image is directly proportional to the amount of time needed to fully annotate the image. In the case of generating labels for image segmentation task, this would be the total time taken to generate masks for all the objects present in the image. There is a need for an automated system or device capable of classifying the input images into multiple categories that vary in terms of the effort required to generate labels for the task of image segmentation. There is a need for a device that assesses the shape complexity of the objects present in the image and uses this as an indicator of the labelling effort required.

Brief description of the accompanying drawings
[0005] An embodiment of the invention is described with reference to the following accompanying drawings:
[0006] Figure 1 depicts a device (10) to classify labelling complexity of an image;
[0007] Figure 2 illustrates method steps of training an AI Model (1011) to classify labelling complexity of an image; and at least
[0008] Figure 3 illustrates the training mechanism for training the AI Model (1011) to classify labelling complexity of an image.

Detailed description of the drawings
[0009] Figure 1 depicts a device (10) to classify labelling complexity of an image. The device (10) comprises at least a processing unit (101) executing a trained AI Model (1011). The processing unit (101) is adapted to receive image data from at least one image source (102) such as any imaging unit, LIDAR or an edge device capable of storing image data. The edge device could be anything enabled to either capture images or store captured images. In an alternate embodiment, source (102) can be extended to other data sources like radars and lidars with certain changes in the feature extraction strategies based on the data-type. The communication link can be enabled via the known state of the art technologies, for example 5G communications. Optionally the device (10) can be in communication with at least one edge device having an output interface (103).

[0010] The processing unit (101) can be defined as either a logic circuitry or a software program that respond to and processes logical instructions to get a meaningful result. A hardware-based processing unit (101) may be implemented in the device (10) as one or more microchips or integrated circuits interconnected using a parent board, hardwired logic, software stored by a memory device (10) and executed by a microprocessor, firmware, an application specific integrated circuit (ASIC), and/or a field programmable gate array (FPGA) and/or any component that operates on signals based on operational instructions.

[0011] The processing unit (101) executes at least one (artificial intelligence) AI Model (1011). An AI Model (1011) with reference to this disclosure can be explained as a reference or an inference set of data, which is use different forms of correlation matrices. Using these models and the data from these models, correlations can be established between different types of data to arrive at some logical understanding of the data. A person skilled in the art would be aware of the different types of AI Model (1011)s such as linear regression, naïve bayes classifier, support vector machine, neural networks and the like. A person skilled in the art will also appreciate that the AI module may be implemented as a set of software instructions, combination of software and hardware or any combination of the same. In an exemplary embodiment of the present innovation, the AI Model (1011) is a convolutional neural network. The neural network could be a software residing in the device (10) or a cloud or embodied within an electronic chip. There are special neural network chips known as specialized silicon chips, which incorporate AI technology and are used for machine learning.

[0012] The device (10) is characterized by the functionality of the processing unit (101). The processing unit (101) trains the AI Model (1011) to categorize the complexity of an image through a supervised learning mechanism in accordance with method steps (200). During the training of the AI Model (1011), the processing unit (101) is configured to: draw out a 2-D mask of the image by means of the processing unit (101); generate a label map of the image from the 2-D mask of the image by means of the processing unit (101); count a number of edges in the label map of the image by means of the processing unit (101); categorize the image in a category based on the number of edges counted; feeding the plurality of feature vectors, number of edges, image category as input dataset to the un-trained AI Model (1011).

[0013] Once this trained AI Model (1011) is deployed in the processing unit (101), it has now learnt to correlate between feature vectors and the number of edges to a corresponding image category. The processing unit (101) comprising the trained AI Model (1011) is configured to extract a plurality of feature vectors from the image; feed the plurality of feature vectors as input to the trained AI Model (1011); execute the AI Model (1011) to classify labelling complexity of the image.

[0014] It should be understood at the outset that, although exemplary embodiments are illustrated in the figures and described below, the present disclosure should in no way be limited to the exemplary implementations and techniques illustrated in the drawings and described below.

[0015] Figure 2 illustrates method steps of training an AI Model (1011) to classify labelling complexity of an image. The AI Model (1011) and its corresponding device (10) along with the relevant components of the device (10) has been explained in accordance with figure 2.

[0016] Method step 201 comprises drawing out a 2-D mask of the image by means of the processing unit (101). Each object is the image for example a car or a human being is represented by a polygon. A 2-D mask of an image uses image segmentation to represent the image in the form of outlines for all the polygons in the image. While drawing out a 2-D mask the processing unit (101) counts the minimum number of 2D coordinate points required to represent each polygon segment.

[0017] Method step 202 comprises generating a label map of the image from the 2-D mask of the image by means of the processing unit (101). Generating label map of the image is done by contour extraction and approximation to extract polygon segments. Method step 203 comprises counting a number of edges in the label map of the image by means of the processing unit (101). For each polygon segment, the minimum number of 2D coordinate points required to represent the segment without degrading its quality is stored. To count the edges, we sum the total number of points required to represent all the polygon segments in a mask and use this as the label to train our AI Model (1011).

[0018] Method step 204 comprises categorizing the image in a category based on the number of edges counted. Since this is a supervised training mechanism, based on ground truthing of a manual labelling we categorize the labelling complexity of the image into three categories easy, medium and hard.

[0019] Method step 205 comprises extracting a plurality of feature vectors from the image by means of the processing unit (101). A feature vector is defined as the vector containing multiple elements or features that may represent a pixel or a whole object in an image. Examples of features are color components, length, area, circularity, gradient magnitude, gradient direction, or simply the gray-level intensity value.

[0020] Method step 206 comprises feeding the plurality of feature vectors, the number of edges and at least the image category as input to the AI Model (1011).

[0021] Figure 3 illustrates the training mechanism for training the AI Model (1011) to classify labelling complexity of an image. Method step 207 comprises training the AI Model (1011) to classify labelling complexity of the image based on correlation between feature vectors, number of edges and the corresponding image category. This correlation is based upon regression. However, directly regressing the feature vectors to the exact number of edges required to represent all the polygon segments (total_size) does not yield good results. Hence, instead we breakdown our score into multiple categories and convert this into a three-way classification problem with 'easy', 'medium' and 'hard' as our categories. These categories represent the relative complexity of a given image.

[0022] For each input image, the processing unit (101) and the AI Model (1011) are trained to extract features and output probabilities for each class (with reference to this disclosure the class for the AI Model (1011) is the image category i.e. easy, medium, hard) in a single forward pass. The class probabilities are then used to determine complexity and hence assess the labelling effort required for the given image.

[0023] A person skilled in the art will appreciate that while these method steps describes only a series of steps to accomplish the objectives, these methodologies may be implemented through adaptations and modifications to the disclosed processing unit (101) and the device (10).

[0024] This proposed method (200) aims to streamline the process of generating ground truth labels for training AI Model (1011)s that solve various computer vision tasks. The ability to predict the effort required to label data enables us to optimize for cost and speed of the labelling process, hence greatly increasing its efficiency. This idea to develop a device (10) to classify labelling complexity of an image and a training method thereof provides an end-to-end system for directly predicting labelling efforts for a given set of images. This further permits setting cost to label dynamically based on predicted complexity and effort.

[0025] It must be understood that the embodiments explained in the above detailed description are only illustrative and do not limit the scope of this invention. Any modification or customization of the device (10) to classify labelling complexity of an image and the training method thereof are envisaged and form a part of this invention. The scope of this invention is limited only by the claims
, Claims:We Claim:
1. A method (200) of training an AI Model (1011) to classify labelling complexity of an image, said AI Model (1011) in communication with a processing unit (101), said method steps comprising:
drawing out a 2-D mask of the image by means of the processing unit (101);
generating a label map of the image from the 2-D mask of the image by means of the processing unit (101);
counting a number of edges in the label map of the image by means of the processing unit (101);
categorizing the image in a category based on the number of edges counted;
extracting a plurality of feature vectors from the image by means of the processing unit (101);
feeding the plurality of feature vectors, number of edges, image category as input to the AI Model (1011);
training the AI Model (1011) to classify labelling complexity of the image based on correlation between feature vectors, number of edges and the corresponding image category.

2. The method (200) of training an AI Model (1011) as claimed in claim 1, wherein drawing out a 2-D mask further comprises recording the minimum number of 2D coordinate points required to represent each polygon segment.
3. The method (200) of training an AI Model (1011) as claimed in claim 1, wherein generating label map of the image is done by contour extraction and approximation to extract polygon segments.

4. The method (200) of training an AI Model (1011) as claimed in claim 1, wherein category of the image is classified into easy, medium and hard.

5. A device (10) to classify labelling complexity of an image, said device (10) comprising at least a processing unit (101) executing a trained AI Model (1011), said processing unit (101) adapted to receive image data from at least one source (102), the processing unit (101) configured to:
extract a plurality of feature vectors from the image;
feed the plurality of feature vectors as input to the AI Model (1011);
execute the AI Model (1011) to classify labelling complexity of the image.

6. The device (10) to classify labelling complexity of an image as claimed in claim 5, wherein the processing unit (101) during the training of the AI Model (1011) is configured to:
draw out a 2-D mask of the image by means of the processing unit (101);
generate a label map of the image from the 2-D mask of the image by means of the processing unit (101);
count a number of edges in the label map of the image by means of the processing unit (101);
categorize the image in a category based on the number of edges counted;

feeding the plurality of feature vectors, number of edges, image category as input dataset to the un-trained AI Model (1011).

7. The device (10) to classify labelling complexity of an image as claimed in claim 5, wherein the trained AI Model (1011) is trained to correlate between feature vectors and number of edges to a corresponding image category.

8. The device (10) to classify labelling complexity of an image as claimed in claim 5, wherein category of the image is classified into easy, medium and hard.

9. The device (10) to classify labelling complexity of an image as claimed in claim 6, wherein the processing unit (101) is configured to draw out a 2-D mask by recording the minimum number of 2D coordinate points required to represent each polygon segment.

10. The device (10) to classify labelling complexity of an image as claimed in claim 6, wherein the processing unit (101) is configured to generate label map of the image by contour extraction and approximation to extract polygon segments.

Documents

Application Documents

#	Name	Date
1	202241056089-POWER OF AUTHORITY [30-09-2022(online)].pdf	2022-09-30
2	202241056089-FORM 1 [30-09-2022(online)].pdf	2022-09-30
3	202241056089-DRAWINGS [30-09-2022(online)].pdf	2022-09-30
4	202241056089-DECLARATION OF INVENTORSHIP (FORM 5) [30-09-2022(online)].pdf	2022-09-30
5	202241056089-COMPLETE SPECIFICATION [30-09-2022(online)].pdf	2022-09-30
6	202241056089-FORM 18 [24-01-2025(online)].pdf	2025-01-24