Abstract: The present disclosure relates to a system (100) for edge-based machine learning and object classification. The system includes a video frame buffer (102) that captures the video frames from a video sensor. A processor (106) coupled to the video frame buffer, the processor configured to extract the frames from the video frame buffer, reads the entire frame from the video frame buffer and estimates the edge contours of different objects in the frame. The processor (106) trains the edge contours of different objects, generate the features or weight vectors of different objects, and perform decision between the tested object features and trained object features to classify the correct object, wherein each detected object is classified and labelled with a confidence value and select any or a combination of video display or self-learning unit based on the confidence value.
Description:TECHNICAL FIELD
[0001] The present disclosure relates, in general, to object classification, and more specifically, relates to a method and system for edge-based machine learning and object classification.
BACKGROUND
[0002] A few examples known in the art relate to US20060018521A1 titled “object classification using image segmentation” describes a method that represents a class of objects by first acquiring a set of positive training images of the class of objects. A matrix is constructed from the set of positive training images. Each row in the matrix corresponds to a vector of intensities of pixels of one positive training image. Correlated intensities are grouped into a set of segments of a feature mask image. Each segment includes a set of pixels with correlated intensities. A set of features is assigned to each pixel in each subset of representative pixels of each segment of the feature mask image to represent the class of objects.
[0003] Another example relates to a patent US20100054535A1 titled “video object classification” which describes a technique for classifying one or more objects in at least one video, wherein at least one video comprises a plurality of frames provided. One or more objects in the plurality of frames are tracked. A level of deformation is computed for each of one or more tracked objects in accordance with at least one change in a plurality of histograms of oriented gradients for a corresponding tracked object. Each of one or more tracked objects is classified in accordance with the computed level of deformation.
[0004] Yet another example relates to a patent WO2016095117A1 titled “object detection with neural network” describes an apparatus comprising at least one processing core and at least one memory including computer program code, the at least one memory and the computer program code being configured to, with the at least one processing core, cause the apparatus at least to run a convolution neural network comprising an input layer arranged to provide signals to a first convolution layer and a last convolution layer, run a first intermediate classifier, the first intermediate classifier operating on a set of feature maps of the first convolution layer, and decide to abort or to continue the processing of a signal set based on a decision of the first intermediate classifier.
[0005] Therefore, it is desired to overcome the drawbacks, shortcomings, and limitations associated with existing solutions, and develop a system that relates to edge-based machine learning and object classification.
OBJECTS OF THE PRESENT DISCLOSURE
[0006] An object of the present disclosure relates, in general, to object classification, and more specifically, relates to a method and system for edge-based machine learning and object classification.
[0007] Another object of the present disclosure is to provide a system that estimates the confidence value or the accuracy value of the model using a convolution neural network for object classification.
[0008] Another object of the present disclosure is to provide a system that provides newly updated information i.e., object contours that are fed back to the neural network model for further training to give accurate results.
[0009] Yet another object of the present disclosure is to provide closed-loop systems that are reliable, stable and give accurate results as the numbers of iterations for training the neural network model are chosen as optimum.
SUMMARY
[0010] The present disclosure relates in general, to object classification, and more specifically, relates to a method and system for edge-based machine learning and object classification. The main objective of the present disclosure is to overcome the drawback, limitations, and shortcomings of the existing system and solution, by providing a method and system for edge-based machine learning and object classification. The present disclosure pertains to an object classification using edge contours. In this method, edge detection is performed to find the edge contours of each object in the entire frame. The edge contours of multiple objects in a frame are fed to a model which can be trained by a neural network. During training, the model generates the weight vectors of each object in an entire frame. During testing, a confidence value or the accuracy value of the trained neural network model for object classification can be estimated and decision-making is done with respect to the confidence value for better classification.
[0011] The present disclosure mainly relates to the method and system for edge-based machine learning and object classification using a neural network model. The present invention is pertaining to the classification of multiple objects using edge-based contours. The edge detection is performed on the entire input frame and the extracted contours are stored in a database. The dataset is fed to the neural network model for training. The trained model generates the weight vectors or features for different objects. In this method, edge detection is performed on the entire input video frames captured from the video sensor. The machine learning technique is a neural network model which is trained on the edge contours of the entire frame of the input video.
[0012] The confidence value or the accuracy value of the model is estimated using a convolution neural network for object classification. The trained neural network model generates the weight vectors or features for different objects. These features of different objects are stored in a database in the form of a lookup table. This lookup table is mapped with different types of objects. During the testing phase, the objects with different features are fed to a machine-learning model. The model performs edge detection for the entire frame and extracts the edge contours from it. These edge contours are compared with the already stored reference feature vectors from the database to classify the objects.
[0013] The reference feature vectors i.e., edge contours are used as an inference for comparison purposes in a decision-making block. These reference feature vectors get updated by using the self-learning block with respect to the score of confidence value.
[0014] Various objects, features, aspects, and advantages of the inventive subject matter will become more apparent from the following detailed description of preferred embodiments, along with the accompanying drawing figures in which like numerals represent like components.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] The following drawings form part of the present specification and are included to further illustrate aspects of the present disclosure. The disclosure may be better understood by reference to the drawings in combination with the detailed description of the specific embodiments presented herein.
[0016] FIG. 1 is a detailed block diagram of object classification using edge-based machine learning, in accordance with an embodiment of the present disclosure.
[0017] FIG. 2 is an exemplary flow diagram of a method for object classification using edge-based machine learning, in accordance with an embodiment of the present disclosure.
[0018] FIG. 3 is an exemplary flow diagram of a method for edge-based machine learning and object classification, in accordance with an embodiment of the present disclosure.
DETAILED DESCRIPTION
[0019] The following is a detailed description of embodiments of the disclosure depicted in the accompanying drawings. The embodiments are in such detail as to clearly communicate the disclosure. If the specification states a component or feature “may”, “can”, “could”, or “might” be included or have a characteristic, that particular component or feature is not required to be included or have the characteristic.
[0020] As used in the description herein and throughout the claims that follow, the meaning of “a,” “an,” and “the” includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.
[0021] The present disclosure relates, in general, to object classification, and more specifically, relates to a method and system for edge-based machine learning and object classification.
[0022] The advantages achieved by the system of the present disclosure can be clear from the embodiments provided herein. The system estimates the confidence value or the accuracy value of the model using a convolution neural network for object classification. The system provides newly updated information i.e., object contours that are fed back to the neural network model for further training to give accurate results. The closed loop systems that are reliable, stable and give accurate results as the numbers of iterations for training the neural network model are chosen as optimum. The description of terms and features related to the present disclosure shall be clear from the embodiments that are illustrated and described; however, the invention is not limited to these embodiments only. Numerous modifications, changes, variations, substitutions, and equivalents of the embodiments are possible within the scope of the present disclosure. Additionally, the invention can include other embodiments that are within the scope of the claims but are not described in detail with respect to the following description.
[0023] FIG. 1 is a detailed block diagram of object classification using edge-based machine learning, in accordance with an embodiment of the present disclosure.
[0024] Referring to FIG. 1, system 100 for edge-based machine learning and object classification is disclosed. The block diagram is divided into two phases i.e., the training phase and the testing phase. The system 100 can include video frame capturing block 102, video sensor 104, a processor 106, edge detection block 108, a trained neural network model 110, database 112, decision-making block 114, self-learning block 116, updated object features 118, neural network model 120 and video display 122.
[0025] In the training phase, the video frame capturing block 102 captures the video frames from the video sensor 104. The captured video frames are stored in an internal video buffer of the processor 106. This method can be implemented in the processor, where the processor should support an AI engine to run neural network models for fast processing.
[0026] The edge detection block 108 reads the entire frame from the video frame buffer and estimates the edge contours of different objects in a frame. These object contours may get trained by using the neural network model 120. The trained model generates the features or weight vectors of different objects. These object features are stored in the database 112, which can be used as an inference. The original database captured from the video sensor 104 is divided into 80:20 fashions. In this 80% of the database captured is used for training the model using neural networks and the remaining 20% of the database is used for testing purposes.
[0027] Testing Phase: During testing the trained neural network model is validated with the remaining 20% database. The edge detection estimates the object contours and is given to the trained neural network model. The trained model generates the object features or weight vectors. These tested object features are compared with the trained object features i.e., stored in the database which are used as an inference. Then decision-making is performed between these tested object features and trained object features. The right decision is required to classify the correct object. The objects in a video frame get detected first and each detected object is classified and labelled with a confidence value. The confidence value is denoted by the symbol ‘α’ (alpha). The ‘α’ (alpha) value varies from 0% to 100%. Thus, object classification can be achieved from a decision-making block.
[0028] However, in some cases the decision-making block 114 may not classify the correct object. The confidence value α > 80% is considered the correct object being classified and the confidence value α < 80% is considered a false object that is being classified. If the trained model wrongly classifies an object then that particular object features or object contours are given to the self-learning block 116 in a feedback path. The self-learning block 116 records that particular object features and maps to the correct object. Thus, the falsely classified object features are updated and mapped to the correct object in the database. These updated object features 118 are given back to the neural network model 120 for further training. Hence, the method forms a closed loop to correct or rectify the falsely classified objects. The closed loop operations are stable, accurate and error-free (Mean Square Error (MSE) is minimum or converges to zero).
[0029] The present disclosure pertains to electro-optic (EO) sensor-based object classification using edge-based machine learning. Electro Optic sensor-based object detection & classification systems, methods for real-time object classification. In an exemplary embodiment, the EO-based object classifiers are working on neural network models, especially convolution neural networks. Machine learning mainly depends on a neural network to build, train and test the model.
[0030] In this method, the edge detection, training & testing of the model, decision making and self-learning for object classification is executed by the processor. The video frame capturing block captures frames from the input video sensor. The captured frames are stored in an internal video frame buffer of the processor. The processor extracts the frames from the video frame buffer and sends them to the edge detection block. The processor executes the edge detection and finds the edge contours of each object in an entire frame. Edge detection can scan the pixel horizontally and vertically. The edge contours of each object are calculated as follows:
[0031] Edge in the horizontal direction can be estimated by the difference between two successive pixels, xi and xi+1.
[0032] If (xi+1-xi> Threshold) then xi+1= 250 Else xi+1 = 10;
[0033] Edge in the vertical direction can be estimated by the difference between two successive pixels, yi and yi+1.
[0034] If (yi+1-yi> Threshold) then yi+1= 250 Else yi+1 = 10;
[0035] These edge contours of each object may get trained by a model namely Single Shot Detection (SSD) model. The weight vectors can be generated with respect to the object contours once the training is complete. The accuracy of the network model should fit with respect to the number of epochs or iterations. The model should neither overfit nor under fit which may lead to wrong results. The testing of the model starts from here once the training is finished. To perform better object classification, the model may be tested initially on object detection and classification by estimating the confidence value/accuracy value. During testing, the objects in a video frame get detected by the model and a confidence value is labelled across each object. The user can take a decision on a particular object in a frame for classification with respect to the confidence value. The confidence value is denoted by the symbol ‘α’ (alpha).
[0036] The ‘α’ value is being compared in a decision-making block as shown in figure 1. The decision-making block decides whether the classification output should pass to the video display block or to the self-learning block with respect to the confidence value. The self-learning block learns itself and updates the object features in the database. These updated object features are again fed back to the neural network model for further training. The more you train in a better way the model can predict better results which gives better performance. Hence the method and system for better object classification will be done by using edge-based machine learning
[0037] Thus, the present invention overcomes the drawbacks, shortcomings, and limitations associated with existing solutions, and provides a system that estimates the confidence value or the accuracy value of the model using a convolution neural network for object classification. The system provides newly updated information i.e., object contours that are fed back to the neural network model for further training to give accurate results. The closed loop systems that are reliable, stable and give accurate results as the numbers of iterations for training the neural network model are chosen as optimum.
[0038] FIG. 2 is an exemplary flow diagram of a method for object classification using edge-based machine learning, in accordance with an embodiment of the present disclosure. A tabular form that depicts the decision-making 200 for a better object classification approach, in accordance with an embodiment of the present disclosure.
[0039] Referring to FIG. 2, at block 202, capture a frame from the video sensor. At block 204, store each frame in the video buffer for processing. At block 206 perform edge detection in each frame. At block 210, a trained neutral network model is used for testing, at block 212 train the model uses the edge contours of each frame. Block 214, generates tested object features and at block 216, trained object features are stored in the database for inference. At block 218, the decision-making block for object classification and at block 220, the selection of video display or self-learning block based on the confidence value.
[0040] Initially, the neural network model gets trained with the edge contours of each object in an entire frame. This training is done by using processor 106, where once the training is performed the model generates weight vectors or weight coefficients. During testing, the objects in an entire video frame are processed using an edge detection block. The contours of the object generated from edge detection are being tested with the trained neural network model. The trained model detects and classifies the objects in an entire frame and the confidence value of each object is estimated. The confidence value is denoted by the symbol ‘α’. The decision making for better object classification is described in table 1 below.
S.No Confidence Value (α) of Trained Model during Testing Phase Decision Making for Object Classification Selection of Next Stage Minimum Mean Square Error (MSE)
1. > 80% Correct Video Display Minimum
2. 70% to 80% May or may not be correct Self Learning Block Moderate
3. < 70% Wrong Self Learning Block Maximum
Table1: A decision making for better object classification
[0041] The tabular form between confidence value ‘α’, decision-making to classify objects, selection of output stage and mean square error for better object classification. There are 3 cases listed in the tabular form.
[0042] Case 1: if α > 80% then the decision-making for object classification is correct and the output is given to video display since the mean square error is minimum or converges to zero.
[0043] Case 2: if 70% < α < 80% then the decision-making for object classification may or may not be correct and the output is given to the self-learning block since the mean square error is moderate.
[0044] Case 3: if α < 70% then the decision-making for object classification is wrong and the output is given to the self-learning block since the mean square error is maximum.
[0045] In case 3 the self-learning block updates the features of an object and maps to the correct object in the database. The updated database is again fed back to the neural network model for further training. This process continues till the confidence value ‘α’ reaches greater than 80% and the mean square error becomes very small or converges to zero.
[0046] The machine learns by itself with the help of neural networks and self-learning block. Initially, the processor captures the video frames at a sensor frame rate. The captured video frame is stored in an internal video buffer for further processing. The internal video buffer of the processor is fed to edge detection. Firstly, edge detection is performed on the entire video frame. Edge detection finds the edge contours of each object in a frame. These edge contours of each object may get trained by the neural network model. The neural network model generates the weight vectors or weight coefficients of each object contour after training. These weight vectors named object features may be stored in the database.
[0047] The testing of the trained network model starts from once the training is finished. To perform object classification, the model can be tested initially on object detection & classification by estimating the confidence value. The confidence value refers to the accuracy of the model on which the object is being classified. The trained object features are stored as inference and are compared with the tested object features during the testing phase. The decision-making block decides the selection of the output path based on the confidence value. If the confidence value is good and optimum i.e., α > 80% then the object classification output is sent to the video display. If the confidence value is average and not optimum i.e., α < 80% then the corresponding object features are given to a self-learning block.
[0048] The self-learning block maps the falsely classified object contours with the correct object and updates the information in the database. The newly updated information (i.e object contours) are fed back to the neural network model for further training to give accurate results. Thus, the method and system form a closed-loop system. The closed loop systems are reliable, stable and give accurate results as the numbers of iterations for training the neural network model are chosen as optimum.
[0049] This method is a processor-based solution where the blocks shown in FIG.1 i.e. video frame capturing block/video frame buffer block, edge detection block, neural network model block, decision-making block and self-learning block are performed by using the processor. The processor consumes memory to perform the method for edge-based machine learning and object classification using a neural network model.
[0050] FIG. 3 is an exemplary flow diagram of a method for edge-based machine learning and object classification, in accordance with an embodiment of the present disclosure.
[0051] Referring to FIG. 3, method 300 includes at block 302, the video frame buffer can capture the video frames from the video sensor. At block 304, the processor can extract the frames from the video frame buffer. At block 306, the edge detection unit can read the entire frame from the video frame buffer and estimates the edge contours of different objects in the frame. At block 308, the first neural network model can train the edge contours of different objects. At block 310, the second neural network can generate the features or weight vectors of different objects.
[0052] At block 312, the decision-making unit can perform a decision between the tested object features and trained object features to classify the correct object, wherein each detected object is classified and labelled with a confidence value and at block 314, select any or a combination of video display or self-learning unit based on the confidence value.
[0053] It will be apparent to those skilled in the art that the system 100 of the disclosure may be provided using some or all of the mentioned features and components without departing from the scope of the present disclosure. While various embodiments of the present disclosure have been illustrated and described herein, it will be clear that the disclosure is not limited to these embodiments only. Numerous modifications, changes, variations, substitutions, and equivalents will be apparent to those skilled in the art, without departing from the spirit and scope of the disclosure, as described in the claims.
ADVANTAGES OF THE PRESENT INVENTION
[0054] The present invention provides a system that estimates the confidence value or the accuracy value of the model using a convolution neural network for object classification.
[0055] The present invention provides newly updated information i.e., object contours that are fed back to the neural network model for further training to give accurate results.
[0056] The present invention provides closed-loop systems that are reliable, stable and give accurate results as the numbers of iterations for training the neural network model are chosen as optimum.
, Claims:1. A system (100) for edge-based machine learning and object classification, the system comprising:
a video frame buffer (102) captures the video frames from a video sensor;
a processor (106) coupled to the video frame buffer, the processor configured to:
extracts the frames from the video frame buffer (102);
reads, by an edge detection unit (108) the entire frame from the video frame buffer and estimates the edge contours of different objects in the frame
train, by a first neural network model (120), the edge contours of different objects;
generate, by a second neural network (110), the features or weight vectors of different objects;
perform, by a decision-making unit (114), the decision between the tested object features and trained object features to classify the correct object, wherein each detected object is classified and labelled with a confidence value; and
select any or a combination of video display or self-learning units based on the confidence value.
2. The system as claimed in claim 1, wherein the edge of each input frame is estimated and used as a machine learning dataset, wherein the edge of each frame is estimated as
Y(i, j) = X(i+1, j) – X(i, j) ;
If Y(i, j) > Contrast Difference then Y(i, j) = 250; else Y(i, j) = 10.
3. The system as claimed in claim 1, wherein the edge detection unit creates the contour of each object, wherein the image frame with each contour is used as a reference to prepare the dataset for supervised learning.
4. The system as claimed in claim 1, wherein the first neural network model is used for learning the contours in the frame, wherein each contour comprises of angle, radius, curves and different shapes.
5. The system as claimed in claim 1, wherein the edges of the objects are extracted from an image which is obtained from the video.
6. The system as claimed in claim 1, wherein the extracted edges and trained object features are used for inference and comparison purposes and decision-making is performed with respect to the confidence value.
7. The system as claimed in claim 1, wherein the self-learning unit records that particular object features and maps to the correct object when the second neural network wrongly classifies an object in a feedback path
8. The system as claimed in claim 1, wherein the edge-based learning and inference save the processing time and logic resources by updating the object features using the self-learning unit.
9. The system as claimed in claim 1, wherein the object matched with >80% area matching is used as an object of interest for classification.
10. A method for edge-based machine learning and object classification, the method comprising:
capturing, by a video frame buffer, the video frames from a video sensor;
extracting, at a processor, the frames from the video frame buffer;
reading, by an edge detection unit (108). the entire frame from the video frame buffer and estimates the edge contours of different objects in the frame;
training, by a first neural network model (120), the edge contours of different objects;
generating, by a second neural network (110), the features or weight vectors of different objects;
performing, by a decision-making unit (114), the decision between the tested object features and trained object features to classify the correct object, wherein each detected object is classified and labelled with a confidence value; and
selecting any or a combination of video display or self-learning units based on the confidence value.
| # | Name | Date |
|---|---|---|
| 1 | 202341024947-STATEMENT OF UNDERTAKING (FORM 3) [31-03-2023(online)].pdf | 2023-03-31 |
| 2 | 202341024947-FORM 1 [31-03-2023(online)].pdf | 2023-03-31 |
| 3 | 202341024947-DRAWINGS [31-03-2023(online)].pdf | 2023-03-31 |
| 4 | 202341024947-DECLARATION OF INVENTORSHIP (FORM 5) [31-03-2023(online)].pdf | 2023-03-31 |
| 5 | 202341024947-COMPLETE SPECIFICATION [31-03-2023(online)].pdf | 2023-03-31 |
| 6 | 202341024947-ENDORSEMENT BY INVENTORS [08-04-2023(online)].pdf | 2023-04-08 |
| 7 | 202341024947-Proof of Right [17-04-2023(online)].pdf | 2023-04-17 |
| 8 | 202341024947-FORM-26 [17-04-2023(online)].pdf | 2023-04-17 |
| 9 | 202341024947-POA [07-10-2024(online)].pdf | 2024-10-07 |
| 10 | 202341024947-FORM 13 [07-10-2024(online)].pdf | 2024-10-07 |
| 11 | 202341024947-AMENDED DOCUMENTS [07-10-2024(online)].pdf | 2024-10-07 |
| 12 | 202341024947-Response to office action [01-11-2024(online)].pdf | 2024-11-01 |