Abstract: The present invention discloses an apparatus and method for cascading of artificial intelligence detection models for passive ranging to improve object detection and classification accuracy using less computation power. At least one sensor (102) is configured to receive a multimedia input from one or more sources. A frame grabbing module (104) is configured to capture one or more frames from the multimedia input. At least one detection module (106, 108) is configured to detect an object from the captured frames based on a pre-defined confidence level value. A decision module (110) is configured to detect pre-defined object parameters of the detected object. A passive range estimation module (112) is configured to estimate passive range of the object based on the detected parameters.
DESC:TECHNICAL FIELD
[0001] The present invention relates generally to a field of an apparatus and method for cascading of artificial intelligence detection models for passive ranging.
BACKGROUND
[0002] Typically, estimating a range of an object is a challenging task. For example, in a pre-defined area, knowing the distance of any fixed and/or movable objects from a pre-determined area is very important. For AI based processing, conventional devices used for detection and classification require huge computational power as well as memory requirement and delay in output is high.
[0003] WO2016095117A1 titled “Object Detection With Neural Network” describes an apparatus comprising at least one processing core and at least one memory including a computer program code, the at least one memory and the computer program code being configured to, with the at least one processing core, cause the apparatus at least to run a convolutional neural network comprising an input layer arranged to provide signals to a first convolutional layer and a last convolutional layer, run a first intermediate classifier, the first intermediate classifier operating on a set of feature maps of the first convolutional layer, and decide to abort or to continue processing of a signal set based on a decision of the first intermediate classifier.
[0004] US005867256A titled “Passive Range Estimation Using Image size Measurements.” describes a method for a range estimation system which comprises a database containing data for identification of certain targets and data for estimating the initial range to each of the targets as a function of the observed dimensions of the targets. The method takes image sequences into account and estimates range based on change in detected object dimensions.
[0005] Hence, there is a need of an apparatus and method which overcome the aforementioned problems and provide accurate estimation of range of objects in an open area.
SUMMARY
[0006] This summary is provided to introduce concepts related to an apparatus and method for cascading of artificial intelligence detection models for passive ranging. This summary is neither intended to identify essential features of the present invention nor is it intended for use in determining or limiting the scope of the present invention.
[0007] For example, various embodiments herein may include one or more apparatuses and methods thereof are provided. In one of the embodiments, a method for cascading of artificial intelligence (AI) detection models for passive ranging includes a step of receiving, by at least one sensor, a multimedia input from one or more sources. The method includes a step of capturing, by a frame grabbing module, one or more frames from the multimedia input. The method includes a step of detecting, by at least one detection module, an object from the captured frames based on a pre-defined confidence level value. The method includes a step of detecting, by a decision module, pre-defined object parameters of the detected object. The method includes a step of estimating, by a passive range estimation module, passive range of the object based on the detected parameters.
[0008] In another embodiment, an apparatus for cascading of artificial intelligence (AI) detection models for passive ranging includes at least one sensor, a frame grabbing module, at least one detection module, a decision module, and a passive range estimation module. The at least one sensor is configured to receive a multimedia input from one or more sources. The frame grabbing module is configured to capture one or more frames from the multimedia input. The at least one detection module is configured to detect an object from the captured frames based on a pre-defined confidence level value. The decision module is configured to detect pre-defined object parameters of the detected object. The passive range estimation module is configured to estimate passive range of the object based on the detected parameters.
BRIEF DESCRIPTION OF ACCOMPANYING DRAWINGS
[0009] The detailed description is described with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the drawings to reference like features and modules.
[0010] Figure 1 illustrates a block diagram depicting an apparatus for cascading of artificial intelligence (AI) detection models for passive ranging, according to an implementation of the present invention.
[0011] Figure 2 illustrates a flow diagram depicting estimation of passive range, according to an exemplary implementation of the present invention.
[0012] Figure 3 illustrates a flowchart depicting a method for cascading of artificial intelligence (AI) detection models for passive ranging, according to an implementation of the present invention.
[0013] It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems/platforms embodying the principles of the present invention. Similarly, it will be appreciated that any flowcharts, flow diagrams, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
DETAILED DESCRIPTION
[0014] In the following description, for the purpose of explanation, specific details are set forth in order to provide an understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these details. One skilled in the art will recognize that embodiments of the present invention, some of which are described below, may be incorporated into a number of systems.
[0015] The various embodiments of the present invention provide an apparatus and method for cascading of artificial intelligence detection models for passive ranging. Furthermore, connections between components and/or modules within the figures are not intended to be limited to direct connections. Rather, these components and modules may be modified, re-formatted or otherwise changed by intermediary components and modules.
[0016] References in the present invention to “one embodiment” or “an embodiment” mean that a particular feature, structure, characteristic, or function described in connection with the embodiment is included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment.
[0017] In one of the embodiments, a method for cascading of artificial intelligence (AI) detection models for passive ranging includes a step of receiving, by at least one sensor, a multimedia input from one or more sources. The method includes a step of capturing, by a frame grabbing module, one or more frames from the multimedia input. The method includes a step of detecting, by at least one detection module, an object from the captured frames based on a pre-defined confidence level value. The method includes a step of detecting, by a decision module, pre-defined object parameters of the detected object. The method includes a step of estimating, by a passive range estimation module, passive range of the object based on the detected parameters.
[0018] In another implementation, the method includes a step of capturing, by the frame grabbing module, the frames at a pre-defined sensor rate.
[0019] In another implementation, the method includes a step of detecting, by a first detection module, an object from the captured frames based on the pre-defined confidence level value and generating an output.
[0020] In another implementation, the method includes a step of detecting, by a second detection module, the object if the generated output is having a low confidence level value.
[0021] In another implementation, the pre-defined object parameters include an object size and an object class.
[0022] In another implementation, the method includes a step of annotating: by the passive range estimation module, the estimated passive range of the object and a pre-determined class information.
[0023] In another implementation, the method includes a step of displaying, the object based on the estimated passive range.
[0024] In another embodiment, an apparatus for cascading of artificial intelligence (AI) detection models for passive ranging includes at least one sensor, a frame grabbing module, at least one detection module, a decision module, and a passive range estimation module. The at least one sensor is configured to receive a multimedia input from one or more sources. The frame grabbing module is configured to capture one or more frames from the multimedia input. The at least one detection module is configured to detect an object from the captured frames based on a pre-defined confidence level value. The decision module is configured to detect pre-defined object parameters of the detected object. The passive range estimation module is configured to estimate passive range of the object based on the detected parameters.
[0025] In another implementation, the frame grabbing module is configured to capture the frames at a pre-defined sensor rate.
[0026] In another implementation, the detection module includes a first detection module and a second detection module.
[0027] In another implementation, the first detection module is configured to detect an object from the captured frames based on the pre-defined confidence level value and generate an output.
[0028] In another implementation, the second detection module is configured to detect the object if the first detection module generates the output having a low confidence level value.
[0029] In another implementation, the first detection module includes a single shot detector (SSD) AI model, and the second detection module includes a Region Based Convolutional Neural Networks (RCNN) AI model.
[0030] In another implementation, the passive range estimation module is configured to annotate the estimated passive range of the object and a pre-determined class information.
[0031] In another implementation, a display unit is configured to display the object based on the estimated passive range.
[0032] In an exemplary embodiment, the method and apparatus for cascading of AI based detection models for passive ranging for standing human, crawling human, two wheelers and four wheeler comprises of (a) Video capture buffer and splitter (b) Processing and Machine learning component (c) Single Shot Detection AI module (d) RCNN AI model (e) Interface block between SSD and RCNN models for hand shaking wherein: a video capture is a device connected to a video camera through standard interfaces; a processing block comprises of a central processing unit (CPU) and a graphics processing unit (GPU) machine for learning and inference with an external memory for a temporary storage of unprocessed and processed data. In an embodiment, the apparatus also provides an interfacing technique for handshaking between SSD and RCNN models.
[0033] In an exemplary embodiment, the SSD model is faster and consumed low memory size is used for training and classification whereas the object size is more than 256 x 128 pixel size.
[0034] In an exemplary embodiment, the RCNN model is slower and consumed more processing cells is used for training and classification whereas size object is smaller than 256 x 128.
[0035] In an exemplary embodiment, the confidence level value along with the object classification is used for decision making of correct object classification.
[0036] In an exemplary embodiment, if the confidence level value generated by the SSD model is more than 70%, the direct object classified object is used for further range estimation.
[0037] In an exemplary embodiment, if the confidence level value generated by the SSD model is lower than 70% the output of the SSD model is not used and a flag from the SSD model to the RCNN model is passed to perform the task for object classification
[0038] In an embodiment, the present invention is pertaining to object detection and classification using cascading of AI based techniques to improve object detection and classification accuracy using less computation power than individual AI model. The detected object class and size will be used to estimate range. The object detection and classification models used in are the SSD and faster-RCNN (FRCNN) model. In the present invention, the input video from the video camera is processed and passed to the SSD model for object classification. Where the primary model SSD output is having confidence level below certain threshold, the secondary model with high accuracy RCNN classes and detections will be considered for combined post detection steps. For post detection, these classes and size parameters will be used in finding passive range estimation of objects by interpolating range based on a camera field of view (FOV) and actual average size of objects.
[0039] Figure 1 illustrates a block diagram depicting an apparatus (100) for cascading of artificial intelligence (AI) detection models for passive ranging, according to an implementation of the present invention.
[0040] An apparatus for cascading of artificial intelligence (AI) detection models for passive ranging (hereinafter referred to as “apparatus”) (100) includes a sensor (102), a frame grabbing module (104), a first detection module (106), a second detection module (108), a decision module (110), a passive range estimation module (112), and a display unit (114).
[0041] In an embodiment, the apparatus (100) includes a camera (not shown in a figure). The camera is configured to capture a plurality of images and videos from a pre-defined area. In one embodiment, the pre-defined area includes a ground, an indoor area, an outdoor area, and any similar open and/or closed types of the area. In another embodiment, the apparatus (100) includes a plurality of cameras which are installed in the pre-defined area. Each camera is configured to capture the images and videos from its pre-determined range.
[0042] The sensor (102) is deployed with the camera. In an embodiment, the sensor (102) includes a video sensor. In another embodiment, the sensor (102) is configured to sense the scenes from the captured data and consider the captured data as an input. In another embodiment, the sensor (102) is configured to receive the captured multimedia data as an input from one or more sources. In an embodiment, the sources can be one or more cameras. The multimedia data includes, but is not limited to, videos and images.
[0043] The frame grabbing module (104) is configured to cooperate with the sensor (102) to receive the multimedia input. The frame grabbing module (104) is configured to capture one or more frames from the multimedia input. In an embodiment, the frame grabbing module (104) is configured to capture the frames at a pre-defined sensor rate.
[0044] The at least one detection module (106, 108) is configured to cooperate with the frame grabbing module (104) to receive the captured frames. The at least one detection module (106, 108) is configured to detect an object from the captured frames based on a pre-defined confidence level value. In an embodiment, the detection module includes a first detection module (106) and a second detection module (108). The first detection module (106) is configured to detect the object from the captured frames based on the pre-defined confidence level value and generate an output. The second detection module (108) is configured to detect the object if the first detection module generates the output having a low confidence level value. In one embodiment, the first detection module (106) includes a single shot detector (SSD) AI model, and the second detection module (108) includes a Region Based Convolutional Neural Networks (RCNN) AI model. In one embodiment, each AI model (106, 108) is configured to compute a respective confidence level value by using artificial intelligence and neural networks. In yet another embodiment, both the detection modules (106, 108) are run in parallel.
[0045] In an embodiment, the present invention uses two AI object detection modules for detection and classification. A primary model which has chosen is SSD that requires low computation power which in turn gives low accuracy. A secondary model is RCNN which requires high computation with high accuracy. In an embodiment, the advantages of the secondary model are taken by using it only when primary model's detection confidence goes below threshold.
[0046] The decision module (110) is configured to cooperate with the detection module to receive the detected object. The decision module (110) is further configured to detect pre-defined object parameters of the detected object. In an embodiment, the pre-defined object parameters include an object size and an object class.
[0047] The passive range estimation module (112) is configured to cooperate with the decision module to receive the detected parameters of the object. The passive range estimation module is further configured to estimate passive range of the object based on the detected parameters. In an embodiment, the passive range estimation module (112) is configured to annotate the estimated passive range of the object and a pre-determined class information. In an embodiment, passive range estimation module (112) is configured to interpolate range based on the pre-defined object size and a range table.
[0048] The display unit (114) is configured to cooperate with the passive range estimation module (112) to receive the estimated passive range. The display unit (114) is further configured to display the object based on the estimated passive range.
[0049] In an exemplary embodiment, the sensor (102) is configured to receive input, the frame grabbing module (104) is configured to capture video frames for passing to AI modules (106, 108). The SSD AI model (106) and RCNN AI model (108) are a Convolutional neural network (CNN) based object detection and classification modules. The decision module (110) is configured for decision making and controlling the AI modules (106, 108). The passive range estimation module (112) takes the detected object classes and object parameters from the decision module (110) and estimates range of objects from the sensor (102). These estimates and class information can be annotated and given to the display unit (114) for display.
[0050] In an exemplary embodiment, the present invention is pertaining to object classification using cascading of AI based techniques to improve decision accuracy as well as reduced computational power and power requirement. For AI based processing, the conventional devices used for detection and classification require huge computational power as well as memory requirement and delay in output is high. The present invention can reduce processing requirement as well as improved accuracy and reduced memory requirement. In this, an input video from the video camera is processed and passed to the SSD model (106) for object classification. The SSD model (106) is used for a big size object 256 x 128 pixels and high contrast image more than 20%, which can be created by training a model by huge recorded data set. The SSD model (106) can give a decision output very fast within 40 milli seconds and object classification can be completed with less requirement of processing power as well as low external memory. Hence, for high quality images the SSD model (106) can give an output in minimal processing time with 70% of the confidence level value. In the present invention, if the SSD model (106) generated output with low confidence level value less than 70% confidence, the output can be passed to the RCNN model (108) which can be used for further accurate classification. The RCNN model (108) is created by training of huge data set of images for small size objects as well as low contrast images. The SSD model (106) requires huge processing power, internal and external memory to make decision for low contrast images. Hence, this invention can save more processing power with improved accuracy for low contrast images. The generated classified output is passed to the object size estimation in terms of pixels block. The size of object in terms pixels is considered as input by using interpolation and look up table the range of the classified object can be estimated.
[0051] Figure 2 illustrates a flow diagram (200) depicting estimation of passive range, according to an exemplary implementation of the present invention.
[0052] In Figure 2, the flow diagram (200) starts at a step (202), where a frame grabbing module (104) is configured to capture the frames at a pre-defined sensor rate from the sensor (102). The captured video frame is stored in an internal buffer for further processing, as shown at a step (204). In an embodiment, the internal buffer can be accessed by the first detection module (106) and the second detection module (108). The first detection module (106) runs at a step (206). In an embodiment, the first detection module (106) includes SSD AI model having an SSD object detection model, where the SSD object detection is performed first on the captured frame. At a step (208), the apparatus (100) checks the threshold of the SSD based AI model for object detection. The SSD detection parameters are passed to the passive range estimation module (112), if confidence value (which is one parameter of the model output) is more than 70%, as shown at a step (210). If the confidence value falls below 70%, then the RCNN model processing is enabled for that frame buffer, as shown at a step (212). The RCNN output parameters are passed to the passive range estimation module (214) in this case where SSD model’s confidence level is below 70%., as shown at a step (214). Finally, a passive range estimation output is displayed on the display unit (114) or can be used for further processing, as shown at a step (216, 218).
[0053] Figure 3 illustrates a flowchart (300) depicting a method for cascading of artificial intelligence (AI) detection models for passive ranging, according to an implementation of the present invention.
[0054] The flowchart (300) starts at a step (302), receiving, by at least one sensor, a multimedia input from one or more sources. In an embodiment, the at least one sensor (102) is configured to receive a multimedia input from one or more sources. At a step (304), capturing, by a frame grabbing module, one or more frames from the multimedia input. In an embodiment, the frame grabbing module (104) is configured to capture one or more frames from the multimedia input. At a step (306), detecting, by at least one detection module, an object from the captured frames based on a pre-defined confidence level value. In an embodiment, the at least one detection module (106, 108) is configured to detect an object from the captured frames based on a pre-defined confidence level value. At a step (308), detecting, by a decision module, pre-defined object parameters of the detected object. In an embodiment, the decision module (110) is configured to detect pre-defined object parameters of the detected object. At a step (310), estimating, by a passive range estimation module, passive range of the object based on the detected parameters. In an embodiment, the passive range estimation module (112) is configured to estimated passive range of the object based on the detected parameters.
[0055] It should be noted that the description merely illustrates the principles of the present invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described herein, embody the principles of the present invention. Furthermore, all examples recited herein are principally intended expressly to be only for explanatory purposes to help the reader in understanding the principles of the invention and the concepts contributed by the inventor(s) to furthering the art and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass equivalents thereof.
,CLAIMS:
1. A method for cascading of artificial intelligence (AI) detection models for passive ranging, the method comprising:
receiving, by at least one sensor (102), a multimedia input from one or more sources;
capturing, by a frame grabbing module (104), one or more frames from the multimedia input;
detecting, by at least one detection module (106, 108), an object from the captured frames based on a pre-defined confidence level value;
detecting, by a decision module (110), pre-defined object parameters of the detected object; and
estimating, by a passive range estimation module (112), passive range of the object based on the detected parameters.
2. The method as claimed in claim 1, wherein capturing, by the frame grabbing module (104), the frames at a pre-defined sensor rate.
3. The method as claimed in claim 1, comprising: detecting, by a first detection module (106), an object from the captured frames based on the pre-defined confidence level value and generating an output.
4. The method as claimed in claim 3, comprising: detecting, by a second detection module (108), the object if the generated output is having a low confidence level value.
5. The method as claimed in claim 1, wherein the pre-defined object parameters include an object size and an object class.
6. The method as claimed in claim 1, comprising: annotating: by the passive range estimation module (112), the estimated passive range of the object and a pre-determined class information.
7. The method as claimed in claim 6, comprising: displaying, by a display unit (114), the object based on the estimated passive range.
8. An apparatus (100) for cascading of artificial intelligence (AI) detection models for passive ranging, the apparatus (100) comprising:
at least one sensor (102) configured to receive a multimedia input from one or more sources;
a frame grabbing module (104) configured to cooperate with the sensor (102), the frame grabbing module (104) configured to capture one or more frames from the multimedia input;
at least one detection module (106, 108) configured to cooperate with the frame grabbing module (104), the detection module (106, 108) configured to detect an object from the captured frames based on a pre-defined confidence level value;
a decision module (110) configured to cooperate with the detection module (106, 108), the decision module (110) configured to detect pre-defined object parameters of the detected object; and
a passive range estimation module (112) configured to cooperate with the decision module (110), the passive range estimation module (112) configured to estimate passive range of the object based on the detected parameters.
9. The apparatus (100) as claimed in claim 8, wherein the frame grabbing module (104) is configured to capture the frames at a pre-defined sensor rate.
10. The apparatus (100) as claimed in claim 8, wherein the detection module includes a first detection module (106) and a second detection module (108).
11. The apparatus (100) as claimed in claims 8 and 10, wherein the first detection module (106) is configured to detect an object from the captured frames based on the pre-defined confidence level value and generate an output.
12. The apparatus (100) as claimed in claim 10, wherein the second detection module (108) is configured to detect the object if the first detection module (106) generates the output having a low confidence level value.
13. The apparatus (100) as claimed in claim 10, wherein the first detection module (106) includes a single shot detector (SSD) AI model, and the second detection module (108) includes a Region Based Convolutional Neural Networks (RCNN) AI model.
14. The apparatus (100) as claimed in claim 8, wherein the passive range estimation module (112) is configured to annotate the estimated passive range of the object and a pre-determined class information.
15. The apparatus (100) as claimed in claims 8 and 14, comprising: a display unit (114) configured to display the object based on the estimated passive range.
| # | Name | Date |
|---|---|---|
| 1 | 202241018476-PROVISIONAL SPECIFICATION [29-03-2022(online)].pdf | 2022-03-29 |
| 2 | 202241018476-FORM 1 [29-03-2022(online)].pdf | 2022-03-29 |
| 3 | 202241018476-DRAWINGS [29-03-2022(online)].pdf | 2022-03-29 |
| 4 | 202241018476-Proof of Right [10-06-2022(online)].pdf | 2022-06-10 |
| 5 | 202241018476-FORM-26 [10-06-2022(online)].pdf | 2022-06-10 |
| 6 | 202241018476-Correspondence_Form-1_20-06-2022.pdf | 2022-06-20 |
| 7 | 202241018476-FORM 3 [19-08-2022(online)].pdf | 2022-08-19 |
| 8 | 202241018476-ENDORSEMENT BY INVENTORS [19-08-2022(online)].pdf | 2022-08-19 |
| 9 | 202241018476-DRAWING [19-08-2022(online)].pdf | 2022-08-19 |
| 10 | 202241018476-CORRESPONDENCE-OTHERS [19-08-2022(online)].pdf | 2022-08-19 |
| 11 | 202241018476-COMPLETE SPECIFICATION [19-08-2022(online)].pdf | 2022-08-19 |
| 12 | 202241018476-POA [04-10-2024(online)].pdf | 2024-10-04 |
| 13 | 202241018476-FORM 13 [04-10-2024(online)].pdf | 2024-10-04 |
| 14 | 202241018476-AMENDED DOCUMENTS [04-10-2024(online)].pdf | 2024-10-04 |
| 15 | 202241018476-Response to office action [01-11-2024(online)].pdf | 2024-11-01 |