Sign In to Follow Application
View All Documents & Correspondence

Cross Learning Video Analytics System For Effective And Accurate Real Time Video Analysis And A Method Thereof

Abstract: ABSTRACT Title: CROSS-LEARNING VIDEO ANALYTICS SYSTEM FOR EFFECTIVE AND ACCURATE REAL-TIME VIDEO ANALYSIS AND A METHOD THEREOF. The present invention discloses a system for effective and accurate real-time video analysis comprising multiple processors each embodying video analytics engines of varying video frame or scene analyzing characteristics connected in sequence enabling output of one of said video analytics engines is input of next video analytics engine in the sequence and memory element interfacing two consecutive processors to bridge differences in video frame or scene analyzing speed of the video analytics engines embodied in said consecutive processors and enable each of the consecutive video analytics engines operating at their respective video frame or scene analyzing speed.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
26 February 2018
Publication Number
35/2019
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
anjanonline@vsnl.net
Parent Application

Applicants

VIDEONETICS TECHNOLOGY PRIVATE LIMITED
PLOT-5, BLOCK-BP, SALT LAKE, KOLKATA WEST BENGAL INDIA 700091

Inventors

1. DAS, Sudeb
AS-1/114/1, Kalyanpur Housing, Asansol West Bengal India 713305
2. BOSE, Tuhin
FD 455/5, Sector III, Salt Lake City, Kolkata, West Bengal India 700106
3. DAS, Monotosh
28, Haladhar Burdhan Lane, Bowbazar, Kolkata, West Bengal India 700012
4. BHATTACHARYYA, Kaustubh
Post & Vill: Fingapara, Dist: North 24 PGS, West Bengal India 743129

Specification

DESC:FIELD OF THE INVENTION:
The present invention relates to video content analysis. More specifically, the present invention is directed to develop a system for effective and accurate real-time video analysis combining heterogeneous processing-asynchronous video analytics engines.

BACKGROUND OF THE INVENTION:
Now a days plentiful CCTV cameras are installed everywhere for video surveillance. Generally, the videos captured through these cameras are used for two purposes – archiving/recording (for post facto analysis towards forensic investigation etc.) and for use by one or more video content analysis system for automatic response generation. Generally, the analytics applications in typical video content analysis systems are complex in nature, yet these systems should have capability of producing real-time-response (RTR).
Intelligent video analytics is a well practiced subject and a lot of technologies which use image processing and computer vision algorithms combined with advanced machine-learning (ML)/deep-learning (DL) mechanisms are in place to identify objects in the scene by various methods including background-foreground separation, finding the trajectories of the objects, and classify the objects based on size, shape and possibly some other features like colors, textures etc. Essentially, these systems undergo a learning/training mechanism with pre-collected and pre-labeled data. These systems are then able to analyze a given video for identification and classification of the objects in scene based on the knowledge it gathered during learning/training. While the DL frameworks are more predictive and accurate in distinguishing objects, these frameworks require significant computation bandwidth of computers.
Reduction of false alarms is another major challenge in such systems as false alarms render the system unreliable. Most of the state-of-the-art video content analysis systems are prone to generate false alarms, particularly in noisy scenes. Therefore, there is a need to develop an effective method/technique/system which can reduce the number of false alarms without increasing the computation requirements.
A commonly used strategy to achieve better accuracy in detecting and identifying objects of interests in scene is to combine different analytics engines or utilize advanced analytics algorithms for accurate response generation in complex environmental and demographic conditions. However, usage of multiple engines requires heavy-computation and memory and it also generates delay in response generation. For example, it is often impractical to use DL based schemes for real-time video analytics/surveillance purposes without high-computing support (like GPGPU). Furthermore, in many cases, processing capacities of different video analytics schemes are different i.e., they produce outputs at different speeds. Sequential combination of these heterogeneous analytics engines is incapable to generate RTR in CPU computing environment. Therefore, there is a need to design a framework which can effectively combine different video analytics engines by distributing the processing loads dynamically for accurate response generation in real-time without requiring high-computing support like GPGPU.
The advantage of DL based video analysis systems is well-known. But, how to use this advantage to reduce the computation burden in a 24 x 7 running video content analysis system is not explored well. Combination of offline and online learning mechanism can be a useful way of achieving superior results with reduced computational requirements.
References:
1. Using dynamic mode decomposition for real-time background/foreground separation in video - US 9,674,406 B2 - Jun. 6, 2017
2. Real-time video frame pre-processing hardware - US 9,607,585 B2 - Mar. 28, 2017
3. Pure convolutional neural network localization - US 9,619,735 B1 - Apr. 11, 2017
Methods, devices and systems for detecting objects in a video - US 9,646,212 B2 - May 9, 2017
4. A. Angelova et al., “Real-Time Pedestrian Detection with Deep Network Cascades,” BMVC, 2015.
5. X. Wang et al., “Robust and Real-Time Deep Tracking via Multi-Scale Domain Adaptation,” CVPR, 2017.
6. A. Paszke et al., “ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation,” CVPR, 2017.
7. C. Ding et al., “A Novel Two-stage Learning Pipeline for Deep Neural Networks,” Neural Process Lett. 2017.

OBJECT OF THE INVENTION:
It is thus the basic object of the present invention is to develop a system for real-time-response (RTR) generation and validation in video analytics.
Another object of the present invention is to develop a system for effective and accurate real-time video analysis combining heterogeneous processing-asynchronous video analytics engines.
Another object of the present invention is to develop a real-time video analysis system and method thereof combining heterogeneous processing-asynchronous video analytics engines for hierarchical processing of video content working as a real-time event-validator in low-computing environment.
Yet another object of the present invention is to develop a cross-learning video analysis system and method thereof for real-time-response (RTR) generation and validation in video analytics.

SUMMARY OF THE INVENTION:
Thus according to the basic aspect of the present invention there is provided a system for effective and accurate real-time video analysis comprising

multiple processors each embodying video analytics engines of varying video frame or scene analyzing characteristics connected in sequence enabling output of one of said video analytics engines is input of next video analytics engine in the sequence; and

memory element interfacing two consecutive processors to bridge differences in video frame or scene analyzing speed of the video analytics engines embodied in said consecutive processors and enable each of the consecutive video analytics engines operating at their respective video frame or scene analyzing speed.

In a preferred embodiment of the present system, the memory element interfacing two consecutive video analytics engines hold the frames before being fed to the following video analytics engine which enable each of the consecutive video analytics engines operating at their respective video frame or scene analyzing speed.

In a preferred embodiment of the present system, the sequence of the multiple processors includes inner feedback and looping mechanism to bypass any of the processor and it’s embodied video analytics engine of the sequence from analysis task.

According to a preferred embodiment, present system comprises
a first processor embodying object detecting analytics engine to receive input video frames and detect one or more possible objects of interest in the input video frames;
a second processor embodying object tracking analytics engine connected to said first processor through a first intermediate memory element for receiving output of the object detecting analytics engine via said first intermediate memory element and tracking the previously detected objects over the input video frames;
a third processor embodying object validating analytics engine connected to said second processor through a second intermediate memory element for receiving the input video frames of the tracked objects forwarded by the object tracking analytics engine along with the object detection and tracking information from said second intermediate memory element and thereby validating the detection and tracking results obtained so far;
a fourth processor embodying online learning analytics engine connected to said third processor through a third intermediate memory element for receiving the input video frames of the validated tracked objects forwarded by the object validating analytics engine from said third intermediate memory element and thereby revalidating the tracking results based on previously detected information for accurate detection of the objects of interest in the input video frames.

In a preferred embodiment of the present system, the first processor embodying object detecting analytics engine receives the input video frames from a video source via an intermediate memory buffer;
said intermediate memory buffer holds the input video frames before being fed to the input of the object detecting analytics engine, whereby, for slow execution speed of the object detecting analytics engine compare to the input feeding speed, the intermediate memory buffer bridges the speed gap for a time duration, depending on the memory buffer size.

In a preferred embodiment of the present system, the first intermediate memory element temporarily stores the object detecting analytics engine output from which the object tracking analytics engine track the previously detected objects;
said first intermediate memory element bridges the execution-speed gap between the object detecting analytics engine and the object tracking analytics engine.

In a preferred embodiment of the present system, the object tracking analytics engine is configured to discard some of the tracked objects based on a pre-configured rule-engine corresponding noisy objects and stores rest of the input frames along with the detection and tracking information into the second intermediate memory element

In a preferred embodiment of the present system, the object validating analytics engine validates the detection and tracking results obtained and stored in the second intermediate memory element and pass only the validated frames to the third intermediate memory element for accessing by the online learning analytics engine.

In a preferred embodiment of the present system, the object validating analytics engine includes a feedback mechanism with the rule engine, whereby feedback information of the object validating analytics engine to the rule engine stops inputting frames to the object validating analytics engine for a particular tracked object for which certain fixed number of frames has already been validated by it to avoid processing all the frames that the source generates and free up the second memory element by deleting information relating to an already validated tracked object.

In a preferred embodiment of the present system, the online learning analytics engine after learning from the first few validated positive samples, can further learn from information directly coming from the rule engine, thus bypassing the validating engine enabling availability of a particular tracked object in the first validated tracking as output with a negligible delay and then tracking the same particular tracked object in subsequent frames in real-time.

In a preferred embodiment of the present system, the third intermediate memory element bridges gap between execution speeds of the validating engine and the online learning engine.

In a preferred embodiment of the present system, the online learning engine includes a feedback mechanism with the rule engine to facilitate the rule engine for passing more valid object information than invalid object.

In a preferred embodiment of the present system, the rule engine includes
a counter for each unique tracked object to describe the number of times the frame containing the unique tracked object is send to the object validating engine for validation;
a switcher to switchably choose among (i) the video frame forwarding path 1 based on a pre-defined threshold describing the number of frames for a particular tracked object to be sent to the object validating engine from the object tracking engine to get confirmation that the detected object really falls under intended class and if from the object tracking engine validates that a particular tracked object is of the type under consideration, then it notifies the rule engine to send the specific frames which the object validating engine has already validates and the coordinate information of the detected objects to the online learning engine via said path 1 for initialization of the online learning engine as the positive samples and (ii) the video frame forwarding path 2 for rest of the frames for a tracked object, to pass the rest of the frames to the online learning engine directly through said path 2.

BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS:
Figure 1 depicts sequential combination of video analysis engines as traditionally practiced.
Figure 2 depicts the block diagram of heterogeneous processing-asynchronous video analytics system in accordance with the present invention.
Figure 3 shows dynamic selection of path for Object tracking: Path 1 is taken for first few frames for a particular object, while Path 2 is taken for rest of the frames.

DESCRIPTION OF THE INVENTION WITH REFERENCE TO THE ACCOMPANYING DRAWINGS:
The present invention discloses a method for real-time-response (RTR) generation and validation in video analytics and a system thereof. In one of the present embodiments, heterogeneous processing-asynchronous video analytics engines are combined in such a way that produces highly accurate responses in real-time.
In another embodiment, simpler video analytics engines are combined with computationally expensive complex video analytics engines (like based on deep learning) for false-alarm reduction in various video surveillance applications. Yet in another embodiment, a hierarchical processing video analysis framework is disclosed working as a real-time event-validator in low-computing environment. In another embodiment, a cross-learning video analytics system is provided for RTR generation and validation.
The term ‘heterogeneous processing-asynchronous video analytics engines’ in general refers to video analytics engines developed by integrating multiple video analytics engines which are diverse in character/content and have different execution speeds.
The term “processing-asynchronous” refers to the dissimilarity in the execution speeds of the constituent engines. The execution speeds of these component engines are different depending on the complexity of the used analysis mechanisms. As mentioned previously that the “Object Detector” engine can use simple techniques like frame differencing / background subtraction to detect probable objects in the FOV. These schemes are not so computationally heavy. On the other hand, the “Object Validator” engine should be highly accurate to reduce the false-alarms produced by the other simple engines, but it is a computationally expensive task. Because of the described hierarchical processing using the memory buffers in between different analysis processes running on different speed, and due to the intelligent routing of the frames across various analytics engines by means of a Switcher, the user will get the output response in near real-time; that too without any higher-computing infrastructure support (like GPGPU).
It is, therefore, one aspect of the disclosed embodiments to provide a method and system for RTR generation and validation in video analytics.
It is, therefore, one aspect of the disclosed embodiments to provide a method and system for combining heterogeneous processing-asynchronous video analytics engines for distributing the processing load.
It is, therefore, one aspect of the disclosed embodiments to provide a method and system for false-alarm reduction in various video analytics.
It is, therefore, one aspect of the disclosed embodiments to provide a method and system for hierarchical processing of video content working as a real-time event-validator in low-computing environment.
It is, therefore, one aspect of the disclosed embodiments to provide a cross-learning video analytics system/method is provided for RTR generation and validation.
To describe the invention, a video analytics/surveillance application called “human intruder detection” in accordance with the present invention is considered as an example. But, the invention is not limited to this application only and can be applied with or without modification for other video analytics/surveillance applications.
As depicted in Figure 1 and figure 1a, a video analysis engine has multiple components to perform its task, the output of one component serves as the input of the next component in sequence. Depending on the complexity of the real world scene captured in the video frames, various algorithms are designed and deployed in computer to generate output at different stages (components). Evidently, the computational bandwidth required by such analytics engines varies from one engine to another as different algorithms are used to realize different components using computer programming.
The block diagram of the figure 1 describes the scenario when different video analysis engines are chained in a sequential fashion, so that the output of one engine is the input of the next engine in the sequence. The input of the first engine is the video stream that has to be analyzed, while the output of the last engine is the result of the overall system. One or more of these engines may be used as validator. A validator is one of the analytics engines in the chain that generates a binary signal (YES or NO) to imply whether a valid intrusion event has occurred or not. An advantage of this combination is that depending on the output/response of a particular engine in the chain, the next engine in sequence that receives the output of the said particular engine can decide whether to process its input further or not - thus achieving computational efficiency. But, the overall output/response time of the combination is the cumulative time requirement of all the engines taken together in the worst case. It might happen that one of the engines (e.g., a DL based process) requires longer execution time than the others. This often leads to frame dropping if the video is generated in real time from sources like a camera. As a result, all the frames cannot be processed in real time or higher computing resources (like GPGPUs) are required to process all the frames in the video stream.
The block diagram of figure 2, describes the present system for effective and accurate real-time video analysis considering “human intruder detection” as an application for video analytics/surveillance. In this application, the intension is to specifically detect human presence in a restricted area. The present system basically comprises multiple processors each embodying video analytics engines of varying video frame or scene analyzing characteristics. The processors are connected in sequence enabling output of one of said video analytics engines is input of next video analytics engine in the sequence. Importantly, in the above sequence of the multiple processors, atleast one memory element interfaces two consecutive processors to bridge differences in video frame or scene analyzing speed of the video analytics engines embodied in said consecutive processors and enable each of the consecutive video analytics engines operating at their respective video frame or scene analyzing speed.

The sequence of the multiple processors may also include inner feedback and looping mechanism to bypass any of the processor and it’s embodied video analytics engine of the sequence from the analysis task.

The interfacing memory element between two consecutive video analytics engines hold the frames before being fed to the following video analytics engine which enable each of the consecutive video analytics engines operating at their respective video frame or scene analyzing speed.

The system of the present invention is now described in detail considering “human intruder detection” as an application for video analytics/surveillance. As shown in the figure, rather than feeding the input video frames directly to a first processor embodying first object detecting analytics engine (“Object Detector”), an intermediate memory buffer (Memory 1) is used to hold the frames before being fed to the input of the object detecting analytics engine. If the execution speed of the object detecting analytics engine is slow compared to the input feeding speed then this intermediate memory can bridge the processing speed gap to some extent for some limited time, depending on the memory buffer size.
The object detecting analytics engine can detect one or more possible human like objects in the scene. Some of them might be humans and others might not (i.e. cases of false detection).
The first processor embodying the object detecting analytics engine is connected with a second processor embodying object tracking analytics engine (Tracker) through a first intermediate memory element (Memory 2). The output of the object detecting analytics engine can be kept in the Memory 2 from which the Tracker can take input to track the previously detected objects. The intermediate “Memory 2” helps to bridge the execution-speed gap between the first analysis engine “Detector” and the second analysis engine “Tracker”.
Based on a pre-configured rule-engine (“Rule Engine 1”) some of the tracked objects can be discarded. e.g., this can be done using many rules like track duration and track displacement for a particular tracked object (represented by a unique track_id). This helps to discard detection and tracking of false human-like noisy objects.
Only the input frames which are not discarded by the previous analysis steps, along with the detection and tracking information can be stored into second intermediate memory (“Memory 3”). The second intermediate memory interfaces the second processor embodying the Tracker with the third processor embodying object validating analytics engine. To validate the detection and tracking results obtained so far; an advanced complex analysis process (like DL based object detector) called “Validator” in the object validating analytics engine can be used taking as input only the information available in the Memory 3.
In the example application given, “Validator” is the most time-consuming analysis module for generating output. A feedback mechanism is established between the “Validator” and “Rule engine 1”. The feedback information of the “Validator” to the “RuleEngine 1” can be used to stop inputting frames to the “Validator” for a particular tracked object for which certain fixed number of frames has already been validated by the “Validator”. This relieves the “Validator” from processing all the frames that the source generates. This feedback information could also be used to free up the “Memory 3” by deleting information relating to an already validated tracked object. e.g., say it has been pre-configured that for a tracked object, information corresponding to only 3 frames should be passed to the “Validator”. There could be more number of frames in the “Memory 3” corresponding to that particular object. Using a majority voting type mechanism the “Validator” decides whether the tracked object is a human or not. If it is validated as human then the “Validator” passes the information only contained in these 3 frames to the third intermediate memory element (Memory 4).
The third intermediate memory element (Memory 4) interfaces the third processor embodying the Validator with the fourth processor embodying online learning analytics engine “Online Learner”.
An “Online Learner” analysis engine is attached to the “Memory 4”. Generally, “Online Learner” works much faster than the “Validator”. The “Memory 4” bridges the gap between the execution speeds of the “Validator” and the “Online Learner”. Because, for a particular tracked object only those frames that have been validated by the “Validator” is passed to the “Online Learner” through the “Memory 4”. The first few samples related to a particular object input to the “Online Learner” are positive samples which lead to enhance performance of the overall system. As shown in the figure 2, that after learning from the first few (3 in this example) validated samples, the “Online Learner” can further learn from the information directly coming from “RuleEngine 1”, thus bypassing the ‘Validator’ engine. The tracking results revalidated by the “Online Learner” are the final output.
As a result of this configuration, for a particular tracked human the first validated tracking result will be available as output with a negligible delay. But, then the tracking will be done in real-time as the frames will not pass throughthe time-consuming processing in the Validator. Moreover, the feedback information from the “Online Learner” could also be used to reward (or penalize) the “RuleEngine 1” for passing more valid object information than invalid object and vice-versa. As mentioned earlier that as the first few samples for a particular tracked object is passed to the “Online Learner” only after validation by the “Validator”, the first few training instances to the “Online Learner” are good for enhancing the overall result. Moreover, because of the described hierarchical processing using the memory buffers in between different analysis processes running on different speed, the user will get the output response in near real-time; that too without any higher-computing infrastructure support (like GPGPU).
Figure 3 describes the dynamic path selection procedure for object tracking using a switching mechanism. Input frames are first analyzed by “Object Detector” for finding the presence of object/objects in the FOV. The “Object detection” is a low computing module and may fail to distinguish various image artifacts and/or other objects (that the application is not meant to identify) from the object of interests sometimes (False detection).
The image frames where object/objects are detected by the Object detector are passed to the next module “Object Tracker”. Using some pre-defined criteria the “Object Tracker” module produces tracking information (assigning a unique track identification number for each object and its tracking information like spatial position of the object in the FOV, track duration etc.) for each detected object. The “Rule Engine” module decides which frames should be passed to “Object Validator” based on some pre-defined criteria like track duration (e.g., only object which is tracked in n (say n = 10) number of frames) etc. The “Rule Engine” also maintains a counter for each unique tracked object describing how many times the frame containing the tracked object is send to the “Object Validator” for validation. The “Rule Engine” decides how many times it enforces the “Switcher” to choose “Path 1” based on a pre-defined threshold describing the number of frames for a particular tracked object to be sent to the “Validator” from the “Tracker” module to get confirmation that the detected object really falls under the intended class (Elimination of False detection).
The feedback information from the “Validator” to the “Rule Engine” module is used to stop the passing of frames for a particular tracked object from the “Tracker” to the “Validator” module. The “Validator” also sends the coordinates of the rectangular boxes around the detected objects to the “Rule Engine”. This coordinate information is later used by the “Online Learner” engine. If the “Validator” validates that a particular tracked object is of the type under consideration (like for human intruder detection – the tracked object is a human), then it notifies the “Rule Engine” to send the specific frames which the “Validator” has already validates and the coordinate information of the detected objects to the “Online Learner” for initialization of “Online Learner” module as positive samples. For rest of the frames for a tracked object, the “Rule Engine” switch off the “Path 1” and switch on the “Path 2”, thus avoiding high-computation overhead for these frames. In other words, the “Rule Engine” passes the rest of the frames to the “Online Learner” directly through “Path 2”. The “Online Learner” module also provides feedback information to the “Object Tracker” and to the “Rule Engine” module to refine their searching criteria. For example, the “Online Learner” module might send the dimensional information of a tracked object to the “Object Tracker” and the “Rule Engine” to discard tracked objects in subsequent frames which do not confirm the dimensional constraint.
,CLAIMS:WE CLAIM:

1. A system for effective and accurate real-time video analysis comprising

multiple processors each embodying video analytics engines of varying video frame or scene analyzing characteristics connected in sequence enabling output of one of said video analytics engines is input of next video analytics engine in the sequence; and

memory element interfacing two consecutive processors to bridge differences in video frame or scene analyzing speed of the video analytics engines embodied in said consecutive processors and enable each of the consecutive video analytics engines operating at their respective video frame or scene analyzing speed.

2. The system as claimed in claim 1, wherein the memory element interfacing two consecutive video analytics engines hold the frames before being fed to the following video analytics engine which enable each of the consecutive video analytics engines operating at their respective video frame or scene analyzing speed.

3. The system as claimed in claim 1 or 2, wherein the sequence of the multiple processors includes inner feedback and looping mechanism to bypass any of the processor and it’s embodied video analytics engine of the sequence from analysis task.

4. The system as claimed in anyone of claims 1 to 3, comprising
a first processor embodying object detecting analytics engine to receive input video frames and detect one or more possible objects of interest in the input video frames;
a second processor embodying object tracking analytics engine connected to said first processor through a first intermediate memory element for receiving output of the object detecting analytics engine via said first intermediate memory element and tracking the previously detected objects over the input video frames;
a third processor embodying object validating analytics engine connected to said second processor through a second intermediate memory element for receiving the input video frames of the tracked objects forwarded by the object tracking analytics engine along with the object detection and tracking information from said second intermediate memory element and thereby validating the detection and tracking results obtained so far;
a fourth processor embodying online learning analytics engine connected to said third processor through a third intermediate memory element for receiving the input video frames of the validated tracked objects forwarded by the object validating analytics engine from said third intermediate memory element and thereby revalidating the tracking results based on previously detected information for accurate detection of the objects of interest in the input video frames.

5. The system as claimed in anyone of the claims 1 to 4, wherein the first processor embodying object detecting analytics engine receives the input video frames from a video source via an intermediate memory buffer;
said intermediate memory buffer holds the input video frames before being fed to the input of the object detecting analytics engine, whereby, for slow execution speed of the object detecting analytics engine compare to the input feeding speed, the intermediate memory buffer bridges the speed gap for a time duration, depending on the memory buffer size.

6. The system as claimed in anyone of the claims 1 to 5, wherein the first intermediate memory element temporarily stores the object detecting analytics engine output from which the object tracking analytics engine track the previously detected objects;
said first intermediate memory element bridges the execution-speed gap between the object detecting analytics engine and the object tracking analytics engine.

7. The system as claimed in anyone of the claims 1 to 6, wherein the object tracking analytics engine is configured to discard some of the tracked objects based on a pre-configured rule-engine corresponding noisy objects and stores rest of the input frames along with the detection and tracking information into the second intermediate memory element

8. The system as claimed in anyone of the claims 1 to 7, wherein the object validating analytics engine validates the detection and tracking results obtained and stored in the second intermediate memory element and pass only the validated frames to the third intermediate memory element for accessing by the online learning analytics engine.

9. The system as claimed in anyone of the claims 1 to 8, wherein the object validating analytics engine includes a feedback mechanism with the rule engine, whereby feedback information of the object validating analytics engine to the rule engine stops inputting frames to the object validating analytics engine for a particular tracked object for which certain fixed number of frames has already been validated by it to avoid processing all the frames that the source generates and free up the second memory element by deleting information relating to an already validated tracked object.

10. The system as claimed in anyone of the claims 1 to 9, wherein the online learning analytics engine after learning from the first few validated positive samples, can further learn from information directly coming from the rule engine, thus bypassing the validating engine enabling availability of a particular tracked object in the first validated tracking as output with a negligible delay and then tracking the same particular tracked object in subsequent frames in real-time.

11. The system as claimed in anyone of the claims 1 to 10, wherein the third intermediate memory element bridges gap between execution speeds of the validating engine and the online learning engine.

12. The system as claimed in anyone of the claims 1 to 11, wherein the online learning engine includes a feedback mechanism with the rule engine to facilitate the rule engine for passing more valid object information than invalid object.

13. The system as claimed in anyone of the claims 1 to 12, wherein the rule engine includes
a counter for each unique tracked object to describe the number of times the frame containing the unique tracked object is send to the object validating engine for validation;
a switcher to switchably choose among (i) the video frame forwarding path 1 based on a pre-defined threshold describing the number of frames for a particular tracked object to be sent to the object validating engine from the object tracking engine to get confirmation that the detected object really falls under intended class and if from the object tracking engine validates that a particular tracked object is of the type under consideration, then it notifies the rule engine to send the specific frames which the object validating engine has already validates and the coordinate information of the detected objects to the online learning engine via said path 1 for initialization of the online learning engine as the positive samples and (ii) the video frame forwarding path 2 for rest of the frames for a tracked object, to pass the rest of the frames to the online learning engine directly through said path 2.

Dated this the 25th day of February, 2019 Anjan Sen
Of Anjan Sen and Associates
(Applicants Agent)

Documents

Application Documents

# Name Date
1 201831007200-STATEMENT OF UNDERTAKING (FORM 3) [26-02-2018(online)].pdf 2018-02-26
2 201831007200-PROVISIONAL SPECIFICATION [26-02-2018(online)]_132.pdf 2018-02-26
3 201831007200-PROVISIONAL SPECIFICATION [26-02-2018(online)].pdf 2018-02-26
4 201831007200-FORM 1 [26-02-2018(online)].pdf 2018-02-26
5 201831007200-DRAWINGS [26-02-2018(online)]_52.pdf 2018-02-26
6 201831007200-DRAWINGS [26-02-2018(online)].pdf 2018-02-26
7 201831007200-FORM-26 [23-05-2018(online)].pdf 2018-05-23
8 201831007200-Proof of Right (MANDATORY) [18-08-2018(online)].pdf 2018-08-18
9 201831007200-ENDORSEMENT BY INVENTORS [25-02-2019(online)].pdf 2019-02-25
10 201831007200-DRAWING [25-02-2019(online)].pdf 2019-02-25
11 201831007200-COMPLETE SPECIFICATION [25-02-2019(online)].pdf 2019-02-25
12 201831007200-FORM 18 [31-12-2021(online)].pdf 2021-12-31
13 201831007200-FER.pdf 2022-05-12
14 201831007200-OTHERS [11-11-2022(online)].pdf 2022-11-11
15 201831007200-FER_SER_REPLY [11-11-2022(online)].pdf 2022-11-11
16 201831007200-DRAWING [11-11-2022(online)].pdf 2022-11-11
17 201831007200-COMPLETE SPECIFICATION [11-11-2022(online)].pdf 2022-11-11
18 201831007200-CLAIMS [11-11-2022(online)].pdf 2022-11-11
19 201831007200-US(14)-HearingNotice-(HearingDate-19-12-2025).pdf 2025-11-13

Search Strategy

1 201831007200E_11-05-2022.pdf