An Intelligent And Unified Method Of Multiple Component Colour Object Analysis In A Scene Favouring Scene Analytic Applications.
Abstract:
There is disclosed a process and an intelligent unified framework for colour object
analysis in a scene in order to develop efficient video analytics applications and other
intelligent machine vision technologies. The advancement is directed to provide an
intelligent and adaptive framework for improved colour object detection method
which can eliminate the defects encountered in the presently available techniques of
colour object detection irrespective of any video noises like shadow, glare, colour
changes due to varying illumination, and effect of lighting condition on colour
appearance, electronics generated induced noises (e.g. shot noise, but not limited to)
and other type of noises sensitive to human vision system. Advantageously, object
analysis technique is also capable of detecting and characterizing static objects along
side with colour moving objects in the same scene by an advanced unified framework
based on multi-layer estimation technique.
Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence
E-375 BAISHNABGHATA - PATULI TOWNSHIP,
KOLKATA - 700091,
WEST BENGAL, INDIA
2. BHATTACHARYYA, DIPAK
16/839 KISHORI BAGAN MEARBER PIRTALA,
P.O.: CHINSURAH, DIST.:HOOGHLY,
PIN: 712101 WEST BENGAL, INDIA.
3. BOSE, TUHIN
BE-1/14/1, PEYARA BAGAN
DESHBANDHU NAGAR, CITY:KOLKATA,
PIN: 700 059, WEST BENGAL, INDIA
4. DALAL, TUTAI KUMAR
KHIRPAI - HATTALA (WD - 4),
DIST: PASCHIM MEDINIPUR,
PIN: 721232, WEST BENGAL, INDIA.
5. DAS, SAWAN
16 GREEN VIEW, GARIA,
CITY: KOLKATA, PIN: 700084,
WEST BENGAL, INDIA.
6. DHAR, SOUMYADEEP
PURBAYAN APPARTMENT,
TARASANKAAR ROAD BY LANE,
DESBANDHU PARA, P.O.: SILIGURI,
DIST: DARJEELING, PIN: 734404,
WEST BENGAL, INDIA.
7. MAITY, SOUMYADIP
VILL & PO: DUMARDARI,
PIN: 721425, PURBA MEDINIPUR,
WEST BENGAL, INDIA
Specification
Field of the Invention
The present invention relates to advancement in process and an intelligent unified
framework for colour object analysis in a scene in order to develop efficient video
analytics applications and other intelligent machine vision technologies. The
advancement is directed to provide an intelligent and adaptive framework for
improved colour object detection method which can eliminate the defects
encountered in the presently available techniques of colour object detection
irrespective of any video noises like shadow, glare, colour changes due to varying
illumination, and effect of lighting condition on colour appearance, electronics
generated induced noises (e.g. shot noise, but not limited to) and other type of
noises sensitive to human vision system. Advantageously, object analysis technique
is also capable of detecting and characterizing static objects along side with colour
moving objects in the same scene by an advanced unified framework based on
multi-layer estimation technique.
Background of the Invention
Video Management Systems are used for video data acquisition and search processes
using single or multiple servers. They are often loosely coupled with one or more
separate systems for performing operations on the acquired video data such as
analyzing the video content, etc. Servers can record different types of data in storage
media, and the storage media can be directly attached to the servers or accessed
over IP network. This demands a significant amount of network bandwidth to receive
data from the sensors (e.g, Cameras) and to concurrently transfer or upload the data
in the storage media. Due to high demand in bandwidth to perform such tasks,
especially for video data, often separate high speed network are dedicated to transfer
data to storage media. Dedicated high speed network is costly and often require
costly storage devices as well. Often this is overkill for low or moderately priced
installations.
It is also known that to back up against server failures, one or more dedicated fail-
over (sometimes called mirror) servers are often deployed in prior art. Dedicated fail-
over servers remain unused during normal operations and hence resulting in wastage
of such costly resources. Also, a central server process either installed in the failover
server or in a central server is required to initiate the back-up service, in case a
server stops operating. This strategy does not avoid a single point of failure.
Moreover, when the servers and clients reside over different ends in an internet and
the connectivity suffers from low or widely varying bandwidth, transmission of multi-
channel data from one point to another becomes a challenge. Data aggregation
techniques are often applied in such cases which are computationally intensive or
suffer from inter-channel interference, particularly for video, audio or other types of
multimedia data.
As regards analytic servers presently in use it is well known that there are many
video analytics system in the prior art. Video content analysis is often done per frame
basis which is mostly pre defined which make such systems lacking in desired
efficiency of analytics but are also unnecessarily cost extensive with unwanted loss of
valuable computing resources.
Added to the above, in case of presently available techniques of video analysis ,cases
of unacceptable number of false alarms are reported when the content analysis
systems are deployed in a noisy environment for generating alerts in real time. This
is because the traditional methods are not automatically adaptive to demography
specific environmental conditions, varying illumination levels, varying behavioural
and movement patterns of the moving objects in a scene, changes of appearance of
colour in varying lighting conditions, changes of appearance of colours in global or
regional illumination intensity and type of illumination, and similar other factors.
It has therefore been a challenge to identify the appearance of a non-moving foreign
object (static object) in a scene in presence of other moving objects, where the
moving objects occasionally occlude the static object. Detection accuracy suffers in
various degrees under different demographic conditions.
Extraction of particular types of objects (e.g. face of a person, but not limited to) in
images based on fiduciary points is a known technique. However, computational
requirement is often too high for traditional classifier used for this purpose in the
prior art, e.g., Haar classifier.
Also, in a distributed system where multiple sites with independent administrative
controls are present, unification of those systems through a central monitoring
station may be required at any later point of time. This necessitates hardware and OS
independence in addition to the backward compatibility of the underlying
computational infrastructure components, and the software architecture should
accommodate such amalgamation as well.
It would be thus clearly apparent from the above state of the art that there is need
for advancement in the art of sensory input/data such as video acquisition cum
recording and /or analytics of such sensory inputs/data such as video feed adapted to
facilitate fail-safe integration and /or optimized utilization of various sensory inputs
for various utility applications including event/alert generation, recording and related
aspects.
Automatic separation of foreground moving objects from the static background in an
image sequence (video) is the primary task for subsequent analysis of video. These
separated moving objects are the keys for any development on video analytics
application. Efficient execution of this task using colour video data that represents a
dynamic scene is challenging and is of immense interest to the experts in the domain
of intelligent machine vision technology and related applications.
Foreground object extraction in a video is a primary requirement and several basic
technologies are adopted by the experts in image processing and computer vision.
Foreground object extraction can be treated as a background subtraction problem.
That is in a video, foreground objects can be detected simply by subtracting the
current image from a background image of the scene. This background image needs
to be determined beforehand. Several approaches have been proposed to estimate
the background from a video sequence in literatures. However, if the background is
consistently affected by shadow, glare, time varying noises, effect of lighting
variation on colour, background estimation becomes a very challenging task
especially in outdoor environment when different seasonal environment is always a
concern. The goal of foreground object extraction is to divide an image into its
constituent regions which are sets of connected pixels or objects, so that each region
itself will be homogeneous with respect to the different physical objects whereas
different regions will be heterogeneous with each other. The foreground object
extraction accuracy may determine the eventual success or failure of many sub-
sequent techniques for video analytics and object recognition, and object based
different event-detection. The known art techniques suffered from imperfect
generation of blobs incoherent with the actual size, shape, feature of the original
object distinguishable by human eyes.
In addition to estimation of proper background scene, another key challenge is also
to handle the shadows and glares during foreground extraction process so that the
objects can be detected accurately. Due to obvious presence of the natural
phenomenon such as shadow and glare, appearance of the objects in the scene
becomes distorted. As a result, the extracted foreground objects associated with the
shadow and glare do not give the proper information about object features like
position, size, shape, contour etc. and any sub-sequent techniques dependent on
these object features bound to fail.
In a real scenario, nature of the shadow and glare can be static, moving or both.
Static or very slowly moving shadow and glare can be modeled by some background
modeling techniques. But moving shadows and glares that are associated with
moving objects are hard to model and eliminate from being detected. Hence effective
identification of shadow and glare regions and elimination of those regions from
actual foreground objects remain to be challenging and important for any video
analytic applications.
Traditionally, shadow and glare are detected using fixed thresholding methods where
a set of fixed and trained thresholds are used to detect the shadow and glare regions.
Mostly these fixed thresholds are derived by observing the variation of pixel intensity
over video frames due to presence of shadow and glare in a specific type of scene, so
their applicability is limited to that type of scene only. Some techniques improve the
fixed thresholding approach by introducing a modeling on shadow and glare
thresholds to make them adaptive, but till either they are very specific to type of the
scene or they require a lot of computations. Another type of shadow detection
approaches applies scene knowledge based object-wise shadow regions identification.
These approaches use a scene knowledge (e.g. difference of shape, size, colour etc.
between objects associated with shadow and without any shadow) about the
appearance of shadows in the scene and apply that knowledge to identify and
distinguish the shadow regions from the associated objects. However, accuracy of the
said techniques is low when applied in real-life scenario where one scene varies
widely with respect to other scene, and also with respect to time.
Objects of the Invention
It is thus the basic object of the present invention to provide an intelligent and
adaptive framework for improved colour object detection method which can eliminate
the defects encountered in the prior state-of-art irrespective of any video noises like
shadow, glare, colour changes due to varying illumination, and effect of lighting
condition on colour appearance, electronics generated induced noises (e.g. shot
noise, but not limited to) and other type of noises sensitive to human vision system.
Another object of the present invention is directed to advancements in sequence of
processes adapted to provide more accurate information of colour objects in an
image taken from any video sequence including by low cost cameras whereby any
sequential video images can be processed with this method to locate all possible
detectable colour objects and their related information which can be further
processed to analyze the scene dynamics with respect to the object itself and in
association with other foreground objects wherein the extracted information can be
used to measure any statistical information regarding the object or association of the
colour object with any other animate or inanimate colour objects in the scene.
Yet another object of the present invention is directed to a method for improved
colour background information by eliminating the defects encountered in the prior
state-of-art in presence of video noises like spatial movement of non-meaningful
objects, change of appearance of colour due to presence of shadow, change of
appearance of the colour in the object when it moves to a low intensity (darker)
region from a higher intensity (brighter) region and vice versa.
Another object of the present invention is directed to a technique which would be
adaptive also when the colour appearance of the foreground objects and
background of the scene changes frame to frame due to change in global intensity or
other phenomena such flickering, sensitivity of the sensor in the camera, etc.
Yet another object of the present invention is directed to an object analysis
technique adapted for detecting and characterizing static objects along side with
colour moving objects in the same scene by way of an advanced unified framework
based on multi-layer estimation technique.
A further object of the present invention is directed to a multi-layer static foreground
pixel estimation technique overcoming the inability of any traditional background
estimation technique to distinguish the background pixels from the foreground pixels
that remain static for a long duration which would further enable . better control
over the process of distinguishing the static foreground pixels from the background.
Another object of the present invention is directed to advancements in method
discussed above by interconnecting a number of intelligent components consisting of
hardware and software, and involving implementation techniques adapted to make
the system efficient, scalable, cost effective, fail-safe, adaptive to various
demographic conditions, adaptive to various computing and communication
infrastructural facilities.
Summary of the Invention
Thus according to the basic aspect of the present invention there is provided an
intelligent and unified method of multiple component colour object analysis in a
scene favouring scene analytic applications comprising:
multiple component colour coherent background estimation involving colour
correlation of neighbouring pixels and inter-frame multiple component colour
correlation using said multiple components as a composite data and using the
relative values of these components to maintain accurate colour information and
appearance of the true colour in the estimated background frame.
An intelligent and unified method as above wherein said multiple components
comprise multi-spectraI signals including human visible spectra Red (R), Green (G),
Blue (B) signals and similar.
An intelligent and unified method of colour object analysis as above comprising (A)
unified colour coherent background estimation involving statistical pixel
processing;(B) removal of shadow and glare from the scene alongwith removal of
electronics induced different types of noises in sensors and vibrations of sensors;(C)
characterization of pixels in the foreground regions and extract moving and/or static
objects.
An intelligent and unified method of colour object analysis as above comprising
tracking variety of objects individually and generating related information for rule-
engine based intelligent analytical applications.
An intelligent and unified method of colour object analysis as above wherein said
unified colour coherent background estimation involving statistical pixel processing
comprises using R,G,B components as a composite single structure in a unified
manner to thereby preserve the mutual relationship of theses colours components in
each individual pixel in order to maintain true colour appearances in the estimative
colour background frame;
continuously readjusting estimated or predicted values for each colour pixel in a
frame with all sequential forthcoming frames of the colour video;
correlate the spatial distribution of the colour values in a local region to model the
pixel background colour value.
An intelligent and unified method of colour object analysis as above wherein for each
pixel (x,y) in the input colour frame there is carried out (i) local window estimation
(ii) colour analysis of each pixel and (iii) background frame construction based
thereon.
An intelligent and unified method of colour object analysis as above wherein if the
pixel location in a current frame belongs to an object pixel in the previous frame,
estimation of colour background at that pixel location is skipped since the colour pixel
is not representative of the background estimation ,otherwise, compute an adaptive
size (k * h, k* w) local window centering around this pixel for computation of the
background estimation using the colour pixel values within this window, where
representing normalized average intensity of all the pixels in window
size (h, w). for all 0