Sign In to Follow Application
View All Documents & Correspondence

A Method For Selection Of Background During Video Conferencing

Abstract: One object of the invention is to provide a system for extracting caller(s) in video conferencing, removing the original background and replacing it by a caller selected background for hiding the location and maintaining privacy. The system is based on skin, eye, and colour and motion information of the image and hence extracts the caller(s). It automatically identifies the caller in the conference on the basis of the face and eye localization. Another object of the invention is to select a static / dynamic background template for advertisements as background. The service provider can add static / dynamic advertisements as the background which can generate substantial revenue. -7- The user selects the background and the system identifies the foreground parts so as to remove the original background and replace with the user selected one. The background removal is carried out in three stages. In the first stage of pre- processing, the details are reduced. Skin detection and face / eye localization can be carried out in the second stage. Decision making and colour clustering can be carried out in the third stage. Stream joining for replacing the background is done as a post-processing step. Thus the present invention provides a method for selection of background during a video conferencing, comprising the steps of: capturing raw stream at callers end; carrying out pre-processing operations on captured stream for automatically extracting / isolating the foreground and removing the background; joining streams with user selected static / dynamic background scenes; and encoding the processed stream for transmission.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
08 October 2007
Publication Number
16/2009
Publication Type
INA
Invention Field
ELECTRONICS
Status
Email
Parent Application

Applicants

SAMSUNG ELECTRONICS COMPANY LIMITED.
416, MAETAN-DONG, YEONGTONG-GU, SUWON-SI, GYEONGGI-DO

Inventors

1. DHALL, ABHINAV
SAMSUNG ELECTRONICS COMPANY LIMITED. PRESSMAN HOUSE, 2ND FLOOR, 10 A, LEE ROAD, KOLKATA-700 020
2. SHARMA, NAVEEN
SAMSUNG ELECTRONICS COMPANY LIMITED. PRESSMAN HOUSE, 2ND FLOOR, 10 A, LEE ROAD, KOLKATA-700 020

Specification

-2-
FTFLD OF THE INVENTION
The present invention relates to a method and system for selection of
background during video conferencing. In particular the invention aims at
developing a system through which the original background can be replaced with
a new caller selected background during run time of a video conferencing call.
It also relates a technique for automatic extraction of caller and removing the
background during a video conference in DTV. The system is based on skin,
eye, and colour and motion information of the image and hence extracts the
caller(s) such that a user selected background can be used. This technique can
be used in the video conferencing in DTV. And the invention also suggests a
new method of advertisement, in which the service provider can replace the
video call background with an advertisement.
BACKGROUND OF THE INVENTION
Document US 2007/0183662 discloses a method for automatic segmentation of a
region-of-interest (ROI) video object from a video sequence. Region-of-interest
object segmentation allows the extraction from background areas (non ROI area)
of the video sequence the selected region-of-interest or "foreground" objects

-3-
that may be of interest to a viewer. The region-of-interest object can be a
human face or a head and shoulder area of a human body. The method includes
a combination of region-of-interest feature detection, region segmentation, and
background subtraction.
Document US 20050271273 describes a method for providing more efficient and
improved foreground extraction. The extraction is achieved by using iterated
graph cuts and without the need for extensive user interaction. In one
embodiment the method includes segmentation of an image into a foreground
portion and a background portion. The properties corresponding to the
foreground and background portions of the image are determined and Gaussian
mixture model of distributions may be used for modeling the foreground and
background properties.
Document US 2005 021273, uses the technique of graph cut algorithms to
segment out the foreground part in the image. As a starting point it requires
some inputs from the user. Where in the user specifies some area of the image
which is supposedly foreground / background. Then the similar regions are
discovered interactively using the graph / grab cut technique.

-4-
Document US 20020051491 describes an image processing device, having an
input for receiving a stereo pair of images. A foreground extractor provided at
the input compares location of like pixel information in each image for
determining which pixel information is foreground pixel information and which
pixel information is background pixel information. A DCT block classifier is
coupled to the foreground extractor for determining which DCT blocks of at least
one of the images contain a threshold amount of foreground information. An
encoder coupled to the DCT block classifier encodes the DCT blocks having the
threshold amount of foreground information at a second lower quantization level.
The i2i project at Microsoft research lab, Cambridge also has a similar application
as that of this document. It is based on stereo vision system.
Document US 20060210159 describes a method for extracting a foreground
object from an image, by selecting a first pixel of the image and a set of second
pixels of the image associated with the first pixel. A set of contrasts for the first
pixel is determined by comparing the first pixel with each of the second pixels in
image value. An image structure of the first pixel is determined according to the
set of contrasts.

-5-
Document US 2005 0271273 uses iterative graph cut algorithm for image
segmentation and hence foreground extraction. But it requires user input initially
so as to construct its Gaussian model. This means that it cannot be used for
automatic segmentation whereas a system can automatically identify the
foreground and the background. Also if the same algorithm is used on each
frame of a video, due to its very heavy system requirements real time results are
not feasible. Further, if there is a scene change the system will not segment the
foreground correct, if it is not given the foreground information from the user
again. Hence it cannot be used in real time applications such as video
conferencing.
Document US 2002005149, relates to identifying foreground and assigning it
more bits during compression. It does take into consideration the colour / skin
information and hence is not robust for caller extraction. Secondly the whqle
approach described is based on providing better compression through object of
interest identification.
The i2i project at Microsoft research has a constraint, it requires a stereo camera
system i.e. two cameras such that a 3D or stereo vision can be created. But this
means that special hardware is required for the same.

-6-
Document US 2007/0183662 does not mention about using user selected
customized background and the method of revenue generation through
advertisements as backgrounds.
None of the systems listed above describes or mentions about replacing the
default background with user selected background in video conferencing.
SUMMARY OF THE INVENTION
One object of the invention is to provide a system for extracting caller(s) in video
conferencing, removing the original background and replacing it by a caller
selected background for hiding the location and maintaining privacy.
The system is based on skin, eye, and colour and motion information of the
image and hence extracts the caller(s). It automatically identifies the caller in
the conference on the basis of the face and eye localization.
Another object of the invention is to select a static / dynamic background
template for advertisements as background. The service provider can add static
/ dynamic advertisements as the background which can generate substantial
revenue.

-7-
The user selects the background and the system identifies the foreground parts
so as to remove the original background and replace with the user selected one.
The background removal is carried out in three stages. In the first stage of pre-
processing, the details are reduced. Skin detection and face / eye localization
can be carried out in the second stage. Decision making and colour clustering
can be carried out in the third stage. Stream joining for replacing the
background is done as a post-processing step.
Thus the present invention provides a method for selection of background during
a video conferencing, comprising the steps of: capturing raw stream at callers
end; carrying out pre-processing operations on captured stream for automatically
extracting / isolating the foreground and removing the background; joining
streams with user selected static / dynamic background scenes; and encoding
the processed stream for transmission.
BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS
The present invention can now be described with reference to the figures of the
accompanying drawings where

-8-
Figure 1 shows a block diagram of the user settings for the proposed invention.
Figure 2 shows a block diagram of the foreground extraction part of the
proposed invention and describes the flow of the same.
DETAILED DESCRIPTION
As shown in Figure 1 the callers' end in a video conferencing is represented by
reference 1. Means 2 is provided for capturing a raw video stream from the
conference scene. Means 3 of the system extracts the foreground and removes
the background. The background removal with the aid of means 3 is carried out
in three stages, pre-processing the foreground for reducing details, detecting
skin / eye localization and making a decision for skin and colour clustering.
Means 4 is provided for skin and colour clustering. The system is provided with
means 5 for adding new background to the system. An encoder 6 is used for
encoding the processed stream and transmission of the stream.
In the first stage of pre-processing operation 11, filters such as colour
quantization and low pass are applied which smoothen the image and hence
facilitate in economical computation. Because the algorithm has been designed
keeping in mind embedded platform, which always have limited memory and
computing resources, complex transformations have been avoided.

-9-
The simple pre-processed stream is then converted into hue saturation variance
(HSV) domain in the second stage and skin detection is applied onto it. Block 12
in Figure 2 represents skin pixel analysis and clustering parameters decision
making. Once the skin like coloured areas are known, the system tries to locate
the eye and the face. Now the eye and the face location are vital features as
this helps in setting parameters for the next stage i.e. colour clustering. If
multiple faces are found in the image connected component analysis are used for
labeling.
The third stage starts with making a decision with respect to the statistics gained
in stage two. On the basis of the ratio of the skin like pixels to non skin like
pixels and the locations of the eye and face an estimation of the user body can
be made. This helps in lowering down the area required for colour clustering
and also parameters in terms of number of clusters required. An optimized
version K means clustering is used for segmentation.
On the basis of parameters derived in stage three, the body is estimated out of
the clusters found in data clustering analysis 13. The stream is joined in join and
rebuild 14 with the faces pixel found in skin detection. Post processing filters 15
are applied such as the morphological operators for removing noise.

-10-
For consequent frames motion estimation is used until a scene change. Now
here the foreground has been isolated and the streams are joined with the user
selected static / dynamic scenes.
Some of the advantages of the present invention are a novel method of user
selected background selection, automatic caller extraction, a novel approach in
video conferencing based on skin and eye information, removal of unwanted
details in the scene, and providing a new method for advertising in which the
service provider can display advertisements in lieu of the original background.
Knowing the region of interest means better compression can be done over the
frames.
The present invention also provides a DTV / HDTV / IPTV comprising the system
of the invention, where the method for selection of the background in video
conferencing can be applied.

-11-
WE CLAIM
1. A method for selection of background during a video conferencing,
comprising the steps of:
- capturing raw stream at callers end;
- carrying out pre-processing operations on captured stream for
automatically extracting / isolating the foreground and removing
the background;
- joining streams with user selected static / dynamic background
scenes; and
- encoding the processed stream for transmission.
2. The method as claimed in claim 1, wherein said step of extracting the
foreground and removing the background comprise the steps of:
- pre-processing using filters like colour quantization and low pass
filters for smoothening the image;
- converting the pre-processed stream into line saturation variance
(HSV) domain; and
- estimating the user body out of the clusters found in data
clustering on the basis of skin pixel / non-skin pixel ratio, and
locations of eye and face for extraction of foreground and a change
in scene.

-12-
3. The method as claimed in claim 1, wherein said user selected background
scene is a display of static / dynamic advertisements from the service
provider,
4. A system for selection of background during video conferencing,
comprising:
- means for capturing raw video stream from the conference scene;
- means for extracting the foreground and for removing the
background by pre-processing the foreground, detecting skin / eye
localization and skin colour clustering;
- means for adding new background to the system; and
- an encoder for encoding the processed stream before transmission.

5. A DTV, or a HDTV, or an IPTV comprising the system as claimed in claim
5.
6. A method for selection of background during a video conferencing,
substantially as herein described and illustrated in the accompanying
drawings.

One object of the invention is to provide a system for extracting caller(s) in video
conferencing, removing the original background and replacing it by a caller
selected background for hiding the location and maintaining privacy.
The system is based on skin, eye, and colour and motion information of the
image and hence extracts the caller(s). It automatically identifies the caller in
the conference on the basis of the face and eye localization.
Another object of the invention is to select a static / dynamic background
template for advertisements as background. The service provider can add static
/ dynamic advertisements as the background which can generate substantial
revenue.

-7-
The user selects the background and the system identifies the foreground parts
so as to remove the original background and replace with the user selected one.
The background removal is carried out in three stages. In the first stage of pre-
processing, the details are reduced. Skin detection and face / eye localization
can be carried out in the second stage. Decision making and colour clustering
can be carried out in the third stage. Stream joining for replacing the
background is done as a post-processing step.
Thus the present invention provides a method for selection of background during
a video conferencing, comprising the steps of: capturing raw stream at callers
end; carrying out pre-processing operations on captured stream for automatically
extracting / isolating the foreground and removing the background; joining
streams with user selected static / dynamic background scenes; and encoding
the processed stream for transmission.

Documents

Application Documents

# Name Date
1 1381-KOL-2007-FORM 18.pdf 2011-10-07
1 1381-KOL-2007_EXAMREPORT.pdf 2016-06-30
2 01381-kol-2007-abstract.pdf 2011-10-07
2 01381-kol-2007-gpa.pdf 2011-10-07
3 01381-kol-2007-form 3.pdf 2011-10-07
3 01381-kol-2007-claims.pdf 2011-10-07
4 01381-kol-2007-form 2.pdf 2011-10-07
4 01381-kol-2007-correspondence others.pdf 2011-10-07
5 01381-kol-2007-description complete.pdf 2011-10-07
5 01381-kol-2007-form 1.pdf 2011-10-07
6 01381-kol-2007-drawings.pdf 2011-10-07
7 01381-kol-2007-description complete.pdf 2011-10-07
7 01381-kol-2007-form 1.pdf 2011-10-07
8 01381-kol-2007-correspondence others.pdf 2011-10-07
8 01381-kol-2007-form 2.pdf 2011-10-07
9 01381-kol-2007-claims.pdf 2011-10-07
9 01381-kol-2007-form 3.pdf 2011-10-07
10 01381-kol-2007-gpa.pdf 2011-10-07
10 01381-kol-2007-abstract.pdf 2011-10-07
11 1381-KOL-2007_EXAMREPORT.pdf 2016-06-30
11 1381-KOL-2007-FORM 18.pdf 2011-10-07