Pedestrian Detection And Tracking System

Abstract: The present invention relates to a pedestrian detection and tracking system. The system of the present invention provides for object detection by extracting the probable pedestrian blocks from given input images. The system according to the invention detects edges in the image which are grouped by merging to form possible pedestrian like image blocks. Profiling is done on the obtained pedestrian like image blocks to further segment the blocks. The image blocks thus obtained are subjected to aspect ratio and row-logic tests. Tracking is done for block verification in subsequent frames and the image blocks are displayed. The said pedestrian detection and tracking system assists vehicle users in varying light intensities and scenarios.

Patent Information

Application #

Filing Date

28 February 2011

Publication Number

39/2011

Publication Type

INA

Invention Field

ELECTRONICS

Status

Email

Parent Application

Applicants

KPIT CUMMINS INFOSYSTEMS LIMITED

35 & 36 RAJIV GANDHI INFOTECH PARK, PHASE 1, MIDC, HINJEWADI, PUNE - 411057, MAHARASHTRA, INDIA.

Inventors

1. VAIDYA, VINAY G.

108, PRATHAMESH PARK, BALEWADI ROAD, BANER, PUNE - 411045, MAHARASHTRA, INDIA.

2. KUTTY, KRISHNAN K

401, B5, GREEN DAISY, GREENFIELD APARTMENTS, NEAR ANTRIKSH CO-OP HOUSING SOCIETY, VAASTU UGHYOG, PIMPRI, PUNE - 411018, MAHARASHTRA, INDIA.

3. GINDI, SANJYOT

A2-5, AMRITWEL SOCIETY, MODEL COLONY, H.K.M. ROAD, PUNE-411016, MAHARASHTRA, INDIA.

4. KHARADE, PALLAVI

C/O D. P. KHARADE, NEAR S. P. COLLEGE, TALUKA: BHOOM, DIST; OSMANABAD - 413504, MAHARASHTRA, INDIA.

5. KANCHARLA, TARUN CHOWDARY

H. NO: 4-287, NEW SARVODAYA NAGAR COLONY, MEERPET, SAROORNAGAR MANDAL, HYDERABAD-500 097, ANDHRA PRADESH, INDIA.

Specification

FORM 2
THE PATENTS ACT 1970
(39 of l970)
AND
The Patents Rules, 2003
COMPLETE SPECIFICATION
(See section 10 and rulel3)
1. TITLE OF THE INVENTION:
"PEDESTRIAN DETECTION AND TRACKING SYSTEM"
2. APPLICANT:
(a) NAME: KPIT CUMMINS INFOSYSTEMS LIMITED
(b)NATIONALITY: Indian Company incorporated under the Companies Act, 1956
(c) ADDRESS: 35 & 36 Rajiv Gandhi Infotech Park, Phase 1, MIDC, Hinjewadi, Pune - 411057, Maharashtra, India.
3. PREAMBLE TO THE DESCRIPTION:
The following specification describes the invention and the manner in which it is to be performed.

Field of Invention:
The field of invention generally relates to object detection and more specifically relates to pedestrian detection with near infrared imaging using edge detection, object detection and profiling.
Background of the invention:
There has been an increase in the number of traffic accidents over the years. Multi-tasking while driving has proven dangerous, fatal even, on several occasions adding to the causes of accidents while driving. Pedestrians are among the major victims of these accidents and more so at night because the driver is unable to see the pedestrian until it is too late. It is therefore necessary to provide warning and environment awareness systems that provide additional assistance to drivers without distracting them.
There has been a lot of interest in development and use of night vision systems to aid the driver to detect pedestrians well in advance. Both FIR (far infra-red) and N1R (near infrared) cameras have been used in these systems. In FIR systems, images are registered based on an object's thermal signature and the image thus produced resembles a photo negative. While these systems work well for distinguishing animals and people, they fail to do much for identifying a dead animal in the middle of the road or perhaps a large rock or a fallen tree. The NIR camera system, which responds to IR wavelengths closer to the visible spectrum, in addition to visible wavelengths, gives more natural images to the human driver.
Currently, in order to enhance a driver's visibility, commercial night vision systems for automotives are available as accessories, typically, in high-end cars. But their high cost impedes their popularity and utility among vehicle users.
WO 2006059201 proposes one such system wherein, the extent of roads is adjudged by the system by calculating velocity information from image data captured. It attempts at identifying objects and indicates danger in case of collision with the moving objects,

Most similar systems rely on training databases for pedestrian detection. Such data becomes redundant and obsolete over the time, unless periodically and frequently updated. Such issues make these systems unreliable and inconvenient.
In view of drawbacks in the prior art, there is a need to equip drivers with pedestrian detection and warning system which is reliable, efficient and affordable.
Summary of the invention:
The present invention provides non-obtrusive vision based means for object detection to assist drivers while driving. The system of the present invention provides for object detection by extracting the probable pedestrian blocks from the given input images. The feature extraction means performs calculations on the input images and the corresponding data is utilised by the tracking means and display and warning means to provide real time detection and tracking of pedestrians.
Brief description of the Drawings:
Fig.l: Illustrates a block diagram of the said system.
Fig.2: Illustrates a block diagram of operations of the system.
Fig.3: Illustrates the process of major edge detection.
Fig.4: Illustrates expansion of the feature extraction and profiling block.
Fig.5: illustrates a pear shape like curve.
Detailed description of the Invention:
The present invention will be described through the preferred embodiments herein below with reference to the drawings.
In an embodiment as seen in Fig. 1, the Object Detection and Tracking system is an apparatus comprising of an Image Input Device (1), a Controller Unit (2), and a Display and Warning Device (6). The Controller Unit (2) further comprises of a Feature Extractor and Profiling Block (3), a Tracking Block (4) and a Display Block (5).

The Image Input Device (1) may be a camera device coupled with a Near-Infra red illumination source or any similar camera input device thereof. The Image Input Device (i) provides continuous image frames made of image blocks as input to the Controller Unit (2) for processing. The 'images', as mentioned hereinafter, are a sequence of frames that are being captured by an on-ward looking camera which is placed inside the host vehicle. The Controller Unit (2) controls the working of the apparatus by maintaining the coordination of operations in the system. The images received from the Image Input Device (1) are passed on to the Feature Extractor and Profiling Block (3) by the Controller Unit (2) for processing. The Feature Extractor and Profiling Block (3) implements edge detection filters to extract edges from the given input image. The data obtained from the resultant image is analysed for determining the image blocks that contain probable pedestrians. Profiling is performed on the obtained image blocks to identify whether they contain pedestrian objects. This data is provided to the Controller Unit (2), which is processed and provided to the Tracking Block (4). The Tracking Block (4) verifies the findings of the Feature Extractor and Profiling Block (3) by tracking identified pedestrians in successive input images. Findings of the Tracking Block (4) are provided to the Controller Unit (2) which displays it on the display screen of the Display and Warning Device (6). The Display Block (5) of the Controller Unit (2) determines the contents to be displayed on the Display and Warning device (6), based on certain predefined conditions. The. Display and Warning Device (6) may include a visual warning, both visual and audio warning, or any other form of warning device.
In its preferred embodiment, the system performs pedestrian detection as illustrated in Fig. 2. The Controller Unit (2) provides every 'f'h input image frame, which is dependent on the speed of the car and frame rate of camera being used, to the Feature Extractor and Profiling Block (3) and the next (f-1) frames are subjected to the Tracking Block (4) for tracking using template matching; wherein 'f' is a predefined number according to the requirement. According to an embodiment of the invention, the Controller Unit (2) provides every '5th' input image frame to the Feature Extractor and Profiling Block (3) and the remaining '4' frames are subjected to the Tracking block (4) for tracking using template matching. The Feature Extractor and Profiling Block (3) extracts only the vertical edges by applying vertical and diagonal Sobel kernels on the image and retaining

only the edge pixels which have higher value along vertical direction than diagonal direction, after applying the Sobel kernel. The Feature Extractor and Profiling Block (3) also retains only those edges which are greater than or equal to 'n' pixels, wherein 'n' is dependent on the image size, in length. For the embodiment of the invention, the Feature Extractor and Profiling Block (3) retains only those edges which are greater than or equal to '3' pixels. Additionally, in order to eliminate edges obtained due to small variations in gray level on generally smooth surfaces, the edges are selected only if the difference of pixel intensities at the edges is greater than a pre-determined threshold 'r' value, wherein the threshold value V depends on the bit depth of the image under consideration. . For the embodiment of the invention, the pre-determined threshold value 'r' is set to '20' pixels
The extraction of edges is based on selection of prominent vertical edges in the input image. Fig. 3 illustrates the process for major edge detection. The edge detection is performed using an 'm' x 'm' Sobel kernel, wherein 'm' is a predefined value. For the embodiment of the invention, as illustrated in Fig. 3, the edge detection is performed using a 3x3 Sobel kernel each for vertical and 2 diagonal edges (corresponding to angle 45 degrees and 135 degrees). For each pixel in the input image, the value obtained after applying each of the 3 edge kernels is compared. All the input pixels, for which the diagonal edge value is found to be greater than the vertical edge value, are discarded. On this edge image a morphological opening operation is performed using a vertical kernel of height 'n' pixel and 1 pixel width. The opening operation removes the very small edges from the image which arise usually due to noise.
After discarding diagonal edges and performing opening operation, the output image now consists only of vertical edges that are greater than a particular size. The output of the vertical edge detection function is further subjected to blob detection and merging.
Blob detection identifies the probable pedestrian image blocks or pedestrian like objects in an image. Vertical edges close to each other usually belong to a single object in an image. The blob detection scheme uses connected component labelling for grouping vertical edges which are close to each other. It comprises of generating a data structure to

save information about the blob. The blobs that are less than minimum acceptable height of a pedestrian and greater than maximum acceptable height of the pedestrian are deleted.
Merging of blobs solves the problem of missed detection, when the edges belonging to the same object of interest in the image are dispersed. In cases where a pedestrian is located very close to the camera or pedestrian is running, the edges of the same pedestrian are separated by a distance. In such cases the blobs are separated. As a result there is a possibility of not detecting a pedestrian, because both the blobs may not satisfy the height and width criteria independently.
Merging of blobs is done as follows:
• The vertical and horizontal distances between centroids of adjacent blobs are calculated.
• The adjacent blobs whose horizontal distance is less than a preset 'Threshold 1' and vertical distance is less than a preset 'Threshold 2'are merged.
• 'Threshold 1' and 'Threshold 2' are preset depending on the image size.
The above mentioned process is repeated twice in order to identify all the probable
pedestrian like candidates.
Another object of the said invention is profiling. After the detected blobs are merged, they are further subjected to profiling. As illustrated in Fig. 4, row and column profiling are used for calculating the approximate width and height of an object in the image block.
The image blocks obtained after blob merging are the regions from the input image which contain objects which are pedestrians or probable pedestrian candidates and hence are probable pedestrian image blocks. These image blocks are further processed to segment the pedestrian like objects.
For row profiling, each bin contains the sum of pixel intensities belonging to the corresponding column. In case of column profiling, each bin contains the sum of pixel intensities belonging to each corresponding row. The row profile has number of bins equal to the columns in the image, while the column profile has number of bins equal to the total rows in the image.

The row and column profiles are used for calculating the width and height respectively, of an object in the image block in terms of number of pixels. This is done by calculating the number of rows (or columns) for which the bin values lie above the profiling threshold, This profiling threshold is calculated as follows:
Profiling Threshold = min value + C * (max value - min value) where, 'C is a preset percentage of the difference between the 'max value'
and 'min value'; 'min value' and 'max value' : values obtained separately for both the
row profiles and the column profiles.
The image blocks are then checked for height and aspect ratio, in order to determine whether the object is a pedestrian or not. The aspect ratio of the segmented object is calculated by taking the ratio of its height to its width. The height is compared against the 'estimated height' which is calculated using the row number and vanishing point, as described hereinafter. If the objects in the image blocks fall within the permissible aspect ratio and height limits, the coordinates of the segmented image block are sent for further processing to the Display Block (5) of the Controller Unit (2).
As the distance between the pedestrian and camera increases, the pedestrian appears smaller. Depending on the vanishing point, the Feature Extractor and Profiling Block (3) estimates the height of the pedestrian based on the row number where his feet lie. The estimated height is calculated using a height-distance model as given below:
Estimated Height = K*(Row Number- Vanishing Point);
Where, 'K' is a heuristically determined constant;
'Vanishing point' is calculated based on parameters such as camera height, camera angle,
etc. Once calculated, vanishing point remains constant for constant camera position.
The height-distance model in the image context is derived heuristically from observed heights of pedestrians located at different rows in the input images. To account for errors

due to inaccuracies in the above height-distance model; demographic variation in heights of the pedestrian; change in the vanishing point of the road; and change in road elevation, a particular tolerance level is provided. For the embodiment of the invention, a tolerance of 0.6 to 1.66 of the estimated height is provided.
It has been observed that illumination of an object in an image varies based on the distance of the objects from the source of illumination (viz. headlight). Hence, pedestrians who are far away from the camera are less illuminated than the pedestrians present near the camera. Since the profiling is based on the pixel intensity values of the image, there is a need to enhance the pixel intensity values of pedestrian and suppress any other brighter objects in the background like lane markers, lamps, etc to improve detection accuracy. This is done by applying a pear shape like curve for enhancing pixel values that belong to probable pedestrians. Fig. 5 illustrates the pear shape like curve.
As a part of the Feature Extractor and Profiling Block (3), it is also essential to employ an image enhancement scheme for improving the accuracy of the detection and tracking mechanisms. The image blocks obtained from blob merging are considered as input for the enhancement technique. These image blocks contain the pedestrian and non -pedestrian information. In order to boost the edges of a probable pedestrian in the block, an average edge value is calculated by taking an average of edge pixels from original image block,.after dilation. The pear shape like curve is then applied on the block to suppress the brighter values and darker pixel values and boosting pixel values 'of interest' i.e. probable pedestrian's pixel values. Due to the application of the pear shape like , curve, the problem of saturation as observed for night time images and for regions having higher pixel values is reduced. Also, the image contrast and overall image quality is improved, thus resulting in a further enhanced image block. This curve is applied only when 70% of image block contains pixels with intensity values less than a predefined constant (constant value is decided from the image brightness).
Nonlinear equation for pear shape like curve:

Where: x = input pixel, y = output pixel, a = max intensity level i.e. 255 and

As illustrated in Fig. 2, the tracking list is updated after the image frames are processed by the Extractor and Profiling Block (3). The processed images are then subjected to the Tracking Block (4) by the Controller Unit (2) for pedestrian tracking.
The Tracking Block (4) performs tracking of probable pedestrians as follows:
• Selection of an extended region around the input image block: For each image
block in the previous list, an extended region (formed by extending the height
by one fourth the height of the block in either direction and width by one
fourth the width of the block in either direction) is selected from the present
frame and sent for template matching.
• Template Matching: The image block selected from the previous frame is used
as a template and a search is performed on the extended image region to obtain
a best correlation match between the template and the extended image. The
coordinates are updated with the new coordinates obtained from the search.
False Objects detected by the Feature Extractor and Profiling Block (2) may not appear in continuous frames, like pedestrian objects which can be assumed to be present continuously for some frames and then gradually moving out of the camera field of view. An object detected is forwarded to the Controller Unit (2) to be displayed only if the said object appears for at least "i" segmentation frames in the last "p" segmentation frames, wherein 'i' and 'p' depend on camera frame rate, by Feature Extractor and Profiling Block (2). For the embodiment of the invention, an object detected is forwarded to the Controller Unit (2) to be displayed only if the said object appears for at least '3' segmentation frames in the last '5' segmentation frames (including the present frame) by Feature Extractor and Profiling Block (2).

Thus, with the embodiments of the present invention, the dependence on static training databases, which may get redundant over time, is avoided. Moreover, the template matching with sequentially prior instances of object in the image frames provides a dynamic reference for tracking of pedestrian.
While the Tracking Block (4) continuously passes tracking data to the Display Block (5) of the Controller Unit (2), for displaying on the Display and Warning Unit (6), the tracking lists are also updated accordingly. The display of detected and tracked pedestrians with warnings takes place as follows:
Modify the count of an image block: The list of Pedestrians detected in the present image frame is obtained from the Controller Unit (2) as provided by the Feature Extractor and Profiling Block (3). Initially, the count of each detected image block after profiling is set to T. The count of the image block is updated by matching the image block centroid position obtained in the present image frame with the previous list of image block centroids. If an image block from the previous list does not match any of the image blocks from the present list, the count is decremented by '1' and the details of the image block in the present list are appended, if the updated count is not zero. Modification of count is done only if segmentation of the image frame is previously performed. Once the pedestrian count is updated across 'p' frames, the previous list is reinitialized with the current list, which is obtained from the current frame under consideration.
Display:
The information obtained from the Feature Extraction and Profiling Block (3) and the Tracking Block (4) are further processed by the Display Block (5) of the Controller Unit (2). The Display Block (5) provides the said processed information, like the coordinates for a bounding box where the pedestrian is detected in an image, to the Display and Warning Device (6), based on certain predefined conditions like, updating the count for each image block and displaying the image block in the frame only if the count of the particular image block is greater than or equal to a particular preset threshold, etc. The identified probable pedestrian image blocks in the frame are highlighted and the bounding box where the pedestrian is detected is displayed by the Display and Warning Device (6) on the display screen only if the image count of the particular image block is greater than

or equal to the preset value of "i". This is done in order to avoid warnings for falsely detected pedestrians. Additionally, the Display and Warning Device (6) may alert the driver of the detected pedestrian by an alarm, a blinking light or any similar audio and/or video means.
While the embodiments of the present invention have been described with certain examples, those with ordinary skill in the art will appreciate that various modifications and alternatives to those details could be developed in the light of the overall teachings of the disclosure, without departing from the scope of the invention.

We Claim:
1. A pedestrian detection and tracking system comprising :
an Image Input Device (1) to provide continuous image frames as input;
a Controller Unit (2) to coordinate communication between various units of the system and raise a warning when the communication fails, wherein the said Controller Unit (2) further comprises a Feature Extractor and Profiling Block (3) to extract probable pedestrian image blocks from the input images and profile information to accurately confirm said pedestrian image blocks, a Tracking Block
(4) which sequentially examines image frames for pedestrian regions to find a best correlation match in the continuously updated tracking list and a Display Block
(5) which controls the display of information;
and
a Display and Warning Device (6) which provides the warning.
2. A pedestrian detection and tracking system as claimed in claim 1, wherein the said Feature Extractor and Profiling Block (3) performs a plurality of operations such as edge extraction, blob detection, merging of blobs and profiling of blobs.
3. A pedestrian detection and tracking system as claimed in claim ], wherein, the Tracking Block (4) selects the probable pedestrian image blocks from the data as provided by the Feature Extractor and Profiling Unit (2).
4. A pedestrian detection and tracking system as claimed in claim 3, wherein, the Tracking Block (4) selects probable pedestrian image block as a template for template matching in the subsequent image frames.
5. A pedestrian detection and tracking system as claimed in claim 1, wherein the said probable pedestrian image blocks are enhanced by application of a pear shape like curve.

6. A pedestrian detection and tracking system as claimed in claim 1, wherein the selected probable pedestrian image blocks are further segmented by profiling to determine the height and width of the pedestrian like objects.
7. A pedestrian detection and tracking system as claimed in claim 1, wherein, false pedestrian detection is prevented by displaying a warning on the Display and Warning Device (6) only when the pedestrian object is detected in "i" or more that 'P segmentation image frames for every "p" segmentation image frames.
8. A pedestrian detection and tracking system as claimed in claim 1, wherein, the edge information of the image is obtained by a comparison of values of the vertical and diagonal edges.

Documents

Application Documents

#	Name	Date
1	551-MUM-2011-AbandonedLetter.pdf	2018-10-31
1	551-MUM-2011-FORM 9(22-06-2011).pdf	2011-06-22
2	551-mum-2011-abstract.doc	2018-08-10
2	551-MUM-2011-FORM 18(22-06-2011).pdf	2011-06-22
3	551-mum-2011-abstract.pdf	2018-08-10
3	551-MUM-2011 CORRESPONDENCE (IPO) (AFR) 26-07-2011.pdf	2011-07-26
4	Other Document [27-10-2016(online)].pdf	2016-10-27
4	551-MUM-2011-CERTIFICATE OF INCORPORATION(17-1-2014).pdf	2018-08-10
5	Form 13 [27-10-2016(online)].pdf	2016-10-27
6	Description(Complete) [27-10-2016(online)].pdf	2016-10-27
6	551-mum-2011-claims.pdf	2018-08-10
7	551-MUM-2011-ORIGINAL UNDER RULE 6(1A)-(27-12-2016).pdf	2016-12-27
7	551-MUM-2011-CORRESPONDENCE(18-3-2011).pdf	2018-08-10
8	abstract1.jpg	2018-08-10
8	551-mum-2011-correspondence.pdf	2018-08-10
9	551-mum-2011-description(complete).pdf	2018-08-10
9	551-mum-2011-form 5.pdf	2018-08-10
10	551-mum-2011-drawing.pdf	2018-08-10
10	551-MUM-2011-FORM 5(18-3-2011).pdf	2018-08-10
11	551-MUM-2011-FER.pdf	2018-08-10
11	551-mum-2011-form 3.pdf	2018-08-10
12	551-MUM-2011-FORM 1(18-3-2011).pdf	2018-08-10
12	551-mum-2011-form 26.pdf	2018-08-10
13	551-mum-2011-form 1.pdf	2018-08-10
13	551-mum-2011-form 2.pdf	2018-08-10
14	551-MUM-2011-FORM 13(17-1-2014).pdf	2018-08-10
15	551-mum-2011-form 13(18-3-2011).pdf	2018-08-10
15	551-mum-2011-form 2(title page).pdf	2018-08-10
16	551-mum-2011-form 13(18-3-2011).pdf	2018-08-10
16	551-mum-2011-form 2(title page).pdf	2018-08-10
17	551-MUM-2011-FORM 13(17-1-2014).pdf	2018-08-10
18	551-mum-2011-form 2.pdf	2018-08-10
18	551-mum-2011-form 1.pdf	2018-08-10
19	551-MUM-2011-FORM 1(18-3-2011).pdf	2018-08-10
19	551-mum-2011-form 26.pdf	2018-08-10
20	551-MUM-2011-FER.pdf	2018-08-10
20	551-mum-2011-form 3.pdf	2018-08-10
21	551-mum-2011-drawing.pdf	2018-08-10
21	551-MUM-2011-FORM 5(18-3-2011).pdf	2018-08-10
22	551-mum-2011-description(complete).pdf	2018-08-10
22	551-mum-2011-form 5.pdf	2018-08-10
23	abstract1.jpg	2018-08-10
23	551-mum-2011-correspondence.pdf	2018-08-10
24	551-MUM-2011-CORRESPONDENCE(18-3-2011).pdf	2018-08-10
24	551-MUM-2011-ORIGINAL UNDER RULE 6(1A)-(27-12-2016).pdf	2016-12-27
25	Description(Complete) [27-10-2016(online)].pdf	2016-10-27
25	551-mum-2011-claims.pdf	2018-08-10
26	Form 13 [27-10-2016(online)].pdf	2016-10-27
27	Other Document [27-10-2016(online)].pdf	2016-10-27
27	551-MUM-2011-CERTIFICATE OF INCORPORATION(17-1-2014).pdf	2018-08-10
28	551-mum-2011-abstract.pdf	2018-08-10
28	551-MUM-2011 CORRESPONDENCE (IPO) (AFR) 26-07-2011.pdf	2011-07-26
29	551-MUM-2011-FORM 18(22-06-2011).pdf	2011-06-22
30	551-MUM-2011-FORM 9(22-06-2011).pdf	2011-06-22
30	551-MUM-2011-AbandonedLetter.pdf	2018-10-31

Search Strategy

1	SearchStrategy_27-07-2017.pdf