0% found this document useful (0 votes)
11 views13 pages

Comparative Study of Illumination-Invariant Foreground Detection

Comparative study of illumination invariant foreground detection

Uploaded by

Karthi Keyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views13 pages

Comparative Study of Illumination-Invariant Foreground Detection

Comparative study of illumination invariant foreground detection

Uploaded by

Karthi Keyan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

The Journal of Supercomputing (2020) 76:2289–2301

https://doi.org/10.1007/s11227-018-2488-1

Comparative study of illumination-invariant foreground


detection

P. R. Karthikeyan1 · P. Sakthivel1 · T. S. Karthik2

Published online: 16 July 2018


© Springer Science+Business Media, LLC, part of Springer Nature 2018

Abstract
Foreground detection plays a vital role in finding the moving objects of a scene. For
the last two decades, many methods were introduced to tackle the issue of illumination
variation in foreground detection. In this article, we proposed a method to segment
moving objects under abrupt illumination change and analyzed the merits and demerits
of the proposed method with seven other algorithms commonly used for illumination-
invariant foreground detection. The proposed method calculates the entropy of the
video scene to determine the level of illumination change occurred and select the
update model based on the difference in entropy values. Benchmark datasets possess-
ing different challenging illumination conditions are used to analyze the efficiency
of the foreground detection algorithms. Experimental studies demonstrate the perfor-
mance of the proposed algorithm with several algorithms under various illumination
conditions and its low time complexity.

Keywords Foreground detection · Illumination invariant · Moving object detection ·


Background subtraction

1 Introduction

Foreground detection is a basic step and commonly used approach for segmenting fore-
ground objects in video surveillance applications. Temporal differencing, optical flow
and background subtraction are the three methods used to detect foreground objects
from a video scene. Out of these three techniques, background subtraction is highly
used because of its accurate foreground detection and computationally less expensive.
In general, all background subtraction techniques [1–4] model the stationary portion

B P. R. Karthikeyan
karthikeyanest@gmail.com

1 Department of Electronics and Communication Engineering, Anna University, Chennai, India


2 Department of Electronics and Communication Engineering, B. V. Raju Institute of Technology,
Narsapur, Telangana, India

123
2290 P. R. Karthikeyan et al.

of the video scene as background and compare the current scene with the modeled
background to detect foreground objects. Background subtraction algorithms [5] are as
simple as subtracting consecutive frames or as complex as sophisticated probabilistic
models. A simple background subtraction method may not detect foreground objects
accurately if the video scene is dynamic. A dynamic video scene may contain complex
background objects [6], like waving trees, different lightings, chairs, escalators, etc.
Foreground detection algorithms play a critical role in applications like public safety
[7] and traffic monitoring systems [8].
Despite a lot of proposed algorithms [9–15], illumination-invariant [16, 17], fore-
ground detection is far from being completely solved. A good foreground detection
algorithm should adapt to gradual as well as sudden illumination changes. In gradual
illumination change, the illumination changes may happen because of the time of a day
(i.e., movement of the sun), whereas in sudden illumination change, the illumination
may change due to clouds passing the outdoor scene or lights on or off in the indoor
scene. Gradual illumination changes could be handled by the existing algorithms to
some extent, but not the sudden illuminations. Recently, a block-based background
modeling algorithm and singular value decomposition (SVD)-based models [18, 19]
were presented to detect objects under varying illumination conditions. The block-
based model uses entropy and sum of absolute difference of blocks, but it suffers from
over-segmentation whereas the SVD-based model use the local structural information,
but it also suffers slow processing of video frames.
Even though many works proposed to evaluate the performance of foreground
detection algorithms [1–4], to the best of our knowledge, this article is the first to
address the adaptability and performance of different algorithms under various illu-
mination changes. This article attempts to provide a comparative study of foreground
detection algorithms commonly used and the proposed technique in detecting moving
object under varying illumination. This article may help engineers to select appropriate
algorithms to detect foreground objects on different kinds of the video scene.
The rest of the article is organized as follows: Sect. 2 describes seven foreground
detection algorithms used to deal with illumination variations. Section 3 presents our
foreground detection technique. Section 4 reports the experimental results of various
algorithms on three video dataset, and finally, conclusions are reported in Sect. 5.

2 Foreground detection algorithms

The background subtraction algorithms model the background and find the disparity
between current frame and background to identify the foreground objects. The resultant
foreground object is called motion mask. Most of the background subtraction model
[2] use the following formula to calculate motion mask.

1 if d(It (k, l), Bt (k, l))> τ
Mt (k, l)  (1)
0 otherwise

where Mt (k, l) is motion mask at time t, d is the disparity of current frame (It ) and
background model (Bt ), τ is a threshold value that varies among algorithms and usually

123
Comparative study of illumination-invariant foreground… 2291

it takes value in the range [2–6], if the disparity d is larger than the threshold τ then
that pixel location (k, l) will be assigned 1; otherwise, it will be assigned 0. Various
foreground detection algorithms are presented as follows.

2.1 Frame difference

Frame differencing is the fastest and easiest of all foreground detection methods. The
previous frame of a video sequence is taken as background, and the absolute disparity
between the current frame and previous frame gives motion mask. Based on the speed
of movement of foreground objects and the threshold value selected, its performance
varies. It can be mathematically expressed as
Bt (k, l)  It−1 (k, l) (2)

where Bt (k, l) is the background model at the pixel (k, l) and It−1 (k, l) is the previous
pixel value at (k, l)

2.2 Approximated median

Background model was estimated by applying running media [10], to the incoming
frames by the following method. Initially, the first image from a video sequence is taken
as background model and each pixel is raised by 1 if the corresponding present pixel
intensity is greater than the background pixel or decremented by 1 in the case of current
pixel intensity is less than the background pixel. This method is computationally
inexpensive since it requires only one background image. The major disadvantages
of the method are it updates the background slowly when sudden changes occur and
foreground objects which are stationary will become background after some time. The
mathematically approximated median can be expressed as

Bt−1 (k, l) + 1 if Bt−1 (k, l) < It (k, l)
Bt (k, l)  (3)
Bt−1 (k, l) − 1 if Bt−1 (k, l) > It (k, l)

where Bt (k, l) is background model at the pixel location (k, l), Bt−1 (k, l) is the previous
background model at the pixel location (k, l) and It (k, l) is the current pixel value at
(k, l).

2.3 Single Gaussian

It is the simplest Gaussian model [9], to find the motion mask. It calculates mean
of the video scene and subtracts mean frame with every incoming frame and checks
whether the disparity is larger than the threshold. If the disparity is greater than standard
deviation then that pixel is labeled as moving object; otherwise, that pixel is labeled as
background. This model performs better under gradual illumination change but fails
when sudden illumination changes occur. After finding the motion mask, it updates
mean value as given in Eq. (5). Single Gaussian can be mathematically denoted as

123
2292 P. R. Karthikeyan et al.


1 if |It (k, l) − μt (k, l)| >ρσ
Mt (k, l)  (4)
0 otherwise
μt (k, l)  (1 − α)μt−1 (k, l) + α It (k, l) (5)

where Mt (k, l) denotes the foreground, It (k, l) is the current pixel value at (k, l),
μt (k, l) and μt−1 (k, l) are the current and previous mean value at the pixel location
(k, l), σ is the standard deviation, ρ is a free parameter and α is a learning rate.

2.4 Gaussian mixture model (GMM)

The single Gaussian is enough if the scene is static, but in reality, the scene may not be
static, so in [11], the authors proposed multimodal distributions to handle changes in
background. Later in [12], the author modeled every pixel as a mixture of K Gaussians
as given in Eq. (6) and the value of K is in the range of [3, 5]. Every pixel is compared
with the corresponding K Gaussians to detect foreground. The probability of It being
one among K Gaussians is

K
P(It )  ωi,t η(It − μi,t , i,t ) (6)
i1

where η(It − μi,t , i,t ) is the i th Gaussian with mean μi,t and covariance i,t , and
ωi,t as its weight; the covariance matrix is assumed to be i,t  σi2 I . Initially, the first
image of video acts as background and σ is assumed as 6. The parameters of GMM
are updates as follows

ωi,t  (1 − α)ωi,t−1 + α Ni,t (7)


μi,t  (1 − ρ) · μi,t−1 + ρ · It (8)
σi,t
2
 (1 − ρ)σi,t−1
2
+ ρ(It − μi,t ) (It − μi,t )
T
(9)

where α and ρ are the learning rates, Ni,t is an indicator variable, and it is equals
to 1 if the ith component is matched, 0 otherwise. From the K distributions observed
exclusively, the first H distributions are observed as a background where H is estimated
as  h 

H  arg min ωi >τ (10)
h i1

τ is a threshold. If any pixel intensity value goes away from standard deviation 2.5
of any of the H distributions, it will be labeled as a foreground pixel.

2.5 Sigma-delta

The sigma-delta method utilizes the approximated median [10], to model background.
In addition to approximated median the authors in [15] measured the temporal activities
of the pixels by estimating the temporal standard deviation. If the difference image

123
Comparative study of illumination-invariant foreground… 2293

obtained from the approximated median method is greater than the temporal standard
deviation, it is labeled as foreground else it is labeled as background. Since it involves
elementary increment and decrement operations, this method is more suitable for real-
time purpose, but if the foreground objects become static it misclassifies the object as
background.

2.6 ISBS

The ISBS method updates background with respect to the change in luminance level.
In [13], the background model is estimated as in approximated median and to find
the luminance change to update background model entropy is used. Here the entropy
value varies as the video scene becomes dark or bright, entropy goes high as the
scene becomes bright and goes low as the scene becomes dark. Entropy is estimated
from probability density function (pdf) determined for every incoming video frames
as follows.


lmax
Et  − pdf(l) log(pdf(l)) (11)
ll
min
pdf(l)  nl /(M · N ) (12)

where E t is the entropy at time t, l is the intensity values in the video frame, lmin
and lmax are the minimum and maximum intensities of the video sequence, nl is the
frequency of an intensity value and M · N is the size of an image.

2.7 ViBe

In ViBe [14], background pixels are modeled with a set of values, rather than with a
particular background model. The new values coming from video scene are compared
with the background samples. The background modeled by a set of N values is as
given below

B(k, l)  {v1 , v2 , . . . , v N } (13)

where B(k, l) denotes background sample at the pixel location (k, l), whereas
{v1 , v2 , . . . , v N } are the samples at that location. To distinguish a pixel as foreground
and background, they define a sphere S R (v(x)) of radius R centered at v(x). Here the
pixel v(x) is to be identified as foreground or background. The pixel v(x) is identified
as background if the samples inside the sphere are greater than the threshold.


0 if {S R (v(u, v)) ∩ {v1 , v2 , . . . v N }}> τ
Mt (k, l)  (14)
1 otherwise

123
2294 P. R. Karthikeyan et al.

3 Proposed method

The proposed algorithm calculates the entropy of video scene to determine the level
of illumination change happened in the scene. In the proposed algorithm, first frame
of the video is considered as an initial model of the background then the entropy
of incoming frames are calculated and compared with the previous entropy value to
determine the level of change happened in the video scene. If the change occurred
exceeds the threshold value, then present background model is replaced by initial
background model else the present model is updated recursively as in singe Gaussian
model. The strategy is to update the background model to initial model when sudden
illumination change takes place and update the background model recursively when
gradual illumination change occurs. The threshold value is empirically found and is
set to 0.06. The simplicity of algorithm not only makes it perform in real time but also
gives competent results.
A pseudo-code of proposed background model is given in proposed algorithm.

Proposed Algorithm
Input: n video frames.
Output: foreground objects.
Read Video.
Divide the video sequence into frames f1,f2, f3,…fn.
Initial background model = f1.
Estimate the entropy of initial background model
for i=2 to fn do
Calculate entropy of input frames.
Compare the entropy of current frame Ecurrent with the entropy of

previous frame Eprevious.

if (Ecurrent - Eprevious) > thresh then


{assign initial background model to current background model}
else
{update background model with recursive filter}
end if
Find the foreground by comparing current frame with background model.
Update the entropy value.
end for

123
Comparative study of illumination-invariant foreground… 2295

4 Experimental results

To evaluate the illumination-invariant foreground detection methods, we used three


videos from two datasets. Light switch, time of day sequence from wallflower dataset
[20] and lobby sequence from I2R dataset [21] were used to evaluate the algorithms.
Time of day video is used to study the algorithms adaptability to gradual illumination
change, whereas the light switch and lobby sequences are used to study the algo-
rithms adaptability to sudden illumination change. The following metrics were used
to determine the efficiency of various algorithms:

TP
Recall = (15)
TP + FN
TP
Precision = (16)
TP + FP
2(Precision) (Recall)
F - Measure = (17)
(Precision) + (Recall)

where true positive (TP) specifies the sum of foreground pixels rightly identified
as foreground, false negative (FN) specifies the sum of foreground pixels wrongly
identified as background pixels, and false positive (FP) specifies the sum of background
pixels misclassified as foreground pixels. Recall describes the percent of foreground
region rightly detected to the actual foreground region. Precision indicates the percent
of foreground region rightly detected to region labeled as foreground. A good algorithm
should produce maximum precision and maximum recall value. F-measure gives a
weighted average of recall and precision. F-measure is used in addition to precision
and recall because only one true positive may result in high precision and all true
positive in a frame may result in a high recall.
Figure 1 shows the moving objects detected by seven commonly used algorithms
and a proposed algorithm on three video sequences. The motion masks generated by
the eight algorithms are given against the original video and its ground truth. The
qualitative results of various algorithms are depicted in the figure. Frame differencing
algorithm is good in predicting changes that are occurring at a moderate rate but fails to
detect changes that are too slow or too fast. The approximate median algorithm is quite
fast in execution but produces a lot of false negatives, and this algorithm incorporates
foreground into the background even if it stops for a minimal time. Even though the
single Gaussian is less complex, it is unable to detect the changes consistently, and it
fails to perform when compared to other methods.
The GMM method outperforms all other methods in most of the situations, but
this method is complex one when compared to other methods and is computationally
expensive in several embedded hardware. Sigma-delta method utilizes the advantage of
the approximated median method and in addition to that, it incorporates better thresh-
olding technique to detect foreground objects better. ISBS method is less expensive in
terms of computation but fails to detect the gradual changes and lots of noise appears
on the resultant frames. ViBe method which generally performs better in dynamic
scenes fails to detect gradual and sudden illumination changes.

123
2296 P. R. Karthikeyan et al.

Original Image

Ground Truth

Frame
Difference

Approximated
Median

Single Gaussian

GMM

Sigma-delta

ISBS

ViBe

Proposed
Method

Fig. 1 Results of foreground detection in the light switch, time of day and lobby video sequences

123
Table 1 Quantitative analysis of illumination-invariant algorithms

Sequence Metrics Frame Approximated Single GMM Sigma-delta ISBS ViBe Proposed
difference median Gaussian method

Light switch Recall 0.3150 0.1782 0.3740 0.7685 0.7521 0.6875 0.5563 0.6459
Precision 0.7967 0.0420 0.0799 0.1343 0.1426 0.7420 0.1154 0.8315
F-measure 0.4515 0.0680 0.1317 0.2286 0.2397 0.7137 0.1911 0.7270
Comparative study of illumination-invariant foreground…

Time of day Recall 0.2636 0.2253 0.3873 0.6280 0.4896 0.2483 0.2830 0.4325
Precision 0.6732 1.0000 0.9946 0.9525 0.9684 0.6051 0.9951 0.7746
F-measure 0.3788 0.3678 0.5576 0.7569 0.6503 0.3521 0.4407 0.5551
Lobby Recall 0.2816 0.0874 0.3010 0.7573 0.6621 0.4932 0.2932 0.5515
Precision 0.5598 0.0449 0.1769 0.8647 0.9342 0.0776 0.0216 0.7738
F-measure 0.3747 0.0593 0.2229 0.8075 0.7750 0.1341 0.0402 0.6440

123
2297
2298 P. R. Karthikeyan et al.

Fig. 2 F-measure results on the three datasets

The proposed method which is computationally less expensive compared to GMM


and few others perform reasonably better under gradual illumination change, but when
it comes to sudden illumination change it outperforms all other algorithms. The obser-
vation in Fig. 1 shows that in case of gradual illumination change GMM performs
better than other methods and in the case of sudden illumination change the proposed
algorithm performs better.
To assess the algorithms quantitatively, three metrics recall, precision and F-
measures were used. Table 1 lists the quantitative measurements for various methods
for each tested video sequence. Note that all the metrics are in the range of 0–1, with
lower range to represent worst performance and higher range to represent better per-
formance. Observation from the table clarifies that the GMM performs better under
gradual illumination change, but it fails under sudden illumination change. Sigma-
delta and approximated median can perform fairly better under gradual illumination
change but fails under sudden illumination change. Whereas ISBS and the proposed
method can perform better under sudden illumination change, the proposed method
can perform reasonably better under gradual illumination change, but ISBS failed to
perform under gradual illumination change. F-measure results are plotted in Fig. 2
to show the performance of various algorithms on three datasets and are clearly vis-
ible from the figure that our proposed algorithm performed equally well on all three
datasets but other methods not able to perform equally on all datasets.
In order to demonstrate the time complexity of algorithms taken for the study, we
calculated the time required by the algorithm to process frames per second (FPS).
The algorithms are implemented in core i5 system with 3 GB RAM. From Table 2,
it is clearly visible that the proposed method can be implemented in real time. Even
though GMM performed better in slow illumination changes, it is unable to process
image sequences in real time. The frame difference and ViBe performance are not up
to the level of other algorithms used in this comparison. From the seven algorithms
compared, it is clearly evident that our proposed method and GMM outperform all
other algorithms in most conditions and our proposed algorithm has a competitive
edge over GMM when the real-time implementation is concerned.

123
Table 2 Comparison of FPS of various approaches

Sequence Resolution Frame Approximated Single GMM Sigma-delta ISBS ViBe Proposed
difference median Gaussian method
Comparative study of illumination-invariant foreground…

Light switch 120 × 160 106 90 62 09 72 52 41 84


Time of day 120 × 160 116 96 68 12 78 57 45 93
Lobby 128 × 160 102 83 58 10 67 46 36 80

123
2299
2300 P. R. Karthikeyan et al.

5 Conclusion

In this article, we presented a comparative study of seven commonly used illumination-


invariant foreground detection methods and a proposed method. These algorithms
were compared based on their ability to detect foreground objects under sudden and
gradual illumination changes. The recall, precision and F-measures were used to quan-
titatively compare the seven algorithms. From the study, it is clearly evident that the
overall working of the proposed algorithm is better than all other algorithms com-
pared. Experimental outcomes indicate that the GMM can only be able to adapt to
gradual illumination variations and the proposed method performed better than other
algorithms under sudden illumination changes and satisfactorily under gradual illumi-
nation changes. In terms of computation time, the proposed algorithm outperformed
GMM and is more suitable for real-time applications.

References
1. Piccardi M (2004) Background subtraction techniques: a review. IEEE Int Conf Syst Man Cybern
4:3099–3104
2. Benezeth Y, Jodoin Pierre-Marc, Emile Bruno, Laurent Helene, Rosenberger Christophe (2012) Com-
parative study of background subtraction algorithms. J Electron Imaging 19(3):12
3. Ahmed Sumaya H, El-Sayed Khaled M, Elhabian Shireen Y (2008) Moving object detection in spatial
domain using background removal techniques-state-of-art. Recent Pat Comput Sci 1(1):32–54
4. Bouwmans Thierry (2014) Traditional and recent approaches in background modeling for foreground
detection: an overview. Comput Sci Rev 11(12):31–66
5. Sen-ching SC, Chandrika K (2004) Robust techniques for background subtraction in urban traffic
video. In: Proceedings of the SPIE 5308, Visual Communications and Image Processing
6. Li L, Huang W, Irene YH Gu, Qi Tian (2003, November) Foreground object detection from videos
containing complex background. ACM International Conference on Multimedia, pp. 2–10
7. Wahyono A, Filonenko Jo KH (2016) Unattended object identification for intelligent surveillance
systems using sequence of dual background difference. IEEE Trans Ind Inf 12(6):2247–2255
8. Wang K, Liu Y, Gou C, Wang FY (2016) A multi-view learning approach to foreground detection for
traffic surveillance applications. IEEE Trans Veh Technol 65(6):4144–4158
9. Wren CR, Azarbayejani A, Darrell T, Pentland AP (1997) Pfinder: real-time tracking of the human
body. IEEE Trans Pattern Anal Mach Intell 19(7):780–785
10. McFarlane NJB, Schofield CP (1995) Segmentation and tracking of piglets in images. Mach Vis Appl
8(3):187–193
11. Friedman N, Russell S (1997, August) Image segmentation in video sequences: a probabilistic
approach. In: Thirteenth Conference on Uncertainty in Artificial Intelligence, pp. 175–181
12. Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. IEEE
Comput Soc Conf Comput Vis Pattern Recogn 2:252
13. Cheng FC, Huang SC, Ruan SJ (2011) Illumination-sensitive background modeling approach for
accurate moving object detection. IEEE Trans Broadcast 57(4):794–801
14. Barnich O, Van Droogenbroeck M (2011) ViBe: a universal background subtraction algorithm for
video sequences. IEEE Trans Image Process 20(6):1709–1724
15. Manzanera A, Richefeu JC (2004, December) A robust and computationally efficient motion detection
algorithm based on - background estimation. In: Indian Conference on Computer Vision, Graphics
and Image Processing, pp. 46–51
16. Lou J, Yang H, Hu W, Tan T (2002, January) An illumination invariant change detection algorithm.
In: Asian Conference on Computer Vision, pp. 13–18
17. Holtzhausen PJ, Crnojevic V, Herbst BM (2015) An illumination invariant framework for real-time
foreground detection. J Real Time Image Process 10(2):423–433

123
Comparative study of illumination-invariant foreground… 2301

18. Elharrouss O, Abbad A, Moujahid D, Tairi H (2018) Moving object detection zone using a block-based
background model. IET Comput Vis 12(1):86–94
19. Kim W, Kim Y (2016) Background subtraction using illumination-invariant structural complexity.
IEEE Signal Process Lett 23(5):634–638
20. Toyama K, Krumm J, Brumitt B, Meyers B (1999) Wallflower: principles and practice of background
maintenance. IEEE Int Conf Comput Vis 1:255–261
21. Li Liyuan, Huang Weimin, Irene Yu-Hua Gu, Tian Qi (2004) Statistical modeling of complex back-
grounds for foreground object detection. IEEE Trans Image Process 13(11):1459–1472

123

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy