Trainable COSFIRE Filters For Keypoint Detection and Pattern Recognition
Trainable COSFIRE Filters For Keypoint Detection and Pattern Recognition
net/publication/224957872
CITATIONS READS
61 327
2 authors, including:
George Azzopardi
University of Groningen
63 PUBLICATIONS 620 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by George Azzopardi on 19 May 2014.
Abstract—Background: Keypoint detection is important for many computer vision applications. Existing methods suffer from
insufficient selectivity regarding the shape properties of features and are vulnerable to contrast variations and to the presence of noise
or texture. Methods: We propose a trainable filter which we call Combination Of Shifted FIlter REsponses (COSFIRE) and use for
keypoint detection and pattern recognition. It is automatically configured to be selective for a local contour pattern specified by an
example. The configuration comprises selecting given channels of a bank of Gabor filters and determining certain blur and shift
parameters. A COSFIRE filter response is computed as the weighted geometric mean of the blurred and shifted responses of the
selected Gabor filters. It shares similar properties with some shape-selective neurons in visual cortex, which provided inspiration for
this work. Results: We demonstrate the effectiveness of the proposed filters in three applications: the detection of retinal vascular
bifurcations (DRIVE dataset: 98.50 percent recall, 96.09 percent precision), the recognition of handwritten digits (MNIST dataset:
99.48 percent correct classification), and the detection and recognition of traffic signs in complex scenes (100 percent recall and
precision). Conclusions: The proposed COSFIRE filters are conceptually simple and easy to implement. They are versatile keypoint
detectors and are highly effective in practical computer vision applications.
Index Terms—Feature detection, feature representation, medical information systems, object recognition, optical character
recognition, shape
1 INTRODUCTION
Fig. 3. (a) Synthetic input image (of size 256 256 pixels). The circle
indicates a prototype feature of interest that is manually selected by a
user. (b) Enlargement of the selected feature. The ellipses represent the
Fig. 1. Examples of corners and junction patterns marked in support of line detectors that are identified as relevant for the concerned
(a) photographic images and (b) their enlargements. feature.
aspect of shape by a human observer, but show differences in A COSFIRE filter is conceptually simple and straightfor-
contrast and/or contain texture, Figs. 2c, 2d. ward to implement: It requires the application of selected
In this paper, we are interested in the detection of Gabor filters, Gaussian blurring of their responses, shifting
contour-based patterns. We introduce trainable keypoint of the blurred responses by specific, different vectors, and
detection operators that are configured to be selective for multiplying the shifted responses. The questions of which
given local patterns defined by the geometrical arrangement Gabor filters to use, how much to blur their responses, and
of contour segments. The proposed operators are inspired how to shift the blurred responses are answered in a
by the properties of a specific type of shape-selective neuron COSFIRE filter configuration process in which a local
in area V4 of visual cortex which exhibit selectivity for parts pattern of interest that defines a keypoint is automatically
of (curved) contours or for combinations of line segments analyzed. The configured COSFIRE filter can then success-
fully detect the same and similar patterns. We also show
[15], [16].
how the proposed COSFIRE filters can achieve invariance to
We call the proposed keypoint detector Combination Of
rotation, scale, reflection, and contrast inversion.
Shifted FIlter REsponses (COSFIRE) filter as the response of
The rest of the paper is organized as follows: In Section 2,
such a filter in a given point is computed as a function of we present the COSFIRE filter and demonstrate how it can
the shifted responses of simpler (in this case orientation- be trained and used to detect local contour patterns. In
selective) filters. Using shifted responses of simpler filters— Section 3, we demonstrate the effectiveness of the proposed
Gabor filters in this study—corresponds to combining their trainable COSFIRE filters in three practical applications: the
respective supports at different locations to obtain a more detection of vascular bifurcations in retinal fundus images,
sophisticated filter with a bigger support. The specific the recognition of handwritten digits, and the detection and
function that we use here to combine filter responses is recognition of traffic signs in complex scenes. Section 4
weighted geometric mean, essentially multiplication, which contains a discussion of some aspects of the proposed
has specific advantages regarding shape recognition and approach and highlights the differences that distinguish it
robustness to contrast variations. Such a model design from other approaches. Finally, we draw conclusions in
decision is mainly motivated by the better results obtained Section 5.
using multiplication versus addition. It gets further support
by psychophysical evidence [17] that curved contour parts
2 METHOD
are likely detected by a neural mechanism that multiplies
the responses of afferent subunits (sensitive for different 2.1 Overview
parts of the curve pattern). Due to the multiplicative The following example illustrates the main idea of our
character of the output function, a COSFIRE filter produces method. Fig. 3a shows an input image containing three
a response only when all constituent parts of a pattern of vertices. We consider the encircled vertex, which is shown
interest are present. enlarged in Fig. 3b, as a (prototype) pattern of interest and
use it to automatically configure a COSFIRE filter that will
respond to the same and similar patterns.
The two ellipses shown in Fig. 3b represent the dominant
orientations in the neighborhood of the specified point of
interest. We detect such lines by symmetric Gabor filters.
The central circle represents the overlapping supports of a
group of such filters. The response of the proposed
COSFIRE detector is computed by combining the responses
of these Gabor filters in the centers of the corresponding
Fig. 2. (a) Prototype pattern. (b) Test pattern which has 50 percent
similarity (computed by template matching) to the prototype. (c), (d) Test ellipses by multiplication. The preferred orientations of
patterns that have only 30 percent similarity to the prototype due to (c) these filters and the locations at which we take their
contrast differences and (d) presence of texture. From a shape detection responses are determined by analyzing the local prototype
point of view, the patterns in (c) and (d) are more similar to the prototype pattern used for the configuration of the COSFIRE filter
in (a) than the pattern in (b). This example shows the shortcomings of
other models that are based on distance or dissimilarity of descriptors.
concerned. Consequently, the filter is selective for the
The local image pattern is used as a descriptor in this example. Methods presented local spatial arrangement of lines of specific
that compute local descriptors only shift the problem to a feature space. orientations and widths. Taking the responses of Gabor
492 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. 2, FEBRUARY 2013
Fig. 5. (a) Input image (of size 256 256 pixels). The enframed inlay images show (top) the enlarged prototype feature of interest, which is the vertex
encircled in the input image and (bottom) the structure of the COSFIRE filter that is configured for this feature. This filter is trained to detect the
spatial local arrangement of four contour parts. The ellipses illustrate the wavelengths and orientations of the Gabor filters, and the bright blobs are
intensity maps for Gaussian functions that are used to blur the responses of the corresponding Gabor filters. The blurred responses are then shifted
by the corresponding vectors. (b) Each contour part of the input pattern is detected by a Gabor filter with a given preferred wavelength i and
orientation i . Two of these parts (i ¼ f1; 2g) are detected by the same Gabor filter and the other two parts (i ¼ f3; 4g) are detected by another Gabor
filter;
therefore,
only two distinct Gabor filters are selected from the filter bank. (c) We then blur the thresholded (here at t1 ¼ 0:2) response
g ; ðx; yÞ of each concerned Gabor filter and subsequently shift the resulting blurred response images by corresponding polar coordinate vectors
i i t1
ði ; i þ Þ. (d) Finally, we obtain the output of the COSFIRE filter by computing the weighted geometric mean (here 0 ¼ 25:48) of all the blurred and
shifted thresholded Gabor filter responses. The marker indicates the location of the specified point of interest. The two local maxima in the output
of the COSFIRE filter correspond to the two similar vertices in the input image.
def
For each tuple ði ; i ; i ; i Þ in the original filter Sf that r^Sf ðx; yÞ ¼ maxfr< ðSf Þ ðx; yÞg; ð5Þ
2
describes a certain local contour part, we provide a
counterpart tuple ði ; i þ ; i ; i þ Þ in the new set where is a set of n equidistant orientations defined as
< ðSf Þ. The orientation of the concerned contour part and ¼ fn2 i j 0 i < n g. Fig. 6d shows the maximum super-
its polar angle position with respect to the center of the filter position r^Sf ðx; yÞ for n ¼ 16. The filter according to (5)
are offset by an angle relative to the values of the produces the same response to local patterns that are
corresponding parameters of the original part.
versions of each other, obtained by rotation at discrete
Fig. 6c shows the responses r< ðSf Þ of the COSFIRE filter
that correspond to < ðS f Þ to the set of elementary features angles 2 .
shown in Fig. 6a. This filter responds selectively to a version As to the response of the filter to patterns that are rotated
of the original prototype feature f rotated counterclockwise at angles of intermediate values between those in , it
at an angle of ( ¼ ) =2. It is, however, configured by depends on the orientation selectivity of the filter Sf that is
manipulating the set of parameter value combinations, influenced by the orientation bandwidth of the involved
rather than by computing them from the responses to a Gabor filters and by the value of the parameter in (1).
rotated version of the original prototype pattern f. Fig. 7 illustrates the orientation selectivity of the COSFIRE
A rotation-invariant response is achieved by taking the filter, which is configured with the enframed local proto-
maximum value of the responses of filters that are obtained type pattern in Fig. 6a using ¼ 0:1. A maximum response
with different values of the parameter : is obtained for the local prototype pattern that was used to
AZZOPARDI AND PETKOV: TRAINABLE COSFIRE FILTERS FOR KEYPOINT DETECTION AND PATTERN RECOGNITION 495
Fig. 6. (a) A set of elementary features. The enframed feature is used as a prototype for configuring a COSFIRE filter. (b) Responses of the
configured filter rendered by shading of the features. (c) Responses of a rotated version ( ¼ 2 ) of the filter obtained by manipulation of the filter
parameters. (d) Rotation-invariant responses for 16 discrete orientations.
def
configure this filter. The response declines with the r~Sf ðx; yÞ ¼ maxfrT ðSf Þ ðx; yÞg; ð7Þ
2
deviation of the orientation of the local input pattern from
the optimal one and practically disappears when this where is a set of values equidistant on a logarithmic
deviation is greater than =8. When the deviation of the i
scale defined as ¼ f22 ji 2 ZZg.
orientation is =16, the response of the filter is approxi-
mately half of the maximum response. This means that the 2.6.3 Reflection Invariance
half-response bandwidth of this COSFIRE filter is =8. Thus, As to reflection invariance, we first form a new set Sf from
n ¼ 16 distinct preferred orientations (in intervals of =8) the set Sf as follows:
ensure sufficient response for any orientation of the feature
def
used to configure the filter. Sf ¼ fði ; i ; i ; i Þj 8 ði ; i ; i ; i Þ 2 Sf g; ð8Þ
As demonstrated by Fig. 6d, when the concerned filter is
applied in rotation-invariant mode (n ¼ 16), it responds The new filter which is defined by the set Sf is selective
selectively to the prototype pattern, a right angle, indepen- for a reflected version of the prototype feature f about the
dently of the orientation of the angle.
y-axis. A reflection-invariant response is achieved by taking
2.6.2 Scale Invariance the maximum value of the responses of the filters Sf and Sf :
?twb=0pc2.58>Scale invariance is achieved in a similar way.
def
Using the set Sf that defines the concerned filter, we form a rSf ðx; yÞ ¼ maxfrSf ðx; yÞ; rSf ðx; yÞg: ð9Þ
new set T ðSf Þ that defines a new filter which is selective for
a version of the prototype feature f that is scaled in size by a
factor :
def
T ðSf Þ ¼ fð i ; i ; i ; i Þj 8 ði ; i ; i ; i Þ 2 Sf g: ð6Þ
For each tuple ði ; i ; i ; i Þ in the original filter Sf that
describes a certain local contour part, we provide a
counterpart tuple ð i ; i ; i ; i Þ in the new set T ðSf Þ.
The width of the concerned contour part and its distance to
the center of the filter are scaled by the factor relative to the
values of the corresponding parameters of the original part.
A scale-invariant response is achieved by taking the
maximum value of the responses of filters that are obtained Fig. 7. Orientation selectivity of a COSFIRE filter that is configured with a
with different values of the parameter : right-angle vertex.
496 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. 2, FEBRUARY 2013
3 APPLICATIONS
In the following, we demonstrate the effectiveness of the
proposed COSFIRE filters by applying them in three
practical applications: the detection of vascular bifurcations
Fig. 8. (a) Synthetic input image (of size 256 256 pixels). (b) The in retinal fundus images, the recognition of handwritten
structure of a COSFIRE filter that is configured using the encircled
pattern in (a) with three values of ( 2 f0; 12; 30g) and 0 ¼ 2:5. (c)
digits, and the detection and recognition of traffic signs in
Rotation-invariant response brSf of the COSFIRE filter (here 0 ¼ 25:48). complex scenes.
Fig. 9. Example of a retinal fundus image from the DRIVE dataset. (a) Original image (of size 564 584 pixels) with filename 21_training.tif. (b) Binary
segmentation of vessels and background (also from DRIVE). The typical widths of blood vessels vary between 1 and 7 pixels. This range of width
values determines our choice of the values of the wavelength used in the bank of Gabor filters. The circles surround Y- and T-formed vessel
bifurcations and crossings. (c), (d) Superposition of the responses of a bank of symmetric Gabor filters with a threshold (c) t1 ¼ 0 and (d) t1 ¼ 0:2.
AZZOPARDI AND PETKOV: TRAINABLE COSFIRE FILTERS FOR KEYPOINT DETECTION AND PATTERN RECOGNITION 497
Sf3 and Sf4 (Fig. 12), and using them together with the other
two filters we achieve 100 percent recall and 100 percent
precision for the concerned image. This means that all
107 features shown in Fig. 9b are correctly detected and that
there are no false responses of the filters.
We use an individual threshold value t3 ðSfi Þ for each
COSFIRE filter Sfi by setting it to the smallest number for
which the precision is still 100 percent for the training image.
Fig. 10. Configuration of a COSFIRE filter. (a) The circle indicates a We apply the same four COSFIRE filters on a dataset
bifurcation feature f1 selected for the configuration of the filter. (DRIVE) of 40 binary retinal images5 and evaluate the
(b) Enlargement of the selected feature. (c) Structure of the COSFIRE obtained results with the ground truth data6 that was
filter Sf1 configured for the specified bifurcation. The ellipses illustrate defined by the authors of this paper. The recall R and the
the involved Gabor filters and the positions in which their responses
are taken.
precision P that we achieve depend on the values of the
threshold parameters t3 ðSfi Þ: P increases and R decreases
junction regions and suppress the undesirable responses of with increasing values of t3 ðSfi Þ. For each COSFIRE filter we
Gabor filters, Fig. 9d. add to (or subtract from) the corresponding learned thresh-
Next, we select a vascular bifurcation that we use to old value t3 ðSfi Þ the same offset value. With the referred
configure a COSFIRE filter. In practice, the selection is done four COSFIRE filters, the harmonic mean ð2P R=ðP þ RÞÞ of
by specifying a region of appropriate size centered at the precision and recall reaches a maximum at a recall R of
the concerned feature. Fig. 10a illustrates the selection of 95.58 percent and a precision P of 95.25 percent when each
one such region that is shown enlarged in Fig. 10b. In the t3 ðSfi Þ is offset by the same amount of þ0:05 from the
following, we denote this prototype feature by f1 . Fig. 10c corresponding learned threshold value. We extend our
experiments by configuring up to eight COSFIRE filters of
shows the structure of a COSFIRE filter Sf1 that is configured
which the new four filters are configured for four prototype
for the specified feature. For the configuration of this filter,
features taken from the same retinal image with filename
we use three values of the radius ( ¼ f0; 4; 10g).
04_manual1.gif. We achieve the best results for six filters
Fig. 11 shows the results that are obtained by the
(Fig. 12) and show them together with the results for four
application of filter Sf1 (0 ¼ 8:49) in different modes to the
filters in Fig. 13. With six filters the maximum harmonic
binary retinal fundus image shown in Fig. 10a. For this filter,
mean is reached at a recall R of 98.50 percent and a
we use a threshold value of t3 ¼ 0:21 as it produces the
precision P of 96.09 percent when the corresponding
largest number of correctly detected bifurcations and no
learned t3 ðSfi Þ values are offset by the same amount of
falsely detected features. The encircled regions4 are centered
þ0:07. We made this application available on the Internet.7
on the local maxima of the filter response and if two such
The optimal results for four and six COSFIRE filters are
regions overlap by 75 percent, only the one with the stronger
reached for a very small value of the offset: þ0:05 and
response is shown.
When no invariance is used (Fig. 11a), the filter Sf1 þ0:07, respectively. This shows that the learned threshold
detects four vascular bifurcations, one of which is the values that are determined individually for each filter give
prototype pattern that was used to configure this filter. results near to the optimal that may be expected.
When the filter is applied in a rotation-invariant mode In principle, all vascular bifurcations can be detected if a
( 2 fi8 j i ¼ 0 . . . 7g) it detects 24 features. With the addi- sufficient number of filters are configured and used.
1 1
tion of scale invariance ( 2 f22 ; 1; 22 g) the filter detects Furthermore, the precision can be improved by performing
34 features, and with the inclusion of reflection invariance additional morphological analysis of the features that are
the COSFIRE filter Sf1 detects 67 bifurcations. These results detected by the filters. Even without these possible
illustrate how invariance to such geometric transformations improvements, our results are better than those achieved
can be used to boost the performance of a COSFIRE filter. It in [33] where a recall of 95.82 percent was reported on a
also shows the strong generalization capability of this small dataset of five retinal images only.
approach because 62.62 percent (67 out of 107) of the
features of interest are detected by one filter. 3.2 Recognition of Handwritten Digits
As to the remaining features that are not detected by the Handwritten digit recognition is a challenging task in the
filter corresponding to feature f1 , we proceed as follows: community of pattern recognition which has various com-
We take one of these features that we denote by f2 (Fig. 12) mercial applications, such as bank check processing and
and train a second COSFIRE filter Sf2 using it. With this postal mail sorting. It has been used as a benchmark for
second filter we detect 50 features of interest of which 35 comparing shape recognition methods. Feature extraction
overlap with features detected by the filter Sf1 and 15 are plays a significant role in the effectiveness of such systems. A
newly detected features (t3 ðSf2 Þ ¼ 0:25). Applying the two detailed review of the state-of-the-art methods is given in [34].
filters together results in the detection of 82 distinct In the following, we show how the proposed trainable
features. We continue adding filters that are configured COSFIRE filters can be configured to detect specific parts of
using features that have not been detected by the previously
5. Named in DRIVE 01_manual1.gif, ..., 40_manual1.gif.
trained filters. By configuring another two COSFIRE filters, 6. The ground truth data (coordinates of bifurcations and cross overs)
can be downloaded from http://www.cs.rug.nl/~imaging/databases/
4. The radius of the circle is the sum of the maximum value of the radial retina_database.
parameter and blur radius used at this value of . 7. http://matlabserver.cs.rug.nl/RetinalVascularBifurcations.
498 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. 2, FEBRUARY 2013
Fig. 11. Results of using the filter Sf1 in different modes: (a) noninvariant, (b) rotation-invariant, (c) rotation- and scale-invariant, and (d) rotation-,
scale-, and reflection-invariant. The number of correctly detected features (TP—true positives) increases as the filter achieves invariance to such
geometric transformations.
handwritten digits. Consequently, the collective responses concerned Gabor filters provide equal contribution
of multiple such filters can be used as a shape descriptor of (1=0 ¼ 0) to the output of the corresponding COSFIRE filter.
a given handwritten digit. We use the well-known modified The feature vectors obtained for the digit images of the
NIST (MNIST) dataset [35] to evaluate the performance of training set are then used to train an all-pairs multiclass (with
this approach. This dataset comprises 60,000 training and majority vote) support vector machine (SVM) classifier with
10,000 test digits8 where each digit is given as a grayscale a linear kernel. In Fig. 16a, we plot the recognition rates that
image of size 28 28 pixels, Fig. 14. we achieve for different values of the threshold t1 and for
In the configuration step, we choose a random subset of different numbers of COSFIRE filters used. We achieve a
digit images from each digit class. For each such digit image maximum recognition rate of 99.40 percent with 4,500
we choose a random location in the image and use the local COSFIRE filters, where the filters are used in a noninvariant
stroke pattern around that location to configure a COSFIRE mode, i.e., without compensation for possible pattern
filter. We use a given randomly selected location for the
configuration of a COSFIRE filter only if that filter consists of
at least four tuples; otherwise, we choose a different location.
We impose this restriction in order to avoid the selection of
small digit fragments as prototype patterns, which may
consequently result in filters with low discriminative power.
We provide further comments on the discriminative abilities
of these COSFIRE filters in Section 4. For this application, we
use three values of ( 2 f0; 3; 8g), t2 ¼ 0:75, 0 ¼ 0:83,
¼ 0:1, and a bank of antisymmetric Gabor filters with
16 equidistant orientations
pffiffiffi ( 2 fi8 ji ¼ 0 . . . 15g) and one
wavelength of ( ¼ 2 2). Fig. 15 illustrates the configuration
of four such COSFIRE filters using local prototype patterns Fig. 12. (Top row) A set of six bifurcations and (bottom row) the
structures of the corresponding six COSFIRE filters. The first four
(parts of digits) that are randomly selected from four bifurcations are taken from the binary retinal image shown in Fig. 10a
handwritten digits. with filename 21_manual1.gif and the last two bifurcations are extracted
We perform a number of experiments with different from the retinal image with filename 04_manual1.gif. The following are
values of the threshold parameter t1 (t1 2 f0; 0:05; 0:1; 0:15g). the learned threshold values: t3 ðSf1 Þ ¼ 0:21, t3 ðSf2 Þ ¼ 0:25, t3 ðSf3 Þ ¼
0:36, t3 ðSf4 Þ ¼ 0:29, t3 ðSf5 Þ ¼ 0:17, and t3 ðSf6 Þ ¼ 0:25.
The values of the other parameters mentioned above are kept
fixed for all experiments. For each value of t1 , we run an
experiment by configuring up to 500 COSFIRE filters per
digit class. We repeat such an experiment five times and
report the average recognition rate. Repetition of experi-
ments is necessary in order to compensate for the random
selection of training digit images and the random selection of
locations within these images that are used to configure the
concerned filters.
After the configuration of a certain number of COSFIRE
filters, every digit to which the set of these filters is applied
can be described by a vector where each element corresponds
to the maximum response of a COSFIRE filter across all Fig. 13. Precision-recall plots obtained with four and six COSFIRE filters.
locations in the input image. For instance, with 500 filters per For each plot the threshold parameter t3 of each filter is varied by adding
digit class and 10 digit classes, a digit image to which this set the same offset (ranging between 0:1 and 0:1) to the corresponding
learned threshold value. The precision P increases and the recall R
of 5,000 COSFIRE filters is applied is described by a vector of decreases with an increasing offset value. The harmonic mean (often
5,000 elements. For this application, the responses of the used as a single measure of performance) of R and P reaches a
maximum at R ¼ 98:50 percent and P ¼ 96:09 percent with six filters and
8. The MNIST dataset is available online: http://yann.lecun.com/exdb/ at R ¼ 95:58 percent and P ¼ 95:25 percent for four filters. These points
mnist. are marked by circle and square markers, respectively.
AZZOPARDI AND PETKOV: TRAINABLE COSFIRE FILTERS FOR KEYPOINT DETECTION AND PATTERN RECOGNITION 499
Fig. 17. Three reference traffic signs: (a) an intersection, (b) compulsory
giveway for bikes, and (c) a pedestrian crossing. (d)-(f) The structures of
the corresponding COSFIRE filters determined by the following
parameter values: 2 f0; 2; 4; 7; 10; 13; 16; 20; 25g, 0 ¼ 0:67, ¼ 0:04, Fig. 18. (a) Input image with filename crossing_004.png. (b) Super-
¼ 4, and 2 fi8 ji ¼ 1 . . . 15g. position of thresholded responses (t1 ¼ 0:1) of a bank of antisymmetric
Gabor filters ( ¼ 4 and 2 fi8 ji ¼ 0 . . . 15g) with isotropic surround
48 images. For each color image, we first convert it to suppression (inhibition factor is 2). (c) Superposition of the thresholded
responses of the three COSFIRE filters. (d) Correct detection and
grayscale and subsequently apply the filters. The antisym-
recognition of two traffic signs. The cross markers indicate the locations
metric Gabor filters that we use to provide inputs to the of the two local maxima responses, each surrounded with a circle that
COSFIRE filters are applied with isotropic surround sup- represents the support of the corresponding COSFIRE filter (the
pression [42] (using an inhibition factor of 2) in order to continuous circle represents the intersection sign and the dashed circle
reduce responses to the presence of texture in these complex represents the pedestrian crossing sign).
scenes. Rather than using the parameter t3 to threshold the
filter responses at a given fraction of the maximum filter of interest. In the first two applications that we present we
response, we choose to threshold the responses at a given choose to configure COSFIRE filters with three values of the
absolute value. Moreover, we also threshold responses that radius parameter as they provide sufficient coverage of
are smaller than a fraction of the maximum value of all the the corresponding features. However, for the third applica-
responses produced by the three filters. We call this thresh- tion we use nine values of the parameter in order to
old validity ratio. For an absolute threshold of 0.04 and a configure COSFIRE filters that are selective for more
validity ratio of 0.5, we obtain perfect detection and complex patterns. The choice of the number of values is
recognition performance for all the 48 traffic scenes. This related to the size and complexity of the local prototype
means that we detect all the traffic signs in the given images pattern that is used to configure a filter. The number of
with no false positives and correctly recognize every values used also controls the tradeoff between the
detected sign. Fig. 18 illustrates the detection and recognition selectivity and generalization ability of a filter: A COSFIRE
of two different traffic signs, shown encircled, in one of the filter becomes more selective and more discriminative with
input images. For this application, we apply the COSFIRE an increasing number of values.
filters in a noninvariant mode ( ¼ 0, ¼ 1) and compute A COSFIRE filter uses three threshold parameters: t1 , t2 ,
their output by a weighted geometric mean of the concerned and t3 . The value of parameter t1 depends on the contrast of
Gabor filter responses (0 ¼ 21:23). the image material involved in a given application and the
presence of noise. It controls the level at which the response
of a Gabor filter is supposed to indicate the presence of a
4 DISCUSSION line or an edge at a given position. For the first application,
When presenting the method in Section 2, we indicated that which concerns binary input images, we achieved good
a prototype feature used for the configuration of a COSFIRE results for t1 ¼ 0:2. Yet, for the second and third applica-
filter is selected by a user. The detection of vascular tions, which use grayscale input images, we obtained the
bifurcations and the detection and recognition of traffic best results for t1 ¼ 0 and t1 ¼ 0:1, respectively. The
signs presented in Sections 3.1 and 3.3, respectively, are threshold parameter t2 , which is used only in the config-
examples of such applications. The method is, however, not uration phase, is application-independent. It implements a
restricted by this aspect: There exists the possibility that a condition that the selected responses are significant and
system “discovers” patterns to be used for configuration comparable with the strongest possible response. We fix the
and Section 3.2 provides an example of such an application. value of this threshold to ðt2 ¼Þ 0:75. The parameter t3 is
We use Gabor filters for the detection of lines and edges. optional. It may be used to suppress the responses of the
Gabor filters, however, are not intrinsic to the proposed COSFIRE filter that are below a given fraction of the
method and other orientation-selective filters can also be maximum response value across all locations of the input
used. image. For instance, in the first application we evaluate the
The configuration of a COSFIRE filter is based on the performance of the COSFIRE filters with different values of
spatial arrangement of contour parts that lies along the parameter t3 , while for the second application we do not
concentric circles of given radii around a specified point threshold (t3 ¼ 0) the responses. In the third application
AZZOPARDI AND PETKOV: TRAINABLE COSFIRE FILTERS FOR KEYPOINT DETECTION AND PATTERN RECOGNITION 501
we threshold the responses at a given absolute value rather reported in [41] but with a much lower computational cost.
than use this threshold parameter. We used Matlab implementation12 for all the experiments.
The proposed COSFIRE filters can be applied in various The application of the proposed method to the recognition
modes. For the detection of vascular bifurcations in retinal of handwritten digits contains an interesting aspect from a
images we applied COSFIRE filters in rotation-, scale-, and machine learning point of view. In traditional machine
reflection-invariant mode, while for the recognition of hand- learning, the features to be used are fixed in advance and the
written digits we only made use of partial rotation invariance machine learning aspect concerns the classification of
and for the detection and recognition of traffic signs in observed feature vectors. If traditional machine learning is
complex scenes we used noninvariant COSFIRE filters. concerned with features at all, this is typically limited to the
In the following, we highlight three main aspects in selection of predefined features or using them to derive
which the proposed COSFIRE filters can be distinguished “new” features as (linear) combinations of the original ones.
from other keypoint detectors. First, a COSFIRE filter gives Examples are principle component analysis and generalized
a response only when all parts of the filter-defining matrix learning vector quantization [43]. Traditional ma-
prototype feature are present. In contrast, dissimilarity- chine learning is typically not concerned with the question of
based approaches also give responses to parts of the how the original features are defined. This aspect of the
prototype pattern. Second, while a COSFIRE filter combines problem is, however, crucial for the success: Almost any
the responses of Gabor filters at different scales, typical machine learning method will perform well with good
scale-invariant approaches, such as SIFT, use the same features. The interesting aspect we would like to point out is
scale, the one at which the concerned keypoint is an that in the proposed approach the appropriate prototype
extremum in a given scale space. Third, the area of support features are learned in the filter configuration process when a
of a COSFIRE filter is adaptive. It is composed of the feature of interest is presented.
support of a number of orientation-selective filters whose In our experiments, we do not analyze the discriminative
geometrical arrangement around a point of interest is ability of the individual COSFIRE filters because in this
learned from a given local contour prototype pattern. On
work we are not concerned with the optimization of the
the contrary, the area of support of other operators is
filters, but rather with showing their versatility. As a
typically related to the appropriate scale rather than to the
consequence, some of the configured filters that we used for
shape properties of the concerned pattern. To the best of our
the handwritten digit application might be redundant due
knowledge the proposed filters are the first ones which
to being selective for correlated patterns or for patterns with
combine the responses of orientation-selective filters with
low distinctiveness. One way of dealing with such
their main area of support outside the point of interest. The
redundancy is to compute a dissimilarity measure between
presence of added noise around a pattern of interest has
the prototype patterns used for the configuration of
little or no effect on a COSFIRE filter response. For other
operators, any added noise in the surroundings of a pattern different COSFIRE filters. Moreover, a prototype feature
of interest results in a descriptor that may differ substan- selection method may also be incorporated in a machine
tially from the descriptor of the same but noiseless pattern. learning algorithm, such as relevance learning vector
The computational cost of the configuration of a COSFIRE quantization [44] or a support feature machine [45], to
filter is proportional to the maximum value of the given set identify the most relevant COSFIRE filters.
of values and to the size of the bank of Gabor filters used. In The COSFIRE filters that we propose are inspired by the
practice, for the parameter values that we used in the three properties of one class of shape-selective neuron in area V4
applications, a COSFIRE filter is configured in less than half of visual cortex [15], [16], [46], [47]. The selectivity that is
of a second for a Matlab implementation that runs on a exhibited by a COSFIRE filter which we configured in
3 GHz processor. The computational cost of the application Section 2 to a dataset of elementary features (Fig. 6) is
of a COSFIRE filter depends on the computations of the qualitatively similar to the selectivity of some V4 neurons
responses of a bank of Gabor filters and their blurring and studied in [15]. The way we determine the standard
shifting. In practice, in the first application a retinal fundus deviation of the blurring function in (1) is also motivated
image of size 564 584 pixels is processed in less than by neurophysiological evidence that the average diameter
45 seconds on a standard 3 GHz processor by six rotation-, of receptive fields13 of V4 neurons increases with the
scale-, and reflection-invariant COSFIRE filters. For the eccentricity [48]. Since there is a considerable spread in the
second application, a handwritten digit of size 28 28 pixels behavior across neurons of the concerned type, different
is described by 5,000 rotation-noninvariant COSFIRE filters computational models may be needed to adequately cover
in less than 10 seconds on a computer cluster.11 Finally, in the the diversity of functional properties in that empirical
third application, a complex scene of size 360 270 pixels is space. In this respect, the proposed COSFIRE filter can be
processed in less than 10 seconds on the same standard considered as a computational model of shape-selective V4
3 GHz processor by three noninvariant COSFIRE filters. For neurons that is complementary to other models [49], [50],
this application we achieve the same performance as [51], [52], [53].
The specific type of function that we use to combine the
11. We executed the experiments for the MNIST dataset on a computer responses of afferent (Gabor) filters for the considered
cluster of 255 multicore nodes (http://www.rug.nl/cit/hpcv/faciliteiten/
HPCCluster/). We split the MNIST dataset of 70,000 images (60,000 training applications is weighted geometric mean. This output
and 10,000 test digits) into 250 batches of 280 images each, and processed
the 250 batches in parallel. In this way, the digit descriptors of one 12. Matlab scripts for the configuration and application of COSFIRE
experiment using 5,000 rotation-noninvariant COSFIRE filters takes filters can be downloaded from http://matlabserver.cs.rug.nl/.
approximately (9.5 seconds 280 images =) 45 minutes. An experiment 13. In neurophysiology a receptive field refers to an area in the visual
with 5,000 partial rotation-invariant COSFIRE filters (five values of the field which provides input to a given neuron. Its mathematical counterpart
parameter ) takes five times as much. is the support of an operator.
502 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 35, NO. 2, FEBRUARY 2013
function proved to give better results than various forms of [4] D.G. Lowe, “Object Recognition from Local Scale-Invariant
Features,” Proc. Seventh IEEE Int’l Conf. Computer Vision, vol. 2,
addition. Furthermore, there is psychophysical evidence pp. 1150-1157, 1999.
that human visual processing of shape is likely performed [5] K. Mikolajczyk and C. Schmid, “Indexing Based on Scale Invariant
by multiplication [17]. In future work, we plan to experi- Interest Points,” Proc. Eighth IEEE Int’l Conf. Computer Vision,
ment with functions other than (weighted) geometric mean. vol. 1, pp. 525-531, 2001.
[6] K. Mikolajczyk and C. Schmid, “A Performance Evaluation of
The proposed COSFIRE filters are particularly useful due Local Descriptors,” IEEE Trans. Pattern Analysis and Machine
to their versatility and selectiveness, in that a COSFIRE filter Intelligence, vol. 27, no. 10, pp. 1615-1630, Oct. 2005.
can be configured by any given local feature and is built on [7] L. Florack, B. ter Haar Romeny, J. Koenderink, and M. Viergever,
top of other—here orientation-selective—simpler filters. “General Intensity Transformations and Differential Invariants,”
J. Math. Imaging and Vision, pp. 171-187, 1994.
Elsewhere, we have used other types of simple filters [8] F. Mindru, T. Tuytelaars, L. Van Gool, and T. Moons, “Moment
(Mexican hat operators) to build a contour operator, which Invariants for Recognition under Changing Viewpoint and
we call Combination of Receptive Fields (CORF) [54]. We use Illumination,” Computer Vision and Understanding, vol. 94, nos. 1-
3, pp. 3-27, 2004.
the terms COSFIRE and CORF for the same design principle
[9] A. Baumberg, “Reliable Feature Matching Across Widely Sepa-
in an engineering and neuroscience context, respectively. rated Views,” Proc. IEEE Conf. Computer Vision and Pattern
There are various directions for future research. One Recognition, vol. 1, pp. 774-781, 2000.
direction is to apply the proposed trainable COSFIRE filters [10] W. Freeman and E. Adelson, “The Design and Use of Steerable
Filters,” IEEE Trans. Pattern Analysis and Machine Intelligence,
in other computer vision tasks, such as geometric stereo vol. 13, no. 9, pp. 891-906, Sept. 1991.
calibration, image retrieval, the recognition of handwritten [11] G. Carneiro and A. Jepson, “Multi-Scale Phase-Based Local
characters, architectural symbols, and pedestrians. Another Features,” Proc. IEEE Conf. Computer Vision and Pattern Recognition,
direction is to enrich the properties of a COSFIRE filter by vol. 1, pp. I-736-I-743, 2003.
[12] D. Lowe, “Distinctive Image Features from Scale-Invariant Key-
including information about the color and texture distribu- points,” Int’l J. Computer Vision, vol. 60, pp. 91-110, 2004.
tion in a given local prototype pattern. A third direction is [13] Y. Ke and R. Sukthankar, “PCA-SIFT: A More Distinctive
to extend the proposed approach to 3D COSFIRE filters that Representation for Local Image Descriptors,” Proc. IEEE Conf.
can be applied, for instance, to tubular organ registration Computer Vision and Pattern Recognition, vol. 2, pp. II-506-II-513,
2004.
and bifurcation detection in X-ray computed tomography [14] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-Up
medical images or to video sequences. Robust Features (SURF),” Computer Vision and Image Understand-
ing, vol. 110, no. 3, pp. 346-359, 2008.
[15] A. Pasupathy and C.E. Connor, “Responses to Contour Features in
5 CONCLUSIONS Macaque Area V4,” J. Neurophysiology, vol. 82, no. 5, pp. 2490-2502,
Nov. 1999.
We demonstrated that the proposed COSFIRE filters provide [16] A. Pasupathy and C.E. Connor, “Population Coding of Shape in
effective machine vision solutions in three practical applica- Area V4,” Nature Neuroscience, vol. 5, no. 12, pp. 1332-1338, 2002.
tions: the detection of vascular bifurcations in retinal fundus [17] E. Gheorghiu and F.A.A. Kingdom, “Multiplication in Curvature
Processing,” J. Vision, vol. 9, no. 2, pp. 23:1-23:17, 2009.
images (98.50 percent recall and 96.09 percent precision), the
[18] N. Petkov, “Biologically Motivated Computationally Intensive
recognition of handwritten digits (99.48 percent correct Approaches to Image Pattern-Recognition,” Future Generation
classification), and the detection and recognition of traffic Computer Systems, vol. 11, nos. 4/5, pp. 451-465, 1995.
signs in complex scenes (100 percent recall and precision). [19] N. Petkov and P. Kruizinga, “Computational Models of Visual
Neurons Specialised in the Detection of Periodic and Aperiodic
For the first application, the proposed COSFIRE filters Oriented Visual Stimuli: Bar and Grating Cells,” Biological
outperform other methods previously reported in the Cybernetics, vol. 76, no. 2, pp. 83-96, 1997.
literature. For the second, it is close to the performance of [20] P. Kruizinga and N. Petkov, “Non-Linear Operator for Oriented
the best application-specific method. For the third, it gives Texture,” IEEE Trans. Image Processing, vol. 8, no. 10, pp. 1395-
1407, Oct. 1999.
the same performance as another method which has much [21] S.E. Grigorescu, N. Petkov, and P. Kruizinga, “Comparison of
higher computational complexity. Texture Features Based on Gabor Filters,” IEEE Trans. Image
The novel COSFIRE filters are conceptually simple and Processing, vol. 11, no. 10, pp. 1160-1167, Oct. 2002.
easy to implement: The filter output is computed as the [22] N. Petkov and M.A. Westenberg, “Suppression of Contour
Perception by Band-Limited Noise and Its Relation to Non-
product of blurred and shifted Gabor filter responses. They Classical Receptive Field Inhibition,” Biological Cybernetics, vol. 88,
are versatile detectors of contour related features as they no. 10, pp. 236-246, 2003.
can be trained with any given local contour pattern and are [23] C. Grigorescu, N. Petkov, and M.A. Westenberg, “The Role of
Non-CRF Inhibition in Contour Detection,” J. Computer Graphics,
subsequently able to detect identical and similar patterns. Visualization, and Computer Vision, vol. 11, no. 2, pp. 197-204, 2003.
The COSFIRE approach is not limited to the combination of [24] C. Grigorescu, N. Petkov, and M.A. Westenberg, “Contour
Gabor filter responses: More generally, it can be applied to Detection Based on Nonclassical Receptive Field Inhibition,” IEEE
the responses of filters that provide information about Trans. Image Processing, vol. 12, no. 7, pp. 729-739, July 2003.
[25] C.D. Murray, “The Physiological Principle of Minimum Work: I.
texture, color, contours, and motion. The Vascular System and the Cost of Blood Volume,” Proc. Nat’l
Academy of Sciences USA, vol. 12, pp. 207-214, 1926.
[26] C.D. Murray, “The Physiological Principle of Minimum Work
REFERENCES Applied to the Angle of Branching of Arteries,” J. General
[1] C. Harris and M. Stephens, “A Combined Corner and Edge Physiology, vol. 9, pp. 835-841, 1926.
Detector,” Proc. Fourth Alvey Vision Conf., pp. 147-151, 1988, [27] T. Sherman, “On Connecting Large Vessels to Small—the Meaning
[2] C. Schmid and R. Mohr, “Local Grayvalue Invariants for Image of Murray Law,” J. General Physiology, vol. 78, no. 4, pp. 431-453,
Retrieval,” IEEE Trans. Pattern Analysis and Machine Intelligence, 1981.
vol. 19, no. 5, pp. 530-535, May 1997. [28] M. Zamir, J. Medeiros, and T. Cunningham, “Arterial Bifurcations
[3] T. Lindeberg, “Feature Detection with Automatic Scale Selection,” in the Human Retina,” J. General Physiology, vol. 74, no. 4, pp. 537-
Int’l J. Computer Vision, vol. 30, no. 2, pp. 79-116, 1998. 548, 1979.
AZZOPARDI AND PETKOV: TRAINABLE COSFIRE FILTERS FOR KEYPOINT DETECTION AND PATTERN RECOGNITION 503
[29] M. Tso and L. Jampol, “Path-Physiology of Hypertensive [51] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio,
Retinopathy,” Opthalmology, vol. 89, no. 10, pp. 1132-1145, 1982. “Robust Object Recognition with Cortex-Like Mechanisms,” IEEE
[30] N. Chapman, G. Dell’omo, M.S. Sartini, N. Witt, A. Hughes, S. Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 3,
Thom, and R. Pedrinelli, “Peripheral Vascular Disease Is pp. 411-426, Mar. 2007.
Associated with Abnormal Arteriolar Diameter Relationships at [52] C. Cadieu, M. Kouh, A. Pasupathy, C.E. Connor, M. Riesenhuber,
Bifurcations in the Human Retina,” Clinical Science, vol. 103, no. 2, and T. Poggio, “A Model of V4 Shape Selectivity and Invariance,”
pp. 111-116, Aug. 2002. J. Neurophysiology, vol. 98, no. 3, pp. 1733-1750, Sept. 2007.
[31] N. Patton, T.M. Aslam, T. MacGillivray, I.J. Deary, B. Dhillon, R.H. [53] S. Fidler and A. Leonardis, “Towards Scalable Representations of
Eikelboom, K. Yogesan, and I.J. Constable, “Retinal Image Object Categories: Learning a Hierarchy of Parts,” Proc. IEEE Conf.
Analysis: Concepts, Applications and Potential,” Progress in Computer Vision and Pattern Recognition, pp. 1-8, 2007.
Retinal and Eye Research, vol. 25, no. 1, pp. 99-127, Jan. 2006. [54] G. Azzopardi and N. Petkov, “A CORF Computational Model of a
[32] J. Staal, M. Abramoff, M. Niemeijer, M. Viergever, and B. van Simple Cell That Relies on LGN Input Outperforms the Gabor
Ginneken, “Ridge-Based Vessel Segmentation in Color Images of Function Model,” Biological Cybernetics, vol. 106, pp. 177-189, 2011,
the Retina,” IEEE Trans. Medical Imaging, vol. 23, no. 4, pp. 501-509, doi: 10.1007/s00422-012-0486-6.
Apr. 2004.
[33] A. Bhuiyan, B. Nath, J. Chua, and K. Ramamohanarao, “Auto- George Azzopardi received the BSc degree
matic Detection of Vascular Bifurcations and Crossovers from with honours (first class) in computer science
Color Retinal Fundus Images,” Proc. Third IEEE Int’l Conf. Signal- from Goldsmiths University of London in 2006
Image Technologies and Internet-Based System, pp. 711-718. 2007, and was awarded an academic award. In 2007,
[34] C. Liu, K. Nakashima, H. Sako, and H. Fujisawa, “Handwritten he was awarded a government scholarship in
Digit Recognition: Benchmarking of State-of-the-Art Techniques,” order to pursue a master’s degree in advanced
Pattern Recognition, vol. 36, no. 10, pp. 2271-2285, 2003. methods of computer science at Queen Mary
[35] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-Based University of London, where he graduated with
Learning Applied to Document Recognition,” Proc. IEEE, vol. 86, distinction (ranked first) in 2008. Currently, he is
no. 11, pp. 2278-2324, Nov. 1998. working toward the PhD degree at the Johann
[36] S. Belongie, J. Malik, and J. Puzicha, “Shape Matching and Object Bernoulli Institute for Mathematics and Computer Science, University of
Recognition Using Shape Contexts,” IEEE Trans. Pattern Analysis Groningen, The Netherlands. His current research interests include
and Machine Intelligence, vol. 24, no. 4, pp. 509-522, Apr. 2002. brain-inspired machine vision, which includes computational models of
[37] D. Oberhoff and M. Kolesnik, “Unsupervised Shape Learning in a the visual system with applications to contour detection, feature, and
Neuromorphic Hierarchy,” Pattern Recognition and Image Analysis, shape recognition.
vol. 18, pp. 314-322, 2008.
[38] A. Borji, M. Hamidi, and F. Mahmoudi, “Robust Handwritten Nicolai Petkov received the Dr.sc.techn. de-
Character Recognition with Features Inspired by Visual Ventral gree in computer engineering (Informationstech-
Stream,” Neural Processing Letters, vol. 28, no. 2, pp. 97-111, 2008. nik) from Dresden University of Technology,
[39] M. Hamidi and A. Borji, “Invariance Analysis of Modified C2 Germany. He is a professor of computer science
Features: Case Study-Handwritten Digit Recognition,” Machine and head of the Intelligent Systems Group of the
Vision and Applications, vol. 21, no. 6, pp. 969-979, 2010. Johann Bernoulli Institute of Mathematics and
[40] M. Ranzato, C. Poultney, S. Chopra, and Y. LeCun, “Efficient Computer Science of the University of Gronin-
Learning of Sparse Representations with an Energy-Based gen, The Netherlands. He is the author of two
Model,” Advances in Neural Information Processing Systems, J. Platt monographs and coauthor of another book on
et al., eds., MIT Press, 2006. parallel computing, holds four patents, and has
[41] C. Grigorescu and N. Petkov, “Distance Sets for Shape Filters and authored more than 100 scientific papers. His current research is in
Shape Recognition,” IEEE Trans. Image Processing, vol. 12, no. 10, image processing, computer vision, and pattern recognition, and
pp. 1274-1286, Oct. 2003. includes computer simulations of the visual system of the brain, brain-
[42] C. Grigorescu, N. Petkov, and M.A. Westenberg, “Contour and inspired computing, computer applications in health care and life
Boundary Detection Improved by Surround Suppression of sciences, and creating computer programs for artistic expression. He
Texture Edges,” Image and Vision Computing, vol. 22, no. 8, is a member of the editorial boards of several journals.
pp. 609-622, Aug. 2004.
[43] K. Bunte, M. Biehl, M.F. Jonkman, and N. Petkov, “Learning
Effective Color Features for Content Based Image Retrieval in
Dermatology,” Pattern Recognition, vol. 44, no. 9, pp. 1892-1902, . For more information on this or any other computing topic,
2011. please visit our Digital Library at www.computer.org/publications/dlib.
[44] B. Hammer and T. Villmann, “Generalized Relevance Learning
Vector Quantization,” Neural Networks, vol. 15, nos. 8/9, pp. 1059-
1068, 2002.
[45] S. Klement and T. Martinetz, “The Support Feature Machine for
Classifying with the Least Number of Features,” Proc. 20th Int’l
Conf. Artificial Neural Networks: Part II, pp. 88-93, 2010.
[46] A. Pasupathy and C.E. Connor, “Shape Representation in Area V4:
Position-Specific Tuning for Boundary Conformation,” J. Neuro-
physiology, vol. 86, no. 5, pp. 2505-2519, Nov. 2001.
[47] J. Hegde and D.C. Van Essen, “A Comparative Study of Shape
Representation in Macaque Visual Areas V2 and V4,” Cerebral
Cortex, vol. 17, no. 5, pp. 1100-1116, 2007.
[48] R. Gattass, A.P. Sousa, and C.G. Gross, “Visuotopic Organization
and Extent of V3 and V4 of the Macaque,” J. Neuroscience, vol. 8,
no. 6, pp. 1831-1845, 1988.
[49] M. Riesenhuber and T. Poggio, “Hierarchical Models of Object
Recognition in Cortex,” Nature Neuroscience, vol. 2, no. 11,
pp. 1019-1025, Nov. 1999.
[50] T. Serre, M. Kouh, C. Cadieu, U. Knoblich, G. Kreiman, and T.
Poggio, “A Theory of Object Recognition: Computations and
Circuits in the Feedforward Path of the Ventral Stream in Primate
Visual Cortex,” AI Memo 2005-036/CBCL Memo 259, Massachu-
setts Inst. of Technology, 2005.