Forrest and Simmons 2002
Forrest and Simmons 2002
Summary. We test whether attendances in team sports respond positively to the amount of
uncertainty of the outcome between the competing teams in a match. Our results show that
admissions at English soccer matches relate positively to the quality of teams involved and
negatively to a measure of the relative win probabilities of the competing teams. The uncertainty
measure is derived from a model of the betting market which corrects for specific biases tested for
and identified in the odds in our data set. Although supporters appear to favour an uncertainty of
outcome, a greater quality of strength across clubs may still yield a fall in aggregate attendance
because of the extent to which home field advantage generates an uneven contest between
similarly strong teams.
1. Introduction
‘Competitive balance’ and ‘uncertainty of outcome’ are two important, but easily conflated, con-
cepts used in the literature on the economics of sports leagues in North America and Europe.
By competitive balance is meant a league structure which has relatively equal playing strength
between league members. By uncertainty of outcome is meant a situation where a given contest
within a league structure has a degree of unpredictability about the result and, by extension, that
the competition as a whole does not have a predetermined winner at the outset of the competi-
tion. Conventional wisdom proposed by analysts, administrators and fans’ groups suggests that
spectators at fixtures organized by sports leagues like uncertainty of outcome of matches and,
further, prefer a more balanced league to a less balanced league (Will, 1999). The underlying
hypothesis is that the audience for a sports league fixture will be larger (ceteris paribus) when
the competing teams are more evenly matched. The present paper sets out to test this hypothesis
using one season’s data from the English Football League. We set up a measure of uncertainty
of outcome, using betting odds of outcomes of matches set by bookmakers; unlike previous
researchers who have employed betting odds in this context, our measure takes into account
biases that may exist in odds setting and for which we test. Given suitable controls, we find that
soccer match attendances are indeed maximized where the uncertainty of outcome is greatest.
However, as we shall show, this does not imply that arguments in favour of greater equalization
of team strengths are actually valid. The concepts of uncertainty of outcome of matches and
balance in the competition should be carefully separated.
Address for correspondence: David Forrest, Centre for Sports Economics, Maxwell Building, University of
Salford, Salford, M5 4WT, UK.
E-mail: d.k.forrest@salford.ac.uk
2. Attendance data
English professional soccer comprises a hierarchy of four leagues with a total of 92 clubs,
making it the world’s largest professional soccer league structure. The top division, currently
with 20 teams, is separately owned and organized as the Football Association Premier League.
Below this lie divisions 1, 2 and 3 of the Football League. Mobility between all four divisions is
accomplished by the promotion or demotion of three or four teams, according to the division,
at the top or bottom of the standings at the end of the season. Within a division, each team
hosts each other team once so that each club plays a total of 38 matches (in the Premier League)
or 46 matches (in the Football League). The season runs from mid-August to early May.
We collected data from Rollin (1999) for all matches played on Saturdays (the most common
day for fixtures and the day for which betting data were always available) between October
1997 and May 1998. We excluded the August and September period because we intended to
use as regressors a summary measure of each team’s performance to date, namely the number
of league points won up to the time of the observed match as a proportion of the points that
were available to the team up to the time of that match. Early in the season, this measure can
vary wildly across teams in a way that reflects not team strength but just which opposing clubs
have been included at the beginning of their schedule. Early season games were therefore set
aside.
We also omitted from our analysis matches played in the top tier, the premiership. We consider
only the remaining 872 matches which took place as part of divisions 1, 2 and 3 of the Football
League. Our exclusion of premiership matches was based on two factors. One was that revenue
sharing in the premiership is important because of the very high money value of the television
contract (over half of the proceeds of which are shared out equally); in the Football League,
before the 2001 season when a new, more lucrative contract began, the importance of television
rights was small and so one could observe the effects on attendance at individual matches
played between clubs among whom the range of relative strength has not been modified by
special measures, such as revenue sharing. It should be noted that the data period postdates the
Bosman ruling in the European Court which pushed the sport strongly towards free agency, so
that, again, we are not observing a market that is severely distorted by labour market restrictive
practices.
232 D. Forrest and R. Simmons
A more fundamental reason for the omission of the Premier League from our data analysis is
that the stadium capacity constraint is binding in a high proportion of its matches. By contrast,
a sell-out is such a rarity in the lower divisions that no special allowance seems likely to have to
be made for it.
We argue that an attendance demand function cannot properly be estimated in the case where,
for a significant proportion of observations, the measure of demand observed reflects the number
of seats that are available rather than the number of tickets that the public would wish to buy. Of
course, the tobit regression model has been devised to allow estimation where the distribution
is censored and this method has been followed in some papers on sports attendance demand
(Kuypers, 1996; Welki and Zlatoper, 1994). However, notwithstanding that Greene (2000), pages
905–909, employed the example of a sports stadium (with a capacity constraint that is binding
on some occasions) to illustrate his exposition of the tobit method, its use is inappropriate in
this context of professional sport. The tobit method rests on the assumption that ‘true’ demand
is observed at events where the capacity of a stadium is not reached. However, true demand is
not observed in these cases. Clubs take advantage of the fact that some matches with favourable
characteristics will be sold out and that this is known ex ante. They respond by bundling tickets
(season tickets) such that, to be sure of attending the ‘big’ game, a fan must also purchase a
ticket for a minor game. Even for non-season ticket-holders, there is usually an incentive to go
to the minor game, for example because tickets for the important game are rationed by one’s
ability to produce ticket stubs from earlier in the season. Consequently, an attempt to measure
how demand responds to match characteristics using the tobit model is flawed by the problem
that observed demand is greater than the true demand for matches with weak characteristics
(as well as less than the true demand for matches with strong characteristics).
Accordingly, our data analysis relates to attendance at 872 matches played in 1997–1998 in
divisions 1, 2 and 3 of the Football League. In addition to collecting attendance figures, we
collected from Rollin (1999) our selected team strength variable for each team in each match.
This was the points ratio of the team before the match, where the points ratio is the number
of league points obtained divided by the number available to the team in the season to date. In
English soccer, 3 points are awarded for a win and 1 point for a draw (tie).
3. Betting data
Betting on soccer in the UK is organized in an unusual way. In contrast with the pari-mutuel
format of American betting on horse-racing (where odds are determined mechanistically by the
weight of bettors’ money) and with Nevada-style betting on team sports (where bookmakers
set odds or spreads but modify these according to the weight of money), British bookmakers
set the terms (odds) of soccer bets several days before a match and these then remain unaltered
through the betting period. The system is termed ‘fixed odds’ betting.
For each match in our sample, we collected the odds for a home team win, draw and away
team win from an electronic archive facility, Mabel’s Tables, that lists odds available from five
large firms. Unsurprisingly, we found that the odds across firms were highly correlated with
each other. In this study, we used the odds set by Super Soccer which is a specialist odds setting
organization whose odds are purchased for use by almost all the smaller bookmakers of the UK
(only the very largest bookmaking firms set their own soccer match odds). Because of the high
correlation of odds across firms, the choice of which set of odds to use was unimportant.
Some previous studies of sports attendance demand have used betting odds as a proxy for
the uncertainty of outcome. In North America, Knowles et al. (1992) and Rascher (1999)
incorporated information on Las Vegas betting lines in their studies of attendance demand for
Outcome Uncertainty and Attendance Demand 233
Major League baseball. For British soccer, Peel and Thomas (1988, 1992) first offered the insight
that odds might contain information that is useful in the estimation of the relationship between
attendance demand and uncertainty of outcome. They argued, convincingly, that bookmaker
odds might be expected to take into account the whole myriad of factors that are likely to
influence the outcome of matches (including special factors such as suspensions of players or
injuries) and are therefore a source of information on how closely fought potential spectators
may expect the match to be.
The results in Peel and Thomas (1988, 1992) are, however, difficult to interpret. For each
division separately, they regressed (log-) attendance at a match on the league positions of the
home and away teams and on the probability odds (and its square) that are offered for a home
victory by their chosen bookmaker (probability odds are the odds quoted by the bookmaker
but expressed in probability form, e.g. 3:1 becomes 0.25). In all divisions, these variables were
found to be significant determinants of attendance where control variables included the (log-)
attendance at the home club’s immediately preceding home fixture, the distance between the
grounds of the two clubs (to allow for the effect of away fans’ travel costs) and a dummy variable
representing whether the game was played on a public holiday. The attendance at a particular
match was found to increase as either the home or away team occupied a higher place in the
standings before the match. The relationship between attendance and the probability odds of a
home victory was reported to be U shaped.
A problem with this specification is that the betting odds that are used represent the probability
that the home team will win rather than the likely evenness of the contest. Although Peel and
Thomas did not report the turning-point in the quadratic relationship between attendance and
home probability odds, it is readily calculable from their results and, for all divisions except
the bottom, falls within the range 0.60–0.67. Allowing for the possibility of a draw, and for the
bookmaker’s overround (i.e. the margin by which the sum of the probability odds of the three
possible outcomes exceeds 1), odds in excess of 0.60 represent situations where the home team
has, according to the betting market, more than twice as much chance of winning as the visiting
team. Therefore, our interpretation, not theirs, of Peel and Thomas’s results is that, as the teams’
chances of winning grow less equal, attendances fall away with some levelling off or recovery
when the home team becomes more likely to win than is usually the case with home teams (clubs
at home win slightly more than twice as often as visiting clubs in English professional soccer).
Our model will employ a variable that more explicitly measures the uncertainty of outcome.
Further, we shall allow for any bias in the betting odds offered by bookmakers. Peel and Thomas
(1988, 1992) assumed that the betting market is efficient so that the odds offered represent
the objective probabilities of the specified outcomes. However, this may not be valid in the
contemporary betting market.
a0 = 0; a1 = 1; a2 = 0;
b0 = 1; b1 = −1; b2 = 0;
c0 = 0; c1 = 1; c2 = 0:
Estimates of these parameters were in fact as follows (absolute values of t-statistics are given
in parentheses; n = 872):
The point estimates of the coefficients on the BOOKPROB terms are below 1, which is
consistent with the particular short odds–long odds bias that has been noted for several betting
markets in team sports. However, they are never significantly different from 1. In contrast, the
estimated coefficient on DIFFATTEND is significant in the home and away win equations,
implying in line with Forrest and Simmons (2001) that bookmakers find it worthwhile to bias
the odds to offer less unfair bets for wagers in favour of better-supported clubs. Our conclusion is
that the soccer betting market in 1997–1998 was not fully efficient and it would be inappropriate
to assume efficiency when modelling attendance demand.
y* = βx + " (5)
where y* is an unobserved latent variable, here the relative strength of the away team, x is the
vector of explanatory variables, comprising PROB(H) and DIFFATTEND, and " is a normally
distributed error term. We observe
RESULT = 0 if y* 0;
RESULT = 1 if 0 < y* µ; (6)
RESULT = 2 if µ y*;
This is our ordered probit model, to be estimated by using Stata 7.0. In Table 1, we show the
coefficients and the marginal effects of changes in BOOKPROB(H) and in DIFFATTEND on
the probabilities of the possible outcomes.
The marginal effects are given by
@Prob.y = 0/=@x = −φ.β x/β;
@Prob.y = 1/=@x = φ.−β x/ − .µ − β x/β; (8)
@Prob.y = 2/=@x = φ.µ − β x/β:
The estimates of marginal effects displayed in Table 1 show that the higher the bookmaker’s
probability ratio of a home win the more likely there will be a home win and, conversely, the less
likely there will be an observed away win. The higher one team’s level of fan support (measured
by its mean home attendance in the previous season) the more likely it is to win given the level
of odds.
The ordered probit regression equation was used to generate estimated probabilities of home
wins and away wins. In the 872 matches, the predicted probability of an away win exceeded that
of a home win in only 72 (8.2%) cases. This may appear surprising but reflects the very strong
home field advantage in British soccer where twice as many matches are won by the home as by
the visiting side. Forrest and Simmons (2000) employed an ordered logit model to examine the
Outcome Uncertainty and Attendance Demand 237
Table 1. Ordered probit model†
†Dependent variable: result (0, home win; 1, draw; 2, away win); absolute t-statistics are given in
parentheses.
relationship between the outcomes of matches and team strength and form variables and found
that, for every single observation, a home win had been more likely than an away win.
The estimated ratio of the probability of a home win to the probability of an away win, denoted
by PROBRATIO, is our measure of match uncertainty of outcome used to help to explain
attendance demand. The closer the value of PROBRATIO is to 1, the greater the uncertainty
attached to the question of which team will win. The matches where the estimated ratio of the
probability of a home win to the probability of an away win is close to 1 are expected to be
closely contested because the greater strength of the visitors is offset by the home advantage
enjoyed by the apparently weaker team.
To illustrate the second stage of our estimation procedure, denote our dependent variable,
LOGATTENDANCE, by Ai , where i is a home team identifier, and let zi be the vector of
explanatory variables with γ as the coefficient vector to be estimated. Let νi be a home-team-
specific residual which differs between home teams but, for any particular home team, takes a
constant value and let "i denote a random error term. Then, our fixed effects estimator applies
ordinary least squares to
A i = α + z i γ + νi + "i : (9)
The squared term PROBRATIO2 is entered to capture possible non-linearity in the attendance–
outcome uncertainty relationship. As with other match level attendance demand studies, we
include measures of how well each team has performed in the league to the date of the match.
Bruggink and Eaton (1996) and Rascher (1999) found that, for Major League baseball, the effect
of home team performance on attendance is greater than that of away team performance. This
may be a plausible hypothesis for English soccer: fans respond to a positive performance by their
own team and also, to some extent, to the quality of play offered by the visiting team. We test the
hypothesis but rather than use teams’ league positions before the match, as in Peel and Thomas
(1988, 1992), we employ the cardinal measure of the proportion of possible league points which
the team has won to the date of the fixture (HOMEPOINTS and AWAYPOINTS). This avoids
losing from the model information that is available to potential match patrons. For example,
the difference in relative cumulative performance between teams in positions n and n + 1 may
be much greater than that between teams in positions m and m + 1. We also follow Peel and
Thomas (1988, 1992) in including the distance between grounds of competing teams but enter
it as a quadratic term (DIST and DIST2 ) to capture the non-linearity that is likely to be present
238 D. Forrest and R. Simmons
if some fans travel with their team whatever the distance. Finally, we include month dummy
variables to capture the effects of weather, alternative seasonal attractions and the tendency for
interest in soccer to vary with the stage that the season has reached. The excluded category here
is November; as noted above, our sample ranges from October to May but April and May are
combined since there are few observations in May.
The uncertainty-of-outcome hypothesis is tested by the estimates of γ1 and γ2 . If γ1 > 0 and
γ2 > 0 then the uncertainty-of-outcome hypothesis is rejected whereas if γ1 < 0 and γ2 < 0 the
hypothesis is accepted. An interpretation of the two remaining possible cases, γ1 < 0 and γ2 > 0
or γ1 > 0 and γ2 < 0, can only be made by an inspection of the position of the turning-point
in the attendance–uncertainty relationship. Our measure captures the relative quality of soccer
matches but we expect fans also to respond to the absolute quality of competing teams. We
predict γ3 > 0 and γ4 > 0. We also expect longer distance to be a deterrent for attendance by
away fans and so we predict γ5 < 0.
In contrast with Peel and Thomas (1988, 1992), we report a single equation across divisions
1–3 of the Football League, rather than for each division separately. This is reasonable because,
to control for unobserved influences on team attendance, we estimate by ordinary least squares
with fixed (home) team effects. The fixed effects will reflect the level (division) at which a team
plays as well as other club-specific factors such as the price of admission, playing style, market
size and degree of enthusiasm for soccer within that market.
Results from our ordinary least squares estimation with fixed effects appear in Table 2. The
performance indicator for the home club attracts the expected positive coefficient but that
on the away team indicator is insignificant. Hence, the absolute quality of the home team
in the season to date influences the match attendance. Admissions decline, at a diminishing
rate, with distance between the home bases of the two clubs. The quadratic specification of
distance captures the curvature of the relationship between attendance and distance rather than
any tendency for the relationship to become positive at high levels of distance (the apparent
turning-point of the relationship is at 215 miles but this in fact exceeds the distance between the
large majority of possible pairs of English clubs). The explanation for the relationship between
attendance and distance is likely to be that fewer away fans travel when travel costs are high.
Acknowledgements
We wish to acknowledge the help of Ron Dorsey for data collection and Pam Carroll for efficient
research assistance.
References
Bruggink, T. H. and Eaton, J. W. (1996) What takes me out to the ball game? In Baseball Economics: Current
Research (eds J. Fizel, E. Gustafson and L. Hadley). Westport: Greenwood.
Cain, M., Law, D. and Peel, D. (2000) The favourite-longshot bias and market efficiency in UK football betting.
Scot. J. Polit. Econ., 47, 25–36.
Dixon, M. J. and Pope, P. F. (1996) Inefficiency and bias in the U.K association football betting market. Mimeo.
Lancaster University, Lancaster.
Dobson, S. and Goddard, J. (2001) The Economics of Football. Cambridge: Cambridge University Press.
El Hodiri, M. and Quirk, J. (1971) An economic model of a professional sports league. J. Polit. Econ., 79, 1302–
1319.
Forrest, D. and Simmons, R. (2000) Forecasting sport: the behaviour and performance of football tipsters. Int. J.
Forecast., 16, 317–331.
(2001) Globalisation and efficiency in the fixed-odds soccer betting market. Mimeo. University of Salford,
Salford.
Fort, R. D. (2000) European and North American sports economics differences (?). Scot. J. Polit. Econ., 47,
431–455.
Fort, R. D. and Quirk, J. (1995) Cross-subsidization, incentives and outcomes in professional team sports leagues.
J. Econ. Lit., 33, 1265–1299.
Goddard, J. A. and Asimakopoulos, I. (2001) Forecasting football results and the efficiency of fixed-odds betting.
Mimeo. University of Wales, Swansea.
Golec, J. and Tamarkin, M. (1991) The degree of inefficiency in the football betting market. J. Finan. Econ., 30,
311–323.
Greene, W. H. (2000) Econometric Analysis, 4th edn. Upper Saddle River: Prentice Hall.
Knowles, G., Sherony, K. and Haupert, M. (1992) The demand for major league baseball: a test of the uncertainty
of outcome hypothesis. Am. Econ., 36, 72–80.
Outcome Uncertainty and Attendance Demand 241
Kuypers, T. (1996) The beautiful game?: an econometric study of why people watch English football. Discussion
Paper 96-01. Department of Economics, University College London, London.
Peel, D. A. and Thomas, D. A. (1988) Outcome uncertainty and the demand for football. Scot. J. Polit. Econ., 35,
242–249.
(1992) The demand for football: some evidence on outcome uncertainty. Empir. Econ., 17, 323–331.
(1997) Handicaps, outcome uncertainty and attendance demand. Appl. Econ. Lett., 4, 567–570.
Quirk, J. and Fort, R. D. (1999) Hard Ball: the Abuse of Power in Pro Sports. Princeton: Princeton University
Press.
Rascher, D. (1999) A test of the optimal positive production network externality in Major League Baseball. In
Sports Economics: Current Research (eds J. Fizel, E. Gustafson and L. Hadley). Westport: Praeger.
Rollin, G. (ed.) (1999) Rothmans Football Yearbook 1998/9. London: Headline.
Welki, A. and Zlatoper, T. (1994) U.S professional football: the demand for game-day attendance in 1991. Mang.
Decsn Econ., 15, 489–495.
Will, D. H. (1999) The Federation’s viewpoint on the new transfer rules. In Competition Policy in Professional
Sports: Europe after the Bosman Case (eds S. Kesenne and C. Jeanrenaud). Antwerp: Standaard Editions.
Woodland, L. and Woodland, B. (1994) Market efficiency and the favourite-longshot bias: the baseball betting
market. J. Finan., 49, 269–279.
(2001) Market efficiency and profitable wagering in the National Hockey League: can betting score in
longshots? S. Econ. J., 67, 983–995.
Zavoina, R. and McElvey, W. (1975) A statistical model for the analysis of ordinal level variables. J. Math. Sociol.,
2, 103–120.
Zellner, A. (1963) Estimators for seemingly unrelated regression equations: some exact finite sample results. J.
Am. Statist. Ass., 58, 977–992.