0% found this document useful (0 votes)
99 views13 pages

Cricket Paper-1

This document discusses using multi-objective optimization and decision making approaches to select the optimal cricket team. The author proposes using a multi-objective genetic algorithm called NSGA-II to optimize the batting and bowling strength of a team within a budget constraint. Additional criteria like fielding performance are also considered. Case studies using Indian Premier League player data are presented to demonstrate the approach. The methodology can be generalized to other team sports.

Uploaded by

mphil.ramesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views13 pages

Cricket Paper-1

This document discusses using multi-objective optimization and decision making approaches to select the optimal cricket team. The author proposes using a multi-objective genetic algorithm called NSGA-II to optimize the batting and bowling strength of a team within a budget constraint. Additional criteria like fielding performance are also considered. Case studies using Indian Premier League player data are presented to demonstrate the approach. The methodology can be generalized to other team sports.

Uploaded by

mphil.ramesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Applied Soft Computing 13 (2013) 402–414

Contents lists available at SciVerse ScienceDirect

Applied Soft Computing


journal homepage: www.elsevier.com/locate/asoc

Multi-objective optimization and decision making approaches to cricket


team selection
Faez Ahmed a , Kalyanmoy Deb a,b,c , Abhilash Jindal a,∗
a
Kanpur Genetic Algorithms Laboratory (KanGAL), Indian Institute of Technology Kanpur, Kanpur 208016, India
b
Koenig Endowed Chair Professor, Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA
c
Aalto University School of Economics, Helsinki, Finland

a r t i c l e i n f o a b s t r a c t

Article history: Selection of players for a sports team within a finite budget is a complex task which can be viewed as
Received 9 September 2011 a constrained multi-objective optimization and a multiple criteria decision making problem. The task is
Received in revised form 18 April 2012 specially challenging for the game of cricket where a team requires players who are efficient in multiple
Accepted 19 July 2012
roles. In the formation of a good and successful cricket team, batting strength and bowling strength of
Available online 3 September 2012
a team are major factors affecting its performance and an optimum trade-off needs to be reached. We
propose a novel gene representation scheme and a multi-objective approach using the NSGA-II algorithm
Keywords:
to optimize the overall batting and bowling strength of a team with 11 players as variables. Fielding
Cricket team selection
Evolutionary multi-objective optimization
performance and a number of other cricketing criteria are also used in the optimization and decision-
NSGA-II making process. Using the information from the trade-off front obtained, a multi-criteria decision making
Multi-criteria decision making approach is then proposed for the final selection of team. Case studies using a set of players auctioned
Trade-off solutions in Indian Premier League (IPL) 4th edition are illustrated and players’ current statistical data is used
to define performance indicators. The proposed computational techniques are ready to be extended
according to individualistic preferences of different franchises and league managers in order to form a
preferred team within the budget constraints. It is also shown how such an analysis can help in dynamic
auction environments, like selecting a team under player-by-player auction. The methodology is generic
and can be easily extended to other sports like American football, baseball and other league games.
Published by Elsevier B.V.

1. Introduction like fielding performance, captaincy (the ability of a captain to


choose the right bowler and field placement according to the
Formation of a good team for any sport is vital to its even- situation), home advantage etc. [11,12]. Optimization studies have
tual success. Team selection in most sports is a subjective issue been done in many sports in the past [18,9] and also has been
involving commonly accepted notions to form a good team. done in various issues in the game of cricket [17]. Just as in most
Application of objective methods for team selection is a relatively league competitions, a pool of players is available along with their
new phenomenon and the effect seems more pronounced after performance statistics. Each player is paid a certain amount of
the involvement of various competitive leagues involving large money by the team owners for playing for their team, which we
budgets to buy players. In this study, we have chosen the game refer to as player’s cost. League organizers impose an upper limit
of cricket for which different performance measures for choosing on budget for each franchise/club to avoid giving undue advantage
players have been suggested [10,3,13,14]. Cricket is a bat-and-ball to rich franchises. Player cost is either fixed by organizers, decided
game played between two teams of 11 players where one team through auction or determined by some contract.
bats, trying to score as many runs as possible, while the other team Twenty20 (T20) is the latest innovation in the game and is even
bowls and fields, trying to dismiss the batsmen one at a time and shorter version than One Day International cricket. The total dura-
thus limiting the runs scored by the batting team [1]. Batting and tion of the T20 game is about 3 h, and each team gets to play a max-
bowling performance of a team are the major criteria affecting its imum of 20 overs. We have considered Twenty20 cricket game and
success along with many other objective and subjective factors the Indian Premier League (IPL) as a test case for our analysis. IPL is
a professional league for Twenty20 cricket competition in India and
is gaining a lot of popularity [23,16]. As of now, only a few recent
∗ Corresponding author. studies have explored team formation in this format since the con-
E-mail addresses: faez@iitk.ac.in (F. Ahmed), cept of IPL and other such competitions are relatively new. Lemmer
deb@iitk.ac.in, kdeb@egr.msu.edu (K. Deb), ajindal@iitk.ac.in (A. Jindal). [13] analyzes the performance of players for the first T20 World cup

1568-4946/$ – see front matter. Published by Elsevier B.V.


http://dx.doi.org/10.1016/j.asoc.2012.07.031
F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414 403

series. Lourens [14] identifies three main categories of the game the given data of only 129 players from IPL 4th edition auction,
(batting, bowling, and fielding) and devises a number of perfor- containing 10 captains and 15 wicket-keepers, the total number of
mance criteria for evaluating players for each category of the game. possible teams under the constraints of at least one wicket-keeper
Thereafter, a linear programming model is constructed and solved and one captain are included in each team can be calculated as
to build a team of 15 players (11 plus four reserve players) for the follows:
South African domestic Pro20 cricket team by maximizing a single    
objective calculated by adding the performance value of each player 10 15 127
total number of teams = . (1)
in the team. Another study [24] uses more sophisticated statistical 1 1 9
performance measures and also employs a single-objective linear
programming approach to construct a fantasy league team. These Considering the large number of different possible team combina-
are nice studies in which some earlier efforts were also discussed tions (to the order of 1015 ), finding optimal teams under the added
and compared. However, one aspect clearly mentioned by these constraint of budget is not a trivial task. Currently most of the team
authors is that all three categories of the game involve multiple selections are done using different heuristics, past experiences, or
performance measures and often they are in conflict to each other. at most using some crude methodologies. For example, one strat-
For example, a batsman may have an excellent record in terms of egy is to choose two or three high performance batsmen or bowlers
‘batting average’ (average runs scored by the player in past games), and the remaining team slots are filled according to budget con-
but may have a poor ‘strike rate’ (rate at which the player scores straints. But, this approach may not always give an optimal or a
runs), trading-off the performance of the player in the batting cate- near-optimal solution, since matches are won by a team effort and
gory of the game. They are in conflict, since a slow but steady player not by having one or two star players in the team. Hence, our aim
will have a better batting average, but a poor strike rate. Ideally, in this paper is to investigate the formation of an overall team that
the game requires a batsman with both qualities, but an impor- is optimal or near-optimal from the point of multiple criteria. Since
tant question then remains: ‘What performance criterion should batting and bowling performances of a team are generally consid-
one use to select a batsman in the team?’ Such questions become ered to be conflicting to each other, we use these two criteria in
even difficult to answer in choosing a batsman over a bowler or a our optimization study, while all other objective and subjective
batsman over a fielder. Although these past studies either choose criteria, such as fielding performance, captaincy, wicket-keeper’s
one criterion for each category or use a priori preference of one performance, brand value of a team and others, are used during the
criterion over other, the studies make one aspect amply clear: a subsequent decision making phase. It is difficult to argue whether
decision-making task in choosing players is truly multi-objective a batting-dominant team is better or a bowling-dominant team is
in nature and a priori consideration of performance measures may better. This is an unavoidable dilemma that the team selectors face
result in a team that may not be agreeable to all. In [20], 35 all- while forming a team. Due to our consideration of both bowling
rounders (who can bat, bowl and field well) are classified based and batting performances in forming a team, we attempt to find a
on ‘strike rate’ (a batting aspect) and ‘economy rate’ (a bowling number of high-performing teams having a good trade-off between
aspect) into four non-overlapping classes – performer, batting all- these two main objectives. Thereafter, we argue and demonstrate
rounder, bowling all-rounder, and under-performer. Although no that a consideration of a number of other criteria applied on the
optimization study is performed, a Bayesian approach is used for obtained high-performing teams makes it convenient for the team
the classification purpose. The key aspect of this study is the use managers to identify a preferred team of their choice.
of two conflicting criteria for classifying all-rounder players and In this paper, we have explored the problem of building a high-
the use of recent statistical techniques in ranking players. Above- performing team out of a set of players given their past performance
mentioned studies make one aspect clear: players can be evaluated statistics and suggested, for the first time, a computational and
from their past performance data using well-established statistical decision-making methodology from the perspective of a multi-
and machine learning procedures for them to be included in a future objective consideration. A novel representation scheme of a team of
team and multiple criteria must be considered to analyze the data 11 players is suggested to handle different constraints associated
for the purpose of constructing a team for optimal performance. with the team selection problem. The elitist evolutionary multi-
Interestingly, such a data-mining procedure can be generically objective optimization algorithm (NSGA-II [6]) is extended to find
applicable to different games for which players’ past performance multiple high-performing teams. A number of realistic decision-
data is available. For example, in the soccer league matches, players making considerations are then used to pick a suitable team.
can be chosen for their positions as goal-keeper, fullbacks, midfield- In the remainder of this paper, we describe the optimization
ers, and forwards depending on their past recorded performances problem corresponding to the cricket team selection problem in
in different positions. A team can be constructed from such infor- Section 2. A novel scheme to represent a 11-player team that auto-
mation for two conflicting team performances – goals scored versus matically satisfies a number of constraints is described next. The
goals conceded. computing procedures of different objective functions are then
In Indian Premier League (IPL) of Twenty20 cricket, the fran- discussed. Section 3 presents the results obtained through our
chise have the task of building a winning team within the budget multi-objective optimization study. Our obtained high-performing
cap, restricted by the IPL board. Individual players are bought by teams are compared against the IPL 4th edition winning team. The
the franchises during a public auction of the players. Since the total (theoretical) superiority of our teams is clear from the figure. The
number of players in the market pool is large, the challenge of find- obtained solutions are analyzed for their sensitivity to the overall
ing a high-performing team becomes an increasingly complicated allowable budget and interesting conclusions are made. Section 4
procedure. Data used for this work (downloaded from the public then suggests a number of decision making techniques to choose a
domain sources) has a pool of 129 players from IPL 4th edition and single preferred team from the set of trade-off teams. This includes
is tabulated elsewhere [2]. The player costs are the IPL 4th edition the standard knee-point approach, a dynamic approach simulat-
auction prices for respective players. We have used other perfor- ing the real auction procedure, and a couple of other interesting
mance statistics of each player from the public domain sources on criteria. A multi-objective optimization study to find a set of teams
the International Twenty20 version of the game. providing a trade-off between batting and bowling performances
With so many players available for choosing a team, the need and then a multi-criterion decision making analysis procedure to
for an effective optimization technique can be justified by making finally pick a preferred team remain as a hallmark feature of this
a rough calculation of the apparent size of the decision space. From study. The procedure is ready to be applied in practice with a
404 F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414

minimal fine-tuning needed to suit various other rules of the IPL in a team can be identical. Players are also tagged as a foreigner (a
team selection process. Conclusions of this study are made in Sec- non-resident Indian player) or not. As an IPL rule, a team should not
tion 5. have more than four foreign players. The final constraint indicates
that overall cost of the team must be within the specified upper
2. Proposed methodology limit. We now describe the computational procedure for each of
three objective functions.
In the game of cricket, player statistics have multiple parame-
ters, like number of matches played, total runs made, batting strike 2.1. Batting performance
rate, number of wickets taken by a bowler, number of overs bowled,
etc. Importantly and interestingly, the values of these parame- A player’s batting average is the total number of runs he has
ters for most active players are available. However, it is important scored divided by the number of times he has been out. Since the
to first identify the statistical parameters that reliably indicate a number of runs a player scores and how often he gets out are pri-
player’s performance. The overall aim of a franchise is to build a marily measures of his own playing ability and largely independent
team of 11 players with optimum bowling, batting, as well as field- of abilities of his team mates, batting average is a good metric for
ing performance within budget and other rule constraints. Rule an individual player’s skill as a batsman. The objective function in
based constraints like presence of at least one player capable of our analysis has been taken as the summation of batting averages
wicket-keeping and one captain have also to be taken into account. of all players. The problem with this approach is that some new
Considering the large amount of statistical data denoting vari- players, even bowlers, may have a batting average comparable to
ous cricketing attributes that is available for each player, we first few of the best established batsmen due to good performance in
tend to reduce the dimension of data. One approach can be to use few matches played. Hence, to avoid this scenario, the concept of
standard batting, bowling, and fielding rating of cricketers obtained primary responsibility of a player is used. A player is identified as
after exhaustive statistical analysis. Such ratings, like the ICC world a designated batsman only if he has scored at least 300 runs in
cricket rating, takes into account multiple factors of performance. an international Twenty20 format. In calculation of a team’s net
But, such a rating system is currently available only for one-day and batting performance, the batting average of players identified as
test match versions of this game. Since Twenty20 version of the designated batsmen is only added to find net batting average. Thus,
game is very different and the performance of a player in one ver- the overall batting average of team must be maximized, but here,
sion of the game does not extrapolate to another version for most we minimize the negative of the net batting average for the conve-
players, we cannot apply the available data for one-day matches nience of showing trade-off solutions.
and test matches to the Twenty20 format, which we are inter-
ested in this study. For simplicity, we have used batting average
and bowling average of a player in the past four editions of the 2.2. Bowling performance
international Twenty20 cricket as a measure of their performance
in batting and bowling. For brevity, the data is taken from the public A bowler’s bowling average is defined as the total number of
domain sources, compiled and stored in a website [2]. runs conceded by the bowler divided by the number of wickets
Each player is assigned a tag indicating the player’s unique iden- taken by the bowler. So, lower the bowling average, the better is the
tity. Using the available data, we then formulate the team selection bowler’s performance. Again to avoid including misleading high or
problem as a multi-objective optimization problem, as follows: low averages resulting from a career spanning only a few matches,
⎧  we qualify a player as bowler only if he has taken at least 20 wick-

⎪ f1 (t)= batting performance(i), ets in Twenty20 format. Only the bowling average of these qualified



⎪ 1 9
i=c,w,p ,...,p
bowlers is summed to get a net bowling average of a team. Total

⎪ bowling average of a team is taken as a measure of bowling perfor-
⎨ f2 (t)= bowling performance(i),
mance and is minimized. Unlike in the batting performance, here,
arg max
t={c,w,p1 ,p2 ,...,p9 } ⎪
⎪ 
i=c,p1 ,...,p9 lower the net bowling average, better the bowling performance.

⎪ f (t)= fielding performance(i).


3 Using such a strategy in optimization may result in a team having

⎪ an exclusion of all bowlers, so that net bowling average of team is
⎩ i=c,w,p1 ,...,p9
zero. Hence, instead of assigning a zero value to a non-bowler, we
(2) assign an artificial penalty of 100 as the bowling average for each
non-bowler. This value (100) is chosen in such a way so as to make
Note that the wicket-keeper (w) does not affect the bowling perfor-
it worse than a designated bowler having the worst (highest) bowl-
mance of a team. The team is subject to the following constraints:
ing average. Hence, a team’s bowling performance is computed by
g1 (t) ≡ c ∈ captain list, (3) adding the true bowling averages of designated bowlers and 100
for each non-designated bowler.
g2 (t) ≡ w ∈ wicket-keeper list, (4)

g3 (t) ≡ no two players are identical in a team, (5) 2.3. Fielding performance
g4 (t) ≡ not more than four foreign players in a team, (6)
A player’s fielding performance is calculated as follows:

g5 (t) ≡ cost(i) ≤ TotalBudget. (7)
player’s fielding performance
i={c,w,p1 ,...,p9 }
total catches taken
Here, t represents a team comprising of c (the tag of the captain of = . (8)
total number of innings played
the team chosen from a captain list (CL) having of 10 names, pre-
sented in [2]), w (the tag of the wicket-keeper of the team from Team’s net fielding performance is summation of all individual
a wicket-keeper list (WL) having 15 names, and p1 , . . ., p9 (tags players fielding performance. Other statistics such as runs saved
of nine other players of the team chosen from 129 total names and run-outs made can also be considered here. The number of
arranged in a ranked list (RL), excluding the chosen captain and stumpings by a wicket-keeper is taken as his wicket-keeper’s per-
wicket-keeper). To have a distinct set of 11 players, no two players formance measure.
F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414 405

2.4. Multi-objective formulation and NSGA-II elements. Thus, they are nine unique integers in the range [1,129].
These are other nine team members. If ‘c’ or ‘w’ is identical to any
To solve the above problem, we employ the elitist non- of the other nine team members, another team member is chosen
dominated sorting genetic algorithm or NSGA-II [6]. We provide at random and the process is continued till the nine members are
a brief description of the NSGA-II procedure here, but readers can different from ‘c’ and ‘w’. Thereafter, the tags of nine members are
refer to the original study for more information. To use NSGA- arranged in ascending order of their values.
II for a team selection problem, one needs to have a meaningful Next, the variable-vector is constructed from the code-vector by
representation scheme, that will be processed favorably to create copying the tag number of every player one by one starting with
feasible teams by NSGA-II’s recombination and mutation opera- the tag of the captain, then the tag of the wicket-keeper, followed
tors. For example, the player represented as a captain in one team by tags of nine team members. Thus, the only difference between a
should be compared with the captain of a mating team in order to code-vector and its corresponding variable-vector is in the first two
create an offspring team that would choose one of the two captains elements. In the code-vector, they are ranks of captain and wicket-
from the mating teams or one from the list of captains as a viable keeper in CL and WL, respectively, and in the variable-vector they
captain. Similarly for the mutation operation, a player should be are tags of captain and wicket-keeper. We illustrate the represen-
changed to another player within its own class – a batsman replaced tation procedure by creating a random population member in the
with another batsman most often and a batsman is replaced following:
to a bowler with a small probability. Thus, the representation- Solution 1:
operator combination is one of the most critical matter as it directly Position: 1 2 3 4 5 6 7 8 9 10 11
Code-vector: 3 6 10 15 36 41 59 75 88 115 125
affects the convergence speed of a genetic algorithm. The rep-
Variable-vector: 96 48 10 15 36 41 59 75 88 115 125
resentation scheme explained below attains the following major In the above team represented by the code-vector, the first ele-
tasks: ment (3, created at random from [1,10]) represents the third player
(Adam Gilchrist) in the captain list (CL). The second element (6,
• A generic scheme which can be directly employed in other sports
created at random from [1,15]) represents the sixth player (Parthiv
where different roles of individuals are required in a team and Patel) from the wicket-keeper list (WL) shown in the same table.
they have to be chosen from a set of viable players. The remaining nine elements in the code-vector are chosen at ran-
• The scheme ensures meaningful GA operators acting on similar
dom from the ranked list (RL) and presented in ascending order
players. of the tag value. The variable-vector is constructed then from the
• Teams formed by genetic operations are repaired so as to reduce
code-vector. Interestingly, the variable-vector is identical to the
the chance of creating infeasible teams. code-vector except at the first two positions. The tag value of
• Caters to practicalities, such as one player can act in multiple
Adam Gilchrist and Parthiv Patel are taken from the RL and put
roles (such as wicket-keeper and captain are the same player in in the first and second position in the variable-vector, respec-
a cricket game, and goal-keeper and captain are the same player tively. The variable-vector is then used to compute the objective
in a soccer game). and constraint values. The code-vector will be used for the genetic
operations – recombination and mutation, as discussed later.
As explained above, the first task in applying a GA is to choose a There is a possibility of duplication that we resolve in our repre-
representation scheme that is suited to handle constraints and also sentation scheme. In certain population members, the chosen first
to facilitate an easier use of genetic operators. To represent a team in two elements of the code-vector may refer to the same player which
our proposed NSGA-II and to also take care of first three constraints is a wicket-keeper captain. In this case, the variable-vector will indi-
(Eqs. 3 to 5), each player is assigned a tag, the details of which are cate that its first two elements are identical and in reality there are
presented in [2]. To assign a tag for a player, first, we sort all 129 only 10 players represented by the vectors. To resolve this issue,
players according to player cost and then assign a unique integer we simply replace the second element of the variable-vector with
number indicating the player’s rank in the range [1,129]. The sorted a random integer from RL and ensure that it is not identical to the
list is called ranked list or RL. Next, from the given 129 players, we 10 players already present in variable-vector. Let us illustrate this
identify the captains and arrange them in ascending order of their scenario with an example population member:
price in a captain list, or CL. There are 10 captains in the list, thus Solution 2:
the list contains integers in the range [1,10]. Similarly, the wicket- Position: 1 2 3 4 5 6 7 8 9 10 11
keepers are identified and put in an ascending order of their price Code-vector: 2 10 5 16 54 68 77 95 101 113 122
Variable-vector: 86 86 5 16 54 68 77 95 101 113 122
in the wicket-keeper list, or WL. There are 15 wicket-keepers in the
Mod. var.-vector: 86 72 5 16 54 68 77 95 101 113 122
list, thereby making the list taking values in the range [1,15]. Note
that a particular value in CL may refer to the same player corre- Here, the captain and wicket-keeper is the same player (Kumar
sponding to a value in WL, and all values in CL and WL correspond Sangakkara). Thus, although the code-vector has 11 distinct ele-
to players represented in RL. There is a reason for this compli- ments, there are in fact 10 players. To resolve this problem,
cation of maintaining three lists in our representation scheme, we change the second element in the variable-vector by a ran-
which will be clear in the next paragraph. We now discuss the con- dom tag from RL that is not identical to any element already
struction procedure of a random population member in the initial present in the entire variable-vector. Say, we choose the player
generation. with the tag 72 in this case. The revised variable-vector is then
Every population member is represented with two vectors: (i) formed and used for computing objective function and constraint
a code-vector and (ii) a variable-vector. We first discuss the proce- values.
dure of creating a code-vector. The first element of the code-vector In an occasion in which the chosen captain is also a wicket-
is created at random from CL. Thus, this element is a random inte- keeper (such as Adam Gilchrist, Kumar Sangakkara or M.S. Dhoni)
ger in the range [1,10]. The corresponding name (say, ‘c’) is the but the chosen wicket-keeper is a different person (say Brendon
captain of the team. The second element of the code-vector is cre- McCullum or Parthiv Patel), we still consider the situation as if
ated at random from WL and is an integer in the range [1,15]. The the wicket-keeper is identical to captain (that is, the captain will
selected name (say ‘w’) is the wicket-keeper of the team. There- also perform the job of wicket-keeping) and replace the chosen
after, the remaining nine elements in the code-vector are selected wicket-keeper with a random player (but non-identical to any other
at random from RL, so that there is no repetition among these nine already chosen players) from RL.
406 F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414

It is interesting to note that the above procedure guarantees parent mates with that of the second parent so that two offspring
that the first three constraints are always satisfied in all popula- codes in the range [1,15] are created. Note here that when solu-
tion members. The variable-vector can now be used to compute the tions 1 and 2 mentioned above are recombined, code values 6 and
objective function and remaining constraints, as mentioned above. 10 will be used as parent integers and two offspring integers will be
Instead of using any sophisticated methods to satisfy the fourth created. Thus, in this situation, the wicket-keeper captain (Kumar
and fifth constraints, here, we simply count the number of foreign Sangakkara) will be participate twice (if both positions are to be
players in the team and calculate the total budget for hiring the recombined) in the recombination operations – once as a captain
team. Thereafter, we check these two values against their allowable for the first position and again as a wicket-keeper for the second
values. If they are within limits, we declare the team as a feasible position. The remaining nine code-vectors mate element-wise from
team, otherwise, we declare it as an infeasible team and compute the third element to 11th element and offspring code values in the
constraint violation as follows: range [1,129] are created. A little thought will reveal that since the
remaining nine code values are arranged in ascending order of their
number of foreign players

tag values, a variable-wise recombination becomes meaningful in
CV(t) = min 0, −1 the sense that a low-cost player gets a chance to mate with another
4
  low-cost player, and a high-cost player mates with another high-
i={c,w,p1 ,...,p9 }
cost(i) cost player. We use the integer-version of the SBX operator [5] as
+ min 0, −1 . (9) a recombination operator. In this operator, every element in the
TotalBudget
code-vector is recombined with a probability 0.5, thus on an aver-
The constraint violation value of a team will be used in NSGA-II’s age 50% of the elements get recombined in a crossover operation.
selection operator, which we shall discuss a little later. Note that Note that even if one parent involves a wicket-keeper captain and
for a feasible solution, the constraint violation value is zero. another involves a different captain and a wicket-keeper, since they
From the available bowling, batting, and fielding average values are recombined with their code-vectors, a meaningful recombina-
of each of the 11 chosen players for a team, we can then compute tion will always take place. Each of the offspring solutions are then
the overall bowling, batting and fielding strengths for the team. mutated with a integer-version of the polynomial mutation oper-
NSGA-II is capable of handling three objectives and we can employ ation [4] to slightly perturb them. This is done with a mutation
it to find a trade-off frontier that is expected to have individual probability pm . After the mutation operation, the corresponding
optimal solutions and also a number of different compromised variable-vector of each offspring is constructed. In the event of a
solutions. A little problem knowledge will reveal that a team hav- repetition of tag values, additional mutation operations are per-
ing an extremely good fielding ability and not enough batting and formed till a distinct 11-member team is formed. These operations
bowling skills may not be what a team selection committee may guarantee that all created offspring solutions satisfy the first three
be looking for in a team. Thus, instead of treating the fielding per- constraints. We then rely on NSGA-II’s constraint handling capa-
formance as an objective of its own in our optimization study, we bilities to satisfy fourth and fifth constraints and create feasible
treat it more like a secondary objective and use it to break a tie population members. After the new population Qt is created, it is
among trade-off solutions. In the tournament selection operator of combined with Pt and a combined population Rt = Pt ∪ Qt is formed.
comparing two solutions, if both are non-dominated to each other The combined population is then sorted based on non-domination
from the bowling and batting objectives, the team having a better level (with respect to the bowling and batting objectives). There-
fielding performance wins the tournament. after, to create the next generation population Pt+1 , solutions from
The NSGA-II procedure goes as follows. From the current set the first sorted front is selected first, then the second front mem-
of N population members describing population Pt , a new set bers are taken, and so on. This process is continued till no more
of N solutions (population Qt ) is created repeated use of tour- fronts can be accommodated to the new population (recall that
nament selection, recombination and mutation operators. In the Rt has a size twice to that of Pt+1 ). The last front that cannot be
tournament selection comparing two population members, a fea- fully accommodated in Pt+1 is undergone with more criteria. First,
sible solution is preferred over an infeasible solution and a lesser the usual crowding distance values are assigned to each member.
constraint-violated infeasible solution wins over another infeasible Second, the individual objective-extremes are chosen straightway.
solution. When two feasible solutions are compared, a hierar- Third, the front members are sorted with the third objective (field-
chy of decision making is performed. First, a solution dominating ing performance) and the remaining population slots are filled with
its tournament competitor wins. Second, if both solutions are solutions having higher fielding performance values. In the event of
non-dominated to each other, the one having better fielding per- a tie, solutions are chosen based on their crowding distance value.
formance wins. Otherwise, if both solutions have the same fielding This ends the operation of a generation and Pt+1 is declared as the
performance, the one residing on a less-crowded region in the next generation population. NSGA-II operators are continued in this
objective space wins. This is achieved by computing the crowd- fashion till a predefined number of generations are elapsed.
ing distance operator [6]. Two such selected solutions are then
mated under a recombination scheme (we discuss the recombina-
tion scheme in the next paragraph) using a probability pc and two 3. Multi-objective optimization results
new offspring teams are created. If recombination operation is not
performed, two selected solutions are saved as offspring solutions. Here we present the simulation results of the above-mentioned
To achieve a meaningful recombination operation between two algorithm applied to the player database. The budget constraint
NSGA-II population members, we use the code-vector of two par- is considered to have TotalBudget=6 million dollars. At least one
ents, instead of their variable-vectors. Thus, a code of the captain wicket-keeper, one captain, and a maximum of four foreign players
represented in one parent mates with that in the second parent so must be included in a team, as mentioned before. We use the follow-
that two offspring codes in the range [1,10] are created. For exam- ing standard parameter settings in all our simulations: population
ple, if solutions 1 and 2 mentioned above are recombined at the size = 400, maximum generations = 400, crossover probability = 0.9,
first position as parents, code values 3 and 2 will get recombined mutation probability = 0.05, SBX distribution index = 10 [5], and
as integer numbers in the range [1,10] and two offspring inte- mutation distribution index = 20 [4].
gers, representing the code value of the respective captains from Fig. 1 shows the trade-off front obtained by our modified NSGA-
CL, will be created. Similarly, the code of the wicket-keeper of one II. Each point on the trade-off front represents a team of 11 players.
F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414 407

−100 −100
6 million USD
Team D
5 million USD
−150 −150 7 million USD
No Budget
−Net Batting Average

−200 −200

−Net Batting Average


Chennai Super Kings
Team C
−250 −250

Team B −300
−300

Team A
−350 −350

−400 −400
200 300 400 500 600 700 800 900 1000
Net Bowling Average
−450
Fig. 1. Multi-objective trade-off front obtained by NSGA-II is shown. CSK team is 200 400 600 800 1000 1200
outperformed by Teams B and C (obtained in this study) on batting and bowling Net Bowling Average
performances.
Fig. 2. Budget sensitivity analysis illustrating how the trade-off front changes with
the upper limit on the total budget.
A few teams corresponding to the trade-off points marked on the
figure are shown in Table 1. Each team has a captain, a wicket-
keeper and at most four foreign players. Also, they all satisfy the 3.1. Budget sensitivity analysis
stipulated budget constraint. The right extreme of the front marks
the team having highest overall batting average, while the left To analyze the effect of budget constraint on a team’s per-
extreme shows the most bowling dominant team. The trade-off formance we have performed a sensitivity analysis where the
between these two objectives is clear from this figure. optimization process is run for a range of budget constraints and the
To compare our results, we consider a real team of 11 crick- corresponding trade-off front obtained by our modified NSGA-II is
eters who played for the Chennai Super Kings (CSK) team in the plotted. It can be seen from the Fig. 2 that the budget constraint
last IPL (took place in India during 8 April to 28 May 2011) and affects batting dominant teams more than bowling-dominant
also won the final. The bowling and batting performance of this teams. This is because the price difference among batsmen with
team are calculated using the same procedure as described above. a high batting average and those having a low average is signifi-
The total cost of hiring the CSK team is estimated to be around cant. The same effect does not exist among bowlers and hence the
7.5 million dollars. It turns out that it is not a feasible team in our minimum bowling average region of the trade-off front is fairly
representation because of its spending more than 6 million dollars unchanged due to increase or decrease in the TotalBudget value. To
in hiring the players. Despite spending more money, CSK team has get an idea of the extreme trade-off front, we also perform a run in
bowling and batting performances that are worse than a number of which no limit on the total budget is imposed. Such an analysis can
teams found by our NSGA-II. The point representing the CSK team is provide the team managers an idea about how much gain in bat-
shown on the objective space in Fig. 1. It can be seen that the team ting and bowling averages is possible by spending certain amount of
is non-optimal as well as costlier. Similar observations were found more money in hiring a team. Interestingly, for the chosen player’s
for other past IPL teams as well. For brevity, we do not discuss them performance statistics and price, a large amount of investment need
here. not make a better bowling team, however, a better batting team is
We now perform a number of post-optimality analyses to deci- possible to be formed by investing more money.
pher and understand useful insights about the team selection Table 2 shows the best batting teams found by our proposed
problem. NSGA-II procedure for different budget restrictions. It is clear how

Table 1
Four teams chosen from the trade-off front (Fig. 1) obtained by NSGA-II. The first row marks the captain and the second row marks the wicket-keeper of a team. Foreign
players in a team are shown in italics.

Team A Team B Team C Team D

Sachin Tendulkar Yuvraj Singh Yuvraj Singh Yuvraj Singh


Wriddhiman Saha Wriddhiman Saha Wriddhiman Saha Wriddhiman Saha
Michael Hussey N. McCullum J.P. Duminy R. Ashwin
Manoj Tiwary Manoj Tiwary Sudeep Tyagi Sudeep Tyagi
Rahul Dravid Ravindra Jadeja Ravindra Jadeja N. Rimmington
Players Suresh Raina Suresh Raina Suresh Raina Paul Collingwood
Shaun Marsh James Franklin James Franklin Steven Smith
Aaron Finch Brad Hodge Brad Hodge Pragyan Ojha
A. McDonald A. McDonald A. McDonald Shakib Al Hasan
Shikhar Dhawan Shikhar Dhawan Jaidev Unadkat Jaidev Unadkat
Naman Ojha Amit Mishra Amit Mishra Amit Mishra

Bat. avg. 378.0 324.1 269.8 120.2


Bowl. avg. 949.5 506.6 341.0 277.9
Cost ($) 5,950,000 5,930,000 5,845,000 4,935,000
408 F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414

Table 2
Best batting teams chosen from the trade-off frontiers (Fig. 2) obtained for different budget restrictions. First row indicates captains and second row indicates wicket-keepers.
Foreign players are shown in italics.

5 M$ 6 M$ 7 M$ No limit

Sachin Tendulkar Sachin Tendulkar Sachin Tendulkar Sachin Tendulkar


Wriddhiman Saha Wriddhiman Saha Wriddhiman Saha M.S. Dhoni
V.V.S. Laxman Suresh Raina Shaun Marsh Michael Hussey
Rahul Dravid Rahul Dravid Rahul Dravid Manoj Tiwary
Aaron Finch Shaun Marsh Suresh Raina Aaron Finch
Players Shaun Marsh Aaron Finch Aaron Finch Rohit Sharma
Manoj Tiwary Michael Husse Manoj Tiwary Saurabh Tiwary
Mohammad Kaif Shikhar Dhawan Michael Hussey S. Badrinath
Shikhar Dhawan A. McDonald A. McDonald A. McDonald
Michael Hussey Naman Ojha Shikhar Dhawan Shaun Marsh
Andrew McDonald Manoj Tiwary S Badrinath Suresh Raina

Bat. avg. 364.7 378.0 389.3 403.1


Bowl. avg. 1022.4 949.5 949.5 874.7
Cost ($) 4,910,000 5,950,000 6,530,000 11,030,000

most teams include well-renowned batsmen in each of the teams. batsmen and bowlers (mentioned above) would fail to capture
When no budget restriction is used, the team got multiple expen- these commonly-appearing high-valued players on the trade-off
sive batsmen, such as Sachin Tendulkar and M.S. Dhoni in one team. front and some famous and highly-priced specialist players may not
Although none of these teams do well in the bowling compartment have a good batting-bowling-fielding trade-off to appear as a good
of the game and a franchise may not choose any of these extreme choice for an effective team. In a sense, the above analysis of iden-
teams, the purpose of presenting these teams is to demonstrate tifying frequently appearing players on the entire trade-off front
that the proposed NSGA-II is able to find such dream batting teams provide the names of effective players for all three compartments
as best batting teams for different budgetary restrictions. It also of the game. These players are commonly-appearing ingredients
emphasizes the motivation for choosing an intermediate trade-off of high-performing teams and hence must be designated as most
solution as a preferred team which would have a good trade-off valued players for the overall effective performance of a team. It
between batting and bowling aspects of the game. should be noted that a player’s frequency is not a direct indica-
tor of how good an individual player is on any one compartment
3.2. Player frequency table of the game, but rather indicates his importance in the formation
of a high-performing team considering budget constraints, all three
Analyzing the trade-off front shown in Fig. 1 not only provides aspects of the game (bowling, batting and fielding) and importantly
us with a set of high-performing teams but can also give us a the chemistry involved in other player’s presence in the team. A
better insight in team selection strategies under a multi-criteria knowledge of such players may be useful in the dynamic auction
decision making process and in a dynamic, auction-like situation. process of choosing a team as well.
In this section, we reveal some important features of choosing a Another observation that can be made by an analysis of the
high-performing team based on the innovization principle recently obtained trade-off solutions is the cost of hiring each team under
suggested elsewhere [8]. the given budget restriction. An important question to ask is
To prepare a cricket team selection strategy, identification of that although a budget of 6 M$ was the restriction, did all high-
key players is an important task. A usual approach for this purpose performing trade-off teams utilize all that budget? Fig. 4 shows
would be to identify the top batsmen or top bowlers as key play- the amount of money spent to form high-performing teams on the
ers for the team [14], but it turns out that choosing such players
usually come with a high cost and a team having a bunch of top
domain-specific good players may not result in a high-performing 100
and cost-effective team. To identify the key players in a team we Saha
Amit
suggest a novel concept here. All teams appearing on the trade- 90
Hodge
off front are considered for an analysis and the frequency of each Raina Yuvraj
80 McDonald
player appearing on all teams on the trade-off front is computed.
For example, say, there are 50 teams on the trade-off front and the 70
player, Sachin Tendulkar, appears in five of these teams. Hence, his
frequency is 5/50 or 10%. Fig. 3 shows the frequency of all players 60
Frequency

Unadkat
from the dataset on a bar graph. The players are represented by the
tags assigned to them in the optimization process. Names of a few 50 Manoj
high frequency players are also mentioned on the bar graph. From
40
the figure, it can be seen that many high-cost players appearing
on the right half of bar graph have a zero frequency. Secondly,
30
only 35 of the 129 players have frequency greater than zero and Piyush
appears in the teams on the trade-off front, meaning that other 20 Dravid
94 players do not cause a good trade-off between batting, bowl- Sachin
ing and fielding performances when a fixed budget is in mind, to 10
appear on any of the high-performing team. Some of these most
frequently appearing key players can be identified from the bar 0
0 20 40 60 80 100 120 140
graph and can be given high priority in forming a team during Player Tag
the auction process. A close examination of the bar graph indi-
cates that our common-sense method of forming a team with top Fig. 3. Frequency of players appearing on teams from the trade-off front (Fig. 1).
F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414 409

6
x 10

Team Cost
4

0
949.46 649.05 556.26 482.53 409.23 392.80 314.53 298.15 286.21 278.10
Bowling Strength of Pareto Solution

Fig. 4. Total investment in hiring each team on the trade-off front with 6 M$ budget restriction.

trade-off front. The teams are arranged from high batting strength will be on one hand disappointed that there does not exist a high-
(on the left) to high bowling strength (on the right), or high to small performing team that will cost so much and on the other hand the
values of bowling strength. It is clear from the figure that teams franchise would be happy that with a much lesser money a high-
having high batting strength must spend almost all the allocated performing team can be hired. Thus, the above analysis not only
budget in hiring the teams, as most good batsman are more costly helps get essential qualitative information of how to form a high-
than good bowlers (see the resource containing players’ statis- performing team, but also provides quantitative information about
tics [2]). Only predominantly bowling teams require about 80% of specifics and, above all, the names of a real team with 11 players
cost needed to hire a good batting team. To investigate if this is a that would do the trick.
trend with other two trade-off frontiers (having 7 M$ and unlim- When we investigate the number of foreign players in each of
ited budget), we plot the total amount of money needed to hire these high-performing teams across all three fronts, we observe
each of the high-performing teams with these budget restrictions that they all have exactly four foreign players in each team, despite
in Figs. 5 and 6, respectively. A similar trend can be observed from the fact that there could have been zero to a maximum four foreign
these two figures as well. In the case of 7 M$ budget, most teams uti- players present in any team. This tells us that one way to choose a
lize the stipulated allowed budget, except that the optimal bowling high-performing team is to make sure there are four foreign players
teams use less budget. For the case of unlimited budget restriction, (the maximum number allowed) in a team, no matter whatever is
the result is slightly different. Optimal batting teams require higher the restriction on the overall budget. Since a few foreign players are
budgets compared to optimal bowling teams, but the teams with a considered in the database, they are supposed to be good players
good trade-off in both aspects of batting and bowling (intermediate and it is interesting that the quota of four foreign players is filled
teams) requires a high budget of around 12–14 M$. This also reveals in each optimal team.
that if a rich franchise is willing to spend more money (say to the Some of these insights are not obvious or intuitive before the
tune of 20 M$) to hire probably the best team around, the franchise optimization run is performed and resulting solutions are analyzed.

6
x 10

5
Team Cost

0
949.46 719.64 586.66 490.78 415.28 389.23 307.72 294.20 287.77 283.52 277.85

Bowling Strength of Pareto Solution

Fig. 5. Total investment in hiring each team on the trade-off front with 7 M$ budget restriction.
410 F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414

6
x 10
15

10

Team Cost

0
874.65 490.41 349.74 315.73 300.14 288.71 277.41
Bowling Strength of Pareto Solution

Fig. 6. Total investment in hiring each team on the trade-off front with no budget restriction.

Clearly, the above optimization-cum-analysis procedure reveals 4.2. Multiple criteria non-dominated sorting approach
crucial insights that can act as thumb-rules dictating what a team-
manager must do to put together a high-performing team. The knee region approach described above does not take into
account many subjective factors that may, in reality, define a good
team, such as the following:
4. Team selection as a decision making process

The objective of the entire process is to obtain a high-performing 1. Direct fielding performance.
single team of 11 players, rather than just finding and suggesting 2. Success rate of the captain.
a number of trade-off teams. Having identified a set of high- 3. Wicket-keeper’s performance.
performing trade-off teams with different bowling, batting and 4. Team expenditure.
fielding performance values, we shall now discuss a few pragmatic 5. Brand value of players, and others.
methodologies in choosing a particular team. Toward this goal, we
use certain multi-criteria decision making (MCDM) methods [15] To take into account these factors, we consider the solution
combined with certain subjective considerations. Decision-making set obtained from knee-region analysis and compute the above-
in such a scenario is a difficult task, as the decision-makers need to mentioned factors. Thereafter, a ranking of importance of the above
keep in mind a number of different subjective yet important con- factors are gathered from the selector’s point of view. For example,
siderations, such as biasing for zonal players for satisfying fan-base, the direct fielding performance may be the most important factor
inclusion of star players despite their current poor performances among the others mentioned above in our team selection strategy.
etc. After the initial optimization analysis, we have a trade-off So, we sort the solution set with respect to fielding performance and
front, similar to that shown in Fig. 1. Different methods now can pick the solution having the best fielding side. In the event of a tie
be adopted for this purpose, but here, we suggest the following on direct fielding performance, the next ranked factor can be used
two methods for the selection of a final team. to choose a team in a lexicographic manner. Some other MCDM
techniques, such as an aggregate voting strategy [15] or analytical
4.1. Knee region approach hierarchy process (AHP) [19] can also be used.
To another extreme, if all the above factors are equally
The obtained trade-off front comprises of a set of points in the important, a domination approach can be applied to find the non-
objective space representing various teams. If the trade-off front dominated teams with respect to all the above factors. The resultant
has a knee-region, it is advisable to select a solution that lies in best front solutions are then considered for further analysis. To
the knee region. Such a solution is most always preferred because illustrate, we use here the first three criteria of fielding, captaincy,
deviating from the knee region means that a small change in the and wicket-keeping performance for a non-dominated analysis.
value of one of the objectives would come from a large compromise After identifying the non-dominated teams, we sort them accord-
in at least one other objective. A recent study suggested a number ing to remaining factors, such as, the team expenditure or the brand
of viable ways of identifying a knee point in a two-objective front value of players. Using a non-dominated sorting of the teams in the
[7]. Although a knee region may not exist in all trade-off frontiers, knee region using the first three factors, we obtain only the follow-
a visual inspection of trade-off fronts shown in Fig. 2 reveals that ing two teams on the non-dominated front. The players in these
in general a batting-bowling trade-off front exhibits a knee in most teams are given below:
fixed budget conditions for the game of Twenty20 cricket.
Applying the knee-finding methods on our trade-off front Team 1: Yuvraj Singh (C), Wriddhiman Saha (W), Suresh Raina,
shown in Fig. 1, we observe that teams B and C lie near the knee Manoj Tiwary, Roelof van der Merwe, Amit Mishra, Brad Hodge,
regions, although the knee region is not visually apparent to have Shikhar Dhawan, Nathan McCullum, Andrew McDonald, Ravindra
a knee for the obtained front. These teams are shown in Table 1. Jadeja.
Teams C and D show a reasonable compromise between batting Team 2: Sachin Tendulkar (C), Wriddhiman Saha (W), Suresh
and bowling performances as compared to Teams A and D. Raina, Manoj Tiwary, James Franklin, Amit Mishra, Brad Hodge,
F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414 411

Shikhar Dhawan, Ryan ten Doeschate, Andrew McDonald, Ravindra −100


Jadeja.

Team S
To illustrate, we now use the team expenditure factor and
−150
observe that the cost of Team 1 is lesser than that of Team 2. Hence,
Team 1 would be our final preference.

−Net Batting Average


4.3. Customized criteria −200
Team R
Using the above multiple criteria non-dominated sorting
approach, finally Team 1 is proposed to be high-performing and −250
preferred team within the six million dollar cost constraint. From Team Q
Table 1, it can also be observed that the obtained teams have over-
all high batting and bowling average but lack famous good bowlers
(such as Dimitri Mascarenhas (21), Nathan McCullum (25), and −300 Team P
Piyush Chawla (101)). We observe that this is a problem originating
due to statistical data considered where the difference in bowling
average of a good bowler and a normal bowler is marginal, while the −350
price differences are significant. In Twenty20 format of the game, it 200 300 400 500 600 700 800
can be argued that a few well-known experienced players can play Net Bowling Average
a key role toward a success in a match and just overall strength of
team may not be not enough. Such ‘star’ players are necessary for a Fig. 7. Trade-off front for teams with star players.
wide fan base of a franchise as well. Hence, a particular team man-
agement may require at least a few star players in its team which
can be considered as an additional constraint. This then requires 4.4. Dynamic optimization
that instead of using a generic algorithmic technique, some cus-
tomized criteria may be included in choosing a single preferred The above analysis focused on attaining the optimal set of teams
team. and then using decision making approaches to identify key players
To demonstrate the effect of customized criterion, we consider and optimal team. The first step resembles to selection of a fantasy
a scenario in which an emphasis of including star players is made. cricket team. The knowledge obtained from such analysis can also
We define a star batsman as a player who has scored at least 1,500 be directly extended to an auction scenario which is currently in
runs with a batting average greater than 30 in Twenty20 format and use at IPL team selection process. This makes the player selection
a star bowler as one who has taken at least 70 wickets with a bowl- process a dynamic event. In an auction based team selection sys-
ing average less than 25. In our simulation, we also assume that a tem, as in IPL, the entire pool of players is not available in one go for
franchise decides to keep at least two star batsmen and three star team selectors to choose. Besides, the final price of players is also
bowlers in its team. Other constraints imposed are six million dollar not fixed beforehand. Only a minimum base price for each player is
budget, a maximum four foreign players, and at least one wicket- usually specified. In [22], an analysis of IPL 2008 auction procedure
keeper, and one captain. The resultant trade-off front is shown in has been done and interesting conclusions have been made where
Fig. 7. Again, using the multiple criteria non-dominated sorting the author throws light on different facets of the auction and pro-
decision making procedure outlined above, we observe that only poses the use of a draft. Singh, Gupta and Gupta [21] focuses on
Team Q lies on the non-dominated front using the primary criteria formulating an integer programming model for the efficient bid-
of fielding, captaincy and wicket-keeping. Team Q along with a few ding strategy for the franchises and implements it in the form of a
other teams corresponding to points marked on the trade-off front spreadsheet for an easy use. The auction broadly comprises of two
are shown in Table 3. A comparison of four teams in this table with tasks – the assignment of players to teams and the determination
the two teams (1 and 2) listed at the end of Section 4.2, it is clear of player salaries. The role of optimization comes into play mainly
that most of these new teams have star players, thereby making for the assignment of players to a team and the player price factor
these teams well-acceptable among the fans and also to not have acts as an constraint. Our proposal to handle such a scenario is as
to compromise on the overall batting-bowling performance of the follows.
teams. Interestingly, all trade-off teams have exactly four foreign First, a pre-processing optimization analysis using the highest
players, thereby making the fourth constraint active at the trade-off expected price of each player is done. The decision making pro-
solutions. cedure discussed above helps select one desired team from the

Table 3
Four teams chosen from the trade-off front with customized conditions on star players. The first row marks the captain of a team and second row marks the wicket-keeper,
except in Team S the captain is also a wicket-keeper. Foreign players are shown in italics.

Team P Team Q Team R Team S

Sachin Tendulkar Yuvraj Singh Yuvraj Singh M.S. Dhoni


Wriddhiman Saha Wriddhiman Saha Wriddhiman Saha Paul Collingwood
S Badrinath Nathan McCullum Kieron Pollard R Ashwin
Manoj Tiwary Manoj Tiwary Manoj Tiwary Pragyan Ojha
Nathan McCullum Brad Hodge Dimitri Mascarenhas Brad Hodge
Suresh Raina Suresh Raina Suresh Raina Munaf Patel
Shaun Marsh Piyush Chawla Brad Hodge Steven Smith
Amit Mishra Amit Mishra Amit Mishra Amit Mishra
Andrew McDonald Andrew McDonald Andrew McDonald Dimitri Mascarenhas
Shikhar Dhawan Shikhar Dhawan Jaidev Unadkat Jaidev Unadkat
Dimitri Mascarenhas Steven Smith Sudeep Tyagi Sudeep Tyagi
412 F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414

obtained trade-off front. It may happen that in the auction pro- −100
cess, some players from the desired team are already bought by Phase 1
other franchises. Thus, team owners must know trade-offs and Phase 2
alternative choices during the auction process. Hence in such a −150
dynamic environment, a systematic approach needs to be followed.
Using the initial bar graph analysis (shown in Fig. 3), high valued

−Net Batting Average


players are first identified. These players must then be given pri- −200

ority for selection. After Phase 1 is completed, the optimization


and decision-making process need to be simulated again with the
−250
remaining players. Here, the selected players must be included in
each team considered during the optimization process and deci-
X: 556.5
sion making process. The new set of high valued players can then Y: −310.6
−300
be identified by using another bar graph analysis process and the Team 3
next set of players can be identified. At each phase, the selection of
players from the obtained trade-off front must be done judiciously −350
from the high frequency (valued) players to ensure a proper balance
of the team. Hence using such a procedure repeatedly in a dynamic Team 2
optimization mode a team with a high overall performance can be −400
obtained. We perform a case study in the following paragraph. 200 300 400 500 600 700 800 900 1000
Suppose, in Phase 1 of auction, 11 players with tags 27, 120, Net Bowling Average
121, 122, 123, 124, 125, 126, 127, 128, and 129 are considered in
Fig. 8. Trade-off front obtained in Phase 2 is compared with that in Phase 1 during
the auction pool, meaning that only these players are now avail- the auction-based decision-making case study.
able to picked by any franchise. Let us also assume that after Phase
1 optimization and decision-making, the franchise X selects Team 1
mentioned in Section 4.2 to be the most preferred team. However, non-dominated sorting analysis with the new trade-off teams as
let us also consider a predicament where the captain of Team 1, before, we choose the following team as our next preferred team:
Yuvraj Singh, has already been bought by another franchise and is
Team 3: Sachin Tendulkar (C), Naman Ojha (W), Roelof van der
no longer available for franchise X. Thus, Team 1 as a whole cannot
Merwe, Shikhar Dhawan, Manoj Tiwary, Piyush Chawla, Amit
be a preferred choice any more and franchise X now needs to look
Mishra, Nathan McCullum, Andrew McDonald, Brad Hodge and
for another high-performing team from its multi-objective opti-
Suresh Raina.
mization results. Let us assume that after reviewing all the trade-off
high-performing teams of Phase 1 that do not contain Yuvraj Singh, Team 3 is shown on the objective space in Fig. 8 along with Team
Team 2 with Sachin Tendulkar as a captain (mentioned in Sec- 2. Franchise X would then choose a player from Team 3 based on
tion 4.2) becomes the next preferred team for franchise X. Of the the next announced auction pool of players and this process can
above-mentioned auction pool of 11 names, Sachin Tendulkar (tag continue till the complete team is formed. Since at every phase,
number 120), is then opted and finalized as a team member of fran- the previously-chosen players are considered in both optimization
chise X. In the next phase, all the above 11 names in the auction and decision-making processes, the players are picked as a team
pool are bought by other franchises and are no longer available for and the dynamic formation of the team will produce a team that
further consideration. was best possible to achieve given the auction nature of the team
In this scenario, franchise X can perform another multi-objective selection process. The above procedure is flexible and if instead of
optimization run with Sachin Tendulkar as a captain and look for one-player-at-a-time auction, if multiple-player-at-a-time auction
a team having 10 more players. From the set of 129 players, the is played, the above procedure will allow a simple way to choose
Phase 1 auction pool players are also eliminated and remaining
119 players become the pool from which 10 players are to be cho-
sen. Other constraints will remain and a new optimization run can 90
Naman Sachin
be executed. The resulting trade-off front is shown in Fig. 8. It is Amit
not a surprise that with a restricted set of (119) players available 80
Mcdonald Raina
to choose from, the Phase 2 front is dominated by Phase 1 front Hodge
obtained with all 129 available players. Due to a slot fixed with 70
Sachin Tendulkar (who is primarily a batsman), the team with the
best bowling performance is somewhat worse that the best bowl- 60
ing team in Phase 1. However, the presence of a good batsman in the Unadkat
Frequency

team makes the best batting team in Phase 2 to have a marginally 50


Sudeep
worse performance than the best batting team in Phase 1. The sight Manoj
deterioration in the batting performance comes from the unavail- 40
ability of another batsman Wriddhiman Saha (Tag 27) in Phase 2.
After obtaining the trade-off front, we perform another bar-graph 30
analysis to investigate the most valued players in Phase 2. The bar Ashwin
graph is shown in Fig. 9. It can be seen how the frequency of most 20
players remains similar to the previous bar graph (Fig. 3), except
for a few players for whom it instantly rises. For example, the fre- 10
quency of Naman Ojha as a wicket-keeper shoots up, again due to
absence of Wriddhiman Saha (Tag 27). It can be observed that in 0
0 20 40 60 80 100 120 140
Phase 1 most trade-off teams had him as a wicket-keeper and his Player Tag
absence has increased the suitability of Naman Ojha as the next
choice for a wicket-keeper. Again repeating the knee region and Fig. 9. Frequency of players during Phase 2 auction.
F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414 413

a set of players to be picked at every phase. For example, if two selection using a multi-objective genetic algorithm and multiple
players are allowed to be picked in every phase, in Phase 2, two top criteria decision making aids. Such problems usually must consider
valued players based on the frequency or other decision-making a number of objective and subjective criteria that all must be paid
criteria may be chosen. attention to. In this paper, we have suggested choosing a couple of
It is worth mentioning here that the average execution time main functional criteria – batting and bowling performances – dur-
needed to perform the above dynamic optimization study took ing the initial multi-objective optimization study and using other
less than a minute on a standard laptop, hence this procedure is more subjective criteria during the subsequent decision making
realistic to be tried in a real auction environment. We have demon- phase. For the optimization task, a novel representation scheme
strated how the procedure can be used in a dynamic manner with has allowed feasible solutions to be found in a convenient manner
certain IPL rules and decision-making criteria, but any other IPL and enabled simple genetic operators to be employed. Starting with
team selection procedures and other decision-making criteria can a list of 129 players and their available statistics, the proposed mod-
be simulated with the procedure to make the overall task realistic ified NSGA-II procedure has able to find a set of high-performing
for it to be used in a real scenario. Although many other subjective teams (only 35 found!) demonstrating a trade-off between their
considerations may be important in making the final composition overall bowling and batting averages. In a comparison to the win-
of a team, the procedures of this study can be repeated when a ning team of the IPL 4th edition (played in April–May, 2011), the
partial team with a set of pre-determined players is formed. Our teams obtained by our study are theoretically better in both bowl-
multi-objective approach and post-optimality analysis procedure ing and batting performances and importantly also cost less to hire
can then identify key high-performing players from the remaining the whole team. Motivated by this result, we have then proposed a
list of players that would be most suitable with the formed partial number of multiple criteria decision making analysis tools to come
team. up with a preferred team by considering a set of subjective criteria
in a systematic manner. It has been demonstrated that heuristics,
such as inclusion of star players to enhance the popularity of a
4.5. Generalization of proposed method to other games
team, or the use of past record on captaincy of the chosen captain of
the team can all be incorporated in the team selection process. To
The results for the proposed multi-objective optimization and
make the process more realistic, we have also demonstrated how
decision-making methodologies are demonstrated for the game
a dynamic, auction-based player selection procedure used during
of cricket applied to IPL Twenty20 format. However, they can be
the IPL 4th edition can be introduced in our approach and an effec-
extended to many other sports with minor modifications. Any team
tive, high-performing, and coherent team can be formed iteratively
selection problem where a few players are to be chosen from a
keeping in mind the different performance criteria associated with
large pool of players for which past performance and player price
forming a cricket team.
data is available can be dealt with the proposed procedures of
The procedures suggested and simulated in the paper clearly
this paper. The team representation scheme and associated genetic
demonstrate the advantage of using a multi-objective computing
operators suggested here are fairly generic and can be extended
methodology for the cricket team selection task for major league
to other similar games. They ensure creation of meaningful teams
tournaments. The consideration of multiple objectives during opti-
and in turn promises to have a fast computational outcome. The
mization and during the decision-making process provides team
post-optimality analysis of identifying key contributing players and
selectors a plethora of high-performing team choices before they
different MCDM and subjective decision-making aspects should all
can select a single preferred team. As an added benefit, the avail-
carry over to other games. To illustrate, we take the example of
ability of of multiple high-performing teams can be exploited to
the American football game. Players are divided into three sepa-
identify key players appearing commonly to most trade-off teams.
rate units: the offense, the defense and the special team (kicker,
Since they appear on most trade-off teams, they bring in values
punter etc.). Each team has 11 players on the field at a time and
from both batting and bowling aspects of the game. In a dynamic
an NFL team has a limit of 53 players on their roster. Hence, from
selection of players one at a time, the team selectors may put
the available record of the performance and rating of individual
more emphasize in selecting such players. Although each idea sug-
football players, overall offensive and defensive performances of a
gested here has been demonstrated with a simulation, the proposed
team played against several opponents can be taken as different
methodology is now ready to be used in a real scenario with more
objectives which the team selector would like to maximize. Thus,
realistic team selection rules and other criteria that a particular
the batting and bowling objectives of Twenty20 cricket (used in this
franchise may be interested. Some fine-tuning of the proposed
study) are similar to offensive and defensive performances of a foot-
methodology and a GUI-based user-friendly software can be devel-
ball team. The role of the quarterback as the leader of the offensive
oped to customize a franchise’s options, that can be used in practice
team in American football is similar to that of a captain in a cricket
without much knowledge of multi-objective optimization, genetic
team. Constraints limiting the number of fullbacks, halfbacks, or
algorithms, or decision making aids. Importantly, the procedures
wide receivers in a team are similar to restrictions on the number
developed here can be easily extended to similar other games of
of wicket-keepers and foreign players in a Twenty20 cricket team.
interest – a matter which we postpone as an immediate future
The ability of a player to play well in different positions is also valid
study.
in American football. Thus, with a little modification, our represen-
tation scheme and proposed genetic operators can be extended to a
team selection for American football. The obtained trade-off teams References
can then be analyzed to identify high-performing players for differ-
ent positions. Knowing their current costs, auction based formation [1] An Explanation of Cricket, http://www.cs.purdue.edu/homes/hosk-ing/cricket/
explanation.htm
of teams or any other current regulated team selection schemes can
[2] Cricket Players’ Database and Statistics, 2011, http://www.iitk.
be simulated to help build an efficient team. ac.in/kangal/cricket
[3] G.D.I. Barr, B.S. Kantor, A criterion for comparing and selecting batsmen in
limited overs cricket, Journal of the Operational Research Society 55 (2004)
5. Conclusions 1266–1274.
[4] K. Deb, Multi-objective Optimization Using Evolutionary Algorithms, Wiley,
Chichester, UK, 2001.
We have proposed and used for the first time emergent com- [5] K. Deb, R.B. Agrawal, Simulated binary crossover for continuous search space,
puting methodologies for an objective evaluation of cricket team Complex Systems 9 (2) (1995) 115–148.
414 F. Ahmed et al. / Applied Soft Computing 13 (2013) 402–414

[6] K. Deb, S. Agrawal, A. Pratap, T. Meyarivan, A fast and elitist multi-objective [15] K. Miettinen, Nonlinear Multiobjective Optimization, Kluwer, Boston, 1999.
genetic algorithm: NSGA-II, IEEE Transactions on Evolutionary Computation 6 [16] D. Parker, P. Burns, H. Natarajan, Player valuations in the Indian Premier League,
(2) (2002) 182–197. Frontier Economics (2008) 1–17.
[7] K. Deb, S. Gupta, Understanding knee points in bicriteria problems and their [17] I. Preston, J. Thomas, Batting strategy in limited overs cricket, Journal
implications as preferred solution principles, Engineering Optimization 43 (11) of the Royal Statistical Society: Series D (The Statistician) 49 (1) (2000)
(2011) 1175–1204. 95–106.
[8] K. Deb, A. Srinivasan, Innovization: innovating design principles through [18] J.C. Régin, Minimization of the number of breaks in sports scheduling problems
optimization, in: Proceedings of the Genetic and Evolutionary Computation using constraint programming, in: Constraint Programming and Large Scale
Conference (GECCO-2006), ACM, New York, 2006, pp. 1629–1636. Discrete Optimization: DIMACS Workshop Constraint Programming and Large
[9] A. Duarte, C. Ribeiro, S. Urrutia, E. Haeusler, Referee assignment in sports Scale Discrete Optimization, vol. 57, American Mathematical Society, 2001, pp.
leagues, in: Practice and Theory of Automated Timetabling VI, 2007, pp. 115–121.
158–173. [19] T.L. Saaty, Decision Making for Leaders: The Analytic Hierarchy Process for
[10] H. Gerber, G.D. Sharp, Selecting a limited overs cricket squad using an inte- Decisions in a Complex World, RWS Publications, Pittsburgh, PA, 2008.
ger programming model, South African Journal for Research in Sport, Physical [20] H. Saikia, D. Bhattacharjee, On classification of all-rounders of the indian
Education 90 (2006). premier league (IPL): A Bayesian approach, VIKALPA 36 (4) (2011)
[11] H. Lemmer, A measure for the batting performance of cricket players, South 51–66.
African Journal for Research in Sport, Physical Education and Recreation 26 [21] S. Singh, S. Gupta, V. Gupta, Dynamic bidding strategy for players auction
(2004) 55–64. in IPL, International Journal of Sports Science and Engineering 5 (1) (2011)
[12] H. Lemmer, A measure of the current bowling performance in cricket, South 3–16.
African Journal for Research in Sport, Physical Education and Recreation 28 (2) [22] T.B. Swartz, Drafts versus auctions in the Indian Premier League, in: Proceedings
(2006) 91–103. of the Statistical Concepts and Methods for the Modern World, 2011.
[13] H. Lemmer, An analysis of players’ performances in the first cricket Twenty20 [23] A. Vig, Efficiency of sports league: the economic implications of having two
world cup series, South African Journal for Research in Sport, Physical Education leagues in the Indian cricket market, Master’s Thesis, The University of Not-
and Recreation 30 (2) (2008) 71–77. tingham, United Kingdom, 2008.
[14] M. Lourens, Integer optimisation for the selection of a Twenty20 cricket team. [24] B. Warren, Integer optimisation for the selection of a fantasy league cricket
Master’s Thesis, Nelson Mandela Metropolitan University, Port Elizabeth, South team, Master’s Thesis, Nelson Mandela Metropolitan University, Port Elizabeth,
Africa, 2009. South Africa, 2010.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy