Ch05 Software Effort Estimation
Ch05 Software Effort Estimation
Chapter Five
Software effort
estimation
Delivering: Stages:
• agreed functionality 1. set targets
• on time 2. Attempt to achieve
• at the agreed cost targets
• with the required quality
4
Project Design Coding Testing Total
wm (%) wm (%) wm (%) wm SLOC
a 3.9 23 5.3 32 7.4 44 16.7 6050
b 2.7 12 13.4 59 6.5 36 22.6 8363
c 3.5 11 26.8 83 1.9 6 32.2 13334
d 0.8 21 2.4 62 0.7 18 3.9 5942
e 1.8 10 7.7 44 7.8 45 17.3 3315
f 19.0 28 29.7 44 19.0 28 67.7 38988
g 2.1 21 7.4 74 0.5 5 10.1 38614
h 1.3 7 12.7 66 5.3 27 19.3 12762
i 8.5 14 22.7 38 28.2 47 59.5 26500
5
Project Work-month SLOC Productivity
(SLOC/month)
a 16.7 6,050 362
b 22.6 8,363 370
c 32.2 13,334 414
d 3.9 5,942 1,524
e 17.3 3,315 192
f 67.7 38,988 576
g 10.1 38,614 3,823
h 19.3 12,762 661
i 59.5 26,500 445
Overall 249.3 153,868 617
6
Over and under-estimating
• Parkinson’s Law: • Weinberg’s Zeroth Law
‘Work expands to fill of reliability: ‘a software
the time available’ project that does not
• An over-estimate is have to meet a reliability
likely to cause project requirement can meet
to take longer than it any other requirement’
would otherwise
8
A taxonomy of estimating methods
9
Parameters to be Estimated
• Duration
10
Measure of Work
11
Major Shortcomings of SLOC
12
Bottom-up versus top-down
• Bottom-up
• use when no past project data
13
Bottom-up estimating
1. Break project into smaller and smaller components
[2. Stop when you get to what one person can do in
one/two weeks]
3. Estimate costs for the lowest level activities
4. At each higher level calculate estimate by adding
estimates for lower levels
A procedural code-oriented approach
a) Envisage the number and type of modules in the final
system
b) Estimate the SLOC of each individual module
c) Estimate the work content
d) Calculate the work-days effort
14
Top-down estimates
Normally associated with parametric (or algorithmic) model.
• Produce overall
overall Estimate
estimate using
100 days
effort driver(s)
project
• distribute
desig proportions of
code test overall estimate to
n
30% 30% 40% components
i.e. i.e. i.e. 40 days
30 days 30 days
15
Algorithmic/Parametric models
17
Parametric models
• Some models focus on task or system size e.g.
Function Points
• FPs originally used to estimate Lines of Code, rather
than effort
Number
of file types
Numbers of input
and output transaction types
18
Parametric models
• Other models focus on productivity: e.g. COCOMO
• Lines of code (or FPs etc) an input
System
Estimated effort
size
Productivity
factors
19
Expert judgement
20
Estimating by analogy
(Case-based reasoning)
22
• Exercise:
Project B has 5 inputs and 10 outputs. What would be
the Euclidian distance between this project and the
target new project being considered in the previous
slide? Is project B a better analogy with the target
than project A?
23
Machine assistance for source
selection (ANGEL)
Source A
Source B
Number of inputs
It-Is
Ot-Os
target
Number of outputs
Euclidean distance = sq root ((It - Is)2 + (Ot - Os)2 )
24
Stages: identify
25
Parametric models
26
Albrecht/IFPUG function points
• Albrecht worked at IBM and needed a way of
measuring the relative productivity of different
programming languages.
• Needed some way of measuring the size of an
application without counting lines of code.
• Identified five types of component or functionality in
an information system
• Counted occurrences of each type of functionality in
order to get an indication of the size of an information
system
27
Albrecht/IFPUG function points -
continued
28
Albrecht/IFPUG function points -
continued
29
Albrecht complexity multipliers
Table-1
External user types Low Medium High
complexity complexity complexity
EI
3 4 6
External input type
EO
4 5 7
External output type
EQ
3 4 6
External inquiry type
LIF
7 10 15
Logical internal file type
EIF
5 7 10
External interface file type
30
With FPs originally defined by Albecht, the external user
type is of high, low or average complexity is intuitive.
Ex: in the case of logical internal files and external
interface files, the boundaries shown in table below are
used to decide the complexity level.
Table-2
Number of record types Number of data types
< 20 20 – 50 > 50
1 Low Low Average
2 to 5 Low Average High
>5 Average High High
31
Example
A logical internal file might contain data about
purchase orders. These purchase orders might be
organized into two separate record types: the main
PURCHASE-ORDER details, namely purchase order
number, supplier reference and purchase order date.
The details of PURCHASE-ORDER-ITEM specified in
the order, namely the product code, the unit price and
number ordered.
• The number of record types for this will be 2
• The number of data types will be 6
• According to the previous table-2 file type will be
rated as ‘Low’
• According to the previous table-1 the FP count is 7
32
Examples
Payroll application has:
1. Transaction to input, amend and delete employee details – an
EI that is rated of medium complexity
33
FP counts
1. Refer Table-1
2. Medium EI 4 FPs
3. High complexity EI 6 FPs
4. Medium complexity EO 5 FPs
5. Medium complexity LIF 10 FPs
6. Simple EIF 5 FPs
Total 30 FPs
If previous projects delivered 5 FPs a day, implementing the
above should take 30/5 = 6 days
34
Function points Mark II
• Developed by Charles R. Symons
• ‘Software sizing and estimating - Mk II FPA’, Wiley &
Sons, 1991.
• Builds on work by Albrecht
• Work originally for CCTA:
• should be compatible with SSADM; mainly used in
UK
• has developed in parallel to IFPUG FPs
• A simpler method
35
Function points Mk II continued
#input #output
items items
FP count = Ni * 0.58 + Ne * 1.66 + No * 0.26
36
• UFP – Unadjusted Function Point – Albrecht
(information processing size is measured)
• TCA – Technical Complexity Adjustment (the
assumption is, an information system comprises
transactions which have the basic structures, as
shown in previous slide)
• For each transaction the UFPs are calculated:
Wi X (number of input data element types) +
We X (number of entity types referenced) + Wo X
(number of output data element types)
Wi ,We ,Wo are weightings derived by asking
developers the proportions of effort spent in previous
projects developing the code dealing with input,
accessing stored data and outputs.
Wi = 0.58, We = 1.66, Wo = 0.26 (industry average)
37
Exercise:
A cash receipt transaction in an accounts subsystem
accesses two entity types INVOICE and CASH-RECEIPT.
The data inputs are:
Invoice number Date received
Cash received
If an INVOICE record is not found for the invoice number
then an error message is issued. If the invoice number is
found then a CASH-RECEIPT record is created. The error
message is the only output of the transaction. Calculate
the unadjusted function points, using industry average
weightings, for this transaction.
(0.58 X 3) + (1.66 X 2) + (0.26 X 1) = 5.32
38
Exercise:
In an annual maintenance contract subsystem is having a
transaction which sets up details of new annual
maintenance contract customers.
1. Customer account number 2.Customer name 3.
Address 4. Postcode 5. Customer
type 6. Renewal date
All this information will be set up in a CUSTOMER record
on the system’s database. If a CUSTOMER account
already exists for the account number that has been
input, an error message will be displayed to the
operator.
Calculate the number of unadjusted Mark II function points
for the transaction described above using the industry
average.
39
Answer:
The function types are:
Input data types 6
Entities accessed 1
Output data types 1
40
Function points for embedded systems
• Mark II function points, IFPUG function points were
designed for information systems environments
• They are not helpful for sizing real-time or embedded
systems
• COSMIC-FFPs (common software measurement
consortium-full function point) attempt to extend
concept to embedded systems or real-time systems
• FFP method origins the work of two interlinked
groups in Quebec, Canada
• Embedded software seen as being in a particular
‘layer’ in the system
• Communicates with other layers and also other
components at same level
41
The argument is
• existing function point method is effective in
assessing the work content of an information system.
• size of the internal procedures mirrors the external
features.
• in real-time or embedded system, the features are
hidden because the software’s user will probably not
be human beings but a hardware device.
42
• COSMIC deals with by decomposing the system
architecture into a hierarchy of software layers.
• The software component to be sized can receive
requests the service from layers above and can
request services from those below.
• There may be separate software components engage
in peer-to-peer communication.
• Inputs and outputs are aggregated into data groups,
where each data group brings together data items
related to the same objects.
43
Layered software
Higher layers
Makes a request
Receives service
for a service
Lower layers
44
COSMIC FPs
The following are counted: (Data
groups can be moved in four ways)
• Entries (E): movement of data into software component
from a higher layer or a peer component
• Exits (X): movements of data out to a user outside its
boundary
• Reads (R): data movement from persistent storage
• Writes (W): data movement to persistent storage
47
Data movement Type
Incoming vehicles sensed E
Access vehicle count R
Signal barrier to be lifted X
Increment vehicle count W
Outgoing vehicle sensed E
Decrement vehicle count W
New maximum input E
Set new maximum W
Adjust current vehicle count E
Record adjusted vehicle count W
48
COCOMO81
• Based on industry productivity standards - database is
constantly updated
• Allows an organization to benchmark its software
development productivity
• Basic model
effort = c x sizek
• C and k depend on the type of system: organic,
semi-detached, embedded
• Size is measured in ‘kloc’ ie. Thousands of lines of code
System type c k
Organic (broadly, information systems, small
team, highly familiar in-house environment) 2.4 1.05
50
effort = c (size)k
51
Estimation of development time
Tdev = a X (Effort)b
Organic : 2.5(Effort)0.38
Semidetached : 2.5(Effort)0.35
Embedded : 2.5(Effort)0.32
52
Embedded
Semi-detached
Estimated effort
Organic
Effort vs. product size
(effort is super-linear in the size of the software)
Effort required to develop a product increases very
rapidly with project size.
Size
Embedded
Nominal development time
Semi-detached
Organic
54
Ex-Two software managers separately estimated a given product to
be of 10,000 and 15,000 lines of code respectively. Bring out the
effort and schedule time implications of their estimation using
COCOMO. For the effort estimation, use a coefficient value of 3.2
and exponent value of 1.05. For the schedule time estimation, the
similar values are 2.5 and 0.38 respectively. Assume all
adjustment multipliers to be equal to unity.
55
COCOMO II
An updated version of COCOMO:
• There are different COCOMO II models for estimating at
56
COCOMO II Scale factor
Boehm et al. have refined a family of cost estimation
models. The key one is COCOMO II. It uses multipliers and
exponent values. Based on five factors which appear to be
particularly sensitive to system size.
58
Example of scale factor
• A software development team is developing an
application which is very similar to previous ones it has
developed.
• A very precise software engineering document lays down
very strict requirements. PREC is very high (score 1.24).
• FLEX is very low (score 5.07).
• The good news is that these tight requirements are
unlikely to change (RESL is high with a score 2.83).
• The team is tightly knit (TEAM has high score of 2.19),
but processes are informal (so PMAT is low and scores
6.24)
59
Scale factor calculation
60
Exercise:
A new project has ‘average’ novelty for the software supplier
that is going to execute it and thus given a nominal rating
on this account for precedentedness. Development
flexibility is high, requirements may change radically and so
risk resolution exponent is rated very low. The development
team are all located in the same office and this leads to
team cohesion being rated as vey high, but the software
house as a whole tends to be very informal in its standards
and procedures and the process maturity driver has
therefore been given a rating of ‘low’.
(i) (i) What would be the scale factor (sf) in this case?
(ii) (ii) What would the estimate effort if the size of the
62
Effort multipliers
(COCOMO II - early design)
63
Effort multipliers
(COCOMO II - early design)
Table-3
Extra Very Low Nom-in High Very Extra
low low al high high
RCPX 0.49 0.60 0.83 1.00 1.33 1.91 2.72
64
Example
65
Example -continued
Refer Table-3
RCPX very high 1.91
PDIF very high 1.81
PERS extra high 0.50
PREX nominal 1.00
All other factors are nominal
Say estimate is 35.8 person months
With effort multipliers this becomes 35.8 x 1.91 x 1.81 x 0.5
= 61.9 person months
66
Exercise:
A software supplier has to produce an application that controls a
piece of equipment in a factory. A high degree of reliability is
needed as a malfunction could injure the operators. The
algorithms to control the equipment are also complex. The
product reliability and complexity are therefore rates as very
high. The company would lie to take opportunity to exploit fully
the investment that they made in the project by reusing the
control system, with suitable modifications, on future contracts.
The reusability requirement is therefore rate as very high.
Developers are familiar with the platform and the possibility of
potential problems in that respect is regarded as low. The
current staff are generally very capable and are rated as very
high, but the project is in a somewhat novel application domain
for them so experience s rated as nominal. The toolsets
available to the developers are judged to be typical for the size
of company and are rated nominal, as it is the degree of
schedule pressure to meet a 67
Given the data table-3
68
Factor Description Rating Effort multiplier
RCPX Product reliability and complexity
RUSE Reuse
PDIF Platform difficulty
PERS Personnel capability
PREX Personnel experience
FCIL Facilities available
SCED Required development schedule
69
Factor Description Rating Effort multiplier
RCPX Product reliability and complexity Very high
RUSE Reuse Very high
PDIF Platform difficulty Low
PERS Personnel capability Very high
PREX Personnel experience Nominal
FCIL Facilities available Nominal
SCED Required development schedule nominal
70
Factor Description Rating Effort multiplier
RCPX Product reliability and complexity Very high 1.91
RUSE Reuse Very high 1.15
PDIF Platform difficulty Low 0.87
PERS Personnel capability Very high 0.63
PREX Personnel experience Nominal 1.00
FCIL Facilities available Nominal 1.00
SCED Required development schedule nominal 1.00
71
New development effort multipliers (dem)
72
COCOMO II Post architecture effort multipliers Modifier type Code Effort multiplier
Product attributes RELY Required software reliability
DATA Database size
DOCU Documentation match to life-cycle needs
CPLX Product complexity
REUSE Required reusability
Platform attributes TIME Execution time constraint
STOR Main storage constraint
PVOL Platform volatility
Personnel attributes ACAP Analyst capability
AEXP Application experience
PCAP Programmer capabilities
PEXP Platform experience
LEXP Programming language experience
PCON Personnel continuity
Project attributes TOOL Use of software tools
SITE Multisite development
SCED Schedule pressure 73
Staffing
• Norden was one of the first to investigate staffing
pattern:
• Considered general research and development
(R&D) type of projects for efficient utilization of
manpower.
• Norden concluded:
• Staffing pattern for any R&D project can be
approximated by the Rayleigh distribution curve
Manpower
TD
Time
Rayleigh-Norden Curve 74
Putnam’s Work
75
Example
16 times.
Why?
76
• The extra effort can be attributed to the increased
communication requirements and the free time of the
developers waiting for work.
• The project manager recruits a large number of
developers hoping to complete the project early, but
becomes very difficult to keep these additional
developers continuously occupied with work.
• Implicit in the schedule and duration estimated
arrived at using COCOMO model, is the fact that all
developers can continuously be assigned work.
• However, when a large number of developers are
hired to decrease the duration significantly, it
becomes difficult to keep all developers busy all the
time. The simultaneous work is getting restricted.
77
Exercise:
The nominal effort and duration of a project is estimated
to be 1000 pm and 15 months. This project is
negotiated to be £200,000. This needs the product to
be developed and delivered in 12 months time. What
is the new cost that needs to be negotiated.
78
Boehm’s Result
• There is a limit beyond which a software project
cannot reduce its schedule by buying any more
personnel or equipment.
• This limit occurs roughly at 75% of the nominal
79
Capers Jones’ Estimating Rules of
Thumb
• Empirical rules: (IEEE journal – 1996)
• Formulated based on observations
• No scientific basis
• Because of their simplicity:
• These rules are handy to use for making off-hand
estimates. Not expected to yield very accurate
estimates.
• Give an insight into many aspects of a project for
which no formal methodologies exist yet.
80
Capers Jones’ Rules
81
Illustration:
Size of a project is estimated to be 150 function points.
Rule-1: 150 X 125 = 18,750 SLOC
Rule-2: Development time = 1500.4 = 7.42 ≈ 8 months
Rule-3: The original requirement will grow by 2% per
month i.e 2% of 150 is 3 FPs per month.
o If the duration of requirements specification and
testing is 5 months out of total development time of 8
months, the total requirements creep will be roughly
3 X 5 = 15 function points.
o The total size of the project considering the creep will
be 150 + 15 = 165 function points and the manager
need to plan on 165 function points.
82
Capers Jones’ Rules
• Rule 4: Defect removal efficiency:
• Each software review, inspection, or test step will find
and remove 30% of the bugs that are present.
(Companies use a series of defect removal steps like requirement
review, code inspection, code walk-through followed by unit,
integration and system testing. A series of ten consecutive defect
removal operations must be utilized to achieve good product
reliability.)
• Rule 5: Project manpower estimation:
• The size of the software (in function points) divided by
150 predicts the approximate number of personnel
required for developing the application. (For a project size
of 500 FPs the number of development personnel will be 500/125
= 4, without considering other aspects like use of CASE tools,
project complexity and programming languages.)
83
Capers’ Jones Rules
• Rule 6: Software development effort estimation:
• The approximate number of staff months of effort
required to develop a software is given by the software
development time multiplied with the number of
personnel required. (using rule 2 and 5 the effort estimation for
the project size of 150 FPs is 8 X 1 = 8 person-months)
• Rule 7: Number of personnel for maintenance
• Function points divided by 500 predicts the approximate
number of personnel required for regular maintenance
activities. (as per Rule-1, 500 function points is equivalent to
about 62,500 SLOC of C program, the maintenance personnel
would be required to carry out minor fixing, functionality adaptation
ONE.)
84
Some conclusions: how to review
estimates
85