Cs615pe STM Unit-3
Cs615pe STM Unit-3
MOTIVATION:
o Flow graphs are being an abstract representation of programs.
o Any question about a program can be cast into an equivalent question about an
appropriate flowgraph.
o Most software development, testing and debugging tools use flow graphs analysis
techniques.
PATH PRODUCTS:
o Normally flow graphs used to denote only control flow connectivity.
o The simplest weight we can give to a link is a name.
o Using link names as weights, we then convert the graphical flow graph into an
equivalent algebraic like expressions which denotes the set of all possible paths from
entry to exit for the flow graph.
o Every link of a graph can be given a name.
o The link name will be denoted by lower case italic letters In tracing a path or path
segment through a flow graph, you traverse a succession of link names.
o The name of the path or path segment that corresponds to those links is expressed
naturally by concatenating those link names.
o For example, if you traverse links a,b,c and d along some path, the name for that path
segment is abcd. This path name is also called a path product. Figure 5.1 shows some
examples:
PATH PRODUCTS:
o The name of a path that consists of two successive path segments is conveniently
expressed by the concatenation or Path Product of the segment names.
o For example, if X and Y are defined as X=abcde,Y=fghij,then the path corresponding to
X followed by Y is denoted by
XY=abcdefghij
o Similarly, YX=fghijabcde aX=aabcde Xa=abcdea XaX=abcdeaabcde
o If X and Y represent sets of paths or path expressions, their product represents the set of
paths that can be obtained by following every element of X by any element of Y in all
possible ways. For example,
o X = abc + def + ghi
o Y = uvw + z Then,
XY = abcuvw + defuvw + ghiuvw + abcz + defz + ghiz
o If a link or segment name is repeated, that fact is denoted by an exponent. The
exponent's value denotes the number of repetitions:
o a1 = a; a2 = aa; a3 = aaa; an = aaaa n times. Similarly, if X = abcde then
X1 = abcde
X2 = abcdeabcde = (abcde)2
X3 = abcdeabcdeabcde = (abcde)2abcde
= abcde(abcde)2 = (abcde)3
o The path product is not commutative (that is XY!=YX).
o The path product is Associative.
RULE 1: A(BC)=(AB)C=ABC
where A,B,C are path names, set of path names or path expressions.
o The zeroth power of a link name, path product, or path expression is also needed for
completeness. It is denoted by the numeral "1" and denotes the "path" whose length is
zero - that is, the path that doesn't have any links.
PATH SUMS:
o The "+" sign was used to denote the fact that path names were part of the same set of
paths.
o The "PATH SUM" denotes paths in parallel between nodes.
o Links a and b in Figure 5.1a are parallel paths and are denoted by a + b. Similarly, links
c and d are parallel paths between the next two nodes and are denoted by c + d.
o The set of all paths between nodes 1 and 2 can be thought of as a set of parallel paths
and denoted by eacf+eadf+ebcf+ebdf.
o If X and Y are sets of paths that lie between the same pair of nodes, then X+Y denotes
the UNION of those set of paths. For example, in Figure 5.2:
DISTRIBUTIVE LAWS:
o The product and sum operations are distributive, and the ordinary rules of
multiplication apply; that is
RULE 4: A(B+C)=AB+AC and (B+C)D=BD+CD
o Applying these rules to the below Figure 5.1a yields
o e(a+b)(c+d)f=e(ac+ad+bc+bd)f = eacf+eadf+ebcf+ebdf
ABSORPTION RULE:
o If X and Y denote the same set of paths, then the union of these sets is unchanged;
consequently,
RULE 5: X+X=X (Absorption Rule)
o If a set consists of paths names and a member of that set is added to it, the "new" name,
which is already in that set of names, contributes nothing and can be ignored.
o For example,
o if X=a+aa+abc+abcd+def then
X+a = X+aa = X+abc = X+abcd = X+def = X
It follows that any arbitrary sum of identical path expressions reduces to the same path
expression.
RULES 6 - 16:
o The following rules can be derived from the previous rules:
o RULE 6: Xn + Xm = Xn if n>m RULE 6: Xn + Xm = Xm if m>n RULE 7: XnXm = Xn+m
RULE 8: XnX* = X*Xn = X* RULE 9: XnX+ = X+Xn = X+ RULE 10: X*X+ = X+X* = X+
RULE 11: 1 + 1 = 1
RULE 12: 1X = X1 = X
Following or preceding a set of paths by a path of zero length does not change the set.
RULE 13: 1n = 1n = 1* = 1+ = 1
No matter how often you traverse a path of zero length,It is a path of zero length.
RULE 14: 1++1 = 1*=1
The null set of paths is denoted by the numeral 0. it obeys the following rules:
RULE 15: X+0=0+X=X
RULE 16: 0X=X0=0
If you block the paths of a graph for or aft by a graph that has no paths , there won’t be
any paths.
o Removing the loop and then node 6 result in the following expression:
a(bgjf)*b(c+gkh)d((ilhd)*imf(bjgf)*b(c+gkh)d)*(ilhd)*e
o You can practice by applying the algorithm on the following flowgraphs and generate
their respective path expressions:
o This arithmetic is an ordinary algebra. The weight is the number of paths in each set.
o EXAMPLE:
Each link represents a single link and consequently is given a weight of "1" to
start. Let’s say the outer loop will be taken exactly four times and inner Loop
Can be taken zero or three times Its path expression, with a little work, is:
Path expression: a(b+c)d{e(fi)*fgj(m+l)k}*e(fi)*fgh
A: The flow graph should be annotated by replacing the link name with the
maximum of paths through that link (1) and also note the number of times for
looping.
B: Combine the first pair of parallel loops outside the loop and also the pair in
the outer loop.
C: Multiply the things out and remove nodes to clear the clutter.
13 = 10 + 11 + 12 + 13 = 1 + 1 + 1 + 1 = 4
2. E: Multiply the link weights inside the loop: 1 X 4 = 4
3. F: Evaluate the loop by multiplying the link wieghts: 2 X 4 = 8.
4. G: Simpifying the loop further results in the total maximum number of paths in
the flowgraph:
2 X 84 X 2 = 32,768.
a(b+c)d{e(fi)*fgj(m+l)k}*e(fi)*fgh
= 1(1 + 1)1(1(1 x 1)31 x 1 x 1(1 + 1)1)41(1 x 1)31 x 1 x 1
= 2(131 x (2))413
= 2(4 x 2)4 x 4
= 2 x 84 x 4 = 32,768
This is the same result we got graphically.Actually, the outer loop should be taken exactly four times.
That doesn't mean it will be taken zero or four times. Consequently, there is a superfluous "4" on the
outlink in the last step. Therefore the maximum number of different paths is 8192 rather than 32,768.
STRUCTURED FLOWGRAPH:
Structured code can be defined in several different ways that do not involve ad-hoc rules such as not
using GOTOs.
A structured flowgraph is one that can be reduced to a single link by successive application of the
transformations of Figure 5.7.
The node-by-node reduction procedure can also be used as a test for structured code.Flow graphs that
DO NOT contain one or more of the graphs shown below (Figure 5.8) as subgraphs are structured.
1. Jumping into loops
2. Jumping out of loops
3. Branching into decisions
4. Branching out of decisions
This question can be answered under suitable assumptions primarily that all probabilities involved are
independent, which is to say that all decisions are independent and uncorrelated. We use the same
algorithm as before: node-by-node removal of uninteresting nodes.
Weights, Notations and Arithmetic:
Probabilities can come into the act only at decisions (including decisions
associated with loops).
Annotate each outlink with a weight equal to the probability of going in that
direction.
Evidently, the sum of the outlink probabilities must equal 1
For a simple loop, if the loop will be taken a mean of N times, the looping
probability is N/(N + 1) and the probability of not looping is 1/(N + 1).
A link that is not part of a decision node has a probability of 1.
The arithmetic rules are those of ordinary arithmetic.
Following the above rule, all we've done is replace the outgoing probability with 1 - so why the
complicated rule? After a few steps in which you've removed nodes, combined parallel terms,
removed loopsand the like, you might find something like this:
us do this in three parts, starting with case A. Note that the sum of the probabilities at each decision
node is equal to 1. Start by throwing away anything that isn't on the way to case A, and then apply the
reduction procedure. To avoid clutter, we usually leave out probabilities equal to 1.
CASE A:
These checks. It's a good idea when doing this sort of thing to calculate all the probabilities
and to verify that the sum of the routine's exit probabilities does equal 1.
If it doesn't, then you've made calculation error or, more likely, you've left out some bra How
about path probabilities? That's easy. Just trace the path of interest and multiply the probabilities
as you go.
Alternatively, write down the path name and do the indicated arithmetic operation.
Say that a path consisted of links a, b, c, d, e, and the associated probabilities were .2, .5, 1.,
.01, and I respectively. Path abcbcbcdeabddea would have a probability of 5 x 10 -10.
Long paths are usually improbable.
EXAMPLE:
1. Start with the original flow graph annotated with probabilities and processing time.
2. Combine the parallel links of the outer loop. The result is just the mean of the processing
times for the links because there aren't any other links leaving the first node. Also combine
the pair of links at the beginning of the flow graph.
PUSH/POP, GET/RETURN:
This model can be used to answer several different questions that can turn up in debugging. It can also
help decide which test cases to design.
The question is:
Given a pair of complementary operations such as PUSH (the stack) and POP (the stack),
considering the set of all possible paths through the routine, what is the net effect of the routine?
PUSH or POP? How many times? Under what conditions?
Here are some other examples of complementary operations to which this model applies:
GET/RETURN a resource block.
OPEN/CLOSE a file.
START/STOP a device or process.
EXAMPLE 1 (PUSH / POP):
Here is the Push/Pop Arithmetic:
G(
G + R)G(GR)*GGR*R
= G(G + R)G3R*R
= (G + R)G3R*
= (G4 + G2)R*
This expression specifies the conditions under which the resources will be balanced on leaving the
routine.
If the upper branch is taken at the first decision, the second loop must be taken four times.
If the lower branch is taken at the first decision, the second loop must be taken twice.
For any other values, the routine will not balance. Therefore, the first loop does not have to be
instrumented to verify this behavior because its impact should be nil.
THE PROBLEM:
o The generic flow-anomaly detection problem (note: not just data-flow anomalies, but
any flow anomaly) is that of looking for a specific sequence of options considering all
possible paths through a routine.
o Let the operations be SET and RESET, denoted by s and r respectively, and we want to
know if there is a SET followed immediately a SET or a RESET followed immediately
by a RESET (an ss or an rr sequence).
o Some more application examples:
1. A file can be opened (o), closed (c), read (r), or written (w). If the file is read or
written to after it's been closed, the sequence is nonsensical. Therefore, cr and
cw are anomalous. Similarly, if the file is read before it's been written, just after
opening, we may have a bug. Therefore, or is also anomalous. Furthermore, oo
and cc, though not actual bugs, are a waste of time and therefore should also be
examined.
2. A tape transport can do a rewind (d), fast-forward (f), read (r), write (w), stop
(p), and skip (k). There are rules concerning the use of the transport; for
example, you cannot go from rewind to fast-forward without an intervening stop
or from rewind or fast-forward to read or write without an intervening stop. The
following sequences are anomalous: df, dr, dw, fd, and fr. Does the flowgraph
lead to anomalous sequences on any path? If so, what sequences and under what
circumstances?
3. The data-flow anomalies discussed in Unit 4 requires us to detect the dd, dk, kk,
and ku sequences. Are there paths with anomalous data flows?
THE METHOD:
o Annotate each link in the graph with the appropriate operator or the null operator 1.
o Simplify things to the extent possible, using the fact that a + a = a and 12 = 1.
o You now have a regular expression that denotes all the possible sequences of operators
in that graph. You can now examine that regular expression for the sequences of
interest.
o EXAMPLE: Let A, B, C, be nonempty sets of character sequences whose smallest
string is at least one character long. Let T be a two-character string of characters. Then
if T is a substring of (i.e., if T appears within) ABnC, then T will appear in AB2C.
(HUANG's Theorem)
As an example, let
o A = pp B = srr C = rp T = ss
A = p + pp + ps
B = psr + ps(r + ps) C = rp
T = P4
Is it obvious that there is a p4 sequence in ABnC? The theorem states that we have only
to look at
Multiplying out the expression and simplifying shows that there is no p4 Sequence.
o Incidentally, the above observation is an informal proof of the wisdom of looping twice
discussed in Unit 2. Because data-flow anomalies are represented by two- character
sequences, it follows the above theorem that looping twice is what you need to do to
find such anomalies.
LIMITATIONS:
o Huang's theorem can be easily generalized to cover sequences of greater length than
two characters. Beyond three characters, though, things get complex and this method
has probably reached its utilitarian limit for manual application.
o There are some nice theorems for finding sequences that occur at the beginnings and
ends of strings but no nice algorithms for finding strings buried in an expression.
o Static flow analysis methods can't determine whether a path is or is not achievable.
Unless the flow analysis includes symbolic execution or similar techniques, the impact
of unachievable paths will not be included in the analysis.
The flow-anomaly application, for example, doesn't tell us that there will be a flow anomaly -
it tells us that if the path is achievable, then there will be a flow anomaly. Such analytical
problems go away, of course, if you take the trouble to design routines for which all paths are
achievable.
INTRODUCTION:
o The functional requirements of many programs can be specified by decision tables, which provide a
useful basis for program and test design.
o Consistency and completeness can be analyzed by using boolean algebra, which can also be used as a
basis for test design. Boolean algebra is trivialized by using Karnaugh-Veitch charts.
o "Logic" is one of the most often used words in programmers' vocabularies but one of their least used
techniques.
o Boolean algebra is to logic as arithmetic is to mathematics. Without it, the tester or programmer is cut
off from many test and design techniques and tools that incorporate those techniques.
o Logic has been, for several decades, the primary tool of hardware logic designers.
o Many test methods developed for hardware logic can be adapted to software logic testing. Because
hardware testing automation is 10 to 15 years ahead of software testing automation, hardware testing
methods and its associated theory is a fertile ground for software testing methods.
o As programming and test techniques have improved, the bugs have shifted closer to the process front
end, to requirements and their specifications. These bugs range from 8% to 30% of the total and
because they're first-in and last-out, they're the costliest of all.
o The trouble with specifications is that they're hard to express.
o Boolean algebra (also known as the sentential calculus) is the most basic of all logic systems.
o Higher-order logic systems are needed and used for formal specifications.
o Much of logical analysis can be and is embedded in tools. But these tools incorporate methods to
simplify, transform, and check specifications, and the methods are to a large extent based on boolean
algebra.
The knowledge-based system (also expert system, or "artificial intelligence" system) has become the
programming construct of choice for many applications that were once considered very difficult.
Knowledge-based systems incorporate knowledge from a knowledge domain such as medicine, law,
or civil engineering into a database. The data can then be queried and interacted with to provide
solutions to problems in that domain.
One implementation of knowledge-based systems is to incorporate the expert's knowledge into a set
of rules. The user can then provide data and ask questions based on that data.
The user's data is processed through the rule base to yield conclusions (tentative or definite) and
requests for more data. The processing is done by a program called the inference engine.
Understanding knowledge-based systems and their validation problems requires an understanding of
formal logic.
o Decision tables are extensively used in business data processing; Decision-table preprocessors as
extensions to COBOL are in common use; boolean algebra is embedded in the implementation
of these processors.
o Although programmed tools are nice to have, most of the benefits of boolean algebra can be
reaped by wholly manual means if you have the right conceptual tool: the Karnaugh-Veitch
diagram is that conceptual tool.
Figure 6.1 is a limited - entry decision table. It consists of four areas called the condition stub,
the condition entry, the action stub, and the action entry.
Each column of the table is a rule that specifies the conditions under which the actions named
in the action stub will take place.
The condition stub is a list of names of conditions.
Action 1 will take place if conditions 1 and 2 are met and if conditions 3 and 4 are not met (rule
1) or if conditions 1, 3, and 4 are met (rule 2).
"Condition" is another word for predicate.
Decision-table uses "condition" and "satisfied" or "met". Let us use "predicate" and TRUE /
FALSE.
Now the above translations become:
1. Action 1 will be taken if predicates 1 and 2 are true and if predicates 3 and 4 are false
(rule 1), or if predicates 1, 3, and 4 are true (rule 2).
2. Action 2 will be taken if the predicates are all false, (rule 3).
3. Action 3 will take place if predicate 1 is false and predicate 4 is true (rule 4).
In addition to the stated rules, we also need a Default Rule that specifies the default action to
be taken when all other rules fail. The default rules for Table in Figure 6.1 is shown in Figure
6.3
DECISION-TABLE PROCESSORS:
o Decision tables can be automatically translated into code and, as such, are a higher-
order language
o If the rule is satisfied, the corresponding action takes place
o Otherwise, rule 2 is tried. This process continues until either a satisfied rule results in
an action or no rule is satisfied and the default action is taken
o Decision tables have become a useful tool in the programmers kit, in business data
processing.
1. The specification is given as a decision table or can be easily converted into one.
2. The order in which the predicates are evaluated does not affect interpretation of the rules
or the resulting action - i.e., an arbitrary permutation of the predicate order will not, or
should not, affect which action takes place.
3. The order in which the rules are evaluated does not affect the resulting action - i.e., an
arbitrary permutation of rules will not, or should not, affect which action takes place.
4. Once a rule is satisfied and an action selected, no other rule need be examined.
5. If several actions can result from satisfying a rule, the order in which the actions are
executed doesn't matter.
1. Consider the following specification whose putative flowgraph is shown in Figure 6.5:
1. If condition A is met, do process A1 no matter what other actions are taken or
what other conditions are met.
PATH EXPRESSIONS:
GENERAL:
o Logic-based testing is structural testing when it's applied to structure (e.g., control flow
graph of an implementation); it's functional testing when it's applied to a specification.
o In logic-based testing we focus on the truth values of control flow predicates.
BOOLEAN ALGEBRA:
o STEPS:
1. Label each decision with an uppercase letter that represents the truth value of
the predicate. The YES or TRUE branch is labeled with a letter (say A) and the
NO or FALSE branch with the same letter overscored (say ).
2. The truth value of a path is the product of the individual labels. Concatenation
or products mean "AND". For example, the straight- through path of Figure 6.5,
which goes via nodes 3, 6, 7, 8, 10, 11, 12, and 2, has a truth value of ABC. The
path via nodes 3, 6, 7, 9 and 2 has a value of .
3. If two or more paths merge at a node, the fact is expressed by use of a plus sign
(+) which means "OR".
o There are only two numbers in boolean algebra: zero (0) and one (1). One means
"always true" and zero means "always false".
In all of the above, a letter can represent a single sentence or an entire boolean algebra expression.
Individual letters in a boolean algebra expression are called Literals (e.g. A,B) The product of several
literals is called a product term (e.g., ABC, DE).
An arbitrary boolean expression that has been multiplied out so that it consists of the sum of products
(e.g., ABC + DEF + GH) is said to be in sum-of-products form.
The result of simplifications (using the rules above) is again in the sum of product form and each
product term in such a simplified version is called a prime implicant. For example, ABC + AB
+ DEF reduce by rule 20 to AB + DEF; that is, AB and DEF are prime implicants. The path
expressions of Figure 6.5 can now be simplified by applying the rules.
The following are the laws of boolean algebra:
The deviation from the specification is now clear. The functions should have been:
Loops complicate things because we may have to solve a boolean equation to determine what
predicate value combinations lead to where.
KV CHARTS:
INTRODUCTION:
o If you had to deal with expressions in four, five, or six variables, you could get bogged
down in the algebra and make as many errors in designing test cases as there are bugs in
the routine you're testing.
o Karnaugh-Veitch chart reduces boolean algebraic manipulations to graphical trivia.
o Beyond six variables these diagrams get cumbersome and may not be effective.
SINGLE VARIABLE:
o Figure 6.6 shows all the boolean functions of a single variable and their equivalent
representation as a KV chart.
OR
THREE VARIABLES:
o KV charts for three variables are shown below.
o As before, each box represents an elementary term of three variables with a bar
appearing or not appearing according to whether the row-column heading for that box is
0 or 1.
o A three-variable chart can have groupings of 1, 2, 4, and 8 boxes.
o A few examples will illustrate the principles: