Electronic Communications of The EASST
Electronic Communications of The EASST
Volume 38 (2010)
14 pages
1 Introduction
Refactoring has emerged as a successful technique to enhance object-oriented software
designs by series of small, behaviour-preserving transformations [Fow99]. However, due
to the number of design choices and the complex dependencies and conflicts between
them it is difficult to choose an optimal sequence of refactoring steps, maximising the
quality of the resulting design while minimising the cost of the transformation. In the
case of large systems the situation becomes acute because existing tools offer only limited
support for their automated application [MTR07]. Therefore, search-based approaches
have been suggested in order to provide automation in discovering appropriate refactoring
sequences [SSB06, HPJ01]. The idea is to see the design process as a combinatorial
optimisation problem, attempting to derive the best solution (with respect to a given
quality measure or objective function) from a given initial design [OM02].
Two obvious problems with search-based approaches are scalability, i.e., the ability to
apply to large models [OC08], and traceability, i.e., the ability on behalf of the developer
to understand the changes suggested by the optimisation [HPJ01]. In particular, heavy
modifications make it difficult to relate the improvement to the original design, so that
developers will struggle to understand the new structure. We believe that both problems
1 / 14 Volume 38 (2010)
Search-Based Refactoring based on Unfolding of GTS
can be mitigated by exploiting the local nature of refactoring operations, which affect only
a certain part of the design while leaving the context unchanged. In terms of scalability,
local operations permit the use of partial order models representing the behaviour of
a system by a set of actions (refactoring steps) equipped with relations of causality
and conflict. Such models provide an implicit representation of the states (designs) of
the system as conflict-free subsets of actions closed under causal dependencies, which
scales better than the explicit representation of reachable states. For traceability, causal
dependency provide a model of explanation of why certain steps are required to perform
later steps, thus reducing the problem to understanding the benefits of the final steps in
a sequence.
In this paper, we use a representation of object-oriented designs as graphs and refac-
toring operations as graph transformation rules [MTR07]. Such rules provide a local
description, identifying and changing a specific part of the design graph only. After
suitably encoding our rules into a hypergraph representation, this enables us to derive
a partial order structure of causality and conflict relations, using the approximated un-
folding of a graph transformation system [BCM99] and its implementation in Augur
2 [KK08]. The result is a structure called Petri graph, presenting the behaviour in terms
of an over-approximation of its transformations and dependencies [BCK01]. Causal de-
pendencies and conflicts, derived directly from the Petri graph, serve as input to our
search problem.
Optimisation algorithms such the Ant Colony Optimisation [dor05] (ACO) metaheuris-
tic rely on an explicit representation of the search space. Thus states and their local
neighbourhoods have to be reconstructed on the fly from the partial order representa-
tion. The desired result is a sequence of transformations leading from the given design
to a design of high(er) quality, using only transformation steps that are necessary to
achieve that improvement.
A more detailed view of the approach is given by the diagram in Figure 1. Using UML
activity diagram notation, boxes represent artifacts while oval nodes are the actions
or transformations performed on them. The class structure of a given Java program
(excluding method bodies, but retaining call and data access dependencies) is encoded
in the GXL format required by Augur 2. This is achieved with the help of the Infusion
environment1 and a subsequent transformation of the resulting MSE2 file into GXL.
The result represents the start graph of the hyper graph grammar to be unfolded. The
rules of the grammar formalising the refactoring operations are derived from the standard
catalogue [Fow99] shared across all Java programs. Augur 2 constructs the approximated
unfolding of a system [BCM99], producing a Petri graph to serve as input to the ACO-
based search algorithm.
ACO is inspired by the behaviour of foraging ants, which search for food individually
and concurrently, but share information about food sources and paths leading towards
them by leaving pheromone trails. This amounts to a distributed traversal of a graph
whose paths represent possible solutions [DMG97]. In our case, the nodes of that graph
1
http://www.intooitus.com/inFusion.html
2
http://www.moosetechnology.org/docs/mse
are the designs to be explored and its edges are the refactoring steps. Rather than
representing this so-called construction graph explicitly, its nodes and edges are derived
from the partial order structure as and when required. As a result, a path (refactoring
sequence) is produced representing the cheapest way to transform the given design into
an optimal one. Since the unfolding represents an over-approximation, the existence of
this sequence needs to be verified in the real model, possibly leading to a refinement of
the approximation. However, this step is beyond the scope of this paper.
The remainder of the paper is organised as follow. In Section 2, we review the pre-
sentation of refactorings as graph transformations and introduce our example. Section 3
describes the partial order analysis based on unfolding. The mapping into an ACO
problem is addressed in Section 4. Finally we evaluate our approach and conclude.
• Extract Superclass, creating a common superclass for two existing classes, usually
in order to encapsulate shared features.
• Add Parameter, introducing a new parameter for a method to make data access
explicit.
The rule Extract Superclass is shown in Figure 3 in class diagram notation. Rules can be
applied in different orders and locations, giving rise to a number of refactoring sequences.
Below we describe and motivate some of these for future reference.
T1 : Extract superclass E from class B and class C, e.g., in order to encapsulate shared
methods.
3 / 14 Volume 38 (2010)
Search-Based Refactoring based on Unfolding of GTS
Java Program
Class Level Refactoring Rules
Refine Unfolding
Petri graph
spurious
T3 : Move method from class B to class D, because it may be more tightly coupled to
that class (e.g., accessing its attribute).
T5 : Encapsulate attribute a1 in class D, making the attribute private and creating setter
and getter methods.
T7 : Add parameter p of type class D to method1 in class C, for the same motivation.
Note that the transformations listed are not part of a single sequence. For example
T2 , T3 are potentially in conflict.
5 / 14 Volume 38 (2010)
Search-Based Refactoring based on Unfolding of GTS
Figure 4: Result of applying rule Extract Superclass to the initial class model
tions [BCK01]. The result is an over approximation of the behaviour, i.e., spurious
sequences may appear that do not exist in the actual behaviour. We use the Augur 2
implementation of this construction [KK08] where initial hypergraph and rules are pre-
sented in the Graph Exchange Language GXL [Tae01]. The output Petri graph produced
is in GXL format as well [DKSR04].
7 / 14 Volume 38 (2010)
Search-Based Refactoring based on Unfolding of GTS
parameter (26)
hasParam (43) attribute (25)
0
0 0
method (20) 1
0 attribute (24)
0
1
1 contains (40)
0 0
contains (36) Class (16)
type (41) 0
0 type (45)
0 1
1 1 contains (39)
method (19)
gen (32) 0 0
0 contains (35) 1 1 type (42)
0
1
0 1 0 Class (23)
1 Contains (28)
Contains (30) 1 1
0
01 Contains (31) type (46)
0
0 1 gen (33) 0
Class (17) tgen (34) 0
0
0 0 Package (15)
1 Contains (29)
contains (37) 0
1
0
0 0 1
parameter (27)
1
method (21) hasParam (44)
contains (38)
0 0
0
Class (18)
method (22)
The set of transitions t1 , t2 , . . . representing refactoring steps and relations # and <
provide the input to our search for an optimal sequence of refactorings.
Package (53)
0
3
Class (44)
0 0
1 tgen (46)
1 Contains (60)
0 1
2 gen (57)
2 0 Class (54) 0 1
Contains (47) 1 =) 0
Class (45) 0
0 1 Class (56)
1
3 1 gen (58)
tgen (59) 0
0
0 2
Package (43)
0
Class (55)
5. An evaluation f (s) for each candidate solution s. For some problems it is possible
to calculate partial evaluations fp (x) associated with intermediate states x of the
problem.
Using the formulation above, artificial ants build solutions by performing randomised
walks on the connected graph G = (C, E), based on the following basic operations [DMG97].
• A state transitions takes an ant from a one node to another across an arc;
• A local update changes the pheromone deposit on the arc it currently walks on;
• A global update changes the pheromone deposits on all arcs an ant has traversed
when this ant successfully ends its trip;
9 / 14 Volume 38 (2010)
Search-Based Refactoring based on Unfolding of GTS
The problem is thus expressed as the search for an optimal path representing the best
sequence of refactoring steps applicable to the original system. The optimisation depends
on an evaluation of paths representing candidate solutions, which takes into account both
the cost of the refactoring transformations and the quality of the end result.
We use a so-called Hybrid Ant System [GD00] where ACO is extended by local search,
in particular, the Java Framework by Chiricom [Chi] implementing [DMG97] in order to
implement and solve a variety of ACS problems.
We adapted this framework to an implicit representation of states based on our partial
order model, deriving states and their local neighbourhood on the fly.
• In each state s, an ant will determine its local neighbourhood by computing all
transitions ti enabled in s, with successor states si = s ∪ {ti }. It will select one of
its neighbouring states based on the states’ evaluation and the pheromone values
• Moving to the selected state, the ant will update the pheromone deposit.
• The ant stops if there are no more new transitions to be added, i.e., all remaining
transitions are in conflict with transitions in the current state.
• A global update will take place to increase the pheromone deposits on all arcs
leading to success, or decrease them in case of failure.
11 / 14 Volume 38 (2010)
Search-Based Refactoring based on Unfolding of GTS
5 Conclusion
Our approach involves a combination of graph transformation theory and the ACO meta
heuristic, aiming to improve search-based refactoring. Rather than representing the
search space of designs and refactorings explicitly we use the unfolding as a more scalable
representation where designs (states) are given by sets of transformations closed under
causal dependencies. We can thus reconstruct states when needed, for example in order
to evaluate the objective function, but will deal with the more compact representation
when navigating the search space. As a further tribute to scalability, we are using the
approximated unfolding. Algorithmically, we are following a hybrid approach [GD00]
where the ACO meta heuristic is augmented with local search to improve its performance.
Hybrid ACO has been shown to be effective in situations of large and rugged search spaces
with complex constraints on solutions. In particular, the implicit representation of states
(by a sets of transformations closed under causality and without conflicts) should allow
us to scale the search to larger problems, avoiding state-space explosion.
Traceability will be evaluated through experiments with smaller models, assessing the
effort it takes a human developer to understand the changes proposed by the search-
based approach. The use of dependency information between transformations allows us
to remove steps that are unrelated to the intended change, making each change relevant
and therefore easier to interpret.
We have implemented the approach up to a point where it remains to check that
sequences produced in the approximated model are also executable in the full model. If
the sequence does not exists in the real model then a refinement of the abstraction will
be required [KK06], which will lead to a more accurate unfolding and another round of
optimisation.
References
[BCK01] P. Baldan, A. Corradini, B. König. A Static Analysis Technique for Graph
Transformation Systems. In Proc. of CONCUR ’01. Pp. 381–395. Springer-
Verlag, 2001. LNCS 2154.
13 / 14 Volume 38 (2010)
Search-Based Refactoring based on Unfolding of GTS
Blum.
doi:http://dx.doi.org/10.1016/j.artint.2005.03.003
[KK08] B. König, V. Kozioura. Augur 2—A New Version of a Tool for the Analy-
sis of Graph Transformation Systems. In Proc. of GT-VMT ’06 (Workshop
on Graph Transformation and Visual Modeling Techniques). ENTCS 211,
pp. 201–210. Elsevier, 2008.
[Tae01] G. Taentzer. Towards Common Exchange Formats for Graphs and Graph
Transformation Systems. Electr. Notes Theor. Comput. Sci. 44(4), 2001.