Recent Developments and Challenges in Surrogate Model Based Optimal Design of Engineering Systems
Recent Developments and Challenges in Surrogate Model Based Optimal Design of Engineering Systems
Abstract— Computer simulation has become the II. OPTIMAL DESIGN SCENARIO
back bone of the modern engineering design. Multi-
disciplinary optimal design through simulation uses Engineering design process is mainly concerned
sophisticated models that take long computing times, even with making decisions on analysis which have a bearing
on today’s most powerful super computers. In this context, directly on the end product. It often takes months of analysis
surrogate models come to the designers’ rescue. These are by a dedicated team of engineers to arrive at key decisions
basically cheap-to-compute models to the data generated by at a given stage of a design project. An important aspect of
the expensive-to-run computer simulation models. This the impact of the computational approach to engineering
paper addresses some of the basic issues in the surrogate design is that hundreds of feasible designs can be evaluated
model development, reviews some of the recent research in and design constraints emanating from more than one
this area and focuses on the challenges that lie ahead. discipline can be taken care of at the conceptual design
stage itself. A good potential now exists, than ever before,
Keywords—Optimal Design, Surrogate Models, Engineering to use conventional optimization tools, with this emerging
Systems, Computer Simulation.
capability, to evolve optimum cost effective products and
devices. A typical design of an aerospace vehicle, for
I. INTRODUCTION
example, can consider aerodynamic, structural, propulsion
weight, manufacturing and other aspects in the conceptual
A. ISSUES IN DVELOPMENT The above questions are the focal areas of active research in
surrogate models development today.
The basic idea in evolving a surrogate model is the
judicious use of available computational resources and
B. APPLICATIONS
budget. Investments in developing fast mathematical
approximations to the data generated by computer intensive
simulation model data, helps in gaining insights into the Surrogate models help us to gain increasing insight into a
design problem at hand and also in exploring many design design problem. Such models seek to provide answers in the
trade-offs and visualize the intricacies of the overall design. gaps between the limited analysis runs that can be afforded
The designer can take recourse to the high fidelity simulation with the available computer resources. They can also be
model code runs to test the ideas so generated and gained. He used to bridge seamlessly the various levels of
can also update and modify the approximation model itself. sophistication in-built in the varying fidelity physics based
Thus the surrogate model is a cheap-to-compute model of a simulation model codes. They may also unify data partly
computer intensive model – a model for a model. 1 obtained from computer simulations and partly by field
experiments. The main aim is to use all the available data
While the basic concept of a surrogate model sounds pertaining to a given problem, and evolve a simple yet
logical and simple, there are many challenges in developing powerful and usable model that can be used to backup
the best surrogate model for a given set of data. The basic design decisions.
questions that are pertinent are:
The simplest and currently the most common use of
• At what sampling points in the design space, the surrogate models is to augment the data generated by a
computer intensive model is to be run for generating single expensive computer simulation code that needs to be
model data that are input to develop the surrogate run for a range of values of input parameter values. The
model? basic idea is to use the surrogate model as a curve fit to the
available data so that data at any new design point can be
• Which is the best approximation method that closely predicted without recourse to the expensive simulation code
resembles the model data? runs. The underlying assumption is that once built the
surrogate model is as good as the original, with good
prediction accuracy and at the same hundreds or thousand
• Can the surrogate model be used to segregate the
times faster than the ‘mother’ code.
important parameters from the not-so-important ones in
the design problem?
Another interestingly common use of the surrogates is to
act as a calibrator for prediction codes with limited
• Can we rely on the surrogate model to do trade-off
accuracy. It is quite common while developing a software
studies and arrive at design decisions?
model for the physical process a simplified approach might
have been used to gain acceptable run times. For example in
• How to deal with noise in the computer generated
computational flow simulation, very rapid panel method of
simulation model data?
solution may be used in the place computer intensive full
Navier-Stokes models. A surrogate model may well be
• Can the surrogate model be continuously improved or trained to bridge the two codes by using it to model the
updated by having recourse to computer intensive differences between the results from each code, one a fast
original model runs during an optimization run? simpler code and a slow complex code. The idea is to gain
the accuracy of the sophisticated code at the expense of
• Can a single surrogate model serve the entire design running the faster code. Such multi fidelity multi level
space or should we develop multiple surrogates in approaches can be extended gainfully to deal with data
different local regions? coming from the physical experiments and their known
established correlations with computational predictions.
• Can a surrogate model be constructed to bridge the gap
between the predictions between a high fidelity model
and a low fidelity model for the same problem so that
Another important use of surrogates is their ability to deal IV. SAMPLING PLAN
with noisy or missing data. It is common that results coming
from physical experiments are subjected to random errors. The first step in constructing a cheap-to-evaluate surrogate
These have to be taken care of while using the data. It is model, say f*(x) that replicates an expensive-to-compute
also possible that some experiments fail to yield any results black-box function f(x) for a given engineering design
at all. The non-repeatable random error that is associated problem, is a well conceived sampling plan. Assuming that
with physical experiments does not exist in computer the design problem is governed by a k-dimensional vector
simulations as these are deterministic in nature of design variables x Є D Є Rk , where D is the design
Computational noise stems from the fact that certain space or the design domain. f(x) is assumed to be
simulations runs fail to converge as no numerical scheme is continuous and is considered as the quality, cost or
fool-proof and many times will fail in unexpected ways. The performance metric of the design problem. If the range of
surrogate models come in handy to act as fillers or filters to the k-variables is non-dimensionalised in the range [ 0,1] ,
smooth the data spanning any gaps. then design space D represents a k-dimensional hypercube.
Apart from the assumption of continuity, the only insight we
Finally the surrogate models can be used to gain insight can obtain about the function f is only through ‘n ‘ discrete
into the functional relationships among the variables and to observations or samples
identify the important ones and isolate the not-so-important
ones. Surrogate models, based on appropriate methods, can
be used to demonstrate, which variables are important and
{x(i) → y(i) = f(x(i)) [i= 1,2,3,…n] }. (1)
have the most profound impact on the final product along
with the approximate functional form. This will help design As these simulations are expensive to compute, the
engineers to focus on such parameters and understand them sampling points have to be distributed judiciously. The
with greater clarity. Along with visulaisation tools, contour number of sampling points is determined primarily by the
maps and plots can be generated using the surrogates to available resources and budget both in terms of computer
better visualize the intricate relations between the simulation runs and / or conducting physical experiments.
parameters. This might not have been possible using only The challenge here is to use the sparse set of observations
the computer intensive simulation code runs. or data to construct an approximation f * to f, which will
be subsequently used as the cheap alternative to evaluate
Having set the background for the necessity of building the any design in x Є D .
surrogate models in the modern day engineering design
process, the various stages that are to be followed in A. Latin Hypercube Sampling
building a good and reliable surrogate can be looked into.
These are detailed in the figure 1. A mathematically well posed surrogate model need not
necessarily generalize well and it may still be very poor in
predicting data at new or unseen locations in the k -
dimensional design space. The ability to predict reasonably
Sampling Plan Define the conditions of Computer well strongly depends on the sampling plan. Some
Simulation and /or Physical Experiments
sampling plans may need a fixed number of sampling points
and the designer has no choice in the matter. Suppose a
certain of level of accuracy is achieved by sampling an one
High Fidelity Quantitative Evaluation and Generate Data dimensional space at n locations, to achieve the same level
Simulations / of accuracy, one can assume intuitively infer that a
Observations
minimum of nk sampling points are needed. Thus sampling
at every possible combination of each of the design
variables becomes a laborious task. This sampling plan is
Construct Kriging, RBF, ANN , Polynomial
Surrogate referred to as full factorial design in the literature. The main
draw back of full factorial design is that the projection of
any variable on to the axes will overlap and the sampling
Optimisation Gradient Based / Evolutionary Algorithms can be improved, if it is made sure that these projections are
as uniform as possible. This can be done by splitting the
range of the variables into a relatively large number of equal
sized bins and generating random sub-samples within these
bins. Hence to extract as much information as possible with
Figure 1. Surrogate model Framework for Engineering Design
Optimisation
a limited set of simulation data, modern sampling of
simulation experiments use methods, that have a built-in
feature which is known as space filling property. A natural
development of this idea is to generate a sampling that is
stratified in all the k-dimensions. The sampling scheme is
known as Latin hypercube sampling (LHS). A major
advantage of LHS is that the number of samples n can be
tailored to match with the available computational budget
and resources. The number of samples ‘n’ is not restricted (2)
to any powers of k. This is especially useful, if the
dimension of the design space k is very large. A typical where m = [logRn] = [ln n / ln R ] , the square bracket here
LHS plan for a 10 point three dimensional case is shown in representing the integral part. A unique fraction between 0
the figure 2. This method yields a randomized sampling to 1, called the inverse radix number, can be constructed by
plan which guarantees multi-dimensional stratification but reversing the order of digits of n around the decimal point
does not ensure enough space filling characteristics. For as follows:
example, placing all the n-points along the main diagonal of
the design space, will not fill the available space uniformly.
A certain measure of ‘good’ and ‘bad’ latin hypercubes is
therefore necessary. This measure is defined in the
following way. Let d1 d2 … dm denote the unique values of (3)
the distance of between all the possible pairs of points in a
sampling plan S, sorted in ascending order and J1 J2 J3 … Jm The Hammersely points in the k-dimensional unit cube are
be defined such that Jj is the number of pairs of points in S given by the following sequence
separated by a distance dj. We will identify S as a maxmin
plan if it maximizes d1 and among plans for which this is
true minimizes J1. This definition can be applied to any set
of sampling plans. But since it is required to keep the
desired properties of the Latin hypercube sampling, we (4)
restrict the scope of all possible plans to a narrower set by
requiring further that S maxmisies d2 , among plans for Where R1, R2, … R k-1 are the first (k-1) prime numbers.
which this is true, minimizes J2 and so on till we reach Jm.
This method of identifying the best sampling plan becomes The Hammersely points are then given by
computationally intensive, if we have a large number of
sample points.
(5)
A comparison of different sampling plans for a two
dimensional case with 100 sampling points is shown in the
following figure
(9)
(7)
VI. RADIAL BASIS FUNCTION MODELS
having resulted from f and ‘ ε ‘ is a small constant margin
around each data point. If we assume that the errors are
independently and randomly distributed according to a In this section we describe the various forms of the
normal distribution with standard deviation σ , the function f*( x,w ) that are used to approximate the actual
probability of the data set is function f(x). The simplest model for f* is the mth order
polynomial model given by
The notable feature of the above equation is, that it is
linear in w, yet it can model highly non-linear multi-
(11) dimensional surfaces. The only condition that we impose is
that w is a square matrix namely nc=n. If the bases coincide
In the sense of maximum likelihood, estimate the with data points, ( ci = xi) then w can be determined from
parameters wm are determined through the least square
formulation Φw = y where Ψw=y (15)
(14)
(20)
(19)
One of the main advantages of kriging is, that the value of (21)
θ for each of the variable indicates the relative importance
of that parameter in evaluating y and hence non-significant Since P(D) is a normalizing constant, we can see the
parameters in a design problem can be identified. essence of Bayes approach in rewriting equation (20) for
P(H|D) as .
(23).
and the update point can be where s2(x) is maximum.
Another more elegant strategy of finding the update point,
is to use the probability of improvement which can be
defined as
(24)
Using the expected improvement index E[I(x)] the global
where I(x) is the difference between y(x) and ymin . The optimum is correctly traced, as seen in Figure 6.
update point is put where P[I(x)] is maximum. The integral
in equation can be found by using error function as
X. CONCLUSIONS
(26)
Figure 6. The Expected Improvement Function Distribution
ACKNOWLEDGMENT
REFERENCES
[5] V.R. Joseph and Ying Hung, ‘Blind Kriging: A new method for
Developing Meta Models ‘, Journal of Mechanical Design, Vol. 130,
No.3, February 2008.