0% found this document useful (0 votes)
13 views44 pages

IJPR MIM 2019 Revised 030720

This paper presents a machine learning approach using Long Short-Term Memory (LSTM) networks for demand forecasting in the context of the Physical Internet, specifically focusing on agricultural products in Thailand. The study compares the LSTM method with classical forecasting techniques such as ARIMAX, Support Vector Regression, and Multiple Linear Regression, demonstrating that LSTM outperforms these methods in scenarios with fluctuating demand. Additionally, a hybrid genetic algorithm and scatter search metaheuristic are introduced to optimize the tuning of LSTM hyperparameters, resulting in improved accuracy and efficiency over traditional trial-and-error methods.

Uploaded by

Hoàng Liêm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views44 pages

IJPR MIM 2019 Revised 030720

This paper presents a machine learning approach using Long Short-Term Memory (LSTM) networks for demand forecasting in the context of the Physical Internet, specifically focusing on agricultural products in Thailand. The study compares the LSTM method with classical forecasting techniques such as ARIMAX, Support Vector Regression, and Multiple Linear Regression, demonstrating that LSTM outperforms these methods in scenarios with fluctuating demand. Additionally, a hybrid genetic algorithm and scatter search metaheuristic are introduced to optimize the tuning of LSTM hyperparameters, resulting in improved accuracy and efficiency over traditional trial-and-error methods.

Uploaded by

Hoàng Liêm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 44

Machine Learning for demand forecasting in the Physical Internet: a

case study of agricultural products in Thailand

Anirut Kantasa-arda*, Maroua Nouirib, Abdelghani Bekrara, Abdessamad


Ait el cadia, Yves Salleza

a
LAMIH, UMR CNRS 8201, Université Polytechnique Hauts-de-France, Valenciennes, France
b
LS2N UMR CNRS 6004, Université de Nantes, Nantes, France

Abstract
Supply chains are complex, stochastic systems. Nowadays, logistics mangers face
two main problems: increasingly diverse and variable customer demand that is
difficult to predict. Classical forecasting methods implemented in many business
units have limitations with regard to fluctuating demand and the complexity of
fully connected supply chains. Machine Learning methods have been proposed to
improve prediction. In this paper, a Long Short-Term Memory (LSTM) recurrent
neural network is proposed for demand forecasting in a Physical Internet supply
chain network. A hybrid genetic algorithm and scatter search metaheuristics are
also proposed to automate the tuning of the LSTM hyperparameters. To assess
the performance of the proposed method, a real case study on agricultural
products in a supply chain in Thailand was considered. Accuracy and the
coefficient of determination were the key performance indicators used to
compare the performance of the proposed method with other supervised learning
methods: ARIMAX, Support Vector Regression, and Multiple Linear Regression.
The results prove the better forecasting efficiency of the LSTM method with
continuously fluctuating demand, whereas the other methods offer greater
performance with less varied demand. The empirical results of the hybrid
metaheuristics show that the performance in terms of accuracy and computational
time is higher than with the trial-and-error method.

Keywords: Demand forecasting; Machine learning; Recurrent neural network;


Physical Internet; Distribution; Metaheuristic

*Corresponding author. Email: anirutka@gmail.com

1
1. Introduction

Nowadays, logistics organizations have to improve the efficiency, reliability, and


availability of their services to be more competitive. Many studies indicate that sales
forecasting, effective demand planning, and related activities have a major impact on
efficiency at every stage throughout the supply chain. Demand forecasting can help to
improve productivity, control flow distribution, and optimize stock control (Marien,
1999). However, poor control of this process can result in incorrect predictions and,
therefore, inadequate decisions.
The complexity of the demand forecasting process results from fluctuating
customer behaviour. Nowadays, the pattern of customer demand is varied and non-
linear (Aburto and Weber 2007).
Moreover, supply chains must deal with increasingly important economical,
societal, and environmental issues (Janvier-James 2011). Reducing logistics costs is also
a priority for many logistics companies. To address these issues, a new innovative
paradigm called the “Physical Internet” (PI) was proposed in 2011 as a solution to
improve the performance of global logistics. As explained in (Montreuil, Meller, and
Ballot 2013), the Physical Internet network represents an open global logistics supply
chain based on physical, digital, and operational interconnectivity through international
standards, interfaces, and protocols. The goal of PI is to form an efficient, sustainable,
resistant, adaptable, and flexible open global logistics network.
PI supply chain structures are huge, complex, and fully connected, making the
demand forecasting problem more complicated. In complex supply chains such as PI
networks, fluctuations in demand can induce heavy perturbations (e.g. sold out in
warehouse, bullwhip effect for all parties, overstocking in distribution centres) (Janvier-
James 2011). In this context, demand forecasting can help ensure the availability of an
adequate quantity of raw materials for the production process and enough goods at the
distributors to serve customers. The forecasting problem is more complex and critical,
as predictions for each node in the network need to be considered. The challenging
point of this concept is the complexity of the connections between all the parties:
suppliers, distributors, and customers. As the PI paradigm is still new, the demand
forecasting problem in Physical Internet Supply chain networks still requires further
investigation.

2
In this study, the demand forecasting problem was tackled within the context of
the physical internet. The physical internet network was inspired by a real case study
relating to the distribution of agricultural products in Thailand. Given the increasing
variety and changing data flows, researchers have adopted innovative, hybrid methods.
As classical forecasting techniques have shown their limits, a new approach is proposed
based on learning techniques.
The main contributions of the paper are:
- A forecasting approach based on Long Short-Term Memory (LSTM) in the
context of the Physical Internet. This approach was compared with classical
ones.
- A genetic algorithm and scatter search hybrid metaheuristic to automate the
tuning of the LSTM hyperparameters to improve its efficiency
- The proposed model was tested on a real case study of agricultural products
in a supply chain in Thailand.

This paper is divided into six sections. This section introduces the paper. Section
2 reviews the literature on forecasting models. Section 3 details the problem statements
and assumptions. Section 4 details the methodology, the implementation of the
proposed forecasting approach, and the parameter tuning technique. Section 5 presents
the comparison between the proposed approach and classical forecasting models, as
well as the results of the parameter tuning process. The conclusion and some future
lines of research are given in Section 6.

2. Literature review
The literature review is structured as follows. Firstly, some forecasting models
are presented with their advantages and limitations. Secondly, the main metaheuristics
used to improve the forecasting models are reviewed, especially those used to tune the
model hyperparameters. For each forecasting method, a short description and relevant
applications are presented.
2.1 Forecasting models
Demand forecasting is an important issue and a fundamental step in supply chain
management. It consists in estimating the consumption of products or services for the
upcoming periods making it possible to plan activities and thus reduce delivery times
and adjust stock levels, for example, and optimize operating costs. Forecasting is not

3
easy, especially for complex, open systems such as PI. Indeed, there is no totally safe
and reliable method, and forecasting can affect many decisions. Forecasting methods
are primarily based on historical data (quantitative methods), assessments or estimates
(qualitative methods), or a mixture of both. Quantitative methods can be based on the
historical sequence of observed demand (times-series models), some exogenous
parameters that can affect the performance of the model (causal model), or both. Many
forecasting models are implemented and tested with time-series data. Classical methods
such as Moving Average, Naïve Approach, or Exponential Smoothing are easily
proposed to forecast trends in time-series data (Box and Jenkins 1970). However, to
forecast non-linear trends, some machine learning methods would be better and could
perform better compared to classical methods (Carbonneau, Laframboise, and Vahidov
2008). Time-series models are typically developed using historical values. They are
easy to model, can provide predictions over a specific period, and use the difference
between the predicted and real values in the immediate past to tune the model
parameters. However, some of them do not capture the effect of other factors that could
affect demand such as demand at other nodes in the PI network, stock levels in PI hubs,
unit price of each product. Neural Networks (NN) are designed to learn the relationship
between these factors and demand in a non-statistical approach. NN-based
methodologies do not require any predefined mathematical models but model tuning is
costly. If there are any patterns embedded in the data, NN come up with a result with
minimum errors. Other statistical methods have the advantage of providing relatively
inexpensive statistical forecasting models that only require historical data. However, the
accuracy of prediction of these models drops significantly when the time horizon is
extended, when the trends are not linear, or in the presence of some exogenous factors.
Two main groups of forecasting models were considered: Classical Forecasting
methods and Neural Networks. These were compared and show the relevance of the
NN-based group for an open logistics system in the context of PI.

2.1.1 Classical Forecasting group


The main classical forecasting methods (Auto-regressive Integrated Moving
Average with Exogenous factors (ARIMAX), Support Vector Regression (SVR), and
Multiple Linear Regression (MLR)) are detailed successively in this subsection. These
three methods are highlighted because they outperform the others in predicting non-
linear trends in customer demand while taking into account the effects of exogenous

4
factors. Furthermore, they have been widely used and implemented in real cases. For
example, the authors in (Carbonneau, Laframboise, and Vahidov 2008) implemented
SVR and MLR as benchmark models with a Recurrent Neural Network for demand
prediction of foundry data in Canada. The authors in (Aburto and Weber 2007; Ryu,
Noh, and Kim 2016) implemented an ARIMA model as a benchmark with a neural
network model to train and predict customer demand.
The mathematical formulation and some additional applications for ARIMAX,
SVR, MLR, and LSTM can be found in Appendix 1.

Autoregressive Integrated Moving Average with Exogenous factors (ARIMAX)


This model is an extension of the ARIMA model (Box and Jenkins 1970) with
exogenous factors, which are extra factors that could affect the predicted parameter
(Aburto and Weber 2007; Supattana 2014). This model performed well with seasonal
and stationary demand (Zhang and Qi 2005). Some research has implemented the
concepts of this model. Aburto and Weber (2007), for example, implemented the
ARIMA hybrid forecasting model and neural networks to forecast the future trend of
customer demand for a Chilean supermarket. The model predicted the trend based on
the variation in historical daily demand and multiple relevant factors. In the
transportation domain, Cools et al. (2009) investigated the variation in daily traffic
taking into account seasonal and holiday effects.

Support Vector Regression (SVR)


Support Vector Regression (SVR), which is part of the Support Vector Machine
(SVM), is one of the most popular models used in the literature to predict time-series
data in the supply chain (Wang 2012). A Support Vector Machine uses hyperplanes to
classify data. The SVM computes the equation of the hyperplane that divides the dataset
into classes. SVR extends the approach to forecasting. SVR is used in many forecasting
problems, in particular to forecast customer demand. In a manufacturing context,
Carbonneau et al. (2008) implemented SVR to predict customer demand based on past
orders with approximately 200 order days. The results obtained demonstrated that this
model offers high performance equivalent to that obtained with another method using
recurrent neural networks. Wang (2012) implemented this model to forecast mass
customization demand in the Shoe industry in China. In addition, he illustrated the

5
forecasting performance with the Relative Mean Square Error and found the
performance was better compared to the RBF neural network.

Multiple Linear Regression (MLR)


The Multiple Linear Regression model (MLR) is an extension of the simple
linear regression model using multiple independent variables to train the model
(Carbonneau, Laframboise, and Vahidov 2008). In general, this model is reputed to
predict the baseline trend. Some research has implemented this model as a benchmark
against a neural network. Carbonneau et al. (2008), for example, proposed MLR as a
forecasting model with a neural network. Benkachcha, Benhra, and El Hassani (2008)
compared MLR with an artificial neural network for predicting future sales based on
multiple independent variables in the supply chain. The results obtained with the two
forecasting models were similar when compared using the Mean Absolute Percentage
Error (MAPE).

Other classical models

The Random Walk and Exponential Triple Smoothing (ETS) models are
interesting benchmarks for recurrent neural networks. However, some constraints make
these two models incompatible with this experiment. Firstly, the ETS (Taylor 2010; A.
2016) and the Random Walk (Tyree and Long 1995; Nag and Mitra 2002) models are
fitted with a univariate input factor and the inputs in this experiment are multivariate:
unit price and historical daily demand. Secondly, the Random Walk model only
considers the last observed values, whereas LSTM considers the variants of time lags in
the prediction (A. 2016).

For benchmarking purposes, ARIMAX, SVR, and MLR are frequently


compared with neural network-based approaches, in particular recurrent neural
networks (RNN), according to the references mentioned above. Details are provided in
the next section.

2.1.2 Neural Network group


Neural networks (NNs) are modelling techniques with a wide range of
applications in all areas, including discrete classification, learning, pattern recognition,
control systems, and statistical modelling, and are often used in forecasting. NNs have

6
the main advantage of learning the patterns in the data and the relationship between
inputs and outputs using a non-statistical approach. NN-based approaches in forecasting
do not require any predefined mathematical models. They try to capture, memorize, and
use the inner patterns or relationships to make predictions.
NNs mimic how biological neurons operate, communicate, and learn. A NN is
made of several layers of interconnected neurons. A specific learning algorithm governs
the learning process. This training process changes the weights across the network until
the network is identified as an optimal model that explains the patterns and links
between the variables.
NN models are one of the most popular models for forecasting non-linear
behaviour in Supply Chains (Carbonneau, Laframboise, and Vahidov 2008). More
particularly, Recurrent Neural Networks exhibit good performances with complex
forecasting problems such as financial data, production capacity, retailer transactions, or
any complex time-series data. Long Short-Term Memory (LSTM) is one of the highest
performing recurrent neural network models. In LSTM, the concept of a memory cell
(Greff et al. 2017; Sagheer and Kotb 2019) is used to build the neural network structure.

Long Short-Term Memory (LSTM) Neural Networks, or more explicitly recurrent neural
networks (RNN) with short-term and long-term memory, are the most successful RNN
architectures. They have enjoyed enormous popularity in many applications and
domains, including forecasting problems.
Both LSTM and RNN are fundamentally different from traditional direct-acting
neural networks. They are formed by backpropagation through time (BPTT) (P.J.
Werbos 1990). These sequence-based models can establish temporal correlations
between the previous information and the current circumstances. This characteristic is
ideal for demand forecasting problems, as the effects of past demand and historical
values of exogenous factors on future demand can be modelled. Indeed, in a supply
chain, demand not only depends on past values but also on the present and past values
of other factors in the chain.
Much research has implemented recurrent neural networks, especially LSTM
models, for predictions with time-series data. Navya (2011) proposed an Artificial
Neural Network to forecast the future trading volume of agricultural commodities. In
terms of accuracy and inequality, the method outperformed the MLR and ARIMA
approaches. Sagheer and Kotb (2019) proposed an LSTM recurrent neural network to

7
forecast the future production rate of petroleum products. In (Kantasa-ard et al. 2019),
LSTM also outperformed other approaches in predicting white sugar consumption in
Thailand.
As stated before, few studies deal with the forecasting problem in the context of
the physical internet, especially using NN techniques. The authors in (Qiao, Pan, and
Ballot 2019), for example, proposed a dynamic pricing model based on forecasting the
quantity of transported requests in the next auction periods. The objective was to
maximize the total profit of the transportation rounds. In a previous study (Kantasa-ard
et al. 2019), LSTM was used to predict white sugar consumption in Thailand in the
context of a PI network.
The literature is full of studies on forecasting techniques, mainly quantitative
methods. Of these methods, the most important in classical regression are MLR,
ARIMAX, and SVR. Of the NN-based methods, LSTM performs best (Sagheer and
Kotb 2019; Chen, Zhou, and Dai 2015). Table 1 summarizes the characteristics of these
models. The first column provides the Model name, followed by its group in the second
column. The third column recaps the model characteristics. The last three columns
provide a comparison of the models according to the most commonly encountered
criteria in the literature (Cao, Li, and Li 2019; Aburto and Weber 2007; Carbonneau,
Laframboise, and Vahidov 2008): performance with complex data, training period, and
performance with a non-linear trend. Performance with complex data concerns the
accuracy as well as the ability of the model to handle many factors. The training period
relates to the computational time during the training phase. The performance with a
non-linear trend shows how a model can capture the patterns in the data, especially non-
linear relations. The number of “+” in Table 1 shows the quality of each indicator.
These three indicators are highlighted because of the characteristics of the agricultural
datasets used in our experiments.

8
Forecasting Model Characteristics Complex Training Non-
Model Group data period linear
trend

Classical This model was ++ ++ +


ARIMAX developed from the
(Aburto and Weber ARIMA model but also
2007; Supattana 2014; considers exogenous
Cools, Elke, and Geert factors
2009)
SVR Classical This model is a part of +++ +++ ++
(Carbonneau, the support vector
Laframboise, and machine model
Vahidov 2008; Wang
2012; Cao, Li, and Li
2019)
MLR Classical This model is an ++ +++ +
(Carbonneau, extension of simple
Laframboise, and linear regression
Vahidov 2008;
Benkachcha, Benhra,
and El Hassani 2008;
Ramanathan 2012)
LSTM Neural This model is based on ++++ + +++
(Sagheer and Kotb Network the concept of a
2019; Kantasa-ard et memory cell
al. 2019; Navya 2011;
Chen, Zhou, and Dai
2015)

Table 1. Comparison of forecasting model characteristics

As exhibited in Table 1, the LSTM model is particularly suited to dealing with


complex data and non-linear trends. However, LSTM requires a longer training period
compared to the other models. In addition, to improve prediction it is necessary to tune
hyperparameters in the model to reduce the error gap between the predicted and real
values (Ojha, Abraham, and Snášel 2017). To address this problem, an automated
hyperparameter tuning method is needed. In the following, some metaheuristic methods
that could speed-up the tuning phase are presented.

9
2.2 Metaheuristic methods for Neural Network parameter tuning
Trial and error is most commonly used for hyperparameter tuning in forecasting
models. However, it takes longer to find an appropriate set of parameters for the model.
Furthermore, there is no guarantee that the solution will be better (Kim and Shin 2007).
Metaheuristic methods are an interesting way of reducing the time spent on
hyperparameter tuning. Ojha and his research team, for instance, proposed that some
metaheuristics such as genetic algorithms, particle swarm optimization, and ant colony
optimization are good exploitation and exploration tools for tuning hyperparameters in
feed-forward neural networks (Ojha, Abraham, and Snášel 2017). However, no single
method can handle all tuning problems perfectly. Therefore, the hybrid metaheuristic
solution was put forward to improve the performance of the tuning phase. Indeed, the
tuning problem is complex for NN in general and more specifically RNN. There are
many behaviors to be extracted and collaboration between two or more heuristics should
be beneficial. In the following, the focus is on two metaheuristics: Genetic Algorithm
and Scatter Search.

Appendix 2 provides more details on the principle of the Genetic Algorithm and
Scatter Search.

2.2.1 Genetic Algorithm


Genetic Algorithms (GA) are one of the most well known and popular
metaheuristics used, particularly in the context of the supply chain (Altiparmak et al.
2006). GAs are stochastic search methods inspired by the biological evolution of living
beings (Goldberg 1989; Melanie 1999). They belong to the family of evolutionary
algorithms and their goal is to obtain an approximate solution in a reasonable time.
Some research implements genetic algorithms to optimize the machine learning
structure. Blanco et al. (2000) optimized the recurrent neural network structure of
grammatical inference using this metaheuristic. Kim and Shin (2007) implemented a
genetic algorithm to define the patterns of a stock market prediction model (i.e. time
delays, network structure factors). This method performed better than the trial-and-error
method. Sagheer and Kotb (2019) also implemented GA to tune hyperparameters in an
LSTM (i.e. number of hidden neural units, number of epochs, and lag size).
These studies demonstrate the performance of GA in optimizing the structure of
neural networks but some problems remain. Neural network hyperparameters, for

10
instance, are chosen randomly from the hyperparameter dictionary. As the network
parameters are generated from similar components in the dictionary, premature
convergence or local minima can occur before reaching the best solution (Dib et al.
2017). Therefore, constructing a hybrid method would be a great choice to increase the
performance of the network structure and prevent premature convergence. To do that,
Scatter search is a promising heuristic method and is described in the next section.
2.2.2 Scatter Search
Scatter Search (SS) is another metaheuristic method for constructing new
solutions based on the integration of existing or reference solutions (Laguna and Marti
2003). The purpose is to improve the performance of the solutions generated with the
various elements in the solution space.
Many studies propose this heuristic to improve their NN. Laguna and Martí
(2006), for example, implemented the concept of Scatter Search to train a single hidden
layer of a feed-forward neural network. They also compared the performance of the
Scatter Search with the classical backpropagation and extended Tabu Search methods
for around 15 instances. The results show that Scatter Search performs better with a
higher number of instances. Cuéllar, Delgado, and Pegalajar (2007) benchmarked their
hybrid training method of a recurrent neural network against a scatter search. Their
method produced the same good results as the scatter search.
The potential of scatter search was exploited in this paper to build a hybrid
metaheuristic with a genetic algorithm for tuning the hyperparameters of the LSTM. In
this perspective, the problem statements and assumptions of this research are proposed
in the next section. In addition, the results of implementing Scatter Search and a Genetic
Algorithm are presented in the results and analysis section (section 5.1).

3. Problem statements and Assumptions


Regarding the issues raised in the introduction, three main problem statements are
linked to this research. Firstly, customer demand nowadays fluctuates more and changes
over time. Secondly, raw materials are lacking on the production lines as well as
finished goods at the distribution centres to serve customers. Thirdly, connections
between all the parties in the supply chain are more complex. In the northern region of
Thailand, for example, there are many suppliers, distributors, and retailers. For instance,
the Big C (superstore company) has many distributors and small retailers in the northern
region (reference: https://corporate.bigc.co.th/). However, the connections between the

11
distributors and the retailers are based on each city or sub-region. The situation recently
showed that it is not practical to balance customer demand and stock levels at the
distributors in the region. The research question is how to balance customer demand and
stock levels between fully connected distribution centres and retailers in the supply
chain.
The concept of PI has never been implemented in the context of the agricultural
product supply chain in Thailand. Therefore, the quantity of commodity crops is
required to anticipate enough to serve retailers in the region based on the proposed
forecasting model. Furthermore, the distribution flow of demand forecasting with
agricultural products was simulated by implementing the concept of PI. The forecasting
details and the simulation model are described in the methodology section.
The experimental data were obtained from the Thai Office of Agriculture for the
period from January 2010 to December 2017 (OAE Thailand 2019). There were two
main assumptions for customer demand in this experiment.
 Firstly, daily demand was generated randomly from the monthly quantity of
commodity crops: pineapple, cassava, corn.
 Secondly, the total daily demand generated was equal to the monthly
quantity of commodity crops based on an equal probability each day.
Customer demand, in this experiment, included all retailers in the northern
region.
For the idea of distribution flow, in the example network in a PI context
presented in figure 1, it was assumed there were one production line, three PI-hubs, and
two retailers in the lower northern region of Thailand. All the components (production
line, PI-hubs, retailers) are interconnected.

12
Figure 1. Example of a distribution network in the context of the Physical Internet in the
lower northern region of Thailand.

4. Methodology
In this section, Figure 2 provides an overview of the proposed approach based on the
aforementioned problem statements and assumptions. Three items can be distinguished:
- Firstly, an appropriate forecasting model (item #1) was investigated taking
into account fluctuation in demand. The concept of the forecasting model is
to plan sufficient resources for all the parties in the chain. An LSTM
Recurrent Neural Network was considered and was implemented using
Python language with Keras and Sci-kit libraries.
- Secondly, automated tuning of the relevant parameters was proposed to
improve the performance of the forecasting model (item #2). The hybrid
metaheuristic used was constructed using a combination of a Genetic
Algorithm (GA) and Scatter Search, which replaced the GA Mutation
process.
- Thirdly, a simulation of a Physical Internet network using forecasting data
was conducted to investigate how to plan resources in a complex chain (item
#3) and to assess the effectiveness of the forecast data on reducing holding
and transportation costs. The simulation was performed using the NetLogo
multi-agent platform (Nouiri, Bekrar, and Trentesaux 2018).

13
Figure 2. Research structure flow chart

The details of the proposal are investigated in the following section. As per
figure 2, this section details successively the forecasting model, parameter tuning using
hybrid metaheuristics, and the simulation of the Physical Internet network.

4.1 The forecasting model


This section presents the five steps to forecast future demand: data gathering, data pre-
processing, implementing the forecasting model, computing the forecast, and evaluating
the model.
4.1.1 Data gathering
OAE Thailand provided the data. However, the data was monthly and daily data is
required to have more information for Neural Networks. There were approximately 3
000 observation days for each product. The dataset was generated using the Dirichlet
distribution method, which is a multivariate probability distribution (Bouguila, Ziou,
and Vaillancourt 2003). This distribution method works well for estimating uncertainty
probabilities for all variables in a model in both symmetric and asymmetric modes.
14
4.1.2 Data pre-processing
Once gathered, it was paramount to pre-process the data before carrying out any
predictions, as it can reduce the effect of noise and increase the performance when
training and forecasting data with neural networks and machine learning techniques
(Zhang and Qi 2005; Cao, Li, and Li 2019). This step involved data cleaning, data
transforming, and error validating. For data transforming, the data was normalized to be
the same scale before training and predicting. The dataset was then separated into two
sets: 80 percent for the training set and 20 percent for the test set. This ratio was chosen
as it provided the best accuracy according to the trial-and-error experiment conducted
with different percentages.

4.1.3 Implementing the forecasting models


Following the data pre-processing step, this section presents the forecasting process
with an LSTM recurrent neural network and classical machine learning models, as
mentioned previously. These models were applied to the data generated for three
commodity crops. The dataset for each crop comprised input variables (x) and predicted
outputs (y). The input variables were the historical daily demand and unit price of each
product, and the output variable was the predicted output of all the products for the next
period. The lag time, one of the most powerful methods to estimate the transit time
between historical data and predicted data in the experiment, was also considered
(Delhez and Deleersnijder 2008). Lag times of 2, 4, and 6 days were considered. The
principle of lag time is that the historical data in the previous period affects the data in a
future period (Delhez and Deleersnijder 2008). Furthermore, the LSTM model worked
well with a long lag time (Hochreiter and Schmidhuber 1997).

4.1.4 Computing the forecast


To compute the forecast, the data had to be denormalized after training and predicting.
The scaled data was converted to raw data of expected daily demand using the
inverse_transform method of the MinMaxScaler function.

4.1.5 Model evaluation


Once the forecast had been computed, the performance of the forecasting model was
assessed using the Root Mean Square Error (RMSE), Mean Absolute Error (MAE),
Mean Absolute Percentage Error (MAPE), and Mean Absolute Scale Error (MASE)
scores (see equations (1)-(5)).

15

( X i−Y i )2
n
RMSE= ∑ (1)
i n

∑ ¿ X i−Y i∨¿ (2)


i
MAE= ¿
n
n
1
MAPE= ∑ ¿ X i−Y i∨ ¿ ¿ ¿ *100 (3)
n i ¿ X i∨¿
MAE
MASE= (4)
Q
T
1
Q= ∗∑ ¿ X t− X t−1∨¿ ¿ (5)
T −1 t =2
where : X i is the real demand i
Y i is the forecast demand i
n is the forecasting period
T is the training period

These scores measure the accuracy and the goodness of fit between the real and
the predicted values (Shafiullah et al. 2008; Bala 2010; Acar and Gardner 2012). If
these scores are smaller, the deviation between the real and the predicted values is
smaller too. R-Squared (R2), another evaluation factor, measures the degree of
association between two variables in such a model (Cao, Li, and Li 2019). In this case,
the variables are the real and predicted values (see equation (6)).
n

∑ ( X ¿ ¿ i−¿ Y i)2
i
R2 = 1- ( n
¿ ¿ )2
∑ ( X i−X i)
i
(6)

The unit root score obtained using the ADF (Augmented-Dickey Fuller test)
determines if the predicted data is stationary or non-stationary. The null hypothesis of
ADF is H 0 : ρ=1 , which means the sequence is non-stationary if root ρ is equal to one.
Therefore, to reject the null hypothesis or make the data stationary, the root ρ should be
less than one and the ADF score should be more negative.

After constructing the forecasting model, another important task to consider was
tuning the model parameters.

4.2 Tuning the model parameters with a hybrid metaheuristic

16
As described previously, a relevant process for tuning the hyperparameters of the LSTM
model (number of hidden layers, number of neural units in each layer, activation
function, and optimizer function) is needed to optimise its efficiency. It took a long time
to choose the appropriate parameters for each dataset and the loss value was still high.
Hyperparameters are generally chosen based on the trial-and-error solution,
which means trying all possible solutions to tune the hyperparameters in the forecasting
model structure.
Some studies propose metaheuristics to tune neural network parameters, as
mentioned in the literature review. The principle of implementing a hybrid
metaheuristic is shown in figure 3 and the details are outlined below. In addition, the
section of the genetic algorithm was inspired by (Harvey 2017). The input data
(historical daily demand and unit price of each product) and the predicted outputs (daily
demand for the next period) were used to choose the hyperparameters.

Figure 3. Flow chart of the hybrid metaheuristic

Firstly, the algorithm starts the solution encoding by randomly generating the
population of LSTM hyperparameter network structures. In this case, four
hyperparameters were considered to construct the network structure: the number of
hidden layers, the number of neural units in each layer, activation functions, and
optimizer functions. These hyperparameters are the main parameters affecting the
performance of the forecasting model. Once the set of hyperparameter networks has
been generated, all the networks are trained and the algorithm returns a fitness score,
which is the loss value for each network. The network structures are then displayed in
descending order starting with the highest fitness score. The algorithm also checks
17
whether the process runs until the last network generation is reached or not. If the
generation is not the last one, the performance of all the networks will be improved
through the selection, crossover, and mutation processes of the genetic algorithm.
Details of the genetic algorithm are provided in figure 4.

(A)

(B)

18
Figure 4. Process overview of a Hybrid Genetic Algorithm and Scatter Search (A);
Example network structures in selection, crossover, and mutation (B)

In figure 4A, after initializing the population of network structures, the


algorithm chooses the network structures, starting with the highest fitness value. The
appropriate probability of selection is usually between 0.5 and 1 (Blanco, Delgado, and
Pegalajar 2000). Then, the chromosome of the parents is chosen randomly from the
selection process to produce a set of children, as shown in figure 4B. Finally, some
children from the list are chosen to mutate the parameter in the mutation process.
However, the difference between Classical GA and Hybrid GA is an
intensification step achieved via a Scatter Search technique. For the hybrid method, the
Diversification Generation Method gathers the list of hyperparameter networks from the
crossover process. Then, the following networks improve the performance using the
concept of improvement method. This means some hyperparameters in the network are
updated with different values after crossover. In this case, the average number of hidden
layers and the average number of neural units from the parent networks are considered
to construct a novel value of the network parameters. This perspective is also
implemented in convolutional neural networks to improve the performance of neural
network structures (Araújo et al. 2017). Once the algorithm has finished improving the
hyperparameter networks, the most recent networks are trained again. The set of
networks is trained and adjusted to the values of the network parameters until the last
generation has been completed.
Once the last generation has been trained, the algorithm returns the top five
hyperparameter networks in terms of loss values. The best network to train and predict
future demand in each dataset is then chosen.

4.3 Simulation model


As mentioned before, a simulation model is proposed to assess the performance of the
proposed forecasting approach. The forecast retailer demand (output of the LSTM) is
tested and compared to the real data in terms of holding and transportation costs. A
small deviation in KPI means that the data are predicted well. The inputs of our
simulator were the forecast or real demand of retailers, the ROP (Reorder Point), the
distance between all nodes, and the stock levels at each hub, as mentioned in figure 2.
The physical internet network tested comprised five nodes: one production line,
three PI-hubs, and two retailers.

19
As shown in figure 5, the simulation provides the daily variation in holding and
transportation costs.

Figure 5. Screenshot of the simulation model in the physical internet supply chain
The holding and transportation costs of real and forecast demand were
compared. A small deviation between the real and forecast results proves the
effectiveness of our proposed approach. An effective forecasting model leads to good
resource planning and, therefore, a decrease in supply chain costs. The same simulator
and the same configuration were used to simulate the predicted and real demand. The
configurations of the simulation model are detailed below.
Details of the configuration of the simulation model
As the PI concept is based on full connectivity between PI-hubs, a replenishment rule
needs to be chosen. In our simulation model, the replenishment policy was the same in
both experiments. The closest hub was always selected as a good replenishment node to
fulfil retailer demand. There were three main assumptions for the simulation.
 The order quantity of each retailer on each day was equal to daily demand.
 Each distribution hub had its own trucks and managed them separately.
 The stock levels at PI-hubs were sufficient for all orders (i.e. the initial stock
level at each hub was greater than the total predicted quantity).
In accordance with the assumptions in section 3, the predicted daily demand of
two retailers was used to calculate the transportation and holding costs for the predicted
daily demand in the simulation. After the delivery of retailer orders, the stock levels at
the hub were updated daily. The distance travelled by the trucks during delivery was
also updated. The holding and transportation costs were calculated and updated using
the equations below, where T is a daily period.

20
Holding cost:
T
 total_holding_cost = ∑ (daily ¿ )
t =1
 daily_holding_cost_hub = Inventory stock * 180

Transportation cost:
T
 total_transportation_cost = ∑ (daily ¿ )
t =1
 daily_Transportation_cost_truck = travelled_distance * Demand_Quantity *
1.85
The unit holding cost was equal to 180 THB or €5.20 per m 3 (based on the
Integrated Logistics Services Thailand 2019) and the unit transportation cost was equal
to 1.85 THB or €0.053 per km per ton (based on the Bureau of Standards and
Evaluation 2016). The simulation model was tested based on the predicted demand over
16 days and over 31 days. Then, the results (holding and transportation costs based on
predicted demand) were compared to the costs of real demand for the same period. The
main reason for focusing on 16 days and 31 days was to validate the deviation between
predicted and real demand based on different volumes of daily demand. The model
evaluation is described in section 5.3.

The performance of the hybrid metaheuristic implementation of an LSTM


architecture compared with all the forecasting benchmarks and the total cost in the
simulation model are presented in the results and analysis section.

5. Results and Analysis


The results are divided into three parts: evaluation of hyperparameter tuning, evaluation
of forecasting model performance, and comparison of the total cost in the simulation
model of the global supply chain in the PI context. All details are provided below.

5.1 Evaluation of hyperparameter tuning


In table 2 below, the comparison of the different tuning methods is based on the
performance of the LSTM recurrent neural network only using data for white sugar
consumption (Kantasa-ard et al. 2019). The tuning parameter with a hybrid method,
which is a combination of a genetic algorithm and a scatter search, offers the best
solution compared to the classical genetic algorithm and trial-and-error methods from
the previous research (Kantasa-ard et al. 2019). The results show that the hybrid method

21
provides the lowest RMSE and MAPE scores with the training and test datasets.
Furthermore, the execution time was lower than the other tuning methods. The epoch
iteration was 500, which was taken from a previous study (Kantasa-ard et al. 2019).

Tuning Hyperparameter Execution Prediction Performance


Solution (No. of layers, Time RMSE RMSE MAPE MAPE
Neural units, (minutes) (Training) (Test) (Training) (Test)
Activation,
Optimizer)
Trial-and-
(2,100,sigmoid,adam) 480 239.31 439.69 2.68 7.08
error
GA (1,128,elu,rmsprop) 58 144.38 333.3 2.5 6.37
Hybrid GA
(2,64,elu,rmsprop) 23 143.41 317.82 2.5 6.13
& SS

Table 2. Comparison of the performance of the LSTM hyperparameter tuning methods


The hybrid method worked well in the previous study, and is, therefore,
compatible with other agriculture products. Based on the results in table 2, the
hyperparameter structure of the LSTM was constructed using the hybrid metaheuristic
method.

5.2 Evaluation of the forecasting model performance


The LSTM model was implemented with datasets for three commodity crops: cassava,
corn, and pineapple. In addition, the LSTM model was benchmarked against other
forecasting models: Multiple Linear Regression, Support Vector Regression, and
ARIMAX. Five means of evaluation were considered in this section: RMSE, MAE,
MAPE, and MASE for accuracy and R-squared (R 2) for the degree of association
between the predicted output and expected real output. Furthermore, the ADF score was
used to assess if the predicted demand was stationary. Details of all the evaluation tools
are provided in section 4.1.5 above. The first prediction concerns pineapple production;
the results are presented in table 3(A-C)

22
R
e
a
A
l
L R
S M
S I
Dd VL
T M
e RR
MA
m
X
a
n
d
1 1 1 1 1
1 2 2 2 1
9 0 0 4 9
04 3 8 1 3
. . . . .
9 9 1 6 8
2 2 6 1 2
1 1 1 1 1
2 1 1 1 1
7 6 7 4 5
11 8 9 8 1
. . . . .
0 3 9 7 1
0 9 3 7 0
1 1 1 1 1
0 2 2 2 2
4 2 4 2 2
26 8 3 8 9
. . . . .
4 8 1 1 3
2 9 5 1 3
1 1 1 1 1
2 1 1 1 1
0 5 4 3 0
34 7 2 7 7
. . . . .
3 1 0 3 4
7 4 8 1 8
1 1 1 1
9
1 1 1 1
2
1 7 3 3
4
4 6 2 2 9
.
. . . .
5
2 9 3 3
0
5 8 9 9
51 1 1 1 1
2 0 0 0 0

23
8 7 6 3 0
5 9 2 8 6
. . . . .
4 1 0 2 8
3 2 6 5 1
1 1 1 1 1
1 1 1 1 1
3 5 8 3 5
67 1 7 9 3
. . . . .
6 0 9 5 6
7 9 9 1 5
1 1 1 1 1
3 0 1 2 1
6 9 7 0 6
70 8 0 8 8
. . . . .
3 6 7 8 7
3 3 7 0 9
1 1 1 1 1
2 3 2 2 2
5 9 5 5 6
80 0 4 1 7
. . . . .
5 8 6 8 2
0 1 3 9 6
1 1 1 1 1
2 2 2 3 2
7 0 4 0 6
99 8 9 1 6
. . . . .
5 2 9 3 1
5 5 9 9 5
1 1 1 1 1
2 2 2 2 2
7 3 3 6 5
18 7 9 1 0
. . . . .
6 2 6 3 3
2 6 4 1 1
1 1 1 1 1
2 2 2 2 2
2 2 4 7 5
13 6 6 4 8
. . . . .
9 4 6 7 9
5 9 4 2 1
1 1 1 1 1
4 1 2 2 2
0 7 1 4 2
10 3 7 7 2
. . . . .
1 5 7 8 9
8 3 5 2 3
11 1 1 1 1

24
1 4 2 3 3
6 0 9 0 2
5 2 9 6 0
. . . . .
6 0 1 8 6
8 3 9 9 1
1 1 1 1 1
3 1 2 2 2
2 4 1 8 2
14 1 6 2 3
. . . . .
8 3 8 0 4
0 9 7 6 3
1 1 1 1 1
1 3 2 2 2
8 4 4 4 5
14 0 6 5 2
. . . . .
0 3 0 5 9
8 0 0 0 5
(A)

Forecasting Data with time lag2


Model RMSE RMSE MAPE MAPE MAE MAE MASE MASE
Train Test Train Test Train Test Train Test
SVR 179.86 152.31 13.01 11.15 105.19 100.46 0.890 0.850
MLR 187.58 153.04 13.01 11.25 109.04 101.87 0.923 0.862
ARIMAX 189.19 331.29 12.9 41.98 108.78 103.11 0.921 0.873
LSTM 173.92 158.45 11.91 12.18 102.5 106.9 0.868 0.905

Forecasting Data with time lag4


Model RMSE RMSE MAPE MAPE MAE MAE MASE MASE
Train Test Train Test Train Test Train Test
SVR 177.14 150.92 12.75 11.11 101.75 98.38 0.861 0.832
MLR 185.78 150.25 12.85 11 107.58 99.71 0.910 0.843
ARIMAX 186.36 150.26 12.75 11.04 107.5 99.82 0.909 0.844
LSTM 178.04 150.91 14.61 11.18 107.69 99.52 0.911 0.842

Forecasting Data with time lag6


Model RMSE RMSE MAPE MAPE MAE MAE MASE MASE
Train Test Train Test Train Test Train Test
SVR 177.21 151.14 12.71 11.1 101 98.35 0.854 0.831
MLR 185.7 150.16 12.85 10.97 107.45 99.56 0.908 0.841
ARIMAX 186.27 150.15 12.75 11 107.32 99.75 0.907 0.843
LSTM 185.43 149.24 14.04 10.97 109.18 98.05 0.923 0.829

(B)
Forecasting Data with time lag2 Data with time lag4 Data with time lag6
Model R2 Train R2 Test R2 Train R2 Test R2 Train R2 Test
SVR 0.93 0.91 0.93 0.91 0.93 0.91
MLR 0.92 0.9 0.92 0.91 0.92 0.91
ARIMAX 0.92 0.57 0.92 0.91 0.93 0.91
LSTM 0.94 0.9 0.93 0.91 0.93 0.92

25
(C)
Table 3. Examples of real and predicted daily demand with relevant forecasting models
for pineapple with time lag2 (A); Performance of the forecasting model for future
demand of pineapple (B)-(C)

(A)

ADF statistic: -4.097


Confidence level Critical val.
95% -2.867
90% -2.569

(B)

Figure 6. Comparison of the trends in forecast and real demand using LSTM and SVR
models with time lag6 (A); ADF statistic score of LSTM demand forecasting with time
lag6 (B).

As shown in Table 3, both LSTM and Support Vector Regression (SVR)


performed well in terms of accuracy and the degree of association between the predicted
and real demand. Regarding accuracy, LSTM predominantly performed better due to its
ability to transfer the forecasting pattern to minimize the error in the test dataset with
lag6. However, SVR had the best forecasting performance with time lag2 and lag4. For
the degree of association, LSTM provided the best performance with time lag6.
Furthermore, when the performance of each dataset was considered, datasets were more
effective with time lag6 according to the accuracy and coefficient of determination
values, as shown in figure 6. In addition, the prediction demand with time lag6 was

26
stationary based on the ADF score. This means that LSTM could work well with more
time-series data. Next, the experiments with other commodity crops are presented, as
shown in tables 4-5 and figures 7-8.

Data with time lag2


Forecasting
RMSE RMSE MAPE MAPE MAE MAE MASE MASE
Model
Train Test Train Test Train Test Train Test
SVR 4262.6 11848.73 97.00 4673.00 2288.9 8808.87 1.128 4.341
MLR 4252.25 4458.47 25.00 34.39 1961.47 1825.98 0.967 0.900
ARIMAX 4291.83 6112.61 22.21 419.42 2000.98 3654.29 0.986 1.801
LSTM 4266.33 4979.95 55.34 81.65 2119.41 2574.58 1.044 1.269

Forecasting Data with time lag4


Model RMSE RMSE MAPE MAPE MAE MAE MASE MASE
Train Test Train Test Train Test Train Test
SVR 4294.44 8845.33 109.78 3363.86 2360.21 6563.01 1.165 3.239
MLR 4250.66 4445.96 24.63 26.55 1955.66 1810.97 0.965 0.894
ARIMAX 4272.16 4449.83 24.47 25.3 1965.69 1808.88 0.970 0.893
LSTM 4245.34 4699.5 47.77 48.49 2077.09 2392.05 1.025 1.180

Forecasting Data with time lag6


Model RMSE RMSE MAPE MAPE MAE MAE MASE MASE
Train Test Train Test Train Test Train Test
SVR 4241.51 7919.86 110.8 2802.22 2339.93 5776.72 1.157 2.855
MLR 4243.79 4496.93 24.4 23.1 1953.51 1829.13 0.966 0.904
ARIMAX 4237.65 5431.77 24.83 69.42 1955.94 2243.67 0.967 1.109
LSTM 4256.19 4779.04 60.19 89.46 2121.73 2421.65 1.049 1.197

(A)
Data with time Data with time
Forecasting Model lag2 lag4 Data with time lag6
R2 Train R2 Test R2 Train R2 Test R2 Train R2 Test
SVR 0.95 0.7 0.95 0.83 0.96 0.86
MLR 0.96 0.95 0.96 0.95 0.96 0.96
ARIMAX 0.96 0.92 0.96 0.96 0.96 0.94
LSTM 0.96 0.95 0.96 0.95 0.96 0.95

(B)

Table 4. Performance of the forecasting model for future demand of cassava (A)-(B)

27
(A)
ADF statistic: -3.191
Confidence level Critical val.
95% -2.867
90% -2.57

(B)
Figure 7. Comparison of the trends in forecast and real demand using LSTM and
ARIMAX models with time lag4 (A); The ADF statistic score for LSTM demand
forecasting with time lag4 (B).
The performance evaluation in table 4 shows that LSTM performs well even
though Multiple Linear Regression (MLR) and ARIMAX were better in terms of
accuracy and degree of association between predicted and real demand. In this dataset,
the accuracy scores for LSTM were similar to those of the MLR model with time lag2,
whereas ARIMAX performed better with time lag4 and lag6. Regarding the degree of
association, the LSTM scores were very good with all the time lags compared to the
best scores obtained with the ARIMAX and MLR models. In addition, the predicted
demand with time lag4 was stationary based on the ADF score. The best performance of
the LSTM model was the prediction pattern with time lag4, as shown in figure 7.

28
Forecasting Data with time lag2
Model RMSE RMSE MAPE MAPE MAE MAE MASE MASE
Train Test Train Test Train Test Train Test
SVR 1873.48 2447.49 53.03 91.57 1066.26 1312.98 1.027 1.264
MLR 1912.62 2384.45 20.06 28.09 986.79 1069.35 0.950 1.030
ARIMAX 1901.65 2407.09 19.49 24.34 975.99 1062.43 0.940 1.023
LSTM 1912.48 2329.84 35.2 53.18 1017.05 1155.15 0.979 1.112

Forecasting Data with time lag4


Model RMSE RMSE MAPE MAPE MAE MAE MASE MASE
Train Test Train Test Train Test Train Test
SVR 1864.16 2438.99 54.44 93.42 1064.91 1334.2 1.022 1.280
MLR 1909.91 2373.01 19.47 26.77 981.4 1045.92 0.942 1.004
ARIMAX 1906.17 2997.26 19.47 25.65 997.17 1361.22 0.957 1.306
LSTM 1904.34 2397.96 37.35 59.46 1039.32 1226.47 0.997 1.177

Forecasting Data with time lag6


Model RMSE RMSE MAPE MAPE MAE MAE MASE MASE
Train Test Train Test Train Test Train Test
SVR 1853.44 2465.06 55.12 92.89 1055.28 1341.78 1.012 1.287
MLR 1907.17 2380.15 19.39 26.27 980.82 1055.7 0.941 1.013
ARIMAX 1906.2 2383.93 19.45 26.62 978.43 1056.88 0.939 1.014
LSTM 1877.36 2365.03 40.15 61.68 1029.08 1146.26 0.987 1.099

(A)

Forecasting Data with time lag2 Data with time lag4 Data with time lag6
Model R2 Train R2 Test R2 Train R2 Test R2 Train R2 Test
SVR 0.96 0.94 0.96 0.94 0.96 0.94
MLR 0.96 0.94 0.96 0.94 0.96 0.95
ARIMAX 0.96 0.94 0.96 0.91 0.96 0.94
LSTM 0.96 0.95 0.96 0.94 0.96 0.95

(B)

Table 5. Performance of the forecasting model for future demand of corn (A)-(B)

(A)

29
ADF statistic: -3.73
Confidence level Critical val.
95% -2.867
90% -2.57

(B)
Figure 8. Comparison of the trends in forecast and real demand using LSTM and MLR
models with time lag4 (A); The ADF statistic score for LSTM demand forecasting with
time lag6 (B).
The results of the performance evaluation are shown in Table 5. The RMSE and
R² scores demonstrate the good performance of the LSTM model for predicting demand
with time lag2 and lag6. The accuracy scores were better with the ARIMAX model with
time lag2 and the MLR model with time lag4 and lag6. In addition, the predicted
demand with time lag6 was stationary based on the ADF score. Moreover, the best
performance of the LSTM model was the prediction pattern with time lag6, as shown in
figure 8.
The overall performance of the forecasting models implemented for three
commodity crops with different dataset conditions is summarized. An overview is
shown in table 6 below.

Dataset Pineapple Cassava Corn


Accuracy Degree of Accuracy Degree of Accuracy Degree of
Associatio Associatio Association
n n
Time lag2 SVR SVR MLR MLR, ARIMAX LSTM
LSTM
Time lag4 SVR, LSTM, ARIMAX ARIMAX MLR SVR, MLR,
MLR SVR LSTM

Time lag6 LSTM LSTM MLR MLR MLR MLR,


LSTM

30
Table 6. The best performances of the forecasting models for future demand of all
commodity crops and relevant conditions

When considering the overall performance of these models, the performance of


LSTM was similar to Multiple Linear Regression (MLR) and Support Vector
Regression (SVR) with the pineapple dataset, particularly with time lag6. For the
cassava and corn datasets, MLR and ARIMAX were more accurate. Regarding the
degree of association, LSTM achieved great data correlation with both the training and
test datasets for all the products, especially pineapple and corn. Besides, looking at the
graph of each product, LSTM performed well with continuous fluctuation whereas
MLR and ARIMAX performed well with discrete fluctuation. Moreover, the solution
used to tune the LSTM hyperparameters influenced the training and prediction
performance more than the trial-and-error technique.

Regarding the prediction characteristics, all the product graphs are seasonal.
However, the trends for each product are different; pineapple was non-linear whereas
the other products were more linear. For this reason, LSTM performed well with
predicting demand for pineapple, and the other classical models, MLR and ARIMAX,
were good for predicting demand for cassava and corn.

Once the forecasting process was finished, the prediction results obtained with
the LSTM forecasting model were used as inputs to calculate the total cost in the
simulation model of the Physical Internet, as mentioned in the next section. The main
perspective was to demonstrate the performance of the distribution flow after
implementing demand forecasting.

5.3 Total cost comparison in the simulation model of a global supply chain in the PI
context
Based on the assumptions of the simulation model stated previously, the simulation
model proposed by (Nouiri et al, 2018) was adapted to simulate the distribution flow in
the physical internet network inspired by the distribution centres in the northern region
of Thailand. In the original model (Nouiri, Bekrar, and Trentesaux 2018), demand was
randomly generated and the simulation was implemented to estimate the total
distribution cost.
The forecast demand for pineapple given by the LSTM model was compared
with the real demand via the multi-agent simulator. The holding and transportation costs

31
were used as KPI. These costs were also compared with those obtained when
considering real demand, as mentioned in table 7. The configuration details and
assumptions of the physical internet supply chain simulation are described in section 4.3
and the physical internet distribution flow is shown in figure 1 above.
The service level was based on sufficient stock levels to cope with daily demand
at each retailer. The holding costs and transportation costs are detailed in table 7.
Forecast Demand Real Demand
Day Total Holding Transportation Total Holding Transportatio
Demand Cost Cost Demand Cost n Cost
0 1203.92 152578.8 5933.68 1194.92 152647.2 5890.6
1 1168.39 143434.8 5759.1 1271.00 142700.4 6264.7
2 1228.89 133815.6 6058.39 1046.42 134510.4 5158.24
3 1157.14 124758 5704.68 1204.37 125085.6 5935.95
4 1116.25 116020.8 5502.89 924.50 117849.6 4557.39
5 1079.12 107575.2 5319.23 1285.43 107791.2 6335
6 1151.09 98568 5672.93 1137.67 98888.4 5607.18
7 1098.63 89917.2 5414.45 1360.33 88243.2 6704.58
8 1390.81 79088.4 6854.23 1250.50 78458.4 6162.68
9 1208.25 69631.2 5956.36 1279.55 68443.2 6307.8
10 1237.26 59947.2 6099.2 1278.62 58435.2 6303.26
11 1226.49 50349.6 6044.77 1223.95 48855.6 6033.44
12 1173.53 41166 5784.03 1400.18 37897.2 6901.84
13 1402.03 30193.2 6910.91 1165.68 28774.8 5745.49
14 1141.39 21261.6 5625.31 1324.80 18406.8 6530
15 1340.30 10771.2 6607.08 1184.08 9140.4 5836.19
Tota
19323.49 1329076.8 95247.24 19532.00 1316127.6 96274.34
l
(A)

Duration Forecast Demand Real demand Deviation Percentage


(day) Holding Transport Holding Transport Holding Transportati
Cost ation Cost Cost ation Cost Cost on Cost
16 days 1329076.8 95247.24 1316127.6 96274.34 0.98 1.07
31 days 4788446.4 187134.07 4838018 186583.28 1.02 0.3
(B)
Table 7. Comparison of holding costs and transportation costs between forecast and real
demand over 16 days (A); Deviation in holding cost and transportation cost between
forecast and real demand over 16 days and 31 days (B)

Regarding the results in table 7, the small deviation of 0.98% and 1.02 % in the
holding cost and 1.07 % and 0.3% in the transportation cost over 16 days and 31 days,

32
respectively, means that the forecasting model is effective even if the data set is large.
These results could be useful to help companies plan the budget for storing and
transporting goods based on forecast demand.

6. Conclusion
There are three main contributions in this research. Firstly, the proposed LSTM model
performed well for demand forecasting compared with classical machine learning
methods, even though the ARIMAX and Multiple Linear Regression models performed
well for some products in terms of accuracy. In addition, the overall performance was
not hugely different from the classical forecasting models. The prediction capability of
LSTM was good with continuous fluctuation such as with the pineapple dataset,
whereas the classical forecasting models were reasonably good with discrete
fluctuation. In terms of the degree of association, LSTM captured the patterns of future
demand and real demand better than the other models based on the coefficient of
determination. Secondly, the use of a hybrid metaheuristic was proposed to automate
the tuning of the hyperparameters in the LSTM model. The accuracy and the
computational time were better than with the trial-and-error method. Finally, for the
total distribution cost in the Physical Internet simulation, the holding cost varied by
approximately one percent between forecast and real demand, and the transportation
cost varied from 0.3 to 1 percent. Therefore, the demand forecasting was effective and
led to good resourcing planning and optimization of the total supply chain cost in the
context of the Physical Internet.
For future research, it would be interesting to focus more on the hybrid methods
of forecasting models. For instance, researchers could implement the concept of the
LSTM model with other regression models to improve the prediction of customer
demand in the supply chain. Next, for tuning the hyperparameters, researchers could
also consider implementing other metaheuristics to increase the performance of the
network structure using demand from the forecasting models. Moreover, researchers
should consider alternative ways of improving routing, which will reduce further
distribution costs in the Physical Internet. In addition, when the number of hubs and
retailers is larger, they can implement the concept of dynamic clustering (Kantasa-Ard
et al. 2019) to cluster a group of distribution hubs and retailers before constructing
connected routes and planning the budget for the distribution process.

33
Acknowledgements
Regarding the successful results in this research work, we would like to thank an internship
student, Niama Boumzebra, for preparing the dataset to test the forecasting performance with
classical forecasting methods. We would also like to thank the Office of Agricultural
Economics Thailand for providing the initial dataset for generating data in this experiment.
Finally, we would like to thank Campus France and Burapha University for their sponsorship.

References

A., Ashraf. 2016. “Using Multiple Seasonal Holt-Winters Exponential Smoothing to


Predict Cloud Resource Provisioning.” International Journal of Advanced
Computer Science and Applications 7 (11): 91–96.
https://doi.org/10.14569/ijacsa.2016.071113.

Aburto, Luis, and Richard Weber. 2007. “Improved Supply Chain Management Based
on Hybrid Demand Forecasts.” Applied Soft Computing Journal 7 (1): 136–44.
https://doi.org/10.1016/j.asoc.2005.06.001.

Acar, Yavuz, and Everette S. Gardner. 2012. “Forecasting Method Selection in a Global
Supply Chain.” International Journal of Forecasting 28 (4): 842–48.
https://doi.org/10.1016/j.ijforecast.2011.11.003.

Altiparmak, Fulya, Mitsuo Gen, Lin Lin, and Turan Paksoy. 2006. “A Genetic
Algorithm Approach for Multi-Objective Optimization of Supply Chain
Networks.” Computers and Industrial Engineering 51 (1): 196–215.
https://doi.org/10.1016/j.cie.2006.07.011.

Araújo, Teresa, Guilherme Aresta, Bernardo Almada-Lobo, Ana Maria Mendonça, and
Aurélio Campilho. 2017. “Improving Convolutional Neural Network Design via
Variable Neighborhood Search.” Lecture Notes in Computer Science (Including
Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in
Bioinformatics) 10317 LNCS: 371–79. https://doi.org/10.1007/978-3-319-59876-
5_41.

Bala, P.K. 2010. “Decision Tree Based Demand Forecasts for Improving Inventory
Performance.” IEEM2010 - IEEE International Conference on Industrial

34
Engineering and Engineering Management, 1926–30.
https://doi.org/10.1109/IEEM.2010.5674628.

Benkachcha, S., J. Benhra, and H. El Hassani. 2008. “Demand Forecasting in Supply


Chain: Comparing Multiple Linear Regression and Artificial Neural Networks
Approaches.” International Review on Modelling and Simulations 7 (2): 279–86.
https://doi.org/10.15866/iremos.v7i2.641.

Blanco, A., M. Delgado, and M. C. Pegalajar. 2000. “A Genetic Algorithm to Obtain


the Optimal Recurrent Neural Network.” International Journal of Approximate
Reasoning 23 (1): 67–83. https://doi.org/10.1016/S0888-613X(99)00032-8.

Bouguila, Nizar, Djemel Ziou, and Jean Vaillancourt. 2003. “Novel Mixtures Based on
the Dirichlet Distribution: Application to Data and Image Classification.” In
International Workshop on Machine Learning and Data Mining in Pattern
Recognition, 172–81. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-
45065-3_15.

Box, G.E.P., and G.M. Jenkins. 1970. Time Series Analysis. Forecasting and Control.
[Reprint]. Holden Day,San Francisco.

Bureau of Standards and Evaluation, THAILAND. 2016. “List of Truck Transportation


Cost.” 2016. http://hwstd.com/Uploads/Downloads/9/วิธีปฏิบัติ07.pdf.

Cao, Jian, Zhi Li, and Jian Li. 2019. “Financial Time Series Forecasting Model Based
on CEEMDAN and LSTM.” Physica A: Statistical Mechanics and Its Applications
519: 127–39. https://doi.org/10.1016/j.physa.2018.11.061.

Carbonneau, Real, Kevin Laframboise, and Rustam Vahidov. 2008. “Application of


Machine Learning Techniques for Supply Chain Demand Forecasting.” European
Journal of Operational Research 184 (3): 1140–54.
https://doi.org/10.1016/j.ejor.2006.12.004.

Chen, Kai, Yi Zhou, and Fangyan Dai. 2015. “A LSTM-Based Method for Stock
Returns Prediction: A Case Study of China Stock Market.” Proceedings - 2015
IEEE International Conference on Big Data, IEEE Big Data 2015, 2823–24.
https://doi.org/10.1109/BigData.2015.7364089.

Cools, Mario, Moons Elke, and Wets Geert. 2009. “Investigating the Variability in
Daily Traffic Counts through Use of ARIMAX and SARIMAX Models: Assessing

35
the Effect of Holidays on Two Site Locations.” Transportation Research Record
2136 (1): 57–66. https://doi.org/https://doi.org/10.3141/2136-07.

Cuéllar, M. P., M. Delgado, and M. C. Pegalajar. 2007. “Problems and Features of


Evolutionary Algorithms to Build Hybrid Training Methods for Recurrent Neural
Networks.” ICEIS 2007 - 9th International Conference on Enterprise Information
Systems, Proceedings AIDSS: 204–11.
https://doi.org/10.5220/0002383502040211.

Delhez, Éric J.M., and Éric Deleersnijder. 2008. “Age and the Time Lag Method.”
Continental Shelf Research 28 (8): 1057–67.
https://doi.org/10.1016/j.csr.2008.02.003.

Greff, Klaus, Rupesh K. Srivastava, Jan Koutnik, Bas R. Steunebrink, and Jurgen
Schmidhuber. 2017. “LSTM: A Search Space Odyssey.” IEEE Transactions on
Neural Networks and Learning Systems 28 (10): 2222–32.
https://doi.org/10.1109/TNNLS.2016.2582924.

Harvey, Matt. 2017. “Let’s Evolve a Neural Network with a Genetic Algorithm.”
Coastlineautomotion. 2017. https://blog.coast.ai/lets-evolve-a-neural-network-
with-a-genetic-algorithm-code-included-8809bece164?

Hochreiter, Sepp, and Jurgen Schmidhuber. 1997. “Long Short Term Memory. Neural
Computation.” Neural Computation 9 (8): 1735–80.
https://doi.org/10.3109/21695717.2013.794593.

Integrated Logistics Services Thailand, ILS. 2019. “Logistics Pricing.” 2019.


http://www.ils.co.th/th/pricing/.

Janvier-James, Assey Mbang. 2011. “A New Introduction to Supply Chains and Supply
Chain Management: Definitions and Theories Perspective.” International Business
Research 5 (1): 194–208. https://doi.org/10.5539/ibr.v5n1p194.

Kantasa-ard, Anirut, Abdelghani Bekrar, Abdessamad Ait el cadi, and Yves Sallez.
2019. “Artificial Intelligence for Forecasting in Supply Chain Management: A
Case Study of White Sugar Consumption Rate in Thailand.” In 9th IFAC
Conference on Manufacturing Modelling, Management and Control MIM
2019Berlin, Germany. Berlin: IFAC.

Kantasa-Ard, Anirut, Maroua Nouiri, Abdelghani Bekrar, Abdessamad Ait El Cadi, and

36
Yves Sallez. 2019. “Dynamic Clustering of PI-Hubs Based on Forecasting
Demand in Physical Internet Context.” In Studies in Computational Intelligence,
853:27–39. Springer Verlag. https://doi.org/10.1007/978-3-030-27477-1_3.

Kim, Hyun jung, and Kyung shik Shin. 2007. “A Hybrid Approach Based on Neural
Networks and Genetic Algorithms for Detecting Temporal Patterns in Stock
Markets.” Applied Soft Computing Journal 7 (2): 569–76.
https://doi.org/10.1016/j.asoc.2006.03.004.

Laguna, Manuel, and Rafael Marti. 2003. “Scatter Search Methodology and
Implementations in C.” In Operations Research/Computer Science Interfaces
Series. Boston,MA: Kluwer Academic Publishers.

Laguna, Manuel, and Rafael Martí. 2006. “Scatter Search.” In Metaheuristic


Procedures for Training Neutral Networks, 139–52. Springer US.
https://doi.org/10.1007/0-387-33416-5_7.

Montreuil, Benoit, Russell D. Meller, and Eric Ballot. 2013. Physical Internet
Foundations. Studies in Computational Intelligence. Vol. 472. IFAC.
https://doi.org/10.1007/978-3-642-35852-4_10.

Nag, Ashok K., and Amit Mitra. 2002. “Forecasting Daily Foreign Exchange Rates
Using Genetically Optimized Neural Networks.” Journal of Forecasting 21 (7):
501–11. https://doi.org/10.1002/for.838.

Navya, Nanaiah. 2011. “Forecasting of Futures Trading Volume of Selected


Agricultural Commodities Using Neural Networks.” University of Agricultural
Sciences, Bengaluru.

Nouiri, Maroua, Abdelghani Bekrar, and Damien Trentesaux. 2018. “Inventory Control
under Possible Delivery Perturbations in Physical Internet Supply Chain Network.”
In 5th International Physical Internet Conference, 219–31. Groningen.

OAE Thailand, Office of Agricultural Economics. 2019. “The Information of


Commodity Crops.” 2019. http://www.oae.go.th.

Ojha, Varun Kumar, Ajith Abraham, and Václav Snášel. 2017. “Metaheuristic Design
of Feedforward Neural Networks: A Review of Two Decades of Research.”
Engineering Applications of Artificial Intelligence 60 (April): 97–116.
https://doi.org/10.1016/j.engappai.2017.01.013.

37
P.J. Werbos. 1990. “Backpropagation Through Time: What It Does and How to Do It.”
In Proceedings of the IEEE, 78:1550–60.
http://ieeexplore.ieee.org/document/58337/?reload=true.

Qiao, Bin, Shenle Pan, and Eric Ballot. 2019. “Dynamic Pricing for Carriers in Physical
Internet with Peak Demand Forecasting.” IFAC-PapersOnLine 52 (13): 1663–68.
https://doi.org/10.1016/j.ifacol.2019.11.439.

Ramanathan, Usha. 2012. “Supply Chain Collaboration for Improved Forecast


Accuracy of Promotional Sales.” International Journal of Operations &
Production Management 32 (6): 676–95.
https://doi.org/https://doi.org/10.1108/01443571211230925.

Ryu, Seunghyoung, Jaekoo Noh, and Hongseok Kim. 2016. “Deep Neural Network
Based Demand Side Short Term Load Forecasting.” 2016 IEEE International
Conference on Smart Grid Communications, SmartGridComm 2016, 308–13.
https://doi.org/10.1109/SmartGridComm.2016.7778779.

Sagheer, Alaa, and Mostafa Kotb. 2019. “Time Series Forecasting of Petroleum
Production Using Deep LSTM Recurrent Networks.” Neurocomputing 323: 203–
13. https://doi.org/10.1016/j.neucom.2018.09.082.

Shafiullah, G. M., Adam Thompson, Peter J. Wolfs, and Shawkat Ali. 2008. “Reduction
of Power Consumption in Sensor Network Applications Using Machine Learning
Techniques.” IEEE Region 10 Annual International Conference,
Proceedings/TENCON. https://doi.org/10.1109/TENCON.2008.4766574.

Supattana, Natsupanun. 2014. “Steel Price Index Forecasting Using ARIMA and
ARIMAX Model.” National Institute of Development Administration.
http://econ.nida.ac.th/index.php?
option=com_content&view=article&id=3021%3Aarima-arimax-steel-price-index-
forecasting-using-arima-and-arimax-model-mfe2557&catid=129%3Astudent-
independent-study&Itemid=207&lang=th.

Taylor, James W. 2010. “Exponentially Weighted Methods for Forecasting Intraday


Time Series with Multiple Seasonal Cycles.” International Journal of Forecasting
26 (4): 627–46. https://doi.org/http://dx.doi.org/10.1016/j.ijforecast.2010.02.009.

Tyree, Eric W, and J A Long. 1995. “Forecasting Currency Exchange Rates : Neural

38
Networks and the Random Walk Model Forecasting Currency Exchange Rates :
Neural Networks and the Random Walk Model.” Proceedings of the Third
International Conference on Artificial Intelligence Applications.

Wang, Guanghui. 2012. “Demand Forecasting of Supply Chain Based on Support


Vector Regression Method.” Procedia Engineering 29: 280–84.

Zhang, G. Peter, and Min Qi. 2005. “Neural Network Forecasting for Seasonal and
Trend Time Series.” European Journal of Operational Research 160 (2): 501–14.
https://doi.org/10.1016/j.ejor.2003.08.037.

Appendix 1
Autoregressive Integrated Moving Average with Exogenous factors (ARIMAX)
Mathematical formulation: The ARIMAX model combines the ARIMA model with
exogenous variables. It is compound of three parts: the autoregressive (AR) model, the
moving-average (MA) model, and a linear model of the exogenous part (EX). The used
notation ARIMAX ( p , q , d) refers to a model with p AR terms, q MA terms, and d EX
terms. One of the mathematical formulations of the ARIMAX model is given in
equation (7), where Y t is the value to predict at time period t , in our case the demand, ε t
is the error at time t , and X t is the vector value of the exogenous factors at time t . The
first monomial in this equation (at the left side of the equal sign) represents the AR
model, the second monomial (first after the equal sign) represents the MA model, and
the third monomial (second after the equal sign) represents the EX model. The
parameters of these model are receptively {φ1 , φ2 … , φ p } , {θ 1 , θ2 … , θq },and
{η1 , η2 … , ηd } and the operator L is the lag operator.
φ ( L ) Y t =θ ( L ) ε t + η ( L ) X t
p
φ ( L )=1−∑ φi Li
i =1
q
(7)
θ ( L )=1+ ∑ θ i L
i
with :
i=1
d
μ ( L ) = ∑ μ Li
i=1

Applications: For trend and causal models, ARIMA could be hybridized with other
technics; One application, for example, is the forecast of Monthly Steel Price Index
based on the historical data around 58 months in 2009-2014 by Supattana (2014) who

39
uses ARIMAX model; Crude oil price and iron ore price are exogenous factors in the
model and the experimentations showed that ARIMAX exhibits higher performance
regarding the Root Mean Square Error (RMSE) and Mean Absolute Percentage Error
(MAPE) scores. In their ARIMAX models they consider the trend and seasonal of
dataset to capture their possible effect on daily traffic.

Support Vector Regression (SVR)


Mathematical formulation: The SVR uses the same principles as the SVM with only a
few minor differences. Mainly, in the case of regression, a margin of tolerance ε is set in
approximation to the SVM. The main idea, for the linear case, is to find the hyperplanes
that minimize error (Saed 2018). The equation (8) summarize the SVR model in the
linear case: Y is the value to predict, in our case the demand, ε is the error, and X is the
vector value of the factors. The part between parenthesis (w X +b) is the equation of the
hyperplane to determine, w is its normal vector and b its bias parameter. For the non-
linear cases, the equation (8) is adapted through the use of kernel functions.
Y =(w X +b)+ε (8)

Applications: As part of a benchmark, Cao et al. (2019) also compared this model with
LSTM recurrent neural network to forecast the future stock market price The empirical
results shown that LSTM with adaptive noise proposed better performance when
comparing with SVR.

Multiple Linear Regression (MLR)


Mathematical formulation: The general mathematical formulation of the MLR is a
linear equation as shown in equation (9). In this equation, Y is the value to predict, in
our case the demand, ε is the error, and X is the vector value of the factors. The model
aims to find the parameters vectors β 0 and β such a likelihood function is maximized; in
general, the target to minimize is the sum of the squares of the deviations.

Y = β0 + βX + ε (9)

Applications: Ramanathan (2012) implemented MLR to predict the trend of soft drink
demand in the company case study in the UK for improving promotional sales
accurately.

40
Long Short-Term Memory (LSTM) Neural Network:

Mathematical formulation:

In order to overcome the problem of disappearance or explosion of the gradient


(which limits, in general, RNNs) (Yoshua Bengio, Patrice Simard, and Paolo Frasconi
1994; Kolen and Kremer 2001), LSTM contain a memory cell ( c in figure 1) introduced
at their creation, by Hochreiter and Schmidhuber (1997), then improved, by (Gers,
Schmidhuber, and Cummins 2000), with an additional forgetting door ( f in figure 1).
Thus, LSTMs are able to learn long-term and short-term time correlations. For a more
exhaustive review on LSTM, the reader can consult the work of (Lipton, Berkowitz, and
Elkan 2015) who presented a detailed review of the overall structure of the LSTM as
well as the latest developments.
Figure 1 illustrates how the LSTM cell can process data sequentially and keep
its hidden state through time. In this figure, the operations graph is detailed for the step
time t . Weights and biases are not shown. The idea is that each computational unit is
linked not only to a hidden state s but also to a state c of the cell that plays the role of
memory. The passage from c t −1 to c t is done by transfer with constant gain, equal to 1.
In this way the errors propagate to the previous steps without phenomenon of
disappearance of gradient. The cell state can be modified through a door which
authorizes or blocks the update (input gate, i t ). Similarly, a gate controls whether the
cell status is communicated at the output of the LSTM unit (output gate o t ). The most
widespread version of LSTM also uses a door allowing the reset of the cell state (forget
gate, f t ) as shown in Figure 9.

Figure 9. The structure of the LSTM block (Sagheer and Kotb 2019)

41
In Figure 9: X t is the input at time t and, generally, represent the exogenous
factors; The operator ⊕ symbolise the pointwise addition; The operator ⊗ symbolizes
the matrix product of Hadamard (product term to term); The σ and τ symbols
respectively represent the sigmoid function and the hyperbolic tangent function,
although other activation functions are possible. Firstly, the forget gate decides which
information must be leaved out from the gate. Secondly, the input gate decides which
information must be admitted to LSTM cell state. Next, the cell state value is updated.
Then, the output gate filters which information in the cell state should be produced as
output. After that, the value of hidden state is constructed.
Applications: Chen et al. (2015) implemented this method to predict the trend of China
stock market. The accuracy rate was so increased from 14 percent to 27 percent
comparing with Random Forecasting Method. In the same applicative field, Long et al.
(2019) compare the performance of their proposal (multi-filter neural network) with
those of LSTM for prediction of stock price movements. Also, Simoncini et al. (2018)
used it to classify the vehicle types with the Global Positioning System (GPS) data of
each vehicle.

Appendix 2
Genetic Algorithm
They use the concept of natural selection and apply it to a population of potential
solutions. It is based on the postulate of the existence of important processes within a
group of organisms of the same species which give rise to genetic mixing. These
processes occur during the reproductive phase when the chromosomes of two organisms
fuse to create a new better one. GAs imitate these operations in order to gradually
evolve the populations of solutions.
The main steps of GAs: (1) Selection: To determine which individuals are more
inclined to obtain the best results, a selection is made. This process is analogous to a
process of natural selection, the most adapted individuals win the competition of
reproduction while the least adapted die before reproduction. (2) Crossing or
recombination: During this operation, two individuals exchange parts of their DNA, to
give new one or new ones. (3) Mutations: Randomly, a gene can be substituted for
another. In the same way as for crossovers, a mutation rate is defined during population
changes. The mutation is used to avoid premature convergence of the algorithm.

42
In general, we start with a base population which most often generated
randomly. Each of the solutions is assigned a score that corresponds to its adaptation to
the problem. Then, a selection is made within this population. The algorithm will iterate
until a certain convergence is obtained or a stopping criterion is reached. GAs, in order
to allow problem solving, use the ingredients above and a representation of a solution.
This representation is called the solution's Coding, it has, also, an impact on the GA
performances. The convergence of GAs is rarely proven in practice. But the crossing
operator very often makes all the richness of the genetic algorithm compared to other
methods.

Scatter Search
Scatter Search derives from the strategies of combining decision rules and
constraints (Laguna and Martí 2006). For example, new rules are generated by weight
combination of existing rules. This algorithm is flexible and able to implement in many
problems based on the sophistication. The main ingredients for implementing scatter
search, generally, are:
1. A Diversification Generation Method to generate a random set of trial solutions,
2. An Improvement Method is applied to the trial solutions to create the enhanced one
(Neither the input nor the output solutions are required to be feasible).
3. A Reference Set Update Method to build and maintain a reference set consisting of
the “best” solutions found. Solutions gain membership to the reference set
according to their quality or their diversity.
4. A Subset Generation Method to operate on the reference set, to produce a subset of
its solutions as a basis for creating combined solutions.
5. A Solution Combination Method to transform a given subset of solutions produced
by the Subset Generation Method into one or more combined solutions.
Repeat the process (Elements 2 to 5) until the reference set does not change. Use
element 1, Diversification Generation Method, to diversify. Stop when reaching a
specified iteration limit or stopping criteria. The notion of “best” in step 3 is not limited
to a measure given exclusively by fitness function. In particular, a solution may be
added to the reference set if the diversity of the set improves.

43
44

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy