Analyzing Forecast Results
Analyzing Forecast Results
6/24/2014
This document describes how to review and investigate forecast results to improve accuracy.
Level of Review
In order to effectively review forecast performance, the best level of aggregation should be established. Traditionally, accuracy is
measured at the warehouse-product level because operational forecasts are used for supply planning. However, the level should be
based on how business decisions are made such as inventory policy and enterprise sourcing. For example, if multiple warehouses are
available to supply the same outlet then accuracy could be evaluated at a product-division level where division is a collection of
warehouses. If two products are differentiated only by palletization and can meet the same demand, then reviewing the combined
accuracy is appropriate. The timeframe to review should be in weekly buckets because this drives deployment and production
decisions.
When the level of review is determined, the next step is to determine the metric to use.
Accuracy Metrics
When working with large amounts of data, it is important to quickly pinpoint areas of improvement. The key metrics are:
1. Net Absolute Error – This gives the total error at the level of aggregation. When sorted descending, this will indicate the
largest contributors to overall error.
In the Table 1, the calculation for Net Absolute Error is shown for a situation where the level of review is at the Prod x
Whse x week level. The error is calculated at the aggregate level based on the absolute difference of the values. When
reviewing forecasts, the Net Absolute error should be included to determine where the largest error is occurring.
2. Net Accuracy to Actual – This gives the accuracy at the level of aggregation using the actual sales as the base. Even if the
net absolute error is high is doesn’t meant that the forecasts are inaccurate. For large volume products, the net absolute
error by itself could lead one astray. In order to combat this, the Net Accuracy to Actual should be reviewed.
In the example shown in Table 1, the Net Accuracy to Actual would be:
(1− NetSales
|Error|
)∗100=(1− 370
30
)∗100=91.9 % .
The combos which have high Net Absolute Error and low Net Accuracy to Actual are the ones to spend time to improve
forecasts.
3. Percentage of Forecast Error (Bias) – This gives the error percentage (plus or minus) which indicates if the forecasts are
bias below or above the sales. This is an important metric to understand if you are consistently over (positive bias) or under
(negative bias) forecasting which will drive the approach of forecast tuning. A bias higher than 20% is indicative of an issue
that should be investigated if the net accuracy and net absolute error are not acceptable.
5. Volumes – The system generated volume and live forecast volume should both be evaluated to determine the effectiveness
of manual forecast edits. The actual sales volume should be included when reviewing metrics.
Figure 1 provides a screenshot of the attributes mentioned that are helpful for forecast analysis.
Alternative Metric
An additional metric to use is ABC-XYZ analysis. Refer to ABC-XYZ Profiles document for more detailed information on ABC-XYZ
analysis. Briefly, ABC analysis is a sales volume classification technique in which products (or other level of aggregation) are sorted
in descending order according to volume and then labeled as such, for example:
The A products are the most important to review for having satisfactory metrics.
XYZ Analysis is a forecasting metric technique in which products (or other level of aggregation) are labeled XYZ based on the
accuracy of the forecast such as:
The combination of ABC and XYZ helps to determine which products need improvement in forecasts.
X Y Z
Use Generated Number &
Use Generated Number &
A identify outliers/exclusions Market Intelligence
monitor seasonality
to keep high accuracy
Use Generated Number &
Market intelligence & ensure
market intelligence & Use Generated Number &
B in-and-out products are
ensure promotions are market intelligence
identified
captured
Use Generated Number &
market intelligence &
C Firm orders & Plan Firm orders & Plan
ensure promotions are
captured
Table 2. ABC-XYZ Analysis Matrix
The ABC-XYZ analysis helps to group combos in a general way and provides rules of thumb to use for editing forecasts. However,
forecasts analysis should still be performed.
Analysis
Now that you have identified combos to investigate, the following will indicate symptoms along with approaches.
1. Review Watch Class – Go to Model-Level > Edit Model-Level Forecasts by Combo. Filter going into the editor for those
combos which you would like to investigate the forecasts.
a. Is the Watch Class Unbalanced? If it is and the current sales are much lower than earlier history, then update the
Model From Week to be a week closer to the current which is more representative of the current sales patterns
while still having as much history as points as possible.
b. Is the Watch Class New? If it is and you are over forecasting, then this is most likely due to a campaign you
created or to manual forecast edits. It represents an execution problem. Either the campaign should be retired
and re-added for the appropriate timing (Detail-Level > Edit Campaigns) or the system generated number should
be used until the product has started selling.
c. Is the Watch Class Sporadic? If it is and there doesn’t seem to be a definable pattern (e.g. end-of-period spikes)
then consider using a selection of median models for these combos. If it has a definable pattern, then consider
using a simpler hierarchy that contains the model that captures the pattern (e.g. LR on Week-of-Period for end-of-
period spikes).
i. Create a new hierarchy for these combos with different median models. For more information on
managing café hierarchies refer to Prevail8 Café document or Simulations document for assigned combos
to different hierarchies.
2. Review Causal Data – Go to Model-Level > Edit Model-Level Forecasts by Summary Vista. Configure the vista at the level
for which the accuracy is generated (e.g. Whse-Prod).
a. Ensure that the correct timings of exceptions - If a history point was exceptionalized but the volume is lower
than the expected volume for the exception then this could represent an exception that didn’t occur but was
forecasted. If you perform the rollover routine weekly (Synchronize Data Across Time Boundary) then this
routine moves over future exceptions to historical exceptions to avoid double entry. This also aids in identifying
where an exception had been forecasted incorrectly. The historical exception should be corrected, and future
exceptions should be reviewed to ensure they will occur on the appropriate week.
©Areté Inc. Analyzing Forecast Results 3
b. Ensure correct percentage of exceptions - Review historical exception percentages versus forecasted exception
percentages. If there is a significant difference then this could lead to higher forecasts than excepted. The
exception percentage represents the number of outlets participating in an exception (or the effect of the
exception). So if there was a lift of 100 cases for a percentage of 5% on a historical exception, and there is a
percentage of 80% in the future for that same exception then the lift could possibly be 1600. Most likely, the lower
percentage is incorrect and should be adjusted to a more reasonable value.
c. Review price data – If any linear regression pricing models are being used then it is important to have correct
future pricing information if pricing is correlated with sales. If the forecasts look higher than expected, review the
price to make sure the forecasted price is not significantly lower than the historical price.
3. Review Café models – Go to Model-Level > Edit Model Level Forecasts by Combo. Filter going into the editor for those
combos which you would like to investigate the forecasts. Click on Gen Hier to display the Generated Model Hierarchy grid.
This will provide insight to how the forecasts were calculated.
a. If the top estimate (based on selected forecast record) is a reasonable number based on the history then this could
be a problem with seasonality. Overlay the seasonality curve onto the graph by clicking Overlays on the graph and
choosing Seasonality as an overlay to confirm. If the seasonality percentage is high (the actual number can be seen
in the Model Forecast/History grids) then the seasonality should be reviewed. Evaluate if the combo matches the
group’s seasonality pattern, and the seasonality settings are set to have enough smoothing. For more information
on seasonality, refer to Seasonality document. Briefly, this can be done by decreasing the neighborhood High
Factor, increasing the Radius, increasing the Compress % or decreasing the Alpha value.
b. If the estimate appears reasonable for weeks with sales, but there are several weeks with no sales (or negative
sales), then this could be an issue with how missing weeks and negative sales are treated. At the root level, setting
the HistMissingTreatment as Zero rather than Exclude should bring down the forecasts for these situations. Note
that for products that are sporadic you normally should set this to Exclude to use Croston’s models.
c. If there are cases in the history where there are high sales followed by negative sales then change the root setting
for BackApplyNegatives from No to Yes.
d. If the estimate doesn’t appear reasonable then the models contributing to the estimate should be investigated.
Determine which model is causing a problem.
i. A Linear Regression on Price which is forecasting high may indicate the future price is underestimated and
should be updated.
iii. A Linear Regression on WMA by Exception may indicate a discrepancy between the percentages of an
exception in the past versus the future. Under File > Application Options on the Forecasting tab, there is
a section on setting the Baseline Week Threshold. This can be raised so that small percentages are not
considered an exception. It is recommended that this be at least 10%. If this is set as expected, then the
differences in the percentage in the past and future should be compared and corrected to be more
comparable. The Baseline Week Radius should typically be set to 1 to treat weeks before and after
exceptions as a non-baseline week.
iv. A Linear Regression on a Free Regressor (customized variable) may indicate incorrect input data that
should be updated.
4. Review Seasonality – Go to Set-up > Set-Up Seasonality > Edit Seasonality Groups. In this editor, make sure the combos in
question have a seasonality that is smooth such there are not large swings (above or below) 100%. A seasonality percentage
above 135% or below 65% may need compression. Review the Seasonality document for more information.
1. Review Watch Class – Go to Model-Level > Edit Model-Level Forecasts by Combo. Filter going into the editor for those
combos which you would like to investigate the forecasts.
a. Is the Watch Class Unbalanced? If it is and the current sales are much higher than earlier history, then update the
Model From Week to be a week closer to the current which is more representative of the current sales patterns
while still having as much history as points as possible.
2. Review Causal Data – Go to Model-Level > Edit Model-Level Forecasts by Summary Vista. Go into the vista at the level
which the accuracy is generated.
a. Avoid over exceptionalizing. If there are too many historical points with exceptions then the effect of modeling
on exceptions will be diminished. It can also create issues with determining which weeks can be counted as a
baseline week.
b. Exclude any unusually low points that are due to stock-outs or cannot be explained. This will ensure only
relevant history points are considered for modeling. To exclude points put an X in the exclusion list for the week in
question.
c. Review price data – If any linear regression pricing models are being used then it is important to have correct
pricing information if pricing is correlated with sales. If the forecasts look lower than expected, review the price
to make sure the forecasted price is not significantly higher than the historical price.
3. Review Café models – Go to Model-Level > Edit Model Level Forecasts by Combo. Filter going into the editor for those
combos which you would like to investigate the forecasts. Refer to step 3 in the Over Forecasting section for more
information on reviewing the generated hierarchy.
a. If there are any Linear Regression models on WMAs that are under forecasting then this could be due to the
MultipleofMostEver. This setting can be increased up to 1.7. However, avoid settings this higher because it could
then lead to over forecasting.
b. If there are any models contributing to the hierarchy with a negative or zero estimates then this can be corrected
by setting the EstNegativeTreatment to Exclude and setting the EstZeroTreatment to Exclude.
c. If Linear Regressions are not passing, then the TTestConfidence and FTestConfidence can be relaxed to a lower
level such as 0.95 or 0.90. Also, the MinimumRequiredPoints could be reduced slightly. Reducing any lower than 5
is not recommended.
d. Review which models are contributing to the forecast. If there are median models for a stable product, then it
could be where the median model is in the hierarchy. Consider adding WMA models under the same grouping as the
median so that the system can compare their standard deviation of errors against each other.
4. Review Seasonality. Refer to step 4 of Over Forecasting section for the data that should be evaluated. If there are low
points in history bringing down the seasonality, consider increasing the Neighborhood Low Factor in the application options.
1. If the total forecasts are significantly larger than total sales, then this could indicate a combo status problem. Ensure there
are no active combos that should be inactivated.
2. If the total sales are significantly larger than the total forecasts, then this could indicate a combo status problem too.
Ensure there are no inactive combos that should be activated.
Simulations / Outliers
To have the system automatically flag outliers, go to Model-Level > Generate Auto Exceptions and Exclusions.
1. Auto-Exceptionalize based on volume Outliers. This will flag historical sales that should be exceptionalized based
on the specified standard deviations away from the mean sales or percentage of the median sales. The Auto Exc
column can be seen in the Summary Vista, Dynamic Vista, and Forecasts by Combo editors. To configure the rules
for flagging, go to Set-Up > Set-Up Forecasting Entities > Edit Exceptions. Those exceptions marked as an A in
the Excclass are Auto-Exceptions. The Autosds represents the standard deviations away from the mean to be
flagged as that type of exception. In the example in Figure 5, the AUTOLOW, AUTOHI and AUTOHI+ are used based
on -1.50, 1.25 and 2.00 standard deviations respectively. The Autosmedpct for these exceptions are 10, 150, and
200% of the Median respectively.
2. Find & Exclude Price Outliers. This is used to flag net/retail price regression outliers and exclude them. Cook’s
distance is a statistical technique used to determine outliers and/or high leverage data points. It is an evaluation
procedure that determines the effect of removing data points. Data points with high cook’s distance factors may
reduce the accuracy of a regression model. The following equation is used:
D=
∑ ( Ŷ j−Ŷ j(i) )
2
i
pMSE
3. Exclude Short Weeks. This will automatically exclude short weeks from contributing to models. Short weeks could
contribute to unusual forecasts and should be excluded.
4. Locate Price Shifts. This find flag weeks in which there appears to be a change in the net/retail price. When it is
a business decisions to reduce/increase the price from a go forward basis then the sales may not be affected but
the correlation between net/retail price and sales will be and will go down. This option will indicate to the user
where this situation is occurring. The Outlier List can be reviewed in the Summary Vista, Outline Vista and
Forecasts by Combo editors. Flagged weeks will either have an N for net price or R for retail price in the Outlier
List.
5. Exclude Points Prior to Most Recent Price Shift. This will exclude pricing prior to the located shift point. This will
place an N for net price or R for retail price in the Exclude List.
6. Omit Previously Excluded Points from Calculations. This will omit all points that have been excluded.
7. Cook’s Distance Factor – This factor along with a cook’s distance threshold is used to determine if a price point is
an outlier.
Preserve Options
a. Preserve All Price Exclusions. This will only add new exclusions and will not affect previously generated
or editor exclusions.
b. Preserve All Edited Price Exclusions. This will preserve any vista or direct edits for price exclusions.
This is the recommended setting to use.
c. Preserve Direct Edited Price Exclusions. This will only preserve price exclusions made from direct edits.
d. Overwrite All Price Exclusions. This will overwrite all exclusions during generation.
SIMULATIONS
1. Create multiple hierarchies to test against. For more information on managing café hierarchies, refer to Prevail Café
document. For more information on generating simulations, refer to Simulations document. Change the default hierarchy to
the one being tested before each simulation. This can be done by selecting the Root of the desired hierarchy and clicking
Make Default. By creating multiple hierarchies rather than modifying one, you can easily compare each against each other.
2. Run simulations against each hierarchy (changing the default hierarchy for each simulation) and store each simulation in
one of the 5 available simulations. Go to Investigate > Simulate Model-Level Forecasts to perform a simulation.
3. Compare each simulation versus each under Investigate > Meta-Query Model-Level Simulations using the same accuracy
metrics used previously with the simulations replacing the forecasts.