0% found this document useful (0 votes)

81 views63 pages

4.1.1 Input Modeling

The document discusses input modeling for simulation. It covers 4 main steps: 1) collecting raw input data from the real system, 2) identifying a probability distribution to represent the input process based on histograms of the data, 3) estimating parameters for the chosen distribution, and 4) evaluating the distribution fit through goodness-of-fit tests. Examples are provided on using histograms to identify distributions for vehicle arrival times and component lifetimes from sample data.

Uploaded by

Ansh Ganatra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views63 pages

4.1.1 Input Modeling

Uploaded by

Ansh Ganatra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 63

Input modeling

Contents
• Data Collection
• Identifying the Distribution with Data
• Parameter Estimation
• Goodness-of-Fit Tests
• Fitting a Nonstationary Poisson Process
• Selecting Input Models without Data
• Multivariate and Time-Series Input Data
Purpose & Overview
• Input models provide the driving force for a simulation model.
• The quality of the output is no better than the quality of inputs.
• In this chapter, we will discuss the 4 steps of input model
development:

(1) Collect data from the real system

(2) Identify a probability distribution to represent the input

process

(3) Choose parameters for the distribution

(4) Evaluate the chosen distribution and parameters for

goodness of fit
Data Collection
Data Collection
• One of the biggest tasks in solving a real problem
• GIGO: Garbage-In-Garbage-Out

System
Input
Raw Data Performance Output
Data
Simulation

• Even when model structure is valid simulation results can be

misleading, if the input data is
• inaccurately collected
• inappropriately analyzed
• not representative of the environment
Data Collection
• Suggestions that may enhance and facilitate data
collection:
• Plan ahead: begin by a practice or pre-observing session,
watch for unusual circumstances
• Analyze the data as it is being collected: check
adequacy
• Combine homogeneous data sets: successive time periods,
during the same time period on successive days
• Be aware of data censoring: the quantity is not observed in
its entirety, danger of leaving out long process times
• Check for relationship between variables (scatter
diagram)
• Check for autocorrelation
• Collect input data, not performance data
Identifying the Distribution
Histograms
Histograms
• A frequency distribution or histogram is useful in
determining the shape of a distribution

• The number of class intervals depends on:

• The number of observations
• The dispersion of the data
• Suggested number of intervals: the square root of the
sample size

• For continuous data:

• Corresponds to the probability density function (pdf) of a theoretical
distribution
• For discrete data:
• Corresponds to the probability mass function (pmf)

• If few data points are available

• combine adjacent cells to eliminate the ragged appearance of the
histogram
Histograms
• Same data with 15
10
different interval sizes
5
0
0 2 4 6 8 101214161820

30
20
10
0
4

1
2

1
6
7 14 20
2
0
Histograms: Example
• Vehicle Arrival Example: Arrivals per
Period Frequency
Number of vehicles arriving at 0 12
an intersection between 7 1
2
10
19
am and 7:05 am was 3 17

monitored for 100 random 4

5
10
8
workdays. 6 7
7 5
• There are ample data, so the 8 5
9 3
histogram may have a cell for 10 3
each possible value in the 11 1

data range
20

0
0

1
Histograms: Example
• Life tests were performed on electronic components at 1.5
times the nominal voltage, and their lifetime was
recorded

Component Life Frequency

0 x < 3 23
3 x < 6 10
6 x < 9 5
9  x < 12 1
12  x < 15 1
…
42  x < 45 1
…
144  x < 147 1
Histograms: Example
Histogram Histogram Histogram

2000
5000

3500
• Sample size
10000

4000

1500
2500
3000
Frequency

Frequency

Frequency
• Histograms with

1000
1500
2000
different numbers

500
1000
of bins

500
0

0
−6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6

#Bins 5 #Bins 10 #Bins 20

Histogram Histogram Histogram

800

400

200
600

300

150
Frequency

Frequency

Frequency
400

200

100
200

100

50
0

0
−6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6 −6 −4 −2 0 2 4 6

#Bins 50 #Bins 100 #Bins 200

Histograms: Example
Stanford University Mobile Activity Traces (SUMATRA)
• Target community: cellular
network research community
• Traces contain mobility as well
as connection information
• Available traces
• SULAWESI (S.U. Local Area Wireless
Environment Signaling Information)
• BALI (Bay Area Location
Information)
• BALI Characteristics
• San Francisco Bay Area
• Trace length: 24 hour
• Number of cells: 90
• Persons per cell: 1100
• Persons at all: 99.000
• Active persons: 66.550
• Question: How to transform the
BALI information so that it is
• Move events: 243.951 usable with a network simulator,
• Call events: 1.570.807 e.g., ns-2?
• Node number as well as connection
number is too high for ns-2
Histograms: Example
• Analysis of the BALI Trace 1800
• Goal: Reduce the amount of 1600

data by identifying user groups 1400

1200

• User group 1000

People
800

• Between 2 local minima 600

400
• Communication characteristic 200
50

is kept in the group 0 40

30 0
5
• A user represents a group 20

Ca
10

ll s
10 15 en ts
Mo vem
• Groups with different 0
20

mobility characteristics 25000

• Intra- and inter group 20000

communication

Number of People
15000

• Interesting 10000

characteristic
• Number of people with odd 5000

number movements is 0

negligible! -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Number of Movements
Identifying the Distribution
Scatter diagrams
Scatter Diagrams
• A scatter diagram is a quality tool that can show the
relationship between paired data
• Random Variable X = Data 1
• Random Variable Y = Data 2
• Draw random variable X on the x-axis and Y on the y-
axis

35
60 35
30
50 30
25
25
20 40
20
15 30
15
10 20
10
5 10 5
0 0 0
0 10 20 40 0 10 20 30 40 0 10 20 30 40
30
Moderate Correlation No Correlation
Strong Correlation
Scatter Diagrams
• Linear relationship
• Correlation: Measures how well data line up
• Slope: Measures the steepness of the data
• Direction
• Y intercept

Positive Correlation Negative Correlation

35 40

30 35

30
25
25
20
20
15
15
10
10
5 5

0 0
0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35
Identifying the Distribution
Selecting the Family of Distributions
Selecting the Family of Distributions
• A family of distributions is selected based on:
• The context of the input variable
• Shape of the histogram
• Frequently encountered distributions:
• Easier to analyze: Exponential, Normal, and Poisson

• Difficult to analyze: Beta, Gamma, and Weibull

Selecting the Family of Distributions
• Use the physical basis of the distribution as a guide, e.g.:
• Binomial: Number of successes in n trials
• Negative binomial and geometric: Number of trials to achieve
k successes
• Poisson: Number of independent events that occur in a fix
amount of time or space
• Normal: Distribution of a process that is the sum of a number of
component processes
• Lognormal: Distribution of a process that is the product of a
number of component processes
• Exponential: Time between independent events, or a process
time that is memoryless
• Weibull: Time to failure for components
• Discrete or continuous uniform: Models complete
uncertainty
• Triangular: A process for which only the minimum, most likely,
and maximum values are known
• Empirical: Resamples from the actual data collected
Selecting the Family of Distributions
• Remember the physical characteristics of the process
• Is the process naturally discrete or continuous valued?

• Is it bound?

• Value range?
• Only positive values
• Only negative values
• Interval of [-a:b]

• No “true” distribution for any stochastic input process

• Goal: obtain a good approximation

Identifying the Distribution
Quantile-Quantile Plots
Quantile-Quantile Plots

• Q-Q plot is a useful tool for evaluating distribution fit

• If X is a random variable with CDF F, then the q-quantile of X is
the  such that F(x)

F ( )  P( X   )  q , for 0  1

q1 q
• When F has an inverse,  = F-1(q)
x

• Let {xi, i = 1,2, …., n} be a sample of data from X
and {yj, j = 1,2, …, n} be this sample in ascending order:

⎛⎜ j 
1 ⎞
⎟
y j is approximately
 0.5 
F
⎝ n ⎠
• where j is the ranking or order number  
Quantile-Quantile Plots

• The plot of yj versus F-1( ( j - 0.5 ) / n) is

• Approximately a straight line if F is a member of an appropriate
family of distributions
• The line has slope 1 if F is a member of an appropriate family of
distributions with appropriate parameter values

F-1()

yj
Quantile-Quantile Plots: Example

• Example: Door installation j Value

times of a robot follows a 1 99,55
normal distribution. 2 99,56
• The observations are ordered 3 99,62
from the smallest to the 4 99,65
largest
5 99,79
• yj are plotted versus
6 99,98
F-1((j - 0.5)/n) where F has a
normal distribution with the 7 100,02
sample mean (99.99 sec) and 8 100,06
sample variance (0.28322 sec2) 9 100,17
10 100,23
11 100,26
12 100,27
13 100,33
14 100,41
15 100,47
Quantile-Quantile Plots: Example
• Example (continued): Check whether the door installation times follow
a normal distribution.
100,8

100,6

100,4

100,2
Straight line, 100
supporting the 99,8
hypothesis of a 99,6
normal distribution 99,4

99,2
99,2 99,4 99,6 99,8 100 100,2 100,4 100,6 100,8

0,35

0,3

0,25

0,2
Superimposed
density function of 0,15

the normal 0,1

distribution 0,05

0
99,4 99,6 99,8 100 100,2 100,4 100,6
Quantile-Quantile Plots
• Consider the following while evaluating the linearity of a
Q-Q plot:
• The observed values never fall exactly on a straight line
• The ordered values are ranked and hence not independent,
unlikely for the points to be scattered about the line
• Variance of the extremes is higher than the middle. Linearity
of the points in the middle of the plot is more important.
Quantile-Quantile Plots

• Q-Q plot can also be used to check homogeneity

• It can be used to check whether a single distribution can represent
two sample sets
• Given two random variables
• X and x1, x2, …, xn
• Z and z1, z2, …, zn
• Plotting the ordered values of X and Z against each other reveals
approximately a straight line if X and Z are well represented by the
same distribution
Parameter Estimation
Parameter Estimation
• When raw data are unavailable (data are grouped into
class intervals), the approximate sample mean and
variance are:
c
f jm j  n

f jm 2j  nX 2
X 
 j 1 S2
 j 1
n

n 1

• fj is the observed frequency in the j-th class interval

• mj is the midpoint of the j-th interval
• c is the number of class intervals

• A parameter is an unknown constant, but an estimator is

a statistic.
Parameter Estimation: Example
• Vehicle Arrival Example (continued): Table in the histogram of the
example on Slide 10 can be analyzed to obtain:

 f j X j  364, and  j f jX 2j 2080

k k
n  100, f1  12, X 1  0, f 2  10, X 2  1,... j
and 1 1
25

• The sample mean and variance are 20

364
X  3.64 15
100

Frequency
2080 100
S 2 10

(3.64)2 99
5
 7.63
0
0 1 2 3 4 5 6 7 8 9 10 11
Number of Arrivals per Period

• The histogram suggests X to have a Poisson distribution

• However, note that sample mean is not equal to sample variance.
• Theoretically: Poisson with parameter    = 2 = 
• Reason: each estimator is a random variable, it is not perfect.
Parameter Estimation
• Maximum-Likelihood Estimators (MLE)
• Discrete distribution with one parameter θ 
pθ(x)
• Given iid sample X1, X2, …, Xn
• Likelihood function L(θ) is defined as

L(θ) = pθ(X1) pθ(X2) … pθ(Xn)

• MLE of the unknown θ is θ’ given by θ that maximizes
L(θ)  L(θ’) ≥ L(θ) for all values of θ
Parameter Estimation
• Maximum-Likelihood Estimators (MLE)
• Suggested estimators for distributions often used in
simulation

Distribution Parameter Estimator

Poisson 
ˆ X
Exponential  1
ˆ X

Gamma , 
ˆ  1X

Normal , 2
ˆ X ,ˆ 2  S 2
Lognormal , 2
ˆ  X ,ˆ 2  S 2 After taking ln
of data.
Parameter Estimation
• Maximum Likelihood example exponential distribution
Goodness-of-Fit Tests
Goodness-of-Fit Tests
• Conduct hypothesis testing on input data distribution
using
• Kolmogorov-Smirnov test
• Chi-square test

• No single correct distribution in a real application

exists
• If very little data are available, it is unlikely to reject any
candidate distributions
• If a lot of data are available, it is likely to reject all candidate
distributions
Goodness-of-Fit Tests
• Be aware of mistakes in decision finding
• Type I Error: 
• Error of first kind, False positive
• Reject H0 when it is true
• Type II Error: 
• Error of second kind, False negative
• Retain H0 when it is not true

Statistical State of the null hypothesis

H0 True H0 False
Decision

Type II Error
Accept H0 Correct Incorrectly accept H0
False negative
Type I Error
Reject H0 Incorrectly reject H0 Correct
False positive
Chi-Square Test
• Intuition: comparing the histogram of the data to the shape of
the candidate density or mass function
• Valid for large sample sizes when parameters are estimated by
maximum-likelihood
• Arrange the n observations into a set of k class intervals
• The test statistic is:
Expected Frequency
Observed frequency in Ei = n ×pi
the i-th class where pi is the theoretical
prob. of the i-th interval.
k Suggested Minimum = 5
(Oi  Ei )2
 02  
i1
Ei

• 02approximately follows the Chi-square distribution with

k-s-1degrees of freedom
• s = number of parameters of the hypothesized distribution
estimated by the sample statistics.
Chi-Square Test
• The hypothesis of a Chi-square test is
• H0: The random variable, X, conforms to the distributional
assumption with the parameter(s) given by the estimate(s).
• H1: The random variable X does not conform.

Accept 0
⎨0s1
Test result⎧  2,k
2

H
⎩2 0   ,k
 Reject H0
2

s1

• If the distribution tested is discrete and combining adjacent

cells is not required (so that Ei > minimum requirement):
• Each value of the random variable should be a class interval, unless
combining is necessary, and

pi  p(xi )  P( X  xi )
Chi-Square Test
• If the distribution tested is continuous:
ai
pi  ai1
f (x) dx  F (ai )  F
(ai1)
• where ai-1 and ai are the endpoints of the i-th class interval
• f(x) is the assumed PDF, F(x) is the assumed CDF
• Recommended number of class intervals (k):
Sample size (n) Number of class intervals (k)
20 Do not use the chi-square test
50 5 to 10
100 10 to 20
> 100 n
n to
5

• Caution: Different grouping of data (i.e., k) can affect the hypothesis

testing result.
Chi-Square Test: Example
• Vehicle Arrival Example (continued):
H0: the random variable is Poisson
distributed.
H1: the random variable is not Poisson
distributed.
0 12 2.6
Ei  n  p(x)
22 9.6 12.2 7.87
1 10
2 19 17.4 0.15
xi
3 17
Observed Frequency, Oi
21.1
Expected Frequency, Ei
0.8
 n e α 
4 19 (Oi - Ei)2/Ei 19.2 4.41  x!
x
5 6 14.0 2.57
6 7 8.5 0.26 Combined because
7 5 4.4 of the assumption of
8 5 2.0 min Ei = 5, e.g.,
9 3 17 0.8 7.6 11.62
10 3 0.3 E1 = 2.6 < 5, hence
> 11 1 0.1 combine with E2
100 100.0 27.68

• Degree of freedom is k-s-1 = 7-1-1 = 5, hence, the hypothesis is

rejected at the =0.05 level of significance.
 2  27.68   2 
0
0.05,5
11.1
Kolmogorov-Smirnov Test
• Intuition: formalize the idea behind examining a Q-Q plot
• Recall
• The test compares the continuous CDF, F(x), of the hypothesized
distribution with the empirical CDF, SN(x), of the N sample
observations.
• Based on the maximum difference statistic:

D = max| F(x) - SN(x) |

• A more powerful test, particularly useful when:

• Sample sizes are small
• No parameters have been estimated from the data
• When parameter estimates have been made:
• Critical values are biased, too large.
• More conservative, i.e., smaller Type I error than specified.
p-Values and “Best Fits”
• Hypothesis testing requires a significance level
• Significance level (α) is the probability of falsely rejecting H0
• Common significance levels
• α = 0.1
• α = 0.05
• α = 0.01
• Be aware that significance level does not tell anything about the
subject of the test!
• Generalization of the significance level: p-value

Reject H0 Fail to reject H0 Reject H0

p-Values and “Best Fits”
• p-value for the test statistics
• The significance level at which one would just reject H0 for the given
test statistic value.
• A measure of fit, the larger the better
• Large p-value: good fit
• Small p-value: poor fit

• Vehicle Arrival Example (cont.):

• H0: data is Poisson
• Test statistics: 0 2  27.68
, with 5 degrees of
• The p-value F(5, 27.68) = 0.00004, meaning we would reject H0 with
freedom
0.00004 significance level, hence Poisson is a poor fit.
p-Values and “Best Fits”

• Many software use p-value as the ranking measure to

automatically determine the “best fit”.

• Things to be cautious about:

• Software may not know about the physical basis of the data,
distribution families it suggests may be inappropriate.
• Close conformance to the data does not always lead to the most
appropriate input model.
• p-value does not say much about where the lack of fit occurs

• Recommended: always inspect the automatic selection using

graphical methods.
Fitting a Non-stationary Poisson Process
Fitting a Non-stationary Poisson Process
• Fitting a NSPP to arrival data is difficult, possible
approaches:
• Fit a very flexible model with lots of parameters
• Approximate constant arrival rate over some basic interval of
time, but vary it from time interval to time interval.
• Suppose we need to model arrivals over time [0, T], our
approach is the most appropriate when we can:
• Observe the time period repeatedly
• Count arrivals / record arrival times
• Divide the time period into k equal intervals of length Δt
=T/k
• Over n periods of observation let Cij be the number of arrivals
during the i-th interval on the j-th period
Selecting Models without Data
Selecting Models without Data
• If data is not available, some possible sources to obtain
information about the process are:
• Engineering data: often product or process has performance
ratings provided by the manufacturer or company rules
specify time or production standards.
• Expert option: people who are experienced with the process
or similar processes, often, they can provide optimistic,
pessimistic and most-likely times, and they may know the
variability as well.
• Physical or conventional limitations: physical limits on
performance, limits or bounds that narrow the range of the
input process.
• The nature of the process.
• The uniform, triangular, and beta distributions are often
used as input models.
• Speed of a vehicle?
Selecting Models without Data
• Example: Production planning
simulation. i Interval (Sales) PDF
Cumulative
Frequency, ci
• Input of sales volume of various
1 1000 ≤ X ≤ 2000 0.1 0.10
products is required, salesperson
2 2000 < X ≤ 2500 0.65 0.75
of product XYZ says that:
3 2500 < X ≤ 4500 0.24 0.99
• No fewer than 1000 units and no
more than 5000 units will be sold. 4 4500 < X ≤ 5000 0.01 1.00
• Given her experience, she believes
there is a 90% chance of selling
more than 2000 units, a 25%
chance of selling more than 2500 1,20
units, and only a 1% chance of
selling more than 4500 units. 1,00

0,80

• Translating these information into

a cumulative probability of being 0,60

less than or equal to those goals 0,40

for simulation input:

0,20

0,00
1000 <= X <= 2000 2000 < X <=2500 2500 < X <= 4500 4500 < X <= 5000
Multivariate and Time-Series Input Models
Multivariate and Time-Series Input Models
• The random variable discussed until now were considered to be
independent of any other variables within the context of the
problem
• However, variables may be related
• If they appear as input, the relationship should be investigated and
taken into consideration
• Multivariate input models
• Fixed, finite number of random variables X1, X2, …, Xk
• For example, lead time and annual demand for an inventory model
• An increase in demand results in lead time increase, hence variables
are dependent.
• Time-series input models
• Infinite sequence of random variables, e.g., X1, X2, X3, …
• For example, time between arrivals of orders to buy and sell stocks
• Buy and sell orders tend to arrive in bursts, hence, times between
arrivals are dependent.
Time-Series
• A time series is a sequence of random variables X1, X2, X3,…
which are identically distributed (same mean and
variance) but dependent.
• cov(Xt, Xt+h) is the lag-h autocovariance
• corr(Xt, Xt+h) is the lag-h autocorrelation
• If the autocovariance value depends only on h and not on t,
the time series is covariance stationary
• For covariance stationary time series, the shorthand for lag-h
is used
h  corr( X t , X t h )

• Notice
• autocorrelation measures the dependence between random
variables that are separated by h-1 others in the time
series
Multivariate Input Models
• If X1 and X2 are normally distributed, dependence between them
can be modeled by the bivariate normal distribution with 1, 2,
12,  22 and correlation 
• To estimate 1, 2, 12, 22, see “Parameter Estimation”
• To estimate , suppose we have n independent and identically
distributed pairs (X11, X21), (X12, X22), … (X1n, X2n),

• Then the sample covariance is

n
1
ˆ X1 , )
cov(  1
X 2  n j j
1 1
( X  X1)(X 2 j  X
• The sample correlation is
2)
coˆv( X 1 , X 2
ˆ 
) ˆ1 Sample deviation

ˆ2
Multivariate Input Models: Example
• Let X1 the average lead time to deliver and X2 the annual
demand for a product.
• Data for 10 years is available. Lead Time
(X1)
Demand
(X2)
6,5 103
X1  6.14, 1  1.02 4,3 83
6,9 116
X 2  101.8, 2 
6,0 97
 9.93
6,9 112

coˆ sample  Covariance 6,9 104

8.66 5,8 106
v 8.6
ˆ 7,3 109
 6  0.86
1.02  9.93 4,5 92
6,3 96

• Lead time and demand are strongly dependent.

• Before accepting this model, lead time and demand should be
checked individually to see whether they are represented well by
normal distribution.
Time-Series Input Models
• If X1, X2, X3,… is a sequence of identically distributed, but
dependent and covariance-stationary random variables, then
we can represent the process as follows:
• Autoregressive order-1 model, AR(1)
• Exponential autoregressive order-1 model, EAR(1)

• Both have the characteristics that:

h
h  corr( X t , X t  h )   for h 
1,2,...
,
• Lag-h autocorrelation decreases geometrically as the lag
increases, hence, observations far apart in time are nearly
independent
Time-Series Input Models:
Autoregressive order-1 model AR(1)
• Consider the time-series model:
X t     ( X t 1  )  t for t 
, 2 ,  , are i.i.d. normally
where 2,3,... with   0 and variance
distributed
3 
 2
• If initial value X1 is chosen appropriately, then


• X1, X2, … are normally distributed with

mean = , and variance = /(1-)
• Autocorrelation h = h

• To estimate  , 2 :

X, ˆ 2  ˆ 2 (1ˆ 2 ˆ
 
coˆv( X t , X t
 ˆ
), )
ˆ 1 2
where coˆv( X t , X t 1) is the lag-
1autocovariance
Summary
• In this chapter, we described the 4 steps in developing input
data models:
(1) Collecting the raw data
(2) Identifying the underlying statistical distribution
(3) Estimating the parameters
(4) Testing for goodness of fit

Dote 2011 L1
No ratings yet
Dote 2011 L1
35 pages
9709 Learner Guide For Examination From 2020
No ratings yet
9709 Learner Guide For Examination From 2020
9 pages
Assignment 3 2025
No ratings yet
Assignment 3 2025
2 pages
Notes 4 - Confidence Intervals and Significance Tests
No ratings yet
Notes 4 - Confidence Intervals and Significance Tests
1 page
Untitled Libreoffice Writer Document
No ratings yet
Untitled Libreoffice Writer Document
1 page
Applied Econometrics 4th Edition Dimitrios Asteriou - The Complete Ebook Set Is Ready For Download Today
No ratings yet
Applied Econometrics 4th Edition Dimitrios Asteriou - The Complete Ebook Set Is Ready For Download Today
41 pages
Module 8 - Normal Distribution
No ratings yet
Module 8 - Normal Distribution
9 pages
Validation and Psychometric Properties of Suicide Behaviors Questionnaire-Revised (SBQ-R) in Iran - PubMed
No ratings yet
Validation and Psychometric Properties of Suicide Behaviors Questionnaire-Revised (SBQ-R) in Iran - PubMed
2 pages
(Ebook PDF) Statistics: Unlocking The Power of Data, 2nd Edition Download
100% (2)
(Ebook PDF) Statistics: Unlocking The Power of Data, 2nd Edition Download
51 pages
05 Handout 1
No ratings yet
05 Handout 1
4 pages
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
From Everand
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
César Pérez López
No ratings yet
02 Unit 1 - 2
No ratings yet
02 Unit 1 - 2
37 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
45 pages
Validity and Reliability CH 5
No ratings yet
Validity and Reliability CH 5
4 pages
ML 1 Lecture 2
No ratings yet
ML 1 Lecture 2
50 pages
Slide-04-Chapter2-Getting To Know Your Data
No ratings yet
Slide-04-Chapter2-Getting To Know Your Data
47 pages
Building Material Estimates and Rates Build Up: Second Edition
From Everand
Building Material Estimates and Rates Build Up: Second Edition
Moremi Mareka
No ratings yet
Da Notes
No ratings yet
Da Notes
4 pages
Data Mining 1
No ratings yet
Data Mining 1
29 pages
Iie 3017 02
No ratings yet
Iie 3017 02
35 pages
24 - Lya Afriasih - Gda Prepost Kontrol
No ratings yet
24 - Lya Afriasih - Gda Prepost Kontrol
3 pages
Chapter 9 - Review Input Analysis
No ratings yet
Chapter 9 - Review Input Analysis
24 pages
Ap Stats Cheat Sheet
No ratings yet
Ap Stats Cheat Sheet
1 page
Week2 1
No ratings yet
Week2 1
24 pages
SCSA1606 - Predictive and Advanced Analytics - Unit II
No ratings yet
SCSA1606 - Predictive and Advanced Analytics - Unit II
50 pages
Plot (Graphics)
No ratings yet
Plot (Graphics)
9 pages
Cpsc531 Input
No ratings yet
Cpsc531 Input
44 pages
SM Lect 07
No ratings yet
SM Lect 07
25 pages
Unit 5 - PS
No ratings yet
Unit 5 - PS
51 pages
Introduction To Panel Data
No ratings yet
Introduction To Panel Data
20 pages
Unit 8 Time Series
No ratings yet
Unit 8 Time Series
24 pages
Basic Forecasting Methods
No ratings yet
Basic Forecasting Methods
144 pages
Sampling and Sampling Distribution With Business Application - v2
No ratings yet
Sampling and Sampling Distribution With Business Application - v2
11 pages
Unit 6 Input Modeling: Collect Data From The Real System of Interest
No ratings yet
Unit 6 Input Modeling: Collect Data From The Real System of Interest
7 pages
Chapter 3 Forecasting
100% (1)
Chapter 3 Forecasting
87 pages
SMS Module 4 - Input Modeling
No ratings yet
SMS Module 4 - Input Modeling
33 pages
Verification and Validation of Simulation Models
No ratings yet
Verification and Validation of Simulation Models
37 pages
Input Modeling
No ratings yet
Input Modeling
10 pages
R22 Unit2 CH2
No ratings yet
R22 Unit2 CH2
28 pages
Ass 1 2019 RMBA
100% (3)
Ass 1 2019 RMBA
8 pages
Psy 230 Independent Samples T-Test: Figure 10-3 (P. 314)
No ratings yet
Psy 230 Independent Samples T-Test: Figure 10-3 (P. 314)
5 pages
Data Estrus SPSS
No ratings yet
Data Estrus SPSS
3 pages
Random Number Generation
No ratings yet
Random Number Generation
43 pages
Random Number Generation
No ratings yet
Random Number Generation
43 pages
BSM With SPSS
No ratings yet
BSM With SPSS
90 pages
MATERIAL01
No ratings yet
MATERIAL01
18 pages
CH 9
No ratings yet
CH 9
13 pages
Chapter 17 - Sampling and Estimation: Solutions To Exercise 17A
No ratings yet
Chapter 17 - Sampling and Estimation: Solutions To Exercise 17A
14 pages
Univariate and Bivariate
No ratings yet
Univariate and Bivariate
4 pages
Basic Statistics
No ratings yet
Basic Statistics
90 pages
4.2 Estimation of Absolute Performance
No ratings yet
4.2 Estimation of Absolute Performance
42 pages
Mann Whitney - Practical
No ratings yet
Mann Whitney - Practical
3 pages
Lec08 2025
No ratings yet
Lec08 2025
43 pages
7 Input Modeling 2024
No ratings yet
7 Input Modeling 2024
90 pages
Statistics Foundation Slider Team Group#1
No ratings yet
Statistics Foundation Slider Team Group#1
94 pages
Sop 9 Control Charts For Mass 20190506
No ratings yet
Sop 9 Control Charts For Mass 20190506
9 pages
Grade 3 Data Mining: Question Text
No ratings yet
Grade 3 Data Mining: Question Text
28 pages
Statistics: Descriptive Statistics and Present Data
No ratings yet
Statistics: Descriptive Statistics and Present Data
41 pages
C++ Language: By:-Aakash Kaushik #9289817971, 98919893083
No ratings yet
C++ Language: By:-Aakash Kaushik #9289817971, 98919893083
21 pages
C++ Language: By:-Aakash Kaushik #9289817971, 98919893083
No ratings yet
C++ Language: By:-Aakash Kaushik #9289817971, 98919893083
21 pages
C++ Language: By:-Aakash Kaushik #9289817971, 98919893083
No ratings yet
C++ Language: By:-Aakash Kaushik #9289817971, 98919893083
21 pages
C++ Language: By:-Aakash Kaushik #9289817971, 98919893083
No ratings yet
C++ Language: By:-Aakash Kaushik #9289817971, 98919893083
33 pages
Programming Fundamentals: Writing Code
No ratings yet
Programming Fundamentals: Writing Code
43 pages
WK4 - Input Data v1
No ratings yet
WK4 - Input Data v1
27 pages
Emgt 512 SP 2024
No ratings yet
Emgt 512 SP 2024
156 pages
PART I: (Please Answer On The QUESTION SHEET) : F E - GTP 8 Intake Index 1
No ratings yet
PART I: (Please Answer On The QUESTION SHEET) : F E - GTP 8 Intake Index 1
8 pages
Bioinfo 10
No ratings yet
Bioinfo 10
88 pages
CENG3300 Lecture 2-2
No ratings yet
CENG3300 Lecture 2-2
23 pages
H1.1 Definitions, Measures, Plots, CLT
No ratings yet
H1.1 Definitions, Measures, Plots, CLT
83 pages
S P R M: Ystems OF Articles AND Otational Otion
No ratings yet
S P R M: Ystems OF Articles AND Otational Otion
13 pages
Business Statistics - Session 1 - 3
No ratings yet
Business Statistics - Session 1 - 3
63 pages
FIN10002 - Notes Master
No ratings yet
FIN10002 - Notes Master
44 pages
72 e 7 e 529 DF 1 Cae 02 D 4
No ratings yet
72 e 7 e 529 DF 1 Cae 02 D 4
11 pages
ORK Nergy AND Ower: Hapter IX
No ratings yet
ORK Nergy AND Ower: Hapter IX
27 pages
Chap 9 Input Modeling - 8-9
No ratings yet
Chap 9 Input Modeling - 8-9
42 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
33 pages
Exp 4
No ratings yet
Exp 4
6 pages
Day 01-Basic Statistics
No ratings yet
Day 01-Basic Statistics
36 pages
Input Modelling: Name: Sohail Shaikh Roll No.: Pa03 Sub: Dess Cad/Cam/Cae
No ratings yet
Input Modelling: Name: Sohail Shaikh Roll No.: Pa03 Sub: Dess Cad/Cam/Cae
14 pages
Statistical Analysis With Software Application - Week2
No ratings yet
Statistical Analysis With Software Application - Week2
76 pages
The Mathematics of Derivatives Securities with Applications in MATLAB
From Everand
The Mathematics of Derivatives Securities with Applications in MATLAB
Mario Cerrato
No ratings yet
M1 & M2 Supplementaries
No ratings yet
M1 & M2 Supplementaries
52 pages
Notes: Section 1: Exploratory Data Analysis
No ratings yet
Notes: Section 1: Exploratory Data Analysis
6 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Probability and Statistics For Computer Scientists Second Edition, By: Michael Baron
No ratings yet
Probability and Statistics For Computer Scientists Second Edition, By: Michael Baron
63 pages
7u7 PDF
No ratings yet
7u7 PDF
31 pages
CS 591.03 Introduction To Data Mining Instructor: Abdullah Mueen
No ratings yet
CS 591.03 Introduction To Data Mining Instructor: Abdullah Mueen
52 pages
CS30 5 System Modeling and Simulation Prof. Dr. Khaled Mahar
No ratings yet
CS30 5 System Modeling and Simulation Prof. Dr. Khaled Mahar
32 pages
Input Modelling: Discrete-Event System Simulation
No ratings yet
Input Modelling: Discrete-Event System Simulation
41 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
29 pages
8 CSC446 546 InputModeling
No ratings yet
8 CSC446 546 InputModeling
44 pages
Bab III Integral Ganda
No ratings yet
Bab III Integral Ganda
396 pages
Chap 06 Slides
No ratings yet
Chap 06 Slides
40 pages
Physical Pharmaceutics-II Lab Manual as per the PCI Syllabus
From Everand
Physical Pharmaceutics-II Lab Manual as per the PCI Syllabus
A. Pavani
No ratings yet
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
No ratings yet
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
20 pages
MATH 361 (Autosaved)
No ratings yet
MATH 361 (Autosaved)
17 pages
Input Modeling: Banks, Carson, Nelson & Nicol
No ratings yet
Input Modeling: Banks, Carson, Nelson & Nicol
7 pages
Data Mining:: Concepts and Techniques
100% (1)
Data Mining:: Concepts and Techniques
63 pages
Reliability Distribution 1
No ratings yet
Reliability Distribution 1
41 pages
ECON 230 - Statistics and Data Analysis - Lecture 1
No ratings yet
ECON 230 - Statistics and Data Analysis - Lecture 1
90 pages
Lecture 1: Introduction: Statistics Is Concerned With
No ratings yet
Lecture 1: Introduction: Statistics Is Concerned With
45 pages
Input Modeling For Simulation
No ratings yet
Input Modeling For Simulation
48 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

4.1.1 Input Modeling

Uploaded by

4.1.1 Input Modeling

Uploaded by

Input modeling

(1) Collect data from the real system

(2) Identify a probability distribution to represent the input

(3) Choose parameters for the distribution

(4) Evaluate the chosen distribution and parameters for

• Even when model structure is valid simulation results can be

• The number of class intervals depends on:

• For continuous data:

• If few data points are available

monitored for 100 random 4

Component Life Frequency

#Bins 5 #Bins 10 #Bins 20

Histogram Histogram Histogram

#Bins 50 #Bins 100 #Bins 200

data by identifying user groups 1400

• User group 1000

• Between 2 local minima 600

is kept in the group 0 40

mobility characteristics 25000

• Intra- and inter group 20000

Positive Correlation Negative Correlation

• Difficult to analyze: Beta, Gamma, and Weibull

• No “true” distribution for any stochastic input process

• Goal: obtain a good approximation

• Q-Q plot is a useful tool for evaluating distribution fit

• The plot of yj versus F-1( ( j - 0.5 ) / n) is

• Example: Door installation j Value

the normal 0,1

• Q-Q plot can also be used to check homogeneity

• fj is the observed frequency in the j-th class interval

• A parameter is an unknown constant, but an estimator is

 f j X j  364, and  j f jX 2j 2080

• The sample mean and variance are 20

• The histogram suggests X to have a Poisson distribution

L(θ) = pθ(X1) pθ(X2) … pθ(Xn)

Distribution Parameter Estimator

• No single correct distribution in a real application

Statistical State of the null hypothesis

• 02approximately follows the Chi-square distribution with

• If the distribution tested is discrete and combining adjacent

• Caution: Different grouping of data (i.e., k) can affect the hypothesis

• Degree of freedom is k-s-1 = 7-1-1 = 5, hence, the hypothesis is

D = max| F(x) - SN(x) |

• A more powerful test, particularly useful when:

Reject H0 Fail to reject H0 Reject H0

• Vehicle Arrival Example (cont.):

• Many software use p-value as the ranking measure to

• Things to be cautious about:

• Recommended: always inspect the automatic selection using

• Translating these information into

less than or equal to those goals 0,40

for simulation input:

• Then the sample covariance is

coˆ sample  Covariance 6,9 104

• Lead time and demand are strongly dependent.

• Both have the characteristics that:

• X1, X2, … are normally distributed with

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.