Lbolytc Finals Notes XXXX - Compress
Lbolytc Finals Notes XXXX - Compress
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
● < Ogive (HA: Upper CB; VA: <cf) NUMERICAL DESCRIPTIVE MEASURES
Measures of Central Tendency
● Describes the “center” of a given data
set. It is a single value about which the
observation tends to cluster.
● Arithmetic Mean (or Mean)
○ Sum of all observations divided
by the total number of
observations, denoted by x
○ Properties : It always exists,
● > Ogive (HA: Lower CB; VA: >cf)
unique, takes everything into
account – easily affected by
other values
● Median - the middle value of an array,
denoted by Md. (x)
○ Properties : Not easily affected
by other values, always exists
and is unique.
● Mode - the observation/s that occur
● Ogives (combined < and > ogive) most frequently in the given set of data,
denoted by Mo.
○ Properties : no calculations
required, may not exist, may not
be unique.
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
most information (distribution); Median is ● Describes the extent to which the data
preferred if distribution is skewed. are dispersed
● Variability is descriptive statistics that
Measures of Position describe how similar a set of scores are
● Measures that discriminate a group of to one another
scores from another group in the same
data set
● Quantile - Divides data into an equal
number of parts
● Quartile - values that divide a set of
data into four equal parts, denoted by Q
● Range - difference between the highest
● Decile - values that divide a set of data
and lowest value in the data set (R = HV
into ten equal parts, denoted by D
- LV)
● Percentile - values that divide a set of
○ Rarely used because of its
data into one hundred equal parts,
sensitivity
denoted by P
● Variance (s2 or σ2) - the mean squared
differences of the observations from
*Ungrouped Measures of Position
their mean
To locate desired quantile:
○ Difference - deviate or deviation
● Pk = k (n+1) / 100 → position
score
● If Pk = k (n+1) / 100 is not exact, use
○ Deviate tells a user how far a
interpolation
given score is from the typical, or
○ Interpolation computed a number
average, score; a measure of
between 2 unidentified numbers
dispersion for a given score
but is not necessarily in the
middle
○ Subtract the 2 values based on
Pk formula → multiply the
decimal → add the lower number
● Standard Deviation (s or σ) - positive
square root of the variance
Measures of Variability (or Measures of
Dispersion)
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
Measures of Kurtosis
● Measures whether the scores are
spread out more or less than they would
be in a normal (Gaussian) distribution
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
Non-Probability Sampling
● Used when there isn’t an exhaustive
population list available
Note: ● Not random
● Mesokurtic if K = 3 ● Can be effective when trying to generate
ideas and getting feedback
● Leptokurtic if K > 3
● More convenient and less costly
● Platykurtic if K < 3
Types of Non-Probability Sampling
● Convenience sampling
SAMPLING TECHNIQUES
○ Uses subjects that are readily
available or includes people who
Population
are easy to reach
● A set which includes all measurements
● Purposive Sampling
of interest to the researcher
○ Looks for predefined groups that
serves as samples
Sample
● A subset of the population
Types of Probability Sampling
● Simple Random Sampling - ALL
Why do sampling?
members of the population have a
● Impossible to study the whole population
chance of being part of the sample.
● Manageability of data
● Stratified Sampling - Used when the
● Economic reasons
population can be subdivided into
● Time and effort
smaller groups (or strata) and then SRS
is applied to get samples from each
Types of Sampling
stratum
● Probability Sampling - each member
● Cluster Sampling - Employs the use of
of the population is given equal chance
cluster (groups) instead of individuals
to become part of the sample
that are randomly chosen
● Non-probability Sampling - each
● Systematic Sampling - Selects every
member of the population does not have
nth member of the population with the
equal chance to become part of the
starting point determined at random
sample
● Multi-Stage Sampling
Probability Sampling
Sample Size, denoted by n
● Complete Sampling Frame
● To get a meaningful result, let n be at
● Can select a random sample from the
least 100.
population
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
Statistical Hypotheses
● A guess or prediction made by the
researcher regarding the possible
outcome of the study 2. One-tailed right directional test (Used
if Ha uses >)
Steps in Hypothesis Testing
1. Formulate Ho and Ha
2. Set the level of significance (α)
3. Formulate the decision rule; Find the
critical value or P-value
4. Test statistics and do the computation
5. Make your decision
6. Write a conclusion
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
Criterion
1. One-tailed left directional test Decisions made regarding Ho (Reject/Do not
“Reject H0 if Zc ≤ Zt” reject)
● If we reject Ho, it means it is wrong
2. One-tailed right directional test ● If we accept Ho, it doesn’t mean it’s
“Reject H0 if Zc ≥ Zt” correct, we don’t have enough evidence
to reject it
3. Two-tailed test: Non-directional
“Reject H0 if Zc ≥ Zt” and Errors in Hypothesis Testing
“Reject H0 if Zc ≤ Zt”
Ho Accept Reject
Testing the hypothesized: Value of the mean
True ✔ Type I Error
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
Kinds of ANOVA
● One-way ANOVA - only 1 variable is
involved
● Two-way ANOVA - 2 variables involved;
column and row variables; used to know
if there is a significant difference
between and among columns and rows
● Three-way ANOVA - 3 variables
Use a table* that contains the ff: x, y, x2, y2, involved
xy
Why do we use ANOVA?
SIMPLE LINEAR REGRESSION ● To determine if there is a significant
● Predicts the value of y given the value of difference between and among the
x means of two or more independent
variables.
When to use?
● When there is relationship between x When to use?
and y ● If there is a normal distribution and when
● The data should be normally distributed the level of measurement is expressed
using the level of measurement in interval or ratio (numerical) data
expressed in interval or ratio (numerical)
data
FORMULA
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
1 0
ALL LECTURE NOTES
LBOLYTC (Introduction to Analytics)
Sec. K35 | Prof. Wilson Cordova | De La Salle University TERM 1 AY 2022-2023
Y = dependent variable
X = independent variable
B = numerical constant
1 0