Statistics and Probability
Statistics and Probability
Maloro, Tangub City, Misamis Occidental 7214 Nominal variable is a qualitative variable that
characterizes/describes an element of a population. These are
numbers that do not mean anything; they are just labels. It is the
Mathematics lowest level of measurement.
Statistics and Probability Example: SSS number, gender, hair color, hometown
Descriptive statistics-concerned with the collection,
organization, and presentation of data. Thus, descriptive Ordinal variable is a qualitative variable that incorporates an
statistical analysis aims to summarize some of the important ordered position or ranking. These are numbers that are used to
features of a set of data. Construction of tables, charts, and label and rank. The order/rank is meaningful.
graphs, and computations of measures such as averages and Example: satisfaction ratings, class ranking, size of t-shirt
percentages fall within this area of statistics.
Interval numbers are used to label + rank and do not have a
Inferential statistics-concerned with the formulation of true zero.
conclusions or generalizations about a population based on an Example: temperature
observation or a series of observations of a sample drawn from
the population. Ratio numbers are used to label + rank + equal unit of interval
and do have true zero.
Variables are properties or characteristics of some event, Example: number of votes
object, or person that can take on different values or amounts.
Sampling Techniques
Data are the values (measurements or observations) that the
variables can assume. Data are facts, or a set of information Probability Sampling
gathered or under study. 1.) Simple Random Sampling
This is the simplest form of random sampling where
Independent and dependent variables every subset of size n of the population has an equal chance of
When conducting research, experimenters often being selected.
manipulate variables. For example, an experimenter might
compare the effectiveness of four types of antidepressants. In 2.) Systematic Sampling
this case, the variable is “the type of antidepressant.” When a This is also called interval sampling. It means that there
variable is manipulated by an experimenter, it is called an is a gap or interval between each selection. Samples are
independent variable. randomly chosen following certain rules set by the researchers.
This involves choosing the kth member of the population, with
The experiment seeks to determine the effect of the =N/n , but there should be a random start.
independent variable on relief from depression. In this example,
relief from depression is called a dependent variable. Example: Choose a sample of size 10 from N= 500, using
systematic random sampling.
In an experiment on the effect of sleep on memory, the Step 1: Determine k (period); =500/10=50 , so this means that
independent variable is you have to include every 50th member of N after choosing a
a. number of hours of sleep random start.
b. recall score on a memory test Step 2: Put the random start at 15.
c. gender of the respondents Step 3: Include in the samples the following: 15,
d. gender of the researcher 65,115,165,215,265,315,365,415, and 465.
Qualitative Variables are categories that differ according to the 3.) Stratified Sampling
characteristics that they possess. They describe some This method is used when the population is too big to
characteristics of the person. These are attributes that cannot handle, thus dividing N into subgroups, called strata, is
be subjected to meaningful arithmetic. necessary. Samples per stratum are then randomly selected,
Example: gender, religion, occupation but consideration must be given to the sizes of the random
samples to be selected from the subgroups.
Quantitative Variables is a variable that quantifies an element
of a population. They are numerical in nature and therefore 4.) Cluster Sampling
meaningful arithmetic can be done. Cluster sampling is sometimes called area sampling
Example: age, salary, length of service because it is usually applied when the population is large. In this
technique, groups, or clusters instead of individuals are
randomly chosen. Recall that in simple random sampling, you
select members of the samples individually. In cluster sampling,
you will draw the members of the sample by group or cluster and The median is 84.5
then you select a sample from each group or cluster individually. Mode: since 87 appears twice and each of the other
number appears only one, the mode is 87.
Non-probability sampling
1.) Convenience Sampling Measures of Dispersion
This type is used because of the convenience it offers
to the researcher. Range: The range is the simplest measure of dispersion. It is
Example: Gathering of data through the telephone. the difference between the maximum and minimum
values in a dataset. For example, if you have a
2.) Quota Sampling dataset of test scores ranging from 60 to 90, the
This is very similar to stratified random sampling. The range is 90 - 60 = 30.
only difference is that the selection of the members of the
samples in stratified sampling is done randomly. Interquartile Range (IQR): The IQR is the range of the middle
Example: To get the most popular noontime show, each field 50% of data values when the dataset is ordered. It
researcher is given a quota of say 200 viewers per area. is calculated as the difference between the third
quartile (Q3) and the first quartile (Q1) and is less
3.) Purposive Sampling sensitive to outliers than the range.
Choosing the respondents based on pre-determined
criteria set by the researcher. Variance: Variance measures the average of the squared
Example: Suppose the research is all about the level of maturity differences between each data point and the mean.
of teenage parents in a particular school. Of course, only It provides a measure of how data points deviate
teenage parents in that school will be the respondents. from the mean.
4.) Snowball Sampling Standard Deviation: The standard deviation is the square root
The survey subjects of snowball sampling are selected of the variance. It is used to measure the spread of
based on referral from other survey respondents or selecting a data in the same units as the data. A lower standard
sample using networks. deviation indicates less variability, while a higher
Example: A researcher wanted to study the factors why some standard deviation indicates greater variability.
students occasionally use prohibited drugs. He intended to get
50 students, but he only knew 5 students who used it. By getting Coefficient of Variation (CV): The coefficient of variation is the
the cooperation of these 5 students, he was referred to other ratio of the standard deviation to the mean,
drug users, who in turn also provided additional contacts. In this expressed as a percentage. It's used to compare
way, he was able to get a sufficient number of students he the relative variability of different datasets,
needed. especially when they have different units or scales.
Measures of Position
Measures of Central Tendency Provide a way to understand the position of a specific data
Mean – the sum of all the data values divided by the number point or observation within a dataset in relation to
of values. the rest of the data.
Median – the number that separates the list of data into two
equal parts. To find the median, list the data in order Percentiles:
from the smallest to largest. If the number of data is Percentiles divide data into 100 equal parts. For example,
odd, the median is the middle number. If the the 25th percentile (P25) is the value below which
number of data is even, the median is the mean of 25% of the data falls.
the two middle numbers. If a student scores in the 75th percentile on a standardized
Mode – the number in the list that occurs the most test, it means they performed better than 75% of
frequently. the test-takers.
Examples: .
1.) What is the probability of rolling a 6 on a fair six-sided die? Example: How many ways can you choose 2 players from a
Answer: a) 1/6 group of 10 to form a doubles tennis team?
2. Question: In a deck of 52 playing cards, what is the
probability of drawing a red card (hearts or diamonds)?
Answer: b) 1/2 Arrange in circle
3. Question: You are drawing cards from a deck without (n-1)!
replacement. What is the probability of drawing two
consecutive aces?
Answer: b) 1/221