Collection of Data Part 2 Edited MLIS
Collection of Data Part 2 Edited MLIS
Population
The age of all faculty members at the college.
Sample
Any subset of that population. Like, we might
select 10 faculty members and determine their age.
Variable
the “age” of each faculty member.
EXAMPLE: A COLLEGE DEAN IS INTERESTED IN
LEARNING ABOUT THE AVERAGE AGE OF FACULTY.
IDENTIFY THE BASIC TERMS IN THIS SITUATION.
Data
It would be the age of a specific faculty member.
Data
It would be the set of values in the sample.
EXAMPLE: A COLLEGE DEAN IS INTERESTED IN
LEARNING ABOUT THE AVERAGE AGE OF FACULTY.
IDENTIFY THE BASIC TERMS IN THIS SITUATION.
Experiment
The method used to select the ages forming the
sample and determining the actual age of each faculty
member in the sample.
EXAMPLE: A COLLEGE DEAN IS INTERESTED IN
LEARNING ABOUT THE AVERAGE AGE OF FACULTY.
IDENTIFY THE BASIC TERMS IN THIS SITUATION.
Parameter
The “average” age of all faculty at the college.
Statistic
The “average” age for all faculty in the sample.
Two kinds of variables:
Qualitative, or Attribute, or Categorical,
Variable:
Quantitative, or Numerical, Variable:
Two kinds of variables:
Qualitative, or Attribute, or Categorical,
Variable: A variable that categorizes or
describes an element of a population.
Note: Arithmetic operations, such as addition
and averaging, are not meaningful for data
resulting from a qualitative variable.
Two kinds of variables:
Quantitative, or Numerical, Variable: A
variable that quantifies an element of a
population.
Note: Arithmetic operations such as addition
and averaging, are meaningful for data
resulting from a quantitative variable.
Example: Identify each of the following examples as
attribute (qualitative) or numerical (quantitative)
variables.
The residence hall for each student in a statistics class.
(Attribute)
The amount of gasoline pumped by the next 10
customers at the local Savemore.
(Numerical)
The amount of radon in the basement of each of 25
homes in a new development.
(Numerical)
Example: Identify each of the following examples as
attribute (qualitative) or numerical (quantitative)
variables.
The color of the baseball cap worn by each of 20
students.
(Attribute)
The length of time to complete a mathematics
homework assignment.
(Numerical)
The state in which each truck is registered when
stopped and inspected at a weigh station.
(Attribute)
Qualitative and quantitative variables may be further
subdivided:
Nominal
Qualitative
Ordinal
Variable
Discrete
Quantitative
Continuous
Nominal Variable: A qualitative variable that
categorizes (or describes, or names) an element of a
population.
Nominal scales are used for labeling
variables, without any quantitative value.
“Nominal” scales could simply be called
“labels.”
Ordinal Variable: A qualitative variable that
incorporates an ordered position, or ranking.
-With ordinal scales, it is the order of the
values is what’s important and significant, but
the differences between each one is not really
known.
-Ordinal scales are typically measures of non-numeric
concepts like satisfaction, happiness, discomfort, etc.
-Advanced note: The best way to determine central
tendency on a set of ordinal data is to use the mode or
median; the mean cannot be defined from an ordinal
set.
Discrete Variable: A quantitative variable that
can assume a countable number of values.
Intuitively, a discrete variable can assume
values corresponding to isolated points along a
line interval. That is, there is a gap between any
two values.
Discrete Data can only take certain values.
Example:
1. the number of students in a class
2. the results of rolling 2 dice
Continuous Variable: A quantitative variable that can assume
an uncountable number of values. Intuitively, a continuous
variable can assume any value along a line interval, including
every possible value between any two values. Continuous Data
can take any value (within a range) Examples:
A person's height: could be any value (within the range of human
heights), not just certain fixed heights,
Time in a race: you could even measure it to fractions of a
second,
A dog's weight,
The length of a leaf,
Collecting Data
1. Data from a designed of experiment (primary
data)
2. Data from a survey (primary data)
3. Data from an observational study (primary
data)
4. Data from a published source (secondary data)
Definition :Representative Sample:
A representative sample exhibits characteristics
typical of those possessed by the target population.
The most common way to satisfy the representative
sample requirement is to select a random sample.
A random sample ensures that every subset of fixed
size in the population has the same chance of being
included in the sample.
Definition : Random Sample:
25
Frequency Distributions (cont.)
26
FREQUENCY DISTRIBUTIONS
(CONT.)
32
Bar graphs
34
Smooth curve
If the scores in the population are measured on an
interval or ratio scale, it is customary to present the
distribution as a smooth curve rather than a jagged
histogram or polygon.
The smooth curve emphasizes the fact that the
distribution is not showing the exact frequency for
each category.
36
Frequency distribution graphs
38
Shape
A graph shows the shape of the distribution.
A distribution is symmetrical if the left side of the
graph is (roughly) a mirror image of the right side.
One example of a symmetrical distribution is the bell-
shaped normal distribution.
On the other hand, distributions are skewed when
scores pile up on one side of the distribution, leaving a
"tail" of a few extreme values on the other side.
39
Positively and Negatively
Skewed Distributions
In a positively skewed distribution, the scores tend to
pile up on the left side of the distribution with the tail
tapering off to the right.
In a negatively skewed distribution, the scores tend
to pile up on the right side and the tail points to the
left.
40
Time Series
(Paired data)
Time Series
Data set is composed of quantitative entries taken at regular
intervals over a period of time.
e.g., The amount of precipitation measured each day for
one month.
Use a time series chart to graph.
Quantitative
data
time
Time-Series Graph
Number of Screens at Drive-In Movies
Theaters
Figure 2-8
44 Graphing Qualitative Data Sets
Pie Chart
A circle is divided into sectors that
represent categories.
Pareto Chart
• A vertical bar graph in which the
height of each bar represents
frequency or relative frequency.
Frequency
Categories
Constructing Pareto Charts
Create a bar for each category, where the height of the bar can
represent frequency or relative frequency.
The bars are often positioned in order of decreasing height,
with the tallest bar positioned at the left.