0% found this document useful (0 votes)
27 views4 pages

Chapter 1

Uploaded by

ashraf helmy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views4 pages

Chapter 1

Uploaded by

ashraf helmy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

What is Meant by Statistics?

 Statistics is a way to get information from data”


 Statistics is a tool for creating new understanding from a set of numbers
 In the more common usage, statistics refers to numerical information
 We often present statistical information in a graphical form for capturing reader attention
and to portray a large amount of information.
 Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting
numerical data to assist in making more effective decisions
Why Study Statistics?
1. Numerical information is everywhere
2. Statistical techniques are used to make decisions that affect our daily lives
3. The knowledge of statistical methods will help you understand how decisions are made and
give you a better understanding of how they affect you.
 No matter what line of work you select, you will find yourself faced with decisions where
an understanding of data analysis is helpful.
Types of Statistics – Descriptive Statistics and Inferential Statistics
 Descriptive Statistics - methods of organizing, summarizing, and presenting data in an
informative way.
 Descriptive Statistics consists of methods for organizing, displaying, and describing data
by using tables, graphs, and summary measures.
 Descriptive Statistics describe the data set that’s being analyzed, but doesn’t allow us
to draw any conclusions about the data.
 Inferential Statistics: A decision, estimate, prediction, or generalization about a population,
based on a sample.
 Inferential statistics involve taking a sample from a population and making estimates
about a population based on the sample results.
 Inferential statistics is also a set of methods, but it is used to draw conclusions about
characteristics of populations based on data from a sample.
Definition
 A population is a collection of all possible individuals, objects, or measurements of interest.
 whose characteristics are being studied.
 The population that is being studied is also called the target population.
 A sample is a portion, or part, of the population of interest
 A survey that includes every member of the population is called a census.
 The technique of collecting information from a portion of the population is called a sample
survey.
 A sample that represents the characteristics of the population as closely as possible is called a
representative sample.
 A sample drawn in such a way that each element of the population has a chance of being
selected is called a random sample. If all samples of the same size selected from a
population have the same chance of being selected, we call it simple random sampling.
1
 Parameter: A number that describes a population characteristic.
 Statistic: A number that describes a sample characteristic.
 An element or member of a sample or population is a specific subject or object (for example, a
person, firm, item, state, or country) about which the information is collected.
 A variable is a characteristic under study that assumes different values for different
elements. In contrast to a variable, the value of a constant is fixed.
 The value of a variable for an element is called an observation or measurement.
 A data set is a collection of observations on one or more variables.
 a single observation a data point.

Types of Variables
A. Qualitative or Attribute variable - the characteristic being studied is nonnumeric.
B. Quantitative variable - information is reported numerically. It can be classified as either
discrete or continuous.
 Discrete variables: can only assume certain values and there are usually “gaps” between values.
 Continuous variable can assume any value within a specified range.
Cross-Section Versus Time –Series Data
 Data collected on different elements at the same point in time or for the same period of
time are called cross-section data.
 Data collected on the same element for the same variable at different points in time or for
different periods of time are called time-series data.

Sources of Data
Data may be obtained from
 Internal Sources
 External Sources
 Surveys and Experiments
2
Four Levels of Measurement
Nominal level - data that is classified into categories and cannot be arranged in any particular
order. EXAMPLES: eye color, gender Properties:
1. Observations of a qualitative variable can only be classified and counted.
2. There is no particular order to the labels.
Ordinal level – data arranged in some order, but the differences between data values cannot be
determined or are meaningless
Properties:
1. Data classifications are represented by sets of labels or names (high, medium, low) that
have relative values.
2. Because of the relative values, the data classified can be ranked or ordered.
Interval level - similar to the ordinal level, with the additional property that meaningful amounts of
differences between data values can be determined. There is no natural zero point.
Properties:
1. Data classifications are ordered according to the amount of the characteristic they possess.
2. Equal differences in the characteristic are represented by equal differences in the
measurements.
Ratio level - the interval level with an inherent zero starting point. Differences and ratios are
meaningful for this level of measurement.
 Practically all quantitative data is recorded on the ratio level of measurement.
 Ratio level is the “highest” level of measurement. Properties:
1. Data classifications are ordered according to the amount of the characteristics they possess.
2. Equal differences in the characteristic are represented by equal differences in the numbers
assigned to the classifications.
3. The zero point is the absence of the characteristic and the ratio between two numbers is
meaningful.

Summary of Four Levels of Measurement

Why Know the Level of Measurement of a Data?


 The level of measurement of the data dictates the calculations that can be done to
summarize and present the data.
 To determine the statistical tests that should be performed on the data
3
Data Collection
Observational study
• A researcher observes and measures characteristics of interest of part of a population.
Experiment
• A treatment is applied to part of a population and responses are observed.
Simulation
• Uses a mathematical or physical model to reproduce the conditions of a situation or process.
• Often involves the use of computers.
Survey
• An investigation of one or more characteristics of a population.
• Commonly done by interview, mail, or telephone
Sampling Techniques
Simple Random Sample
 Every possible sample of the same size has the same chance of being selected.
 Random numbers can be generated by a random number table, a software program or a calculator.
 Assign a number to each member of the population.
 Members of the population that correspond to these numbers become members of the sample.
Stratified Sample
Divide a population into groups (strata) and select a random sample from each group.
Cluster Sample
Divide the population into groups (clusters) and select all of the members in one or more, but
not all, of the clusters.
Systematic Sample
Choose a starting value at random. Then choose every kth member of the population.
Convenience sampling
A convenience sample is a type of non-probability sampling method where the sample is taken
from a group of people easy to contact or to reach

Confidence & Significance Levels


 The confidence level is the proportion of times that an estimating procedure will
be correct.
E.g. a confidence level of 95% means that, estimates based on this form of statistical inference
will be correct 95% of the time.
 When the purpose of the statistical inference is to draw a conclusion about a population, the
significance level measures how frequently the conclusion will be wrong in the long run.
 E.g. a 5% significance level means that, in the long run, this type of conclusion will be wrong
5% of the time.
we use α (Greek letter “alpha”) to represent significance, then our confidence level is 1 - α.
This relationship can also be stated as: Confidence Level + Significance Level =1
measures of reliability”, namely confidence level and significance level

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy