01 WEEK1 EA Statistics Part1
01 WEEK1 EA Statistics Part1
Week 1
Part 1
Some contact information
Course Leader:
• Tibor Takács (takacs.tibor@uni-corvinus.hu)
Seminar Teacher:
• Esteban Muñoz (esteban.munoz@uni-corvinus.hu)
Room:
• E.1.107
Office:
• C707/A (Laboratory for Networks, Technology & Innovation –
NETI Lab)
• Thursday: 09:00 – 12:00
Information, requirements
Students’ achievement in the course are assessed based on two
compulsory exams and a project work as follows:
Recommended literature
• Essentials of Business Analytics, Second Edition, 2017, Cengage
Learning. Jeffrey D. Camm, James J. Cochran, Michael J. Fry, Jeffrey W.
Ohlmann, David R. Anderson, Dennis J. Sweeney, Thomas A. Williams.
• Statistics for Business and Economics, 2011, South Western College
Publishing. David R. Anderson, Dennis J. Sweeney, Thomas A. Williams.
Detailed class schedule
Week 1: 17-21 February
Qualitative and quantitative data. Frequency distribution table: frequencies, relative frequencies,
cumulative frequencies, cumulative relative frequencies. Bar chart. Pie chart. Dot Plot. Histogram.
Ogive. (Unit I, 1-5)
Ratios, proportions, and rates. Types of ratios. Ratios are used for temporal, geographic, and across-
group comparisons. Chain rule for temporal ratios. Proportions. Rates. Comparison of rates in absolute
and relative terms: differences, ratios, percentage changes.
Measures of central tendency. Mean, mode, median, percentiles. Exploratory data analysis. Box-plot
diagram. How to use the box-plot diagram: depicting the distribution, assessing its range of dispersion,
and detecting outliers. (Unit I, 6-9)
Measures of variability: range, interquartile range, variance, standard deviation, mean absolute
deviation, coefficient of variation. Distribution shape and properties. Normal distribution as a point of
reference. Measures of distribution shape: skewness, kurtosis. Standardization. Use of z-scores for
detecting outliers.
Cross-tabulation for qualitative data. Row and column percentages. Joint percentages. Analysis of
heterogeneous populations with graphical tools: clustered and stacked bar chart. Association between
qualitative data: Cramer’s V. (Unit I, 10-13)
Detailed class schedule
Week 4: 10-14 March
Relationship between a qualitative and a quantitative variable: between-to-total variance ratio (Eta-
squared), correlation ratio. The linear relationship between quantitative data: covariance and correlation.
Rank correlation.
Scatter-plot diagram. Grouped scatter-plot diagram. Fitting trend lines to scatter-plot diagrams. Simple
linear regression analysis, coefficients and interpretation, coefficient of determination, sample correlation
coefficient. (Unit I, 14-16)
Introduction to interval estimation. The margin of error. Interval estimation of a population mean. Interval
estimation of a population proportion. t distribution. Determining the necessary sample size. (Unit II, 5-7)
Detailed class schedule
Week 7: 31 March - 4 April
Interval estimation of the difference between two population means. Independent and matched samples.
Interval estimation of the difference between two population proportions. Interval estimation of
population variance. Chi-square distribution. (U II, 8-10)
Midterm exam 1
Introduction to hypothesis testing. Developing null and alternative hypotheses. Type I and Type II errors.
Lower-tailed, upper-tailed, and two-tailed tests. Approaches to hypothesis testing. Decision rule. P-
value. z test about a population mean.
t test about a population mean. z test about a population proportion. Introduction to non-parametric test
procedures. Binomial test about a population proportion. Sign test about a population mean. (U II, 23-
25)
z and t tests about the difference between two population means. Welch's d test. t tests with
independent and matched samples. z test about the difference between two population proportions.
Hypothesis testing and decision making. Power of the test. Calculating the probability of type ii error.
Determining the necessary sample size. (U II, 11-13) t and F tests of regression. (U II 23-25)
Non-parametric tests for independent and matched samples. Mann-Whitney's u test about the difference
between two population means. Matched samples binomial test for stochastic monotonicity. Tests about
population variances. F distribution. (U II, 14-16)
Introduction to Chi-Square Tests. The goodness of fit test for multinomial population proportions, the
goodness of fit test for normal distribution, and the test of independence. Fisher's exact test. Introduction
to Analysis of Variance (ANOVA). Testing for the equality of k population means. ANOVA Table. (U II, 17-
20)
Index numbers. Price relatives. Weighted aggregate price indexes: Laspeyres, Paasche, Fisher.
Calculation of aggregate price indexes as weighted averages. Practical use of price indexes in
economics and business. How to deflate nominal monetary values using a price index. Quantity
indexes. Quantity relatives. Weighted aggregate quantity indexes: Laspeyres, Paasche, Fisher.
Decomposing the change in nominal monetary values as the product of aggregate price and quantity
indexes. (U II, 21-22)
Detailed class schedule
Week 12: 19-23 May
Midterm exam 2
Project presentations
Project papers
• The project papers should be submitted in written form, and the results should be
presented in Week 12.
• The project teams include 4 or 5 students.
• The instructor shall approve the chosen databases until 31 March.
• Each team must develop an oral presentation and a written form paper (3 pages per
student, incl. tables and figures).
• The written paper must have a standard structure (introduction, problem statement,
data and methodology, discussion of results, and references).
• The submission deadline for the papers is 20 May 2025.
Information on Project work
30 points can be attained by developing and presenting a research paper Small work
groups should be created, within which the students must collaborate, submit a research
paper and present the main results of their research.
Week 1
Part 1
Quantitative – numeric
Nature of variable
Type of data
representation Data
Qualitative Quantitative
Scale of measurement
Qualitative and Quantitative Data
Ordinal Ratio
■ Nominal
■ Nominal
Example:
Students of a university are classified by the
school in which they are enrolled using a
nonnumeric label such as Business, Humanities,
Education, and so on.
Alternatively, a numeric code could be used for
the school variable (e.g. 1 denotes Business,
2 denotes Humanities, 3 denotes Education, and
so on).
Scales of Measurement
■ Ordinal
■ Ordinal
Example:
Students of a university are classified by their
course performance using a nonnumeric label
such as Distinction, Merit, Pass or Fail.
Alternatively, a numeric code could be used for
the class standing variable (e.g. 1 denotes
Distinction, 2 denotes Merit and so on).
Scales of Measurement
■ Interval
■ Interval
Example:
The maximum temperature on Tuesday was 18°C,
while on Wednesday it was only 12°C. The peak
on Tuesday was higher by 6°C on Tuesday.
Scales of Measurement
■ Ratio
Nominal:
Data are labels or names used to identify an attribute of the element (colour,
religion, family type, etc.)
Ordinal:
The rank of the data is meaningful (grade, positions in a competition, etc.)
Interval:
The distance between observations is expressed in terms of a fixed unit – the
scale does not have a natural zero point (time, temperature)
Ratio:
The ratio of two values is meaningful (height, weight, speed,etc.), this scale
must contain a zero value.
Unit 2 Data Acquisition and Analysis
■ Existing Sources
Time Requirement
• Searching for information can be time consuming.
• Information may no longer be useful by the time it
is available.
Cost of Acquisition
• Organizations often charge for information even
when it is not their primary business activity.
Data Errors
• Using any data that happens to be available or
that were acquired with little care can lead to poor
and misleading information.
Data Sources
■ Statistical Studies
• Google Account
• Applications
• Google Drive
• New
• Forms
https://docs.google.com/forms
PLANNED SURVEY QUESTIONS
What gender are you?
What color is your hair?
How tall are you? (in cm)
How much do you weigh? (in kg)
When were you born?
Age (in months)
How many siblings do you have?
What kind of locality did you live in as a child? (at the age of 8)
What is the population (in thousands) of the locality in which you lived as a child? (at the age of 8)
Which program are you studying at?
Have you ever studied statistics before? (in high school or college)
How would you rate your proficiency in using Excel?
What kind of animal would you most like to be reborn as in your next life?
If costs didn't matter, in which country would you most like to live for the next 3 months?
What would be your 1st preferred drink at a party?
What would be your 2nd preferred drink at a party?
QUESTIONS - answers
male / female
QUESTIONS - answers
black
dark brown
light brown
blond
red
gray
I've no hair
other:
QUESTIONS - answers
What kind of locality did you live in as a child? (at the age of 8)
Township or village
QUESTIONS - answers
What is the population (in thousands) of the locality in which you lived as a
child? (at the age of 8)
Have you ever studied statistics before? (in high school or college)
yes / no
If costs didn't matter, in which country would you most like to live
for the next 3 months?
3 separate items
Items:
I prefer to openly discuss my feelings and experiences with my friends rather
than keep them to myself.
Even my friends are unaware of my innermost feelings because I rarely express
how I think or feel.
I prefer to remain distant and detached with people.
I feel uncomfortable disclosing myself to other people, even to my friends.
I can cope better with nervousness in my friends' company than alone.
I prefer to keep my problems to myself.
Response categories:
1: very untrue of me
2: somewhat untrue of me
3: neutral
4: somewhat true of me
Can you open this link to our survey?
https://forms.gle/mPGkGsQW9q7ptxGf7