0% found this document useful (0 votes)
8 views29 pages

Estadistica Analisi

Uploaded by

gallardinii04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views29 pages

Estadistica Analisi

Uploaded by

gallardinii04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Quantitative Methods

Òscar Coromina
Population and Sample

A population is a collection
of objects (data of
interest). A sample is any
subset of the population.
When the sample consist in
the whole population we
use the term census.
Mean, Median and mode

There are different


concepts and calculations
that try to identify the
central values of a
population or “average”.
Mean

Name Monthly Salary


The “mean”, also referred Annika 1000000
as “average” comes from Bojan 50000
Ayman 45000
adding all the values and Miguel 30000
divided by the number of Lars 23000

values. In our example the Malin 21000


Anders 20000
mean salary of this group Ulla 5000
of students is 120.200 Kr. Erik 5000
Monika 3000
Median

Name Monthly Salary


Annika 1000000
The “median” is the middle Bojan 50000
Ayman 45000
score of a set of data that Miguel 30000
has been arranged in Lars 23000

order. In our data set the Malin 21000


Anders 20000
median is 22.000 kr Ulla 5000
Erik 5000
Monika 3000
Mode

Name Monthly Salary


Annika 1000000
Bojan 50000
The “mode” is the value Ayman 45000
that more often appears in Miguel 30000
Lars 23000
a data set. In our case Malin 21000
5.000 Kr. Anders 20000
Ulla 5000
Erik 5000
Monika 3000
Standard Deviation

Name Monthly Salary


The standard deviation is a
Annika 1000000
measure that accounts for the Bojan 50000
variation and dispersion of a set Ayman 45000
of data. A lower standard Miguel 30000
deviation reflects that values are Lars 23000
expected to be closer to the Malin 21000
mean. In this example the Anders 20000
Ulla 5000
standard deviation is extremely
Erik 5000
high 309.546.
Monika 3000
Trimmed Mean

Name Monthly Salary


In some distributions calculating
Annika 1000000
central values is particularly Bojan 50000
reliable. In these cases we can Ayman 45000
use trimmed means calculation Miguel 30000
that allows excluding the Lars 23000
extremes of the data. In the Malin 21000
example the trimmed mean Anders 20000
Ulla 5000
excluding 20% of our sample is
Erik 5000
24.875.
Monika 3000
Quartiles

Name Monthly Salary Q3 41.250


Annika 1000000
Quartiles are data that divide Bojan 50000
your data into quarters. Following Ayman 45000
Miguel 30000
the same logic of the median,
Lars 23000
data must be ordered and Q1
Malin 21000
splits the lowest 25%, Q2 the 50% Anders 20000
and Q3 splits the highest 25%. Ulla 5000
Erik 5000
Monika 3000 Q1 8.750
Name Monthly Salary
Five-number summary Annika 1000000
Bojan 50000
Ayman 45000
Miguel 30000
Lars 23000
A common and very useful Malin 21000
method in preliminary Anders 20000
investigation of data set (e.g. for Ulla 5000
Erik 5000
assessing what kind of
Monika 3000
distribution presents) is the 5
number summary: the max and Min: 3000
min. values, lower and upper Q1: 8.750
quartiles and the median. Median: 22.000
Q3: 41.250
Max: 1.000.000
Whisker Plots

Max
Boxplots (aka whiskerplots)
is one of the graphic
methods to visualize
Q3
distributions of numerical Median
data through their quartiles
and the 5 number Q1

summary. Min
Tukey, J. W. (1977). Exploratory data analysis (Vol. 2, . 40)
Outliers

Outliers

An outlier is a data point Max without


that differs significantly outliers
Q3
from others. In this Median
whiskerplot outliers are
presented as dots. Q1

Min
https://en.wikipedia.org/wiki/Box_plot#/media/File:Box-Pl
ot_mit_Interquartilsabstand.png
Outliers

Outliers

There are several methods for


calculating outliers. The most Max without
common and simple is to outliers
consider an outlier each data Q3
point that is far from Q1 or Q3 by Median
a distance higher than IQR by 1.5.
Q1
IQR stands for Interquartile
Range: Q3-Q1 Min
https://en.wikipedia.org/wiki/Box_plot#/media/File:Box-Pl
ot_mit_Interquartilsabstand.png
Linear Scale Log Scale

https://www.ncbi.nlm.nih.gov/core/lw/2.0/html/tileshop_pmc/tileshop_pmc
_inline.html?title=Click%20on%20image%20to%20zoom&p=PMC3&id=72008
43_S000842392000030X_fig1.jpg
Anscombe’s quartet

https://en.wikipedia.org/wiki/Anscombe%27s_quartet
Challenge/Expectations

Skills/resources
Fl ow
Anxiety/Frustation

Challenge/Expectations

ow
Fl

Boredom

Skills/resources
Anxiety/Frustation

Challenge/Expectations

ow
Fl

Boredom

Skills/resources
Anxiety/Frustation

Challenge/Expectations

ow
Fl

Boredom

Skills/resources
Anxiety/Frustation

Challenge/Expectations

ow
Fl

Boredom

Skills/resources
Anxiety/Frustation

Challenge/Expectations

ow
Fl

Boredom

Skills/resources
Why research questions?
• Data analysis, as a research method, adds to our repertoire of
available approaches – more choices, more decisions.
• The research question helps us with making the decisions
needed to produce a workable brief by orienting us towards
specific approaches, datasets, methods, types of visualization,
etc.
• The research question also holds us accountable, because it
constitutes a benchmark for our analyisis: Are we really
answering the question?
• Research questions are a process rather than finished and
unchanging.
Different types of research questions
• There are many different types of questions we can ask, for example:
• Exploratory or descriptive ("What is happening here?")
• Explanatory or causal ("Why is this happening?")
• Interpretative ("What is the meaning of this?")
• Operational ("What decision should I make?")
• Strategic ("What should my course of action be?")

• In practice, a research/analysis project often combines aspects of these types of


questions.

• Even though the analysis will be guided by the questions, there are still many
decisions concerning the how: Qualitative? Quantitative? Comparative? Statistics?
Graph theory? Theoretical framework?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy