Tableau by Mughammd Asrol Smallsize
Tableau by Mughammd Asrol Smallsize
Shade Tint
Using Hue
Qualitative / Categorical
Using Saturation
How to install?
Go to https://www.tableau.com/products/public/download
Download the apps
Install in your computer
Tableau: Basic
Tableau Work-Flow
Input data
or file
Tableau
Public User
Interface
Connecting data source to tableau
The dataset extension may be any type: csv, xls, JSON, PDF, etc.
For this tutorial, we will start with the excel or CSV dataset file.
The dataset file save in your local computer and import to tableau easily, as the
following
Dataset from excel file in local
computer
1. We select
first table
2. Go to worksheet
View your data
Please try on another data using
CSV file source
Our dataset for this section
Data elements content
independent
variables
Categorical/
Dimensions
Numerical/
Measures
Dependent
variables
Data field
Data types
Combine dimensions and measures
2
Make your first graph
You can choose any types of charts
Make your first graph
Right click on coloumn Green, it means that taken from the dimension
Analysis without aggregate
Time-series dataset with detail
Time-series dataset with detail
Changes row settings
Area chart for easy investigation
Filter
Quick filter
Double click
on dataset
and add
other table
Connect dataset
Rename the
dimension to
‘hierarchy’
Decompose the
country
Inner join
Left join
We will joining all sheet: list of orders, order breakdown and target sales.
It is run smoothly for joining list of orders and order breakdown. But it is find
difficulties to joining target sales, since it has different level of granulaty.
• Start your
visualization with
airline 1, like this.
Joining dataset
• Now, add
‘revenue’ From
the second
dataset
• You will find a
‘sign’ that means
the data is joined.
Joining dataset
Joining multiple field
• Rename ‘year’ to be
‘period’
• Your data will be
revised.
Joining multiple field
Primary
Secondary
How if you start joining and blending
data with Airline2?
• Data by category
Data by category
Data and its target
Data and its target
• In this section we will make a calculated field that contains from different
data sources.
• We will enrich our previous visualization with a deepen analysis by adding
a bar chart to know the differences of the actual data to target.
• Firstly, we have to set our 3 visualization data set to single visualization
using filter function
• We use our previous data and visualization
Filtered dataset visualization
Advanced calculated field
P1-UK-Bank-Customers
Geographical Dataset
• Go to each sheet
• Go to worksheet>tooltip>
unchekced
Project Preparation
Data pre-processing
• Silahkan dipilih satu data dengan dari sumber kaggle.com atau sumber lain
yang relevan.
• Siapkan PPT untuk memperjelas project yang akan Anda kerjakan sebagai
tugas akhir.
• Berikut ini adalah konten PPT yang Anda perlu siapkan hari ini.
• Berikan judul project, nama dan nim Anda. Judul perlu sesuai dengan kontekstual
data yang akan Anda gunakan.
• Berikan outline yang akan Anda jelaskan dalam PPT tersebut.
• Berikan sumarisasi tabel data.
Data pre-processing
• Motivation
To better understand the data: central tendency, variation, and
spread
Dispersion
Central tendency
Measuring Central Tendency
Example
𝑠𝑠𝑠𝑠𝑠𝑠 𝑜𝑜𝑜𝑜 𝑎𝑎𝑎𝑎𝑎𝑎 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
PT X is a car manufacturing company. Below are 8 samples of car production sample mean =
𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 𝑜𝑜𝑜𝑜 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑖𝑖𝑖𝑖 𝑡𝑡𝑡𝑡𝑡 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠
time: 90+88 …+89
= = 86.37
90 88 84 91 8
84 85 80 89
Mean:
an Example
Measuring Central Tendency
• Median:
• The midpoint of the values after they have been ordered from the
smallest to the largest, or the largest to the smallest.
• Middle value if odd number of values,
or average of the middle two values otherwise
• For grouped data, estimated by interpolation:
𝑛𝑛
− ∑ 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑙𝑙
𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 = 𝐿𝐿1 + ( 2 ) width
𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚
Symmetric
Mode Mean Mean Mode
Mean
Median
Mode
min max
VARIANCE The arithmetic mean of the squared deviations from the mean.
• The variance and standard deviations are nonnegative and are zero only if all
observations are the same.
• For populations whose values are near the mean, the variance and standard
deviation will be small.
• For populations whose values are dispersed from the mean, the population
variance and standard deviation will be large.
• The variance overcomes the weakness of the range by using all the values in the
population
Measuring Dispersion
of Data
Standard Deviation and Distribution
Symmetric vs. Skewed Data
• Median, mean and mode of symmetric, symmetric
positively and negatively skewed data
177
August 15, 2024 Data Mining: Concepts and Techniques
Content of your slides
• Title, name, nim, data sources
• Slides outline
• Dataset and sources + analysis
• Data pre-processing
• Statistical description of the dataset
• Visualization plan of the dataset, mentioned the attributes to set.
• Data visualization 1 + analysis
• Data visualization 2 + analysis
• Data visualization 3 + analysis
• Data visualization 4 + analysis
• Data visualization 5 + analysis
• Data visualization …n + Analysis
• Dashboard + analysis and settings
• Data storytelling + analysis
• Key points of the dataset
• Link of tableau project
• Link of video recording of data story telling
• Conclusion