0% found this document useful (0 votes)
6 views43 pages

DVL Notes

Data Visualization is the process of converting data into visual formats to enhance understanding and analysis, aiding in decision-making across various fields. The methodology involves stages such as understanding objectives, data collection, cleaning, analysis, and effective design, culminating in sharing the visualizations. Tools like Tableau, Power BI, and Google Data Studio facilitate this process, while best practices ensure clarity, accuracy, and audience engagement.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views43 pages

DVL Notes

Data Visualization is the process of converting data into visual formats to enhance understanding and analysis, aiding in decision-making across various fields. The methodology involves stages such as understanding objectives, data collection, cleaning, analysis, and effective design, culminating in sharing the visualizations. Tools like Tableau, Power BI, and Google Data Studio facilitate this process, while best practices ensure clarity, accuracy, and audience engagement.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Module 1

Data Visualization Definition and Methodology

Data Visualization is the process of converting data into visual formats like charts, graphs, maps,
and dashboards so that it becomes easy to understand and analyse. It helps people quickly see
patterns, trends, and outliers in data, which can be hard to find in raw numbers or spreadsheets.

In simple words, Data Visualization helps to tell a story using visuals instead of only numbers or
text.

Importance of Data Visualization:

 Helps in faster decision making.

 Makes complex data simple and easy to understand.

 Useful in business, science, healthcare, and many other fields.

 Makes it easier to spot problems or opportunities.

 Helps people who are not experts in data to still understand the insights

Example:

Imagine a company has 10,000 sales records. By using a bar chart, they can easily see which product
sold the most. Without a chart, going through the records manually would take a lot of time.

Methodology of Data Visualization

1. Understanding the Objective of Visualization

Before starting the visualization, it is important to clearly understand the purpose of creating the
visual. This means identifying what message needs to be communicated and who the target
audience is. For example, a business manager might want to see monthly sales figures, while a
researcher might be interested in understanding trends or patterns in the data. Knowing the
objective helps in selecting the right data and visualization type.

2. Collecting the Relevant Data

In this step, the required data is collected from suitable and trusted sources such as spreadsheets,
databases, online platforms, or surveys. The data should be relevant to the objective identified in the
previous step. If the data is outdated or incorrect, it can lead to misleading visualizations and wrong
conclusions.

3. Cleaning and Preparing the Data

Once the data is collected, it must be cleaned to remove any errors, duplicates, or missing values.
During this step, the data is also arranged in a suitable format so that it can be used easily for
analysis. Clean and well-prepared data ensures that the final visualizations are accurate and
meaningful.

4. Analyzing the Data for Insights

After cleaning the data, it is carefully analyzed to find patterns, relationships, trends, or unusual
values. This step helps in understanding what the data is trying to tell. The analysis provides a base
for choosing the most appropriate visualization method and for deciding what parts of the data
should be highlighted.

5. Choosing the Right Type of Visualization

Based on the analysis and the objective, a suitable type of chart or graph is selected. For example, a
bar chart is useful for comparing different items, while a line chart is better for showing trends over
time. Choosing the right type of visualization is important because it directly affects how clearly the
information is understood by the audience.

6. Designing the Visualization Effectively

The visualization should be designed in a way that is simple, clear, and easy to read. Proper titles,
labels, legends, and color codes should be added to make the visual more informative. Unnecessary
decorations or too many colors should be avoided because they can confuse the viewers. A well-
designed visualization helps communicate the message quickly and effectively.

7. Adding Interactivity (if needed)

If the visualization is going to be used in a dashboard or online platform, interactive features such as
filters, tooltips, or clickable items can be added. Interactivity allows users to explore the data further
and view the specific details they are interested in. This makes the visualization more useful and
flexible.

8. Testing and Reviewing the Visualization

Before finalizing the visualization, it should be reviewed for any errors or design problems. The
creator should check if the data is correct, if the chart is readable, and if the message is clear. It is
also helpful to ask someone else to review it and give feedback. This step ensures that the final result
is of good quality.

9. Sharing or Publishing the Visualization

The final visualization can be shared through presentations, printed reports, websites, or
dashboards. It should be made available in a format that suits the audience. The goal is to ensure
that people can view, understand, and use the information in a meaningful way.

Seven stages of Data visualization

Data visualization is a process of turning raw data into graphical formats like charts, graphs, and
maps. It helps people understand patterns, trends, and insights quickly and easily.

1. Data Acquisition

In this first stage, the required data is collected from reliable sources. These sources can be
spreadsheets, databases, surveys, online platforms, or APIs. It is important to gather only the data
that is relevant to the purpose of the visualization. If the data collected is incorrect or incomplete, it
will affect the entire visualization process.

2. Data Cleaning
After data is collected, it must be cleaned to remove any errors, missing values, duplicates, or
formatting issues. This step is very important because raw data often contains mistakes that can lead
to wrong results. Data cleaning ensures that the dataset is correct, complete, and ready for analysis.

3. Data Transformation

Once the data is clean, it needs to be transformed into a suitable format for visualization. This may
involve converting data types, creating new columns, grouping data, or summarizing it using averages
or totals. The main goal of this step is to prepare the data in a way that supports effective
visualization.

4. Data Analysis

In this stage, the data is examined to identify key patterns, relationships, comparisons, and trends.
The results of the analysis help us understand what story the data is telling. This step also helps in
deciding what parts of the data should be visualized and what type of chart or graph will be most
suitable.

5. Data Representation (Visualization Design)

In this stage, the data is actually represented using charts, graphs, maps, or dashboards. The type of
visualization is chosen based on the analysis and the type of data. For example, bar charts are used
for comparisons, line charts for trends, and pie charts for showing proportions. This step focuses on
making the information easy to understand and visually appealing.

6. Data Refinement

After creating the first version of the visualization, it is important to review and improve it. This may
include fixing labels, adjusting colors, resizing charts, or correcting any mistakes. The goal of
refinement is to improve clarity, remove confusion, and make the message more direct. It may also
include adding interactive features like filters or tooltips.

7. Data Presentation and Interpretation

In the final stage, the completed visualization is shared with the intended audience. This may be
through a report, presentation, dashboard, or website. The audience then interprets the visual to
understand the insights. The success of this stage depends on how clearly and effectively the data is
presented.

Data Visualization Tools

1. Tableau

Tableau is one of the most powerful and user-friendly data visualization tools. It allows users to
create interactive dashboards and charts without writing any code. Tableau supports data from
various sources like Excel, SQL, and cloud services. It is used by businesses, researchers, and students
to analyze and present data clearly.

2. Microsoft Power BI
Power BI is a tool developed by Microsoft that helps users create detailed visual reports and
dashboards. It connects easily with Microsoft Excel, databases, and online services. Power BI allows
users to explore their data, apply filters, and share reports online. It is popular in both small
companies and large organizations.

3. Google Data Studio

Google Data Studio is a free tool offered by Google that helps users turn data into informative
dashboards and reports. It integrates well with other Google products like Google Sheets, Google
Analytics, and BigQuery. It allows real-time sharing and makes it easy to collaborate with others.

4. Qlik Sense

Qlik Sense is another powerful data visualization and business intelligence tool. It allows users to
explore data using a drag-and-drop interface. Qlik Sense also has smart search and interactive
filtering features. It is known for its speed and flexibility in handling large datasets.

5. Excel

Microsoft Excel is a basic but still widely used data visualization tool. It allows users to create simple
charts such as bar charts, pie charts, and line graphs. Though Excel is not as advanced as other tools,
it is useful for small-scale data analysis and is available on almost all computers.

6. Looker

Looker is a modern data visualization tool now owned by Google Cloud. It helps users explore,
analyze, and visualize real-time data. It works well with cloud databases and provides powerful
options to build customized reports and dashboards.

Best practices for Data Visualization

Data visualization is the process of turning data into visual formats like charts and graphs. But simply
creating a visual is not enough. It should be clear, accurate, and meaningful to the audience.

1. Know Your Audience

Before creating any visual, it is important to know who will see it. The design, chart type, and level of
detail should be chosen based on the knowledge and interest of the audience. For example, a
technical team may prefer detailed data, while a manager may want a quick summary.

2. Choose the Right Chart Type

Selecting the correct chart type is very important. Use bar charts to compare values, line charts to
show trends over time, pie charts to show proportions, and maps to display geographic data. The
wrong chart type can confuse the viewers and lead to misinterpretation.

3. Keep It Simple

A good visualization is simple and clear. Avoid adding too many colors, labels, or decorative elements
that do not add meaning. Too much information in one chart can make it hard to understand. It is
better to break complex data into multiple visuals if needed.

4. Use Clear Labels and Titles


Every chart should have a clear and meaningful title. Labels on the axes and data points should be
easy to read. This helps the viewer understand what the data is showing without confusion. Avoid
using abbreviations unless the audience is familiar with them.

5. Maintain Data Accuracy

The data used in the visual should always be accurate and up to date. Mistakes in data can lead to
wrong decisions. It is also important to avoid changing the scale or cutting the axis in a way that
misleads the viewer.

6. Use Colors Wisely

Colors should be used to highlight important points, not to decorate the chart. Use consistent colors
for the same type of data. Also, choose color combinations that are readable for people with color
blindness. Avoid using too many bright or similar colors.

7. Provide Context

A visualization should provide enough background so that the viewer understands the meaning. This
can include time periods, data sources, or explanations of units. Without context, the data may be
misunderstood.

8. Make It Interactive (If Needed)

In dashboards or reports, adding filters and drill-down options allows users to explore data more
deeply. Interactive visualizations are helpful for large datasets and allow users to find their own
insights.

9. Tell a Story

A good visualization tells a story. It should guide the viewer from the main point to the conclusion.
Charts should not just display data, but also help the user understand what action to take or what
message is being delivered.

10. Test and Get Feedback

After creating a visualization, show it to a few people and get feedback. Check if they can understand
it quickly. Make changes if needed to improve clarity. Sometimes small changes in color, label, or
chart type can make a big difference.

Architecture of Tableau

Tableau is one of the most popular data visualization tools used in business intelligence. It helps
people convert raw data into easy-to-understand visuals like graphs and dashboards. To do this
effectively, Tableau has a well-structured architecture.

1. Data Sources Layer


The first part of Tableau’s architecture is the data sources layer. Tableau can connect to a wide range
of data sources.

 Tableau connects to data stored in files like Excel, CSV, or JSON.

 It can also connect to databases such as MySQL, Oracle, SQL Server, and PostgreSQL.

 Tableau supports live connections and data extracts. A live connection means Tableau
directly connects to the source in real-time. Extracts are a copy of data stored locally in
Tableau for better speed.

This layer is responsible for gathering all the data needed for visualization.

2. Data Connectors

The data connectors are responsible for linking Tableau to the data sources.

 Tableau provides in-built connectors for almost every type of data, including local files, cloud
services, and big data platforms.

 These connectors make sure data is fetched securely and correctly.

This layer handles how Tableau communicates with different types of data formats and systems.

3. Data Engine

The data engine is one of the most important parts of Tableau. It helps in processing and managing
the data for analysis.

 It takes care of querying the data and returning results quickly.

 When extracts are used, the data engine stores and handles this data in a highly optimized
way.

 It performs fast calculations and filtering operations even on large datasets.

This engine is what allows Tableau to provide fast performance.

4. VizQL Server

VizQL stands for Visual Query Language. It is the brain behind Tableau’s visual power.

 The VizQL server converts user actions (like dragging and dropping fields) into SQL queries.

 These SQL queries fetch data from the source, and VizQL transforms the results into visuals
such as charts or graphs.

 It separates the data processing part from the visual presentation part.

This is the component that makes Tableau interactive and dynamic.

5. Components of Tableau Server

If Tableau is used in a company environment through Tableau Server, then the architecture includes
server components as well.

 Application Server (Gateway): Manages user sessions, permissions, and access. It ensures
users can log in and see only what they are allowed to.
 Repository (PostgreSQL): Stores metadata, like workbook info, users, and permissions. It
helps Tableau remember user preferences and past work.

 Data Server: Maintains data connections and ensures consistent data access for all users. It
also manages data extracts.

 Backgrounder: Handles tasks like data refresh and sending alerts. It works in the background
without disturbing the user experience.

 Cache Server: Stores commonly used data to make Tableau faster. It avoids repeating the
same queries again and again.

These components allow Tableau to support multiple users in a secure and smooth way.

6. Client Layer (User Interface)

This is the front-end that users interact with.

 It includes Tableau Desktop (used for designing visuals), Tableau Server (for sharing), Tableau
Public (free version), and Tableau Online (cloud-based).

 Users create visualizations by dragging and dropping fields.

 They can also apply filters, calculations, and formatting without writing code.

This layer is very user-friendly and is one of the reasons Tableau is so widely used.

7. Tableau Desktop and Tableau Server Communication

When a user designs a dashboard in Tableau Desktop and publishes it to Tableau Server, many
background processes occur.

 The data gets shared with the server, and permissions are applied.

 Other users can now access the dashboard through a browser or mobile app.

 Any changes made are reflected in real-time or during scheduled refreshes.

This communication makes it easy to collaborate in teams and organizations.

8. Security and User Management

Security is also an important part of Tableau architecture.

 Tableau supports user authentication using passwords, tokens, or external systems.

 Access control is handled through roles and permissions.

 Admins can control who can view, edit, or publish content.

This ensures that sensitive data is protected and only visible to the right users.

9. Mobile and Web Support

Tableau also supports mobile and web viewing.

 Dashboards designed in Tableau Desktop can be viewed in web browsers and mobile apps.

 The architecture includes mobile support for both iOS and Android.
 Responsive design ensures visuals adjust to different screen sizes.

This makes Tableau useful for business users who are on the move.

Interface of Tableau

1. Workspace Area

The workspace area is the central part of Tableau where visualizations are created. Here, users drag
and drop fields from their data sources to build charts and graphs. This area shows the current
worksheet or dashboard you are working on.

2. Data Pane

On the left side of the screen is the data pane. It displays all the fields from the connected data
source. These fields are divided into two types: dimensions and measures. Dimensions usually
include categorical data like names or dates, while measures include numeric data like sales or profit.
Users drag these fields from the data pane into the workspace to create visualizations.

3. Shelves and Cards

Above the workspace are shelves and cards where users place fields to define how the visualization
will look.

 The Columns and Rows shelves control the layout of the chart.

 The Filters shelf allows users to filter data to show only specific parts.

 The Marks card lets users control the type of mark used in the visualization, such as bars,
lines, or shapes. It also controls color, size, labels, and tooltips.

4. Toolbar

At the top of the interface is the toolbar. It contains buttons for common tasks such as saving the
workbook, undoing or redoing actions, zooming in or out, and adding new worksheets or
dashboards.

5. Menu Bar

The menu bar contains dropdown menus like File, Data, Worksheet, Dashboard, and Help. These
menus provide access to all of Tableau’s features, including connecting to new data sources,
exporting visuals, and customizing settings.

6. Sheets and Tabs

At the bottom of the interface, users find the sheets and tabs section. Each sheet represents a
different worksheet, dashboard, or story. Users can create multiple sheets and switch between them
easily.

7. Dashboard and Story Interface

Tableau also allows users to create dashboards by combining several visualizations on one screen.
The dashboard interface provides tools to arrange, resize, and add interactive controls like filters and
actions. The story interface is used to create a sequence of visualizations that tell a data-driven story.
8. Status Bar

At the very bottom, the status bar shows information about the data used in the current view,
including the number of records and the progress of any ongoing tasks.

Connecting to Data Sources

1. Types of Data Sources Supported

 Tableau supports connecting to many different types of data sources including Excel files,
CSV files, and text files.

 It can also connect directly to popular databases such as MySQL, Oracle, Microsoft SQL
Server, and PostgreSQL.

 Cloud-based data sources like Google Sheets, Salesforce, and Amazon Redshift are supported
as well.

 Tableau allows connections to big data platforms and web data connectors to expand its
reach.

2. Live Connection and Extracts

 Tableau offers two connection methods: live connection and data extract.

 A live connection allows Tableau to fetch real-time data directly from the source without
storing it locally.

 Data extracts are copies of the data stored locally in Tableau, which improve speed and
performance during analysis.

 Extracts are useful when working with large datasets or when the original data source has
limited access or slow response.

Live Connection:

 Directly connects to the database and fetches data in real-time


 Updates automatically whenever the source data changes
 suitable for high-performance databases that can handle frequent queries
 Example: sales dashboard connected to live SQL database updated instantly when
new sales transactions occur

Extract Connection:

 stores snapshot of the data locally as a .hyper files


 improves performance by reducing load on the database
 user must refresh the extract to get updated data
 Example: a monthly sales report uses an extract connection to load faster without
querying the database repeatedly

3. How to Connect to a Data Source


 Users start by opening Tableau Desktop and selecting the “Connect” pane on the start
screen.

 They then choose the appropriate data source type such as Excel, database, or cloud service.

 When connecting to a database, users enter the server name, database name, username,
and password to establish the connection.

 For files like Excel or CSV, users browse their computer or network to locate and select the
file.

4. Data Source Page in Tableau

 After connecting, Tableau displays the data source page where users can see all available
tables and fields.

 Users can preview data to check if the correct data has been loaded for analysis.

 They can rename fields, change data types, or hide unnecessary fields to clean up the data.

 Tableau supports joining multiple tables or blending data from different sources on this page.

5. Data Preparation Features

 Users can filter data to include only the relevant rows needed for analysis.

 Unused or irrelevant fields can be hidden to make the data easier to work with.

 Tableau provides options to split columns into multiple fields if needed.

 Calculated fields can be created to add new data based on formulas or conditions within
Tableau.

6. Refreshing Data

 When using a live connection, Tableau updates the data automatically every time the
visualization is opened.

 For data extracts, users must refresh the data manually to update the local copy with new
information.

 It is also possible to schedule automatic refreshes to keep the extract data current without
manual intervention.

Saving and publishing the data source

1. Saving the Data Source

 After connecting to and preparing a data source, users can save their Tableau workbook to
preserve their work.

 Saving the workbook stores the data connection details, the prepared data, and any created
visualizations together in one file.

 Tableau workbooks are saved with a “.twb” or “.twbx” file extension, where “.twbx” includes
the data extract packaged inside the file.
 Users can save the workbook locally on their computer or on a shared network drive for
team access.

 Saving regularly is important to avoid loss of work due to unexpected errors or power
failures.

2. Publishing the Data Source

 Tableau allows users to publish data sources to Tableau Server or Tableau Online, enabling
sharing across an organization.

 Publishing the data source creates a centralized version that multiple users can connect to
and use in their own workbooks.

 Users must have appropriate permissions on Tableau Server or Tableau Online to publish
data sources.

 Publishing helps maintain data consistency and control, as updates to the published data
source automatically reflect in connected workbooks.

 Tableau provides options to schedule extract refreshes on the server, ensuring the published
data stays up to date without manual intervention.

3. Benefits of Saving and Publishing

 Saving workbooks locally helps users maintain personal copies of their analysis and data
connections.

 Publishing data sources promotes collaboration and reduces duplication of effort by sharing
a single, authoritative data source.

 It improves data governance by controlling who can access and modify the data source.

 Users can access published data sources from anywhere using Tableau Server or Online,
increasing flexibility and productivity.

Extract connection

In Tableau, an extract connection means that a snapshot or copy of the original data is saved locally
in a highly optimized format. Instead of querying the original data source every time you want to
visualize the data, Tableau uses this extract to improve performance and allow offline access.
Extracts are especially useful when working with large datasets or when the original data source is
slow or has limited connectivity.

 An extract connection creates a subset or full copy of the data in Tableau’s own data engine
format, which is compressed and optimized for faster querying.

 Users can refresh extracts manually or schedule automatic refreshes to keep the data
synchronized with the original source.

 Extracts allow Tableau users to work offline since the data is stored locally and does not
require continuous connection to the original data source.
 This connection type helps reduce the load on the original data source, which is beneficial
when many users access the same data.

 However, because extracts are snapshots, they may not always reflect real-time data unless
frequently refreshed.

Dimensions and measures, Filters

Dimensions

Dimensions are fields in Tableau that contain categorical or qualitative data. They usually represent
descriptive attributes such as names, dates, or geographic locations. Dimensions help to segment,
group, and categorize data in visualizations. For example, a “Region” or “Customer Name” field is a
dimension because it describes the data but is not used for calculations. Dimensions define the level
of detail in a view by breaking down measures into smaller parts.

 Dimensions are often placed on rows or columns to organize data in a table or chart.

 They are used to create headers or labels in visualizations.

 Dimensions do not aggregate values but instead split data into categories for analysis.

Measures

Measures are numerical fields that can be measured and aggregated. These fields typically contain
data such as sales amounts, profits, or quantities. Tableau automatically applies calculations like sum,
average, or count to measures to create meaningful insights. Measures are the values that are
analyzed and visualized using charts like bar graphs or line charts.

 Measures are usually placed on axes in visualizations to represent quantitative data.

 They can be aggregated or calculated using functions such as sum, average, min, or max.

 Measures help quantify data and show trends, comparisons, or distributions.

Filters

Filters in Tableau are tools that allow you to restrict or refine the data shown in your visualizations.
By applying filters, you can focus on specific portions of the dataset and exclude irrelevant or
unwanted data. Filters can be applied to dimensions, measures, or date fields depending on the
analysis need.

 Dimension filters limit data based on categories, for example showing only sales from
specific regions.

 Measure filters allow you to include or exclude data based on numeric conditions, such as
sales greater than a certain amount.
 Date filters enable analysis for specific time periods like months or years.

 Filters can be applied at different levels: to a single worksheet, across a dashboard, or at the
data source level.

 Interactive filters allow users to dynamically change filter criteria while viewing dashboards.

Types of Filter in Tableau

Extract Filters

Extract filters are applied when you create a data extract from your data source.
They reduce the amount of data imported into Tableau by filtering rows at the time of data
extraction.
Using extract filters improves performance by limiting data size and only including relevant data in
the extract.

Data Source Filters

Data source filters restrict the data available from the data source itself.
They apply to all worksheets and dashboards using that data source, ensuring consistent filtering
across the workbook.
Data source filters are useful for security or compliance, limiting user access to sensitive data.

Context Filters

Context filters create a temporary subset of data that other filters reference.
They are applied before other filters and can improve performance when working with large
datasets.
Context filters are used when you want to create dependent filtering or when filters rely on
aggregated data.

Dimension Filters

Dimension filters filter data based on categorical fields such as names, dates, or regions.
They allow selection of specific categories or members to include or exclude in the visualization.
Dimension filters can be applied as single or multiple selections, dropdowns, or search boxes.

Measure Filters

Measure filters filter data based on numeric fields or aggregated values like sales or profit.
They allow filtering data points by specifying conditions such as ranges (greater than, less than) or
specific values.
Measure filters help focus on data points meeting certain numeric criteria.

Top N Filters

Top N filters display only the top or bottom set of data points based on a measure.
For example, showing the top 10 customers by sales or bottom 5 products by profit.
They help in highlighting key performers or underperformers in the data.

Date Filters
Date filters allow filtering data based on date and time fields.
Users can select specific dates, relative dates (last month, last year), or ranges to analyze time-based
data.
Date filters are useful for trend analysis and time comparisons.

Table Calculation Filters

These filters are applied after table calculations, such as running totals or percent of total.
They filter data based on computed results rather than raw data values.
Table calculation filters enable dynamic filtering based on complex computations.

Module 2
Creation of univariate charts: Pie chart, Bar graph, Line graph, Histogram, Box plot

Creation of Univariate Charts

Univariate charts display the distribution or summary of a single variable. These charts help analyze
data patterns, frequencies, and trends for one data dimension or measure.

Pie Chart

A pie chart shows the proportion of categories as slices of a circle. It is useful for visualizing
percentage or part-to-whole relationships in categorical data.

 To create a pie chart, first select the dimension that defines categories and a measure to
determine slice sizes.

 In Tableau, drag the dimension to the Color shelf and the measure to the Angle shelf on the
Marks card.

 Pie charts work best with a limited number of categories to avoid clutter.

 They help compare parts of a whole but do not show detailed distributions.

Bar Graph

Bar graphs represent categorical data with rectangular bars whose lengths correspond to values or
frequencies. They are simple and effective for comparing quantities across categories.

 To create a bar graph, drag the dimension to the Columns shelf and the measure to the Rows
shelf (or vice versa).

 Tableau automatically generates bars representing the sum or aggregation of the measure
for each category.

 Bar graphs can be vertical or horizontal and are easy to interpret.

 They are useful for comparing discrete categories clearly.

Line Graph
Line graphs show trends or changes over a continuous variable, often time. They connect data points
with lines to illustrate patterns or fluctuations.

 To create a line graph, place a date or continuous dimension on the Columns shelf and a
measure on the Rows shelf.

 Tableau connects the data points in chronological order by default.

 Line graphs are ideal for time series data or continuous measurements.

 They help identify trends, seasonality, and anomalies.

Histogram

Histograms display the distribution of a continuous variable by dividing it into intervals or bins. The
height of each bar represents the frequency of data points in that bin.

 To create a histogram, convert a continuous measure to bins and then drag the bin field to
the Columns shelf and the count measure to the Rows shelf.

 Tableau automatically calculates the frequency of data points in each bin.

 Histograms are useful for understanding the shape, spread, and skewness of data.

 They help detect outliers and data clustering.

Box Plot

Box plots summarize the distribution of a continuous variable using five-number summary statistics:
minimum, first quartile, median, third quartile, and maximum. They highlight data spread and
outliers.

 To create a box plot, drag a dimension (optional) to the Columns shelf and a continuous
measure to the Rows shelf, then select “Box Plot” from the Show Me panel.

 The box represents the interquartile range, the line inside shows the median, and whiskers
extend to the data extremes.

 Outliers are plotted as individual points beyond whiskers.

 Box plots are excellent for comparing distributions across categories.

Showing Aggregate Measures in Tableau

 Tableau automatically aggregates numeric fields (measures) when you add them to a view,
usually using the SUM function by default.

 You can change the aggregation type by right-clicking the measure and selecting options like
Average, Minimum, Maximum, Count, or Count Distinct.

 Aggregation summarizes detailed data into meaningful totals, averages, or counts, which
helps in better understanding and comparing data.

 The level of aggregation depends on the dimensions included in the view; more dimensions
mean more detailed aggregation.
 You can add Grand Totals and Subtotals in tables to show summarized data across
categories.

 Tableau’s Level of Detail (LOD) Expressions allow customizing the aggregation level beyond
the default view settings.

 Aggregate values can be displayed on charts as labels by dragging the measure to the Label
shelf.

 Filters affect aggregate calculations by limiting data included in the summary dynamically.

 You can choose to disable aggregation if needed by selecting Do Not Aggregate, but this is
rarely used.

Creation of Bivariate and Multivariate charts: Scatter plots, Area charts, Bullet graphs, Gantt chart,
Heat maps.

Bivariate and multivariate charts help analyze the relationship between two or more variables. These
charts reveal patterns, correlations, and comparisons that single-variable charts cannot show.

Scatter Plots

Scatter plots display data points for two continuous variables along X and Y axes to show
relationships or correlations.

 To create a scatter plot in Tableau, drag one measure to the Columns shelf and another
measure to the Rows shelf.

 You can add a dimension to the Detail or Color shelf to distinguish groups or categories.

 Scatter plots help identify trends, clusters, and outliers between two numeric variables.

Area Charts

Area charts show quantitative data over time or ordered categories with the area under the line filled
in to emphasize volume.

 In Tableau, drag a date or dimension to the Columns shelf and a measure to the Rows shelf.

 Select the Area chart option from the Marks card to fill the area below the line.

 Area charts are useful for showing cumulative totals or comparing multiple categories over
time.

Bullet Graphs

Bullet graphs compare a single measure against a target or benchmark and show progress toward
goals.

 To create a bullet graph, use a measure as the main value and another measure or
parameter as the target.

 Tableau provides a Bullet Graph option in the Show Me panel.

 Bullet graphs are helpful in dashboards to track KPIs against predefined thresholds.
Gantt Charts

Gantt charts visualize schedules, timelines, or project plans by displaying bars that represent the
duration of tasks.

 In Tableau, drag a date field to the Columns shelf and a dimension (like Task) to the Rows
shelf.

 Use the Gantt Bar mark type and add a measure representing the duration to the Size shelf.

 Gantt charts are widely used in project management to track start dates and lengths of
activities.

Heat Maps

Heat maps use color to represent values in a two-dimensional matrix, making it easy to spot patterns
and correlations.

 To create a heat map, place two dimensions on Rows and Columns shelves and a measure on
Color in the Marks card.

 The color intensity changes based on the measure’s value, highlighting highs and lows.

 Heat maps are effective for visualizing correlations, frequencies, or performance across
categories.

Working with Maps: Working with coordinate points, plotting longitude and latitude, Customizing

geocoding, creating dual axes maps and editing locations.

Maps in Tableau help visualize geographic data by plotting locations, showing spatial relationships,
and providing geographic insights. Tableau supports working with coordinate points, customizing
map data, and creating complex map visualizations.

Working with Coordinate Points

 Tableau can plot data points on a map using latitude and longitude values as coordinates.

 To work with coordinate points, you must have fields in your data source containing latitude
and longitude values.
 Drag the latitude field to the Rows shelf and the longitude field to the Columns shelf to plot
points.

 Tableau automatically recognizes these fields as geographic roles if they are named or
assigned properly.

Plotting Longitude and Latitude

 Longitude and latitude fields must be assigned geographic roles if Tableau doesn’t detect
them automatically.

 You can assign these roles by right-clicking the field in the Data pane, selecting Geographic
Role, then choosing Latitude or Longitude.

 Once assigned, dragging these fields to the appropriate shelves creates a map with plotted
points representing locations.

 You can add additional dimensions or measures to the Marks card for color, size, or detail to
enhance map visualization.

Customizing Geocoding

 Tableau uses built-in geocoding data for countries, states, cities, and postal codes.

 You can customize or add geocoding by importing your own custom geographic data files,
such as shapefiles or KML files.

 Custom geocoding is useful for regions or areas not covered by Tableau’s default database.

 You can edit or update locations manually by right-clicking a geographic field and selecting
Edit Locations to correct or assign missing geographic values.

Creating Dual Axes Maps

 Dual axes maps combine two layers of geographic data into a single map for richer
visualization.

 To create dual axes maps, plot two geographic measures (like latitude and longitude) on
Rows and Columns shelves twice.

 Right-click the second axis and select Dual Axis to overlay the two maps.

 Synchronize axes to align the layers correctly by right-clicking an axis and selecting
Synchronize Axis.

 This allows combining layers like points and filled maps, or showing multiple data types on
the same geographic area.

Editing Locations

 Editing locations helps fix incorrect or missing geographic assignments.

 Use the Edit Locations option to manually assign correct geographic roles or rename
locations in your data.

 This ensures accurate plotting and better map clarity.

 You can also exclude locations or group them for cleaner visualization.
Difference Between Univariate and Bivariate Charts

Definition

 Univariate charts analyze only one variable at a time, focusing on its distribution or
frequency.
 Bivariate charts analyze the relationship between two variables, showing how one variable
changes with respect to the other.

Number of Variables

 Univariate charts display data for a single variable.


 Bivariate charts display data for two variables simultaneously.

Purpose

 Univariate charts help understand patterns like distribution, central tendency, and variation
of one variable.
 Bivariate charts help explore correlation, association, or comparison between two variables.

Examples

 Univariate charts include pie charts, bar charts, histograms, and box plots.
 Bivariate charts include scatter plots, line graphs (with two variables), and side-by-side bar
charts.

Use Cases

 Univariate charts are used when the focus is on summarizing or understanding one variable.
 Bivariate charts are used when the goal is to investigate relationships or interactions
between two variables.

Data Types

 Univariate charts work with one set of data values, which can be categorical or numerical.
 Bivariate charts require two sets of data values, which may both be numerical or one
numerical and one categorical.

Univariate, bivariate and multivariate difference

Univariate

1. it only summarize single variable at a time

2. it does not deal with causes and relationships

3. it does not contain any dependent variable

4. the main purpose is to describe

5. the example of univariant can be height


Bivariate

1. it only summarize two variables

2. it does deal with causes and relationship

3. it does contain only one dependent variable

4. the main purpose is to explain

5. the example of bivariant can be temperature and ice sales in summer vacation

Multivariate

1. it only summarize more than 2 variables

2. it does not deal with causes and relationships and analysis is done

3. it is similar to bivariate but it contains more than 2 variables

4. the main purpose is to study the relationship among them

5. example: suppose an advertiser wants to compare the popularity of four advertisements on a


website. then their click rates could be measured for both men and women relationships between
variable can be examined

Module 3
Meta data, Tableau data types
Metadata in Tableau refers to the descriptive information about the data in
your data source. It includes details such as the names of fields, data types,
geographic roles, and any calculations or hierarchies defined on the data.
Metadata helps Tableau understand how to interpret and visualize the data
correctly.

Tableau Data Types


 String (Text): This data type represents text or alphanumeric characters.
It is used for fields like names, categories, or any data that is not
numerical.
 Number (Whole): This data type includes integer values without
decimals. It is used for counting or identifying whole numbers like
quantity or IDs.
 Number (Decimal): This represents numerical values with decimal
points. It is used for measurements, amounts, or any data requiring
precision.
 Date: This data type includes calendar dates without time information. It
is used for fields like birth dates, order dates, or event dates.
 Date & Time: This combines both date and time values. It is used when
you need to track precise timestamps or durations.
 Boolean: This data type represents true or false values, useful for binary
conditions such as yes/no or active/inactive.
 Geographic: Tableau assigns geographic roles like country, state, city, or
postal code to fields with geographic data. This data type enables
mapping and spatial analysis.

Connection to Excel in Tableau

Tableau makes it easy to connect to Excel files as a data source. Excel files are commonly used in
business reporting and analysis, and Tableau allows users to import sheets directly for visualization.

 When connecting to Excel, Tableau allows you to select specific worksheets from the Excel
file.

 Each sheet in the Excel file is treated as a separate table that you can drag and drop into the
canvas.

 Tableau also detects named ranges in Excel and displays them as options for connection.

 After connection, you can use the Data Source tab to preview, clean, rename columns, and
set data types.

Steps:

1. Open Tableau Desktop

2. Start Page -> Click "Microsoft Excel" under "Connect" on the left side.

3. Browser and select your Excel file (e.g., sales_data.xlsx)

4. Tableau loads a Data Source tab, where:

4.1 Sheets from the Excel file are shown on the left

4.2 Drag the required sheet(s) to the canvas

5. After dragging, Tableau previews the data.

6. Click “Sheet 1” at the bottom to start building your visualization.

Management of Metadata in Tableau


Metadata refers to information about the structure and attributes of the data, such as field names,
data types, descriptions, and roles.

 Tableau allows you to rename fields, hide unnecessary columns, and change data types
directly in the Data Pane.

 You can assign geographic roles, set default aggregation types, or add field descriptions to
help in documentation.

 Fields can be grouped, aliased, or organized into folders or hierarchies for better navigation
and data management.

 Metadata management ensures clean, readable, and user-friendly datasets while creating
dashboards.

Extracts in Tableau

Extracts are snapshots of your data saved as Tableau Data Extract (.hyper) files, optimized for
performance and offline use.

 Creating an extract reduces the load on the original data source and speeds up dashboard
performance.

 Extracts can be filtered to include only a subset of data or scheduled for refresh on a regular
basis.

 Users can choose between live connection (real-time data access) and extract (static but
faster) depending on their use case.

 Extracts support incremental refresh, where only new rows are added, avoiding the need to
reload all data every time.

Management of metadata and extracts

Metadata Management in Tableau: Metadata refers to data about your data—like field names, data
types, and calculations.

Key Metadata Actions:

1. Rename Fields: Right-click field → Rename (affects only Tableau, not the original source).

2. Change Data Types: Right-click field → Change Data Type.

3. Create Aliases: Right-click dimension → Aliases to rename specific values (e.g., "NY" → "New
York").

4. Hide Unused Fields: Right-click → Hide to declutter the data pane.

5. Groups / Hierarchies / Folders: Organize fields for easier use.

6. Calculated Fields: Create new fields using formulas based on existing data.

Extract Management in Tableau: Extracts are snapshots of your data saved as .hyper files to improve
performance or allow offline access.
Creating an Extract:

1. In the Data pane, click the data source drop-down.

2. Select "Extract" instead of "Live".

3. Click Sheet 1, then go to Data → Extract → Refresh.

4. Optionally set filters and aggregation to reduce extract size.

⚙️Managing Extracts:

1. Refresh: Data → Refresh All Extracts to update with latest source data.

2. Schedule (Tableau Server/Cloud): Automate refreshes for shared dashboards.

3. Edit Extract: Add filters, aggregate data, or set row limits before extracting.

4. Convert Back to Live: Switch from extract to live if needed (right-click source → Use Live).

Data Preparation in Tableau

Data preparation is the process of cleaning, organizing, and structuring raw data before it is used for
visualization in Tableau. Well-prepared data helps ensure accuracy, improves performance, and
enhances the quality of insights drawn from dashboards and reports.

 Data preparation starts when you connect to a data source such as Excel, SQL, or a web
service. Tableau allows you to preview the data in the Data Source tab.

 Tableau provides options to rename columns, hide unnecessary fields, and change data types
directly within the workspace.

 You can split columns into multiple parts if a single column contains combined values, such
as "City, State".

 Tableau allows you to clean null values, filter unnecessary rows, and create calculated fields
to transform raw data into meaningful insights.

 If you are working with multiple tables, Tableau supports joins, unions, and blending to
combine datasets into a single structured view.

 Using Data Interpreter, Tableau can automatically detect headers, footers, and fix common
formatting issues in Excel files.

 Data can also be grouped, clustered, or organized into hierarchies to improve navigation and
support drill-down analysis in dashboards.

Steps

1. Connecting to Data

1.1 Connect to sources like Excel, SQL, Google Sheets, etc.

1.2 Combine multiple sources using:

1.2.1 Joins (within the same connection)


1.2.2 Blends (across different connections)

1.2.3 Relationships (more flexible than joins)

2. Cleaning the Data

2.1 Rename fields for clarity

2.2 Remove nulls or unwanted rows

2.3 Hide unused columns

2.4 Split columns (e.g., full name → first & last)

2.5 Change data types (text, number, date, etc.)

2.6 Pivot data if rows/columns need reshaping

2.7 Use Data Interpreter (for Excel/CSV) to auto-clean messy files.

3. Creating Calculated Fields

3.1 Derive new values using formulas (e.g., Profit Ratio = Profit / Sales)

3.2 Create conditional fields using IF, CASE, etc.

4. Grouping and Aliasing

4.1 Group similar values (e.g., "TV", "T.V." → "Television")

4.2 Assign aliases for cleaner labels in views

5. Creating Hierarchies

5.1 Organize fields into drill-down paths (e.g., Country → State → City)

6. Filtering

6.1 Apply data source filters to limit data loaded into Tableau

6.2 Use extract filters when creating an extract

Dealing with NULL Values in Tableau

In Tableau, NULL values represent missing or unknown data. These values can appear in a dataset
due to various reasons such as incomplete data entry, missing joins, or undefined calculations.
Handling NULL values correctly is important to ensure accurate and clean visualizations.

Reasons for NULL Values in Tableau:

 NULL values occur when a field in a record does not contain any data.

 They may appear due to data import issues, incorrect joins between tables, or missing values
in the original source.

 When using calculated fields or aggregations, some data combinations may result in NULL.

Techniques to Handle NULL Values in Tableau:


 Using ZN() Function:
You can replace NULL numeric values with 0 using the ZN() function.
Example: ZN([Sales]) returns 0 if the Sales field is NULL.

 Using IFNULL() Function:


The IFNULL() function allows you to replace NULL with a custom value. Example:
IFNULL([Region], 'Unknown') replaces missing region values with "Unknown".

 Filtering Out NULL Values:


You can drag the field to the Filters shelf and exclude NULL values manually.
This helps remove incomplete data from the view.

 Displaying NULL as Custom Label:


Tableau allows you to display NULLs with custom labels in views, making the chart more
understandable. You can right-click on NULL in the view and choose “Edit Alias” to rename it.

 Handling NULLs in Joins:


When joining tables, NULL values can arise if there are unmatched records. Using a left join
or full outer join may result in NULLs where matching data is missing. You can handle this by
creating calculated fields that fill in default values for missing data.

 Using ISNULL() Function:


This function checks whether a field is NULL and returns TRUE or FALSE. It is useful in
conditional calculations. Example: IF ISNULL([Profit]) THEN 0 ELSE [Profit] END.

 In Maps:
If geographical fields contain NULL values, Tableau will display a mark labeled "Unknown".
You can click on this and assign a location or filter it out.

Why Handling NULLs Is Important:

 Leaving NULL values untreated may lead to incorrect totals, averages, or blank spaces in
dashboards.

 It can affect the accuracy of analysis and mislead decision-makers.

 Replacing or removing NULL values ensures consistency and reliability of insights.

Data extraction, Refresh extraction, Incremental extraction

Data Extraction in Tableau

Data extraction in Tableau refers to the process of creating a local copy of your connected data
source. This extracted data is saved in a .hyper file format, which Tableau uses to improve
performance and enable offline analysis.

 Data extraction allows users to work with large datasets faster, since Tableau queries the
extracted file rather than the live database.

 Extracts can be created by selecting the Extract option in the Data Source tab and then
choosing the fields and filters to include.
 Extracted data is stored locally, and you can share it or publish it to Tableau Server or Tableau
Public for others to use.

Refresh Extraction

Refresh extraction refers to updating your extract file to match the most current version of your
original data source.

 A full refresh replaces the entire contents of the extract with new data from the source.

 This is useful when the underlying data changes frequently or contains updates to existing
records.

 In Tableau Desktop or Tableau Server, you can schedule automatic extract refreshes to keep
your dashboards up to date.

Incremental Extraction

Incremental extraction allows you to update only the new or changed rows in your extract instead of
reloading the entire dataset.

 You must specify a field like a date or an ID column that can identify new rows during the
extract process.

 Tableau then pulls only the records that have been added or modified since the last extract
refresh.

 This method saves time and system resources, especially when dealing with large datasets
that don’t change completely every day.

Data Blending: Joins (Left, Right, inner and outer)

Data Blending in Tableau

Data Blending is a method used in Tableau to combine data from two different data sources that do
not share the same database. It is useful when data is stored in multiple places and cannot be joined
directly.

 Tableau performs data blending at the worksheet level, not the data source level.

 The primary data source is marked with a blue check, while the secondary data source is
marked with an orange link icon.

 Blending is used when you have two datasets with a common field, such as "Country" or
"Product ID", but from different databases like Excel and SQL.

 The common field is called the linking field, and it automatically matches values across both
sources.

 Blending results in a left join-like behavior, where all values from the primary data source
are preserved, and only matching values from the secondary source are shown.
Joins in Tableau

Joins in Tableau allow you to combine data from multiple tables within the same data source. Joins
work like in SQL and are performed in the Data Source tab by dragging tables into the canvas and
choosing the join type.

1. Inner Join

 An Inner Join returns only the rows that have matching values in both tables.

 It excludes any data that does not have a match in either table.

 This is used when you only want to analyze records that are complete in both datasets.

Example:
Table A (Customer ID, Name)
Table B (Customer ID, Orders)
Inner Join on Customer ID will return only customers who have placed orders.

2. Left Join

 A Left Join returns all records from the left table, and matching records from the right table.

 If there is no match, NULL values are shown for columns from the right table.

Example:
If the left table is Customer Details and the right table is Orders, a left join ensures all customers are
listed, even if they have not placed any orders.

3. Right Join

 A Right Join is the opposite of a left join.

 It returns all records from the right table, and only matching records from the left table.

Example:
If the right table contains product orders and the left table contains product details, a right join
ensures all ordered products are listed even if some are not in the product master list.

4. Full Outer Join

 A Full Outer Join returns all records from both tables.

 If there is no match, the result will contain NULLs for the missing side.

Example:
If Table A has customer data and Table B has feedback data, a full outer join will show all customers
and all feedback entries, even if some customers didn’t give feedback or some feedback was
anonymous.

Differences Between Data Blending and Joins


Criteria Joins Data Blending

Source Same data source Different data sources

Where it happens Data Source tab Worksheet level

Types available Inner, Left, Right, Outer Similar to Left Join only

Performance Faster Slightly slower due to separate queries

Flexibility Works on table structure Works even if structures are different

Cross database Joining and Union

1. Cross-Database Joining in Tableau

Cross-database joining allows you to join tables from different data sources (like MySQL, Excel,
PostgreSQL, or Oracle) within the same Tableau workbook. This is useful when your data is
distributed across multiple systems but you want to analyze it together without blending.

Explanation and Features:

 Tableau lets you drag and drop tables from different databases onto the data canvas and join
them just like you would with tables from the same source.

 When you add a second table from another source, Tableau automatically creates a cross-
database join using its data engine (Hyper).

 Cross-database joins are supported for inner, left, right, and full outer joins, just like normal
joins.

 This method works well when your tables have similar schemas and a common key (such as
"Product ID" or "Employee ID").

Example of Cross-Database Join:

 Suppose you have Sales Data in an Excel file and Customer Details in a MySQL database.

 You can join them in Tableau using a common field like Customer ID, and perform analysis
like "Sales by Customer Age Group".

Key Points:

 You can combine data from multiple platforms without writing complex SQL queries.

 Performance may vary based on the size and type of sources.

 Tableau creates temporary Hyper files in the background to process the joined data.
2. Union in Tableau

A Union is used in Tableau to combine data vertically — that is, to stack rows from two or more
tables that have the same column structure.

Explanation and Features:

 Union is used when you have data split across multiple tables or files, especially when the
schema (columns) is the same.

 When you perform a union, Tableau stacks the rows of one table under another.

 It is helpful for combining datasets like monthly sales files, regional reports, or yearly data
stored in separate sheets.

Example of Union:

 Suppose you have Sales_Jan.xlsx and Sales_Feb.xlsx, each with columns: Date, Product,
Revenue.

 Performing a union on these files will create one single table with all the rows from January
and February combined.

Steps to Perform Union:

1. Open Tableau and connect to your data source.

2. In the Data Source tab, drag and drop multiple tables or sheets into the canvas.

3. Select “Union” when Tableau gives the option.

4. Tableau will create a new table with all rows combined and will add an extra column called
Table Name to indicate the source of each row.

Difference Between Join and Union

Feature Join Union

Combines tables horizontally


Orientation Combines tables vertically (rows)
(columns)

Schema requirement Tables can have different columns Tables must have same columns

Used when data is split by time or


Use case Used when tables have related data
type

Example Customers + Orders Sales_Jan + Sales_Feb

Use Cases

 Cross-Database Join: When data is stored in multiple systems like SQL Server and Excel but
must be analyzed together.
 Union: When monthly data is stored in different files or sheets with the same format and
needs to be combined into one dataset.

Module 4
Working with Calculations: Syntax for calculation and functions in Tableau

Tableau provides powerful calculation features that allow users to create new data from existing data
using formulas. These calculations are helpful for creating custom fields, defining logic, filtering,
sorting, or aggregating values in a dynamic way.

Syntax for Calculations in Tableau

The syntax used for creating calculated fields in Tableau is similar to mathematical or programming
expressions. You can create a calculated field by right-clicking in the data pane and selecting “Create
Calculated Field.”

Important syntax rules:

1. Calculated fields use a name followed by an expression:

o Profit Ratio = SUM(Profit) / SUM(Sales)

2. Comments can be added using // for single line and /* */ for multi-line comments.

3. String values must be enclosed in double quotes:

o "West" or "Completed"

4. Field names are case-sensitive and must match exactly as they appear in the data.

5. Logical expressions can be created using IF, ELSEIF, ELSE, and END statements.

6. Brackets [] are used to refer to fields:

o [Sales], [Region], [Order Date]

Types of Calculations in Tableau

1. Basic Calculations – Performed row by row.


2. Aggregate Calculations – Work on summarized data.

3. Table Calculations – Calculated at the visualization level.

4. Level of Detail (LOD) Calculations – Advanced expressions to control data granularity.

Commonly Used Functions in Tableau

Tableau has several built-in functions grouped into categories like Number, String, Date, Logical, and
Aggregate functions.

1. Number Functions

These functions perform calculations on numerical values.

 ABS(number) – Returns the absolute value.

 ROUND(number, decimals) – Rounds a number to a given number of decimal places.

 CEILING(number) – Rounds up to the nearest integer.

2. String Functions

These manipulate text fields or return text-based output.

 LEFT(string, number) – Returns the left part of a string.

 LEN(string) – Returns the length of a string.

 UPPER(string) – Converts text to uppercase.

3. Date Functions

Used for working with dates, extracting parts of a date, or calculating time differences.

 DATEPART('month', [Order Date]) – Extracts the month from the date.

 DATEDIFF('year', [Order Date], [Ship Date]) – Finds the number of years between two dates.

 NOW() – Returns the current date and time.

4. Logical Functions

These return true/false or conditional values.

 IF [Sales] > 1000 THEN "High" ELSE "Low" END

 ISNULL([Discount]) – Checks if a field has a NULL value.

 IFNULL([Profit], 0) – Replaces NULL with zero.

5. Aggregate Functions

These operate on a group of values and return a single summary value.

 SUM([Sales]) – Adds all sales values.

 AVG([Profit]) – Calculates the average profit.

 MAX([Order Date]) – Returns the latest order date


Example of a Custom Calculation

Suppose we want to calculate Profit Margin as a new field:

[Profit] / [Sales]

To handle errors when Sales is 0, we can write:

IF [Sales] != 0 THEN [Profit] / [Sales] ELSE 0 END

Level of Detail (LOD) Expressions – Brief Overview

LOD expressions allow you to compute values at different levels of granularity than what is shown in
the view.

 {FIXED [Region]: SUM([Sales])} – Calculates total sales per region, regardless of other
dimensions in the view.

 {INCLUDE [Customer Name]: AVG([Sales])} – Includes customer detail in the calculation even
if it is not in the view.

Various types of calculations: Table, String, Date, Aggregate and Number, Levels of Details (LOD),
Quick table calculation

Calculations in Tableau are essential for analysing and transforming data beyond the raw dataset.
These calculations help create new fields, filter data, format values, or perform dynamic analysis.

1. Table Calculations

Table calculations are computed based on the data that is currently in the visualization. They are
applied after the data is aggregated and are dependent on the layout of the view.

 Table calculations work on the summary data displayed in the worksheet, not on the row-
level data in the dataset.

 Example: Calculating the percent difference between this year and last year’s sales.

 Common table calculations include Running Total, Percent of Total, Rank, Moving Average,
etc.

 These are easy to apply using the right-click option or from the drop-down in the Marks card

2. String Calculations

String functions help manipulate text values, extract characters, and perform text formatting
operations.

 String calculations are useful for cleaning data, formatting customer names, or splitting
combined text fields.

 Example: LEFT([Customer Name], 5) returns the first 5 characters of the customer’s name.

 Functions include: LEN(string), UPPER(string), LOWER(string), REPLACE(string, old, new),


TRIM(string), etc.
 String functions can be used in calculated fields to group, format, or conditionally categorize
data.

3. Date Calculations

Date functions are used to manipulate and compare date values. They are helpful for creating time-
based filters and trend analysis.

 Date calculations allow users to extract parts of a date (year, month, day), find differences, or
build date ranges.

 Example: DATEDIFF('year', [Order Date], [Ship Date]) calculates the number of years between
two dates.

 Functions include: DATEPART, DATENAME, NOW(), TODAY(), MAKEDATE(), and DATEADD.

4. Aggregate Calculations

Aggregate functions are used to summarize data by applying operations like sum, average, min, or
max.

 Aggregate calculations are often used when building dashboards or performing analysis on
group-level data.

 Example: AVG([Sales]) calculates the average sales for the selected category.

 Other examples include: SUM, MIN, MAX, COUNT, COUNTD (distinct count).

 Aggregations are automatically applied when fields are dragged into Rows, Columns, or
Marks.

5. Number Calculations

Number functions help in performing mathematical operations on numeric data.

 These functions are essential for calculating ratios, margins, indexes, etc.

 Example: ROUND([Profit]/[Sales], 2) returns the profit ratio rounded to two decimal places.

 Functions include: ABS(), CEILING(), FLOOR(), POWER(), SQRT(), LOG(), and RANDOM().

6. Level of Detail (LOD) Expressions

LOD expressions are powerful features that allow you to control the level of aggregation independent
of the view’s dimensions.

 LOD calculations let you fix, include, or exclude certain dimensions in the calculation.

 {FIXED [Region] : SUM([Sales])} calculates total sales per region, ignoring other fields in the
view.

 Types of LOD expressions:

o FIXED – Uses only specified dimensions.

o INCLUDE – Adds extra dimensions to the current view.

o EXCLUDE – Removes dimensions from the view.


 LOD expressions are useful when comparing individual performance against a group or for
nested aggregation.

7. Quick Table Calculations

Quick table calculations are prebuilt functions that can be applied to measures quickly without
writing formulas.

 These calculations are useful for trend analysis, comparison, and data patterns.

 Common quick table calculations:

o Running Total

o Difference

o Percent Difference

o Percent of Total

o Moving Average

o Rank

 To apply, right-click on the measure and select “Quick Table Calculation.”

Calculated Fields in Tableau

In Tableau, a calculated field is a custom field created by using a formula to modify or create new
data from existing fields. This allows users to perform custom computations, format values, create
dynamic categories, or filter data based on logical conditions.

Steps to Create a Calculated Field in Tableau

1. Right-click on any blank area in the Data pane and select "Create Calculated Field".
This option opens a dialog box where the user can enter the name and formula for the new
calculated field.

2. Give a name to the calculated field to identify it easily.


Naming the field helps organize and recognize it when used in visualizations.

3. Write a valid formula using Tableau’s functions, operators, and existing fields.
You can use functions like IF, SUM, AVG, DATEDIFF, LEFT, etc., depending on your
requirement.

4. Tableau checks the syntax of the formula automatically.


If the formula is correct, a green check mark appears. If not, Tableau highlights the error.

5. Click OK to save the calculated field.


The new field is added to the Data pane and can now be used like any other field in your
worksheets.

6. You can drag the calculated field into Rows, Columns, or the Marks card.
It behaves like a regular dimension or measure depending on the type of formula used.
Example of a Calculated Field

 To calculate a discount price:


Discounted Price = [Sales] - ([Sales] * [Discount])

 To classify customers:
IF [Sales] > 1000 THEN "High Value" ELSE "Regular" END

Working with Parameters: Creating Parameters, Parameters in calculations, Using Parameters with
filters

In Tableau, parameters are dynamic values that can be used to control various aspects of a
visualization. They are similar to variables and allow users to input or select a value that influences
the view. Parameters increase interactivity and flexibility by enabling end users to modify
calculations, filters, or reference lines on dashboards without editing the worksheet.

1. Creating Parameters in Tableau

Parameters can be created manually by the user to serve as inputs that can drive logic or user
interaction.

 You can create a parameter by right-clicking in the Data pane and choosing “Create
Parameter”.

 Each parameter needs a name, data type (Integer, Float, Boolean, Date, or String), and
allowable values.

 Allowable values can be:

o A range (e.g., 1 to 100),

o A list of specific values (e.g., "High", "Medium", "Low"),

o Or any value (user-defined input).

 After creation, parameters do not automatically affect the visualization until they are used in
calculations or filters.

2. Using Parameters in Calculations

One of the most powerful uses of parameters is in calculated fields. You can write formulas where
the output changes depending on the parameter value.

 Parameters can be used with logical functions like IF, CASE, or SWITCH to control outputs.

 Example: You can create a calculated field like


IF [Select Measure] = "Sales" THEN [Sales] ELSE [Profit] END
where [Select Measure] is a parameter with choices "Sales" or "Profit".

 This lets users switch between different metrics without changing the underlying worksheet.

 You can also use parameters to control thresholds or reference lines, e.g.,
IF [Sales] > [Target Value Parameter] THEN "Above Target" ELSE "Below Target".

3. Using Parameters with Filters


Parameters can be combined with filters to make dynamic filtering options in the dashboard.

 You can create a calculated field using a parameter and use that field in the Filters shelf.

 Example: Suppose you have a parameter called Select Category with values like “Furniture”,
“Technology”, “Office Supplies”.
Then you can create a calculated field:
IF [Category] = [Select Category] THEN "Show" ELSE "Hide".

 Drag this field into the Filters shelf and keep only the "Show" value.

 This allows users to filter the view using a parameter dropdown, not a traditional filter box.

4. Other Uses of Parameters

 Parameters can be used to control the size or number of bins in histograms.

 You can use parameters in Top N filtering, e.g., show Top 10 products where 10 is a
parameter.

 Parameters are also used for changing date ranges, such as showing data for last N days.

 You can use parameters to switch between different charts (e.g., line chart and bar chart)
using calculated fields and container layouts.

Benefits of Using Parameters

 Parameters enhance interactivity of dashboards by letting end users control what they see.

 They reduce the need for multiple dashboards or worksheets by allowing dynamic
selection.

 Parameters make calculated fields more flexible by letting users control conditions or values.

Column selection parameters, Chart selection parameters

Column Selection Parameters

Column Selection Parameters allow users to dynamically choose which column (field) from the data
to display or analyze in the visualization.

 In Tableau, you create a parameter listing the names of columns (for example, “Sales,”
“Profit,” or “Quantity”) as selectable options.

 You then create a calculated field that uses this parameter to display the chosen column’s
data. For example,
CASE [Select Column] WHEN "Sales" THEN [Sales] WHEN "Profit" THEN [Profit] ELSE
[Quantity] END.
 This calculated field can be placed in rows, columns, or marks, so the chart updates
automatically based on the selected column.

 This method is useful when you want a single chart to show different metrics without
creating separate sheets.

 It enhances interactivity and saves space in dashboards by letting users switch views easily.

Chart Selection Parameters

Chart Selection Parameters allow users to switch between different chart types dynamically within
the same dashboard or worksheet.

 You create a parameter with a list of chart options like “Bar Chart,” “Line Chart,” “Pie Chart,”
or “Scatter Plot.”

 Then, create calculated fields or use sheet swapping techniques to show or hide charts
based on the parameter’s selected value.

 For example, you can create different sheets for each chart type and use a parameter control
to display the chosen chart in a dashboard using layout containers and filter actions.

 Another approach is to use calculated fields that change the measure or dimension
depending on the parameter, which can visually mimic chart switching.

 This approach makes dashboards more interactive and user-friendly, allowing exploration of
data in multiple visual formats without clutter.

Difference Between Calculations and Parameters in Tableau

1. Definition:
Calculations in Tableau are formulas or expressions created to derive new values from
existing data fields.

Parameters, on the other hand, are dynamic input values that users can control to change
how data is displayed or analysed.

2. Purpose:
Calculations are used to perform operations like arithmetic, logical comparisons, or string
manipulation on data to generate new fields or insights.

Parameters are used to make visualizations interactive by letting users select or enter values
that influence calculations, filters, or other controls.

3. Usage:
Calculations automatically compute values based on the data and formula provided.
Parameters require user input or selection and do not compute values themselves but serve
as inputs to calculations or filters.
4. Interaction:
Calculation: Users do not directly interact with calculations; they only see the results in the
visualization.

Parameters provide a control interface such as dropdowns, sliders, or input boxes that allow
users to influence calculations or filter data dynamically.

5. Flexibility:
Calculations are fixed formulas that update when data changes but remain static in logic
unless edited.

Parameters are flexible and can change at runtime, allowing users to modify calculations or
views without editing the workbook.

6. Example:
A calculation might be Profit Margin = [Profit] / [Sales].

A parameter could be “Select Region,” allowing users to choose a specific region to filter the
data shown.

Module 5
Creating Dashboards: Building and Formatting Dashboards using objects, size, views, filters and
legends

Dashboards in Tableau

A dashboard in Tableau is a collection of multiple views, such as charts, maps, and tables, arranged
on a single screen to provide a comprehensive visual analysis. Creating effective dashboards involves
combining different visual components and formatting them to make the information clear and
interactive.

Building Dashboards

 To create a dashboard, click the “New Dashboard” button at the bottom of the Tableau
interface.

 You can add various objects such as worksheets (views), images, web pages, text boxes, and
blank spaces by dragging them from the Dashboard pane onto the workspace.
 It is important to arrange these objects logically to tell a coherent data story and support
easy interpretation.

 Multiple views can be added side-by-side or stacked vertically, depending on how you want
to present the data.

Formatting Dashboards

 Size: You can set the dashboard size either to a fixed width and height or let Tableau
automatically adjust the size based on the device (desktop, tablet, or phone). Fixed size is
useful for precise layout control, while automatic size ensures responsiveness.

 Objects: Objects like text, images, and web content help to add context or branding. They
can be resized and positioned anywhere on the dashboard.

 Views: These are the core charts or visualizations added to the dashboard. You can resize
views, move them around, and even layer them to create complex visual effects.

 Filters: Filters added to a dashboard can control one or multiple views simultaneously. By
adding filter controls to the dashboard, users can interactively refine the data shown across
charts.

 Legends: Legends explain the colors, sizes, or shapes used in views. You can add legends to
the dashboard and position them in a way that helps users quickly understand the meaning
of visual elements.

Additional Features

 You can add actions such as filter actions, highlight actions, or URL actions to increase
interactivity. For example, clicking a data point in one chart can filter another chart.

 Use floating objects for elements that should overlap or remain fixed as you scroll.

 Consistent use of fonts, colors, and spacing improves readability and professional
appearance.

Best practices for making interactive dashboards

Creating interactive dashboards in Tableau requires careful design and thoughtful use of features to
ensure the dashboard is user-friendly, insightful, and efficient.

1. Keep the Dashboard Simple and Focused

Avoid clutter by including only the most important views and filters. A clean, focused layout helps
users understand the key insights without being overwhelmed. Limit the number of charts to those
that directly support the dashboard’s purpose.

2. Use Consistent and Clear Layouts

Organize dashboard elements logically, grouping related views together. Use consistent fonts, colors,
and sizes to maintain a professional and cohesive look. Align objects properly to improve readability
and visual appeal.

3. Provide Intuitive Filters and Controls


Use filters, parameters, and quick filters to give users control over the data they want to explore.
Place filters in a visible, easily accessible area. Avoid excessive filters that confuse users, and use filter
actions to connect charts seamlessly.

4. Use Actions to Enhance Interactivity

Incorporate filter actions, highlight actions, and URL actions to let users interact with the dashboard
naturally. For example, clicking on a bar in one chart can filter data in other charts, helping users drill
down into details.

5. Optimize Performance

Reduce load times by limiting the number of complex calculations and data sources. Use extracts
instead of live connections when possible. Avoid showing too many marks or overly detailed views
that slow down responsiveness.

6. Use Tooltips Effectively

Enhance insights with tooltips that display additional information when users hover over data points.
Customize tooltips to show relevant details without cluttering the main view.

7. Make Use of Parameters for Flexibility

Parameters allow users to change measures, date ranges, or other inputs dynamically. Use
parameters to let users switch views or adjust thresholds, increasing the dashboard’s adaptability.

8. Design for Your Audience and Devices

Consider who will use the dashboard and on what devices. Use device layouts to optimize
dashboards for desktop, tablets, or phones. Ensure interactivity works well across all platforms.

9. Test and Iterate

After building the dashboard, test it with real users to get feedback on usability and performance.
Make adjustments based on their input to improve clarity and interactivity.

Creating Stories: Creating stories, adding annotations

Creating Stories in Tableau

Stories in Tableau are a sequence of sheets or dashboards arranged to convey a data-driven


narrative. They help present insights step-by-step, guiding viewers through the analysis in a
structured and engaging way.

Creating Stories

 To create a story, click the “New Story” tab at the bottom of the Tableau interface.

 Stories consist of story points, which are individual sheets or dashboards arranged in a
specific order.

 You add sheets or dashboards as story points by dragging them onto the story workspace.

 Each story point can be customized with a title or caption to explain what the view
represents.
 Users can navigate through the story points using arrows or a scrollbar, making the story easy
to follow.

Adding Annotations

 Annotations allow you to add text or notes directly on visualizations to highlight key insights
or explain specific data points.

 To add annotations, right-click on a data point or an empty area in a worksheet and select
options like “Annotate Mark,” “Annotate Point,” or “Annotate Area.”

 You can format the annotation text, adjust its position, and use arrows or boxes to draw
attention to specific parts of the visualization.

 In stories, annotations help clarify the message, making the narrative more informative and
understandable.

 Adding clear annotations improves communication and ensures that viewers grasp the
important aspects without confusion.

Highlight actions, URL actions and Filter actions

Highlight Actions

 Highlight Actions in Tableau are interactive features that emphasize related data points
across one or more worksheets when you hover over or select a mark.

 When you trigger a highlight action, Tableau visually brightens the selected data points and
dims the rest, helping users focus on specific information.

 Highlight actions are useful for spotting relationships and patterns without filtering out data.

 For example, hovering over a sales region in one chart can highlight the same region in other
charts on the dashboard.

URL Actions

 URL Actions let you create clickable links within Tableau dashboards that open web pages,
documents, or other resources.

 You can configure URL actions to pass parameters from Tableau, such as a selected customer
ID, to dynamically generate specific web pages.

 This feature is helpful for integrating external data sources, detailed reports, or related
information accessible via the internet or intranet.

 For example, clicking a product name in Tableau could open its detailed product page on a
company website.

Filter Actions

 Filter Actions enable users to filter data in one or more worksheets by interacting with marks
in another worksheet.
 When a user clicks or selects data points in one view, the filter action updates other views to
show only related data.

 Filter actions help drill down into details and explore data dynamically without manually
adjusting filters.

 For instance, selecting a country on a map can filter a sales chart to display only sales from
that country.

Dashboard examples using Tableau workspace

Dashboard Examples Using Tableau Workspace

Tableau workspace provides a flexible environment to create dashboards by combining multiple


visualizations and interactive elements. Below are some common examples of dashboards you can
build using Tableau’s workspace:

1. Sales Performance Dashboard

This dashboard displays overall sales metrics such as total sales, profit, and sales trends over time. It
typically includes:

 A line graph showing sales trends by month or quarter.

 A bar chart comparing sales across regions or product categories.

 A map view plotting sales by geographical location.

 Filters to allow users to select specific time periods, regions, or product types.

2. Customer Analysis Dashboard

This dashboard helps understand customer behavior and segmentation. Common views include:

 A pie chart showing customer distribution by demographics like age or gender.

 A scatter plot comparing customer spend and frequency.

 A table listing top customers by revenue or transactions.

 Interactive filters for customer segments, purchase dates, or loyalty levels.

3. Financial Overview Dashboard

Used by finance teams to monitor key financial indicators. It often features:

 A bullet graph displaying budget vs. actual expenses.

 A box plot highlighting variations in costs across departments.

 A heat map showing revenue intensity by product line and region.

 Date filters and drop-down menus for different fiscal years or quarters.

4. Marketing Campaign Dashboard

This dashboard tracks the effectiveness of marketing efforts with:


 A bar chart of campaign performance by channel (email, social media, ads).

 A line chart showing leads generated over time.

 A geographical map indicating lead sources by location.

 Filters for campaign dates, channels, and target audience segments.

Using Tableau Workspace for Dashboards

 Tableau workspace allows dragging and dropping worksheets and objects onto the
dashboard area.

 You can resize and arrange charts, images, and filters to create a clear layout.

 Filters and legends can be added to improve user interaction and understanding.

 Interactive actions like filter actions and highlight actions can be configured for dynamic data
exploration.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy