Data science questions and answers
Data science questions and answers
Tutorial 2
EDA is an approach to analyzing data sets to summarize their main characteristics, often using
visual methods. It is a crucial step in data analysis, helping to uncover underlying patterns, detect
anomalies, and test assumptions.
1. Data Collection and Cleaning: Load the data and clean it by handling missing values,
duplicates, and inconsistencies.
2. Data Profiling: Understand the data types, unique values, and summary statistics.
3. Univariate Analysis: Analyze each variable individually using visualizations like
histograms or box plots.
4. Bivariate and Multivariate Analysis: Explore relationships between multiple variables
using scatter plots, correlation matrices, and pair plots.
5. Outlier Detection: Identify and handle outliers using statistical techniques or visual
methods.
6. Feature Engineering: Create new features based on existing data to improve model
performance.
Statistical analysis involves collecting and analyzing data to identify patterns and trends. It
provides the foundation for evidence-based decision-making in business intelligence.
Measures of Central Tendency: Mean, median, and mode describe the central point of a
data set.
Measures of Dispersion: Range, variance, and standard deviation show how spread out
the data is.
Frequency Distributions: Understanding the distribution of categorical and numerical
data.
Confidence Intervals: Estimating the range within which a population parameter lies.
Correlation and Regression Analysis: Assessing relationships between variables.
ANOVA (Analysis of Variance): Comparing means of multiple groups.
Interactive dashboards allow users to explore data dynamically and gain insights through visual
interactions. Reports present data in a structured format for decision-makers.
Identify the Audience: Tailor the dashboard to the needs of its users.
Focus on Key Metrics: Highlight important metrics that align with business goals.
Use Appropriate Visuals: Choose the right charts for the data being presented.
Ensure Interactivity: Enable filters, drill-downs, and hover effects for deeper insights.
4.2 Tools for Creating Interactive Dashboards
Tableau: Offers rich interactive features and supports various data sources.
Power BI: Integrates well with Microsoft products and supports real-time data.
Qlik Sense: Allows for associative data exploration and custom visuals.
Google Data Studio: Simple tool for integrating Google Analytics and other Google
services.