Analytics 3,4,5
Analytics 3,4,5
1. Graphical Perception
Definition: Graphical perception focuses on how humans visually interpret data from
charts and graphs. It's based on cognitive psychology principles.
Key Concepts:
Colors and shapes are better for categorical data, while gradients or
sizes suit quantitative data.
Example:
A line chart is great for trends over time because humans easily perceive the
slope and continuity.
Definition: Interaction dynamics let users actively explore data by interacting with
visualizations.
Techniques:
Definition: Properly using the available area in visualizations to make them clear and
readable.
Strategies:
Aspect Ratios: Maintaining appropriate ratios (e.g., for line charts) avoids
distortion.
Example: Compare an overcrowded pie chart with a well-organized bar chart for the
same data.
4. Stacked Graphs
Definition: Graphs where categories are stacked on top of one another to show part-
to-whole relationships.
Types:
Stacked Bar Charts: Used for comparing cumulative totals across categories.
Stacked Area Charts: Visualize how parts of a whole change over time.
Challenges:
Example: A stacked graph showing total sales per year broken down by product
category.
Key Elements:
Geometry: The shapes and structures of the graph (e.g., circles, bars, lines).
Principles:
Example: Using color gradients for heatmaps or choosing harmonious color palettes.
Key Concepts:
Nodes and Edges: Nodes represent entities (e.g., people), and edges
represent relationships (e.g., friendships).
Types:
Challenges:
8. Text Visualization
Techniques:
1. Modeling Process
Definition: The end-to-end steps for building and applying machine learning models.
Steps:
2. Training Model
How it Works:
The algorithm adjusts its internal parameters (weights) based on input data
and desired outputs.
Involves multiple iterations (epochs) to reduce error.
3. Validating Model
Definition: Testing the model on unseen validation data to assess accuracy and
prevent overfitting.
Metrics:
Definition: Applying the trained model to new data for making predictions.
Definition: Learn from labeled datasets where input-output relationships are known.
Types:
Types:
Definition: A set of moral principles and guidelines to ensure data is used responsibly,
ethically, and transparently.
Key Aspects:
Challenges:
Balancing innovation with ethical constraints.
Addressing unintentional bias that may emerge during data processing or modeling.
Example: Avoiding racial bias in a hiring algorithm by ensuring training data includes diverse
candidates.
Definition: Using data science to solve societal problems and improve lives, rather than
solely pursuing profit-driven goals.
Applications:
Key Principle: Align projects with the broader benefit of society, considering long-term
consequences.
Definition: Understanding and respecting who owns the data and has the rights to it.
Key Considerations:
Open Data: Public datasets made available for research and innovation.
Ethical Issues:
Example: Social media platforms collecting and monetizing user data without transparent
consent.
Key Dimensions:
Data Minimization: Collecting only the data necessary for the task.
Contextual Integrity: Ensuring data is used in ways that align with its intended
purpose.
Legislation:
CCPA (California Consumer Privacy Act): U.S. law ensuring data transparency and
consumer rights.
Example: A health app encrypting user data to protect sensitive medical information.
Definition: Ensuring individuals are aware of and agree to how their data will be collected,
stored, and used.
Best Practices:
Opt-in Mechanisms: Allow users to actively consent rather than relying on default
opt-ins.
Revocation Rights: Let users withdraw consent and delete their data if desired.
Challenges:
Long, complex terms of service often discourage users from fully understanding the
implications.
Example: A mobile app explicitly asking for permission before accessing a user's location.
6. The Five Cs
1. Clarity: Make the purpose and use of data clear to all stakeholders.
Example: An organization following the Five Cs might prioritize user control over profit-
driven data collection practices.
Definition: Ensuring datasets, models, and outcomes are representative of all groups in
society to avoid perpetuating inequality.
Key Actions:
Inclusive Datasets: Use diverse training data that reflects different demographics,
regions, and languages.
Inclusive Teams: Encourage diversity within data science teams to bring varied
perspectives.
Challenges:
Example: Building a voice recognition system that works for all accents, not just a few
dominant ones.
8. Future Trends
Definition: Emerging directions and innovations shaping the field of data science.
Major Trends:
1. Real-time Analytics: With the rise of IoT, systems that analyze data in real-time are
becoming essential.
5. Augmented Reality (AR) and Virtual Reality (VR): Data visualization integrated into
immersive environments.
Challenges:
Balancing innovation with privacy and ethical concerns.
Example: Using AI-powered analytics for precision farming to minimize resource waste and
maximize crop yield.