Data Literacy
Data Literacy
Key-concepts
Understanding of data literacy
Identify the difference between Quantitative (Numerical) and Qualitative (Categorical) Data
Impact of data literacy with the help of case studies and scenarios
Best practices for Cyber Security
Data Literacy
Rahul rated the 3 films he watched consecutively as bad, best and average respectively"
Can you filter the data from this statement? Are they of the same type?
_____________________________________________________________________________________
_____________________________________________________________________________________
_____________________________________________________________________________________
_________________________________________________________
Purpose: The purpose of this activity is to engage participants in various scenarios that involve collecting data
and analyzing its sources. Emphasizing the importance of validating data sources, the aim is to instill the concept
of data literacy. By understanding how authentic data sources contribute to reliable and unbiased decision-
making, participants will develop critical skills for navigating and interpreting data effectively.
Brief: [Pair Activity] Participants will search the internet for data sources, extracting key information to support
their decisions.
You have to rank the sources of the news articles from most accurate to least, state reasons for your
choice.
So, we can conclude that every data tells a story, but we must be careful before believing the story
Data literacy is essential because it enables individuals to make informed decisions, think critically, solve
problems, and innovate.
2.1.3 How to become Data Literate?
Every data tells a story, but we must be careful before believing the story. Data Literate is a person who
can interact with data to understand the world around them.
Data literacy helps people research about products while shopping over the
internet
How do you decide the following things when we are shopping online?
Which is the cheapest product available?
Which product is liked by the users the most?
Does a particular product meet all the requirements?
The data literacy framework provides guidance on using data efficiently and with all levels of awareness.
Data literacy framework is an iterative process.
2.1.4 What are Data Security and Privacy? How are they related to
AI?
Data Privacy and Data Security are often used interchangeably but they are different from each other.
Here are examples of two things which may compromise our data privacy
Why is it important?
A data breach at a government A breach at a corporation can A breach at a hospital can put
The following besttoppractices
agency can put secret can help you ensure data privacy:
put proprietary data in the personal health information in the
information in the hands of an hands of a competitor. hands of those who might misuse
enemy state. it.
Understanding what data, you have collected, how it is handled, and where it is stored.
Necessary data required for a project should only be collected.
User consent while data collection must be of utmost importance.
Why is it important?
Due to the rising amount of data in the cloud there is an increased risk of
cyber threats. The most appropriate step for such an amount of traffic being generated is how we
control and protect the transfer of sensitive or personal information at every known place.
The most possible reasons why data security is more important now are:
Cyber-attacks affect all the people
The fast-technological changes will boom cyber attacks
Reference Links:
Video: https://www.youtube.com/watch?v=aO858HyFbKI
Use strong, unique passwords with a mix of characters for each account.
Activate Two-Factor Authentication (2FA) for added security.
Download software from trusted sources and scan files before opening.
Prioritize websites with "https://" for secure logins.
Keep your browser, OS, and antivirus updated regularly.
Adjust social media privacy settings for limited visibility to close contacts.
Always lock your screen when away.
Connect only with trusted individuals online.
Use secure Wi-Fi networks.
Report online bullying to a trusted adult immediately.
Avoid sharing personal info like real name or phone number.
Don't send pictures to strangers or post them on social media.
Don't open emails or attachments from unknown sources.
Ignore suspicious requests for personal info like bank account details.
Keep passwords and security questions private.
Don't copy copyrighted software without permission.
Avoid cyberbullying or using offensive language online.
Revision Time:
1. Cultivating Data Literacy means:
a) Utilize vocabulary and analytical skills
b) Acquire, develop, and improve data literacy skills
c) Develop skills in statistical methodologies
d) Develop skills in Math
2. Data Privacy and Data Security are often used interchangeably but they are different from each other
a) True
b) False
3. The_____________________ provides guidance on using data efficiently and with all levels of awareness.
a) data security framework
b) data literacy framework
c) data privacy framework
d) data acquisition framework
5.__________ is the practice of protecting digital information from unauthorized access, corruption, or theft
throughout its entire lifecycle.
a) data security
b) data literacy
c) data privacy
d) data acquisition
2.2 Acquiring Data, Processing, and Interpreting Data
Lesson Title: Acquiring Data, Processing, and Interpreting Data Approach: Session + Activity
Summary: You will get an understanding of data processing, data interpretation and keywords related
to data.
Learning Objectives
Familiarizing youth with different data terminologies like data acquisition, processing, analysis,
presentation, and interpretation
Discussing different methods of data interpretation like qualitative and quantitative.
Understanding the methods and different collection techniques
Critically think about their advantages and disadvantages
Identifying various data presentation methods with examples and interpreting them
Gain awareness about the advantages and impact of Data interpretation on business growth
Learning Outcomes
Determine the best methods to acquire data.
Classify different types of data and enlist different methodologies to acquire it.
Define and describe data interpretation.
Enlist and explain the different methods of data interpretation.
Recognize the types of data interpretation.
Realize the importance of data interpretation
Pre-requisites: Acquaintance with data and its different types.
Key-concepts
Familiarizing with different data terminologies like data processing, analysis, presentation, and
interpretation
Quantitative and Qualitative Data Interpretation
Types of Data Interpretation Textual, Tabular and Graphical with examples.
Activity
Session Preparation Logistics: For a class of 40 Students [Pair Activity]
Materials Required:
ITEM QUANTITY
Online Data Sources Clues NA
Computers 20
Purpose:
The purpose of this activity is to engage participants in acquiring data from online sources. The ability
to locate and access relevant data sources is crucial for AI Projects.
Brief: [Pair Activity] Participants will be locating an online dataset suitable for training an AI model.
They will conduct a search for weather forecast related datasets on various online platforms and then
paste images or screenshots of the datasets found.
2.2.1. Types of data
Artificial Intelligence is crucial, with data serving as its foundation. We come across different types of
information every day. Some common types of data include:
Data Acquisition, also known as acquiring data, refers to the procedure of gathering data. This involves
searching for datasets suitable for training AI models. The process typically comprises three key steps:
Acquiring Data Sample Data Discovery
Sources of Data
Various Sources for Acquiring Data:
Primary Data Sources Some of the sources for primary data include surveys, interviews,
experiments, etc. The data generated from the experiment is an example of primary data.
Here is an excel sheet showing the data collected for students of a class.
Secondary Data Sources Secondary data collection obtains information from external sources,
rather than generating it personally. Some sources for secondary data collection include:
2. Cleanliness- Clean data is free from duplicates, missing values, outliers, and other anomalies that
may affect its reliability and usefulness for analysis. In this particular example, duplicate values are
removed after cleaning the data.
3. Accuracy- Accuracy indicates how well the data matches real-world values, ensuring reliability.
Accurate data closely reflects actual values without errors, enhancing the quality and
trustworthiness of the dataset.
In this particular example, we are comparing data gathered from measuring the length of a small box
in centimeters.
Kaggle assigns a usability score to the data sets that are present on the website based on scores
given by the users of that data.
Features of Data
Data features are the characteristics or properties of the data. They describe each piece of information in
a dataset. For example, in a table of student records, features could include things like the student's name,
age, or grade. In a photo dataset, features might be the colors present in each image. These features help
us understand and analyze the data.
Dependent features, on the other hand, are the outputs or results of the model they're what we're
trying to predict.
2.2.5 Data Processing and Data Interpretation
Data Processing
Data processing helps computers understand raw data.
Use of computers to perform different operations on data is
included under data processing.
Data Interpretation
It is the process of making sense out of data that has been
processed.
The interpretation of data helps us answer critical questions
using data.
Data Processing- After raw data is collected, data is processed to derive meaningful
information from it.
Data Analysis Data analysis is to examine each component of the data
in order to draw conclusions.
Data Presentation- In this step, you select, organize, and group ideas and
evidence in a logical way.
Based on the two types of data, there are two ways to interpret data-
Quantitative Data Interpretation
Qualitative Data Interpretation
Qualitative Data Interpretation
Qualitative data tells us about the emotions
and feelings of people
Reviews by customers
Pizza Qualitative data Jim and his
toppings friends are regular customers
here are so Veg Veg farmhouse
tasty! farmhouse pizza is a popular choice
pizza is the
best here!
Record keeping: This method uses existing reliable documents and other similar sources of information
as the data source. It is similar to going to a library.
Observation: In this method, the participant their behavior and emotions are observed carefully
Case Studies: In this method, data is collected from case studies.
Focus groups: In this method, data is collected from a group discussion on relevant topic.
Longitudinal Studies: This data collection method is performed on the same data source repeatedly over
an extended period.
One-to-One Interviews: In this method, data is collected using a one-to-one interview.
Purpose:
This activity will engage youth with longitudinal studies a study conducted over a considerable
amount of time to identify trends and patterns
The ability to identify trends and patterns in datasets allows us to make informed decisions
about different tasks in our lives
Activity Guidelines
Visit the link: https://trends.google.com/trends/?geo=IN (Google Trends)
Explore the website
Check what is trending in the year 2022 Global
Make a list of trending sports (top 5)
Make a list of trending movies (top 5)
Check what is trending globally in the year 2022
Counter Number of
Cumulative Grade Point website visit
Average (CGPA)
Cumulative Grade Point
Average (CGPA)
Textual DI
The data is mentioned in the text form, usually in a paragraph.
Used when the data is not large and can be easily comprehended by reading.
Textual presentation is not suitable for large data.
Example:
In the Science Olympiad class of 45 Students, 3 students obtained the More than 60% of
perfect score of 50. 10 students got a score of 45 and above, 15 students scored more
students got a score of 40 and above, 8 students got a score of 30 and than 80% Marks in
above, 6 students got a score of 20 and above and 3 got 19 and below. Olympiad!
Tabular DI
Data is represented systematically in the form of rows and columns.
Title of the Table (Item of Expenditure) contains the description of the table content.
Column Headings (Year; Salary; Fuel and Transport; Bonus; Interest on Loans; Taxes) contains the
description of information contained in columns.
Graphical DI
Bar Graphs
In a Bar Graph, data is represented using vertical and horizontal bars.
Pie Charts
Pie Charts have the shape of a pie and each slice of the pie represents the portion of the entire
pie allocated to each category
It is a circular chart divided into various sections (think of a cake cut into slices)
Each section of the pie chart is proportional to the corresponding value
Distribution of Math Score
Perfect Score(=50)
7%7% 45 and Above(>=45)
13% 22%
18% 40 and Above(>=40)
Line Graphs
A line graph is created by connecting various data points.
It shows the change in quantity over time.
Activity: Visualize and Interpret Data
Duration: 40 Minutes
Purpose
This activity will engage youth with data visualization and interpretation
visualization makes it easier for us to extract useful information contained in the dataset
Activity Guidelines
The table shows the details of a class consisting of 50 students and their scores ranging in the listed
categories for 5 subjects: Math, Physics, Chemistry, Social Science, and Biology
Student Performance
Marks Range Math Physics Chemistry Social Science Biology
Less than 20 6 3 1 0 0
Between 20-29 14 11 9 15 8
Between 30-40 17 20 21 22 19
Between 41-44 8 10 14 10 16
45 and Above 5 6 5 3 7
Total Students 50 50 50 50 50
Copy the table in an Excel sheet and create the following visualizations for the given data:
Make a bar graph showing the marks distribution for all 5 subjects
Make a pie chart showing the marks distribution for Physics
Make a line chart displaying the marks distribution for Chemistry
Importance of Data Interpretation
Brief:
The following are questions for the quiz. You can either go for a Pen/Paper Quiz or you can visit any
open-sourced, free, online portal; one of which is Kahoot, and create your quiz there. For Kahoot: Go to
https://kahoot.com/ and create your login ID on it. Then, add your own kahoot in it simply by adding all
the given questions into it. Once created, you can initiate the quiz from your ID and students can
participate in it by putting in the Game pin.
Quiz Questions
1. What are the basic building blocks of qualitative data?
a. Individuals
b. Units
c. Categories
d. Measurements
2. Which among these is not a type of data interpretation?
a. Textual
b. Tabular
c. Graphical
d. Raw data
3. Quantitative data is numerical in nature.
a. True
b. False
4. A Bar Graph is an example of?
a. Textual
b. Tabular
c. Graphical
d. None of the above
5. _____________ relates to the manipulation of data to produce meaningful insights.
a. Data Processing
b. Data Interpretation
c. Data Analysis
d. Data Presentation
Pre-requisites:
Meet the learning outcomes of units till learnt
Basic computer skills.
Key-concepts
Mapping AI Project Cycle.
Data Literacy.
Sources of data.
Data acquisition.
Usability of data.
Data processing and interpretation.
Data visualization using Tableau.
Icebreaker Activity
Tic-Tac-Toe
Purpose:
To initiate the concept of data collection
Material required:
Instructions
Activity
Data Visualization Using Tableau
Your favorite songs
Think about songs! Which songs do you listen to? Which songs do you sing?
Do you have a favorite song, artist, album, or playlist?
Let's start thinking about the different aspects of a song, like instruments and lyrics.
Do your favorite songs have anything in common?
Instructions
Download Tableau public with the help of an adult using this link -
https://public.tableau.com/en-us/s/download
Install the package via the install wizard.
Once installed, double click the program to open the Tableau Public Desktop application.
Now drag the sheet with your data to Drag tables here section.
First, let's recreate the bar chart we made to visualize the number of songs per genre!
Click Sheet1 in the bottom left corner of the screen
the right, releasing it next to the word Columns when a little
orange arrow appears.
We can make the text a little more fun and easier to read.
To do that, click the label square.
This opens up a box that allows us to change the font and text size.
Let's change the font size to 12 and the font to "Chalkboard".
We have our complete bubble chart now!
https://www.youtube.com/watch?v=NLCzpPRCc7U
https://www.youtube.com/watch?v=_M8BnosAD78
Note: You may also use Ms Excel or Datawrapper (https://www.datawrapper.de/) for the data
visualization instead of Tableau.
Revision Time:
1. At which stage of the AI project cycle does Tableau software prove useful?
2. Name any five graphs that can be made using Tableau software
3. In the below excel sheet-