0% found this document useful (0 votes)

15 views

3.1-Data and Data Analysis

GYVUGYH BG B

Uploaded by

sai charan planet spark

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

3.1-Data and Data Analysis

GYVUGYH BG B

Uploaded by

sai charan planet spark

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Data- refers to the collection of raw and unorganized facts and figures, which

may be in the form of numbers, letters, characters or images. Data is often

composed of facts and observations. It is an individual unit containing raw
material that does not have any meaning and is measured in bits and bytes.

Information- provides context for the data and is measured in different units.
Information may be based on questions such as 'who', 'what', 'where' and 'when'.

Knowledge- comes next and refers to when more meaning can be derived from
information, which is then applied to achieve a set goal.

Wisdom- follows on from knowledge and is when knowledge can be applied in

action. One may ask questions such as 'why' and use knowledge and insight to
make decisions, determine
patterns and make predictions.

DIKW-The data, information, knowledge. wisdom (DIKW) pyramid is a diagram

that represents the relationship between data, information, knowledge and
wisdom. Each block builds on the previous block, answering different questions
about the initial data and how to add value to it.
Types of data
● Financial
● Medical
● Meteorological
● Geographical
● Scientific

Metadata- is a set of data that describes and gives information about other data.
For example, a document may store details such as the author, the size of the file
and the date it was created.

Data mining- is the term used to describe the process of finding patterns
and correlations, as well as anomalies, within large sets of data.

Data matching: The process of comparing two different sets of data with the
aim of finding data about the same entity.

Primary data:Original data collected for the first time for a specific purpose.

Secondary data:Data that has already been collected by someone else for a
different purpose.

Relational database: A database that has more than one table.

Validation: In databases, this means that only valid (suitable) data can be
entered,

Verification: In databases, these are checks that the data entered is the actual
data that you want, or that the data entered matches the original source of data.
Two common methods of data verification
● include double entry (for example, being asked to enter a password twice

● When registering, check the data visually. username for a new website) or
having a second person check the data visually

Data visualization- is the process by which large sets of data are converted
into charts, graphs or other visual presentations.

Encryption: The process of converting readable data into unreadable

characters to prevent unauthorized access.

Symmetric key encryption- is where the key to encode and decode the data
is the same. Both computers need to know the key to be able to communicate or
share data. This type of encryption is commonly used in wireless security,
security of archived data and security of databases.

Public key (asymmetric) encryption- uses two different keys to encode and
decode the data. The private key is known by the computer sending the data,
while the public key is given by the computer. It is shared with any computer that
the original computer wishes to communicate with. When sending data, the
public key of the destination computer is used. During transmission, this data
cannot be understood without the private key. Once received by the destination
computer the private key is used to decode the data.

Secure Socket Layer (SSL): is a protocol developed for sending information

securely over the Internet by using an encrypted link between a web server and a
browser.

Transport Layer Security (TLS): is an improved version of SSL and is a

protocol that provides security between client and server applications
communicating over the Internet.

Data masking: The process of replacing confidential data with functional

fictitious data, ultimately anonymizing the data.
Data erasure: The destruction of data at the end of the data life cycle.

Data deletion: The sending of the file to the recycle bin which removes the file
icon and pathway of its location.

Blockchain: a digital ledger of transactions that is duplicated and distributed

across a network of computers.

Big data: Term used to describe large volumes of data, which may be both
structured or unstructured.

Big data can be characterized by the 4Vs: volume,velocity, variety and

veracity.

Volume —-big data consists of very large volumes of data that is created every
day from a wide range of sources, whether it is a human interaction with social
media or the collection of data on an internet of things (IoT) network.
Velocity — the speed that data is being generated, collected and analysed.
Variety — data consists of a wide variety of data types and formats, such as
social media posts, videos, photos and pdf files.
Veracity — refers to the accuracy and quality of the data being collected.

Data privacy: The ability for individuals to control their personal information.

Data reliability: Refers to data that is complete and accurate.

Data integrity: Refers to the trustworthiness of the data and whether it
has been compromised.

Unreliable data
Biased data: This could be due to using biased data sets or bias by humans
when selecting the data.
Viruses and malware: Stored data can be vulnerable to these external threats.
Data can be changed, and therefore lose its integrity, or be corrupted and
ultimately lost.
Reliability and validity of sources: Data can be generated from a number of
online sources; if these sources have not been evaluated, this can lead to
unreliable data being used by the IT systems.
Outdated data: Many IT systems collect and store data that is changing; if data
is not updated it becomes unreliable data. Consider the telephone numbers of
parents at school, for example if a parent does not inform the school of a change
in number, this data cannot be relied on to contact parents.
Human error and lack of precision: Any form of manual data entry is prone
to human error. Automating data entry is crucial for reducing these types of
errors. It is also easy for users to accidentally delete files, move them or even
forget the name of the file and where it was saved. Effective file management
procedures are essential to reduce these types of errors.

Real life example

Geographical-Accessing location data Without Authorization:
Australian Federal Police (AFP)
According to Australian Computer Society's Information Age, in 2021 the
Australian Federal Police (AFP) were being investigated for accessing location
data without gaining the correct authorization. The investigation covered a period
of five years from 2015 to 2020 in which there were 1700 instances of police
accessing location data, with compliance for only 100 of these.
DIKW-Citizen scientists by wired
During 2020—21, there was a marked increase in bird watching, which
generated an increase in data. Many people were working from home during this
time due to the COVID-19 pandemic, and large numbers joined projects to collect
and share data about birds in the form of pictures, sound recordings and
observations. One such citizen-science project, Project Safe Flight, asked users
to record birds injured by flying into windows, while eBird allowed citizens to
update sightings of the different species of birds.
In many cases, the number of people registered to these projects doubled, and
so did the amount of data uploaded. From this data, scientists could see changes
in bird behavior, although it was not clear whether this could be attributed to the
increase in observations, or whether the birds were actually changing their
behavior.

Data analysis in employment

Data is collected widely by both people and communities. In employment, for
example, artificial intelligence can be used to analyze data generated by detailed
questionnaires to identify which employees would be suitable for new job
opportunities. In the health industry, data analysis can be used to determine
staffing levels. Too many staff can lead to overspending on labor costs, while
understaffing can create a stressful working environment and lower the quality of
medical care. Data can be used to solve this issue.

Data breaches from lack of data erasure by njb news

In 2010, some photocopiers that were used to copy sensitive medical information
were sent to be resold without wiping the hard drives. Three hundred pages of
individual medical records containing drug prescriptions and blood test results
were still on the hard drive of the copiers. The US Department of Health and
Human Services settled out of court with the original owner of the copiers for the
violation of the Health Insurance Portability and Accountability Act (HIPAA) for
US$I.2 million.
In 2015, a computer at Loyola University that contained names, social security
numbers and financial information for 5800 students was disposed of before the
hard drive was wiped.

Uses of blockchain
● Microsoft's Authenticator app for digital identity
● the healthcare industry is using blockchain technology for patient data
● blockchain technology can provide a single unchangeable vote per person
in digital voting
● The US Government is using blockchain to track weapon and gun
ownership.
Big data in banking and finance by algorithimxlab
Big data is allowing banks to see customer behavior patterns and market trends.
American Express is using big data to get to know its customers using predictive
models to analyze customer transactions. It is also being used to monitor the
efficiency of internal processes to optimize performance and reduce costs. JP
Morgan has used historical data from billions of transactions to automate trading.
A third use of big data has been to improve cybersecurity and detect fraudulent
transactions. Citibank has developed a real-time machine learning and predictive
modeling system that uses data analysis to detect potentially fraudulent
transactions.

Big data in the sports industry

Bundesliga, Germany's professional association football league, introduced
Match Facts in 2021 to give match insights to its viewers. During a match, 24
cameras are positioned on the field to collect and stream data during the
90-minute game. This data is then converted into metadata and used with past
data to provide insights for the fans, such as which player is being most closely
defended or the likelihood of a goal being scored.

Bias in facial recognition by harvard

In 2019 the National Institute of Standards and Technology (NIST) published a
report analyzing the performance of facial-recognition algorithms. Many of these
algorithms were less reliable in identifying the faces of black or East Asian
people, with American Indian faces being the most frequently misidentified. The
main factor was the non-diverse set of training images used.

Reliability and validity of COVID-19 data by guardian

In June 2020, the Guardian reported on a study that was published online about
the effect of the anti-parasite drug Ivermectin on COVID-19 patients. The data in
the study was obtained from the Surgisphere website using the QuartzClinical
database, which claimed to be monitoring real-time data from 1200 international
hospitals. However, as doctors around the world started using this data, they
soon became concerned regarding the amount of anomalies they found. This
resulted in prestigious medical journals reviewing studies that were based on this
unreliable data and the World Health Organization stopping their research into
the potential COVID-19 treatment.

Unit 30 Applied Cryptography in the Cloud
No ratings yet
Unit 30 Applied Cryptography in the Cloud
134 pages
ELearnSecurity EWPTX Notes Basic by Joas
No ratings yet
ELearnSecurity EWPTX Notes Basic by Joas
311 pages
Prevention of Cyber Crime and Fruad Management
No ratings yet
Prevention of Cyber Crime and Fruad Management
157 pages
Encryption in ASE
No ratings yet
Encryption in ASE
6 pages
Image Steganography
100% (1)
Image Steganography
60 pages
Data Basics, Data Processing, Data Security and Document Management
No ratings yet
Data Basics, Data Processing, Data Security and Document Management
62 pages
Intro
No ratings yet
Intro
4 pages
Investigation On Data Security Threats & Solutions
No ratings yet
Investigation On Data Security Threats & Solutions
5 pages
ICT- Concepts of Data Collection and Control
No ratings yet
ICT- Concepts of Data Collection and Control
20 pages
Lecture 1 CSCU (Autosaved)
No ratings yet
Lecture 1 CSCU (Autosaved)
25 pages
Cyber Security Unit 4
No ratings yet
Cyber Security Unit 4
24 pages
Information Control and Privacy
50% (2)
Information Control and Privacy
33 pages
Misuse of Information
No ratings yet
Misuse of Information
3 pages
data security measures
No ratings yet
data security measures
5 pages
Data Security
No ratings yet
Data Security
2 pages
KBXP lP9SDOXhSb90W1zRQ Course-4-Glossary
No ratings yet
KBXP lP9SDOXhSb90W1zRQ Course-4-Glossary
13 pages
Data Preparation Steps
No ratings yet
Data Preparation Steps
6 pages
data evolution unit 1 material.docx
No ratings yet
data evolution unit 1 material.docx
28 pages
Data Analitics 3
No ratings yet
Data Analitics 3
14 pages
Chapter 1
No ratings yet
Chapter 1
12 pages
Cyber Security
No ratings yet
Cyber Security
12 pages
INFORMATION PROCESSING FUNDAMENTALS Handout by Paul Gordon
No ratings yet
INFORMATION PROCESSING FUNDAMENTALS Handout by Paul Gordon
11 pages
Chapter 2 Data Representation
No ratings yet
Chapter 2 Data Representation
6 pages
ATARC AIDA Guidebook - FINAL K
No ratings yet
ATARC AIDA Guidebook - FINAL K
3 pages
Essentials of Information Technology Com
No ratings yet
Essentials of Information Technology Com
41 pages
Cyber Security
No ratings yet
Cyber Security
17 pages
Legal Research
No ratings yet
Legal Research
18 pages
CSEC IT PPT Notes-Information Processing - 012250
No ratings yet
CSEC IT PPT Notes-Information Processing - 012250
44 pages
Data Protection and Security Measures
No ratings yet
Data Protection and Security Measures
13 pages
Data Leakage Detection - Final 26 April
100% (2)
Data Leakage Detection - Final 26 April
62 pages
WA0002
No ratings yet
WA0002
22 pages
Securing Sensitive Business Data in Non-Production Environment Using Non-Zero Random Replacement Masking Method
No ratings yet
Securing Sensitive Business Data in Non-Production Environment Using Non-Zero Random Replacement Masking Method
9 pages
Unit 16 Database Resource Management: Structure
No ratings yet
Unit 16 Database Resource Management: Structure
13 pages
Data literacy notes
No ratings yet
Data literacy notes
5 pages
Data and Databases
No ratings yet
Data and Databases
18 pages
Lesson 3 Data Science
No ratings yet
Lesson 3 Data Science
12 pages
Information Processing Revision
No ratings yet
Information Processing Revision
22 pages
It Test
No ratings yet
It Test
1 page
DATA SECURITY AND CONTRO1
No ratings yet
DATA SECURITY AND CONTRO1
4 pages
Pathfinder Glossary Web Final
No ratings yet
Pathfinder Glossary Web Final
4 pages
Legal Research
No ratings yet
Legal Research
8 pages
Unit 2 - Data Literacy
No ratings yet
Unit 2 - Data Literacy
22 pages
Data Security
No ratings yet
Data Security
5 pages
Chapter 1 Introduction To Data Science
No ratings yet
Chapter 1 Introduction To Data Science
9 pages
Unit 1 Explores The Basic Concepts of ICT Together With Its Role and Applicability in Today's Knowledge Based Society
No ratings yet
Unit 1 Explores The Basic Concepts of ICT Together With Its Role and Applicability in Today's Knowledge Based Society
14 pages
Context For Our Data, It Becomes Something Far More Useful: Information
No ratings yet
Context For Our Data, It Becomes Something Far More Useful: Information
16 pages
Study Session 5 Assignment
No ratings yet
Study Session 5 Assignment
9 pages
Security and Ethics (Unhu
No ratings yet
Security and Ethics (Unhu
51 pages
subtitle (5)
No ratings yet
subtitle (5)
3 pages
Data longevity and accessibility
No ratings yet
Data longevity and accessibility
4 pages
Cs Unit-5
No ratings yet
Cs Unit-5
5 pages
DAMA-DMBOK - Ch-1
No ratings yet
DAMA-DMBOK - Ch-1
46 pages
Privacy and Data Security in Internet of Things: Arun Nagaraja and N. Rajasekhar
No ratings yet
Privacy and Data Security in Internet of Things: Arun Nagaraja and N. Rajasekhar
13 pages
Privacy and Data Security in Internet of Things: Arun Nagaraja and N. Rajasekhar
No ratings yet
Privacy and Data Security in Internet of Things: Arun Nagaraja and N. Rajasekhar
13 pages
Privacy and Data Security in Internet of Things: Arun Nagaraja and N. Rajasekhar
No ratings yet
Privacy and Data Security in Internet of Things: Arun Nagaraja and N. Rajasekhar
13 pages
IT Security English SampleLesson
No ratings yet
IT Security English SampleLesson
10 pages
AI-Ch-2-Data Literacy short notes
No ratings yet
AI-Ch-2-Data Literacy short notes
9 pages
Digital Literacy Topic 2-1
No ratings yet
Digital Literacy Topic 2-1
13 pages
Data Leakage Detection and Prevention
No ratings yet
Data Leakage Detection and Prevention
6 pages
Investigate The Place of ICT in The World of Information
No ratings yet
Investigate The Place of ICT in The World of Information
39 pages
Module 1 Introduction To DataScience and Analytics
No ratings yet
Module 1 Introduction To DataScience and Analytics
10 pages
Management Information Systems 2
No ratings yet
Management Information Systems 2
208 pages
Data and Information
No ratings yet
Data and Information
2 pages
IGCSE Computer Science Chapter 9 - Security
No ratings yet
IGCSE Computer Science Chapter 9 - Security
8 pages
Digital Privacy Today
From Everand
Digital Privacy Today
Sterling Blackwood
No ratings yet
Private-Key Cryptography
No ratings yet
Private-Key Cryptography
30 pages
Legal Issues Involving Cryptography in India: Parvathy.A Ravi Shankar Choudhary, Dr. Vrijendra Singh
100% (1)
Legal Issues Involving Cryptography in India: Parvathy.A Ravi Shankar Choudhary, Dr. Vrijendra Singh
12 pages
The Imposter s Handbook Season 2 Rob Conery - The ebook in PDF format is available for download
100% (2)
The Imposter s Handbook Season 2 Rob Conery - The ebook in PDF format is available for download
70 pages
Lab-Practice III Ics Labfile PDF
No ratings yet
Lab-Practice III Ics Labfile PDF
21 pages
Cryptographic Hash Functions
No ratings yet
Cryptographic Hash Functions
16 pages
Bitcoin White Paper Made Simple
No ratings yet
Bitcoin White Paper Made Simple
53 pages
Network Security Notes
No ratings yet
Network Security Notes
54 pages
Nist Fips 203
No ratings yet
Nist Fips 203
56 pages
Unit 3
No ratings yet
Unit 3
46 pages
2023 Cybersecurity Note All You Need To Know 1689007339
No ratings yet
2023 Cybersecurity Note All You Need To Know 1689007339
110 pages
Top Cybersecurity Interview Questions and Answers For 2024
No ratings yet
Top Cybersecurity Interview Questions and Answers For 2024
25 pages
All in One Quiz
No ratings yet
All in One Quiz
58 pages
Recommendation For Block Cipher Modes of Operation - Galois Counter Mode (GCM) and GMAC (NIST SP800-38D) - Morris Dworkin
No ratings yet
Recommendation For Block Cipher Modes of Operation - Galois Counter Mode (GCM) and GMAC (NIST SP800-38D) - Morris Dworkin
39 pages
Email This Letter
No ratings yet
Email This Letter
5 pages
Chapter 1&2(With Answer) p2
No ratings yet
Chapter 1&2(With Answer) p2
46 pages
(Almost) Everything You Need To Know For OCR A Level Computer Science (v1.0)
No ratings yet
(Almost) Everything You Need To Know For OCR A Level Computer Science (v1.0)
30 pages
SEB Complaint Encryption Keys Final Exhibits Opt (1) 2
No ratings yet
SEB Complaint Encryption Keys Final Exhibits Opt (1) 2
614 pages
Encryption in Network Criptography
No ratings yet
Encryption in Network Criptography
44 pages
Mathematics of Public Key Cryptography 1st Edition by Steven Galbraith 1107013925 9781107013926pdf download
100% (3)
Mathematics of Public Key Cryptography 1st Edition by Steven Galbraith 1107013925 9781107013926pdf download
80 pages
short question 2
No ratings yet
short question 2
3 pages
Ammett Williams Google Cloud Security
No ratings yet
Ammett Williams Google Cloud Security
9 pages
Digital Forensics - Methodology (Brief)
No ratings yet
Digital Forensics - Methodology (Brief)
34 pages
Keys For @letsdoreviews
No ratings yet
Keys For @letsdoreviews
2 pages
Unhashing Passwords: Asst. Prof. Rajesh Dhakad Rhythum Tamra
No ratings yet
Unhashing Passwords: Asst. Prof. Rajesh Dhakad Rhythum Tamra
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

3.1-Data and Data Analysis

Uploaded by

3.1-Data and Data Analysis

Uploaded by

Data- refers to the collection of raw and unorganized facts and figures, which

may be in the form of numbers, letters, characters or images. Data is often

Wisdom- follows on from knowledge and is when knowledge can be applied in

DIKW-The data, information, knowledge. wisdom (DIKW) pyramid is a diagram

Relational database: A database that has more than one table.

Encryption: The process of converting readable data into unreadable

Secure Socket Layer (SSL): is a protocol developed for sending information

Transport Layer Security (TLS): is an improved version of SSL and is a

Data masking: The process of replacing confidential data with functional

Blockchain: a digital ledger of transactions that is duplicated and distributed

Big data can be characterized by the 4Vs: volume,velocity, variety and

Data reliability: Refers to data that is complete and accurate.

Real life example

Data analysis in employment

Data breaches from lack of data erasure by njb news

Big data in the sports industry

Bias in facial recognition by harvard

Reliability and validity of COVID-19 data by guardian

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.