N 3. Classification of Digital Data
N 3. Classification of Digital Data
• Evolution of Technology
• Types of Data
• Big Data- Definition Aspect
• Big data Vs Not Big data
• Challenges of big data
Evolution of Technology
Reference : https://www.youtube.com/watch?v=zez2Tv-bcXY
Internet of Things
Reference : https://www.edureka.co/blog/big-data-tutorial
Social Media Usage
Classification of Digital Data
Digital Data
Structured data
• When do we say that the data is structured??
• When data conforms to a predefines schema/structure.
• Sources of structured data
Working with structured data
• Insert/update/delete
• Indexing
• Transaction processing
• Security
• Scalability
Semi-structured data
• It does not conform to the data models that one typically associates
with relational databases or any other form of data tables
• It uses tags to segregate semantic elements
Sources of semi-structured data
Unstructured data
• Does not conform to any predefined data model
• The structure can be unpredictable.
Sources of unstructured data
How to deal with unstructured data?
Inclass#exercise
Solution
Let’s Discuss
• Why email in unstructured category?
• Where should we put CCTV footage?
You are at city shopping mall. You see few people are browsing
the items. Some of them are looking for discounts. Some of them
are filling feedback form. Few people are at billing counter. You
may consider other things and events happening in this
scenario. Think for while on the different types of data
generated. Mention each of them with proper logic
You are at university library. You see few students browsing through the
library catalog on kiosk. You see the working of librarians and other
staff to issue/return books, magazines, and journals. Few students are
using the e-library service, too. Which type of data is generated in this
scenario? Support your answer by considering big data
Big Data – Definitional
Aspects
Characteristics of Big data
Gartner’s 3V casted by Douglas Laney in 2001
Volume , Velocity and Variety
Yuri Demchenko’s 5V
Volume , Velocity , Variety , Veracity and Value
Microsoft’s 6V
Volume , Velocity , Variety , Veracity , Value and Visibility
Volume
Velocity
Taken from : Hewlett-Packard Development Company “truths and myths about big data”,2013
Veracity
Value
What is big data about?
• It’s about the analytics—the insights gleaned from the data; and the
necessary capacities to do so—human, technological
• One step further: it’s about knowledge: getting near to the ‘true’ meaning
of a facebook status update;
• It’s about sharing and diffusion – visualizations
Big data Definition
Challenges with Big data
The problem is storing the colossal amount of data.
A big data analytics cycle can be described by the following stage −
Business Problem
Definition
2. Data Identification
8. Analysis of Results
Classification of Data Analytics
Big data Analytics-Case studies
• Healthcare
Traditional Vs Big data Approach
OLTP: Online Transaction Processing
• DBMSs
OLAP: Online Analytical Processing
• Data Warehousing
RTAP: Real-Time Analytics Processing
• Big Data Architecture & Technology