0% found this document useful (0 votes)
53 views88 pages

Slide 1 Big Data Introduction

The document provides an overview of big data, including: 1) It defines big data and discusses the challenges of capturing, curating, storing, searching, sharing, transferring, analyzing and visualizing large and complex datasets. 2) It explains the 3Vs (volume, variety, velocity) and 5Vs (including value and veracity) frameworks for characterizing big data. 3) It provides examples of big data applications in various domains like weather forecasting, healthcare, logistics, travel and tourism, and government/law enforcement.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views88 pages

Slide 1 Big Data Introduction

The document provides an overview of big data, including: 1) It defines big data and discusses the challenges of capturing, curating, storing, searching, sharing, transferring, analyzing and visualizing large and complex datasets. 2) It explains the 3Vs (volume, variety, velocity) and 5Vs (including value and veracity) frameworks for characterizing big data. 3) It provides examples of big data applications in various domains like weather forecasting, healthcare, logistics, travel and tourism, and government/law enforcement.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 88

Big Data

(Understanding about Big data)


Trong-Hop Do
September 8th, 2021

S3Lab
Smart Software System Laboratory

1
“Without big data, you are blind and deaf
and in the middle of a freeway.”
– Geoffrey Moore

Big Data 2
Evolution of Technology

3
IOT

4
Social media

5
Other factors

6
What is BigData

● Big data is the term for a collection of data sets so large and complex
that it becomes difficult to process using on-hand database
management tools or traditional data processing applications.
● Challenges: Capture, Curation, Storage, Search, Sharing, Transfer,
Analysis, and Visualization.

7
Big Data
Big Data: 3V’s

8
Big Data
Big Data: 3V’s
Volume (scale)

9
Big Data
Big Data: 3V’s
Volume (scale)

10
Big Data
Big Data: 3V’s
Volume (scale)

11
Big Data
Big Data: 3V’s
Volume (scale)

Earthscope - 67 terabytesof CERN’s Large Hydron Collider (LHC) generates 15 PB a


data year
12
Big Data
Big Data: 3V’s
Variety (Complexity)

13
Big Data
Big Data: 3V’s
Variety (Complexity)

● Big data could be of three types


○ Structured: The data that can be stored and processed in a fixed format (fixed schema) is
called as Structured Data. Ex. RDBMS
○ Semi-Structured: not have a formal structure of a data model, but nevertheless it has
some organizational properties like tags and other markers to separate semantic
elements that makes it easier to analyze. Ex. XML files or JSON documents.
○ Unstructured: Text Files and multimedia contents like images, audios, videos are example
of unstructured data. The unstructured data is growing quicker than others, experts say
that 80 percent of the data in an organization are unstructured.

14
Big Data
Big Data: 3V’s
Variety (Complexity)

● Semi-Structured, NoSQL

15
Big Data
Big Data: 3V’s
Variety (Complexity)

● Relational Data (Tables/Transaction/LegacyData)


● Text Data (Web,log)
● Semi-structured Data (XML)
● Graph Data: Social network, Semantic web (RDF)...
● Streaming Data: You can only scan the data once
● A single application can be generating /collecting many types of data
● Big Public Data (online, weather, finance, etc.)
To extract extract knowledge ➠all these types of data need to linked together
16
Big Data
Big Data: 3V’s
Variety (Complexity)

17
Big Data
Big Data: 3V’s
Velocity (Speed)

18
Big Data
Big Data: 3V’s
Velocity (Speed)

● Data is begin generated fast & need to be processed fast


● Online Data Analytics
● Late decisions ➠missing opportunities
● Examples
○ E-Promotions: Base on your current location, your purchase history, what you like ➠send
promotions right now for store next to you
○ Healthcare monitoring: sensors monitoring your activities and body ➠any abnormal
measurements require immediatereaction

19
Big Data
Big Data: 3V’s
Velocity (Speed)

● The progress and innovation is no longer hindered by the ability to collect


data. But, by the ability to manage, analyze, summarize, visualize, and
discover knowledge from the collected data in a timely manner and in a
scalable fashion
20
Big Data
Big Data: 3V’s
Velocity (Speed)

21
Big Data
Big Data: 5V’s
Value

22
Big Data
Big Data: 5V’s
Value

23
Big Data
Big Data: 5V’s
Veracity

24
Big Data
Big Data: 5V’s
Veracity

25
Big Data
Big Data: 4V’s

26
Big Data
Big Data: 5V’s

27
Big Data
Big Data: NV’s

● The above image depicts the five V’s of Big Data but as and when the
data keeps evolving so will the V’s. Iam listing five more V’s which have
developed gradually overtime:
○ Validity: correctness ofdata
○ Variability: dynamic behaviour
○ Volatility: tendency to change in time
○ Vulnerability: vulnerable to breach or attacks
○ Visualization: visualizing meaningful usage of data

28
Big Data
Big Data: Applications

29
Big Data
Big Data: Applications

30
Big Data
Big Data: Applications

31
Big Data
Big Data: Applications

32
Big Data
Big Data: Applications

33
Big Data
Big Data: Applications

34
Big Data
Big Data: Applications

35
Big Data
Big Data: Applications
Weather forecast

36
Big Data
Big Data: Applications
Weather forecast

37
Big Data
Big Data: Applications
Weather forecast

38
Big Data
Big Data: Applications
Media and entertainment

39
Big Data
Big Data: Applications
Media and entertainment

40
Big Data
Big Data: Applications
Media and entertainment

41
Big Data
Big Data: Applications
Media and entertainment

42
Big Data
Big Data: Applications
Media and entertainment

43
Big Data
Big Data: Applications
Media and entertainment

44
Big Data
Big Data: Applications
Health care

45
Big Data
Big Data: Applications
Health care

46
Big Data
Big Data: Applications
Health care

47
Big Data
Big Data: Applications
Health care

48
Big Data
Big Data: Applications
Health care

49
Big Data
Big Data: Applications
Logistic

50
Big Data
Big Data: Applications
Logistic

51
Big Data
Big Data: Applications
Logistic

52
Big Data
Big Data: Applications
Logistic

53
Big Data
Big Data: Applications
Logistic

54
Big Data
Big Data: Applications
Travel and tourism

55
Big Data
Big Data: Applications
Travel and tourism

56
Big Data
Big Data: Applications
Travel and tourism

57
Big Data
Big Data: Applications
Travel and tourism

58
Big Data
Big Data: Applications
Travel and tourism

59
Big Data
Big Data: Applications
Government and law enforcement

60
Big Data
Big Data: Applications
Government and law enforcement

61
Big Data
Big Data: Applications
Government and law enforcement

62
Big Data
Big Data: Applications
Government and law enforcement

63
Big Data
Big Data: Applications
Government and law enforcement

64
Big Data
Big Data: Scale

65
Big Data
Big Data: Evolution
● The Model of Generating /Consuming Data has changed
○ Old Model: a few companies are generation data, all others are consuming data

○ New Model: All of us are generating data, and all of us are consuming data

66
Big Data
Big Data: Evolution

● OLTP: Online Transaction Processing (DBMSs)


● OLAP: Online Analytical Processing (Data Warehousing)
● RTAP: Real-time Analytics Processing (Big Data Architecture & Technology)
67
Big Data
Big Data: Evolution

68
Big Data
Big Data: Evolution
● Big data is more real-time in nature
than traditional DW applications
● Traditional DW architectures (e.g.
Exadata, Teradata) are not well-
suited for big data apps
● Shared nothing, massively parallel
processing, scale out architectures
are well-suited for big data apps

69
Big Data
Big Data: Evolution

70
Big Data
Big Data: Landscape

71
Big Data
Big Data: Landscape

72
Big Data
Big Data: Landscape(Open sources)

73
Big Data
In this course:

74
Big Data
Projects

75
Projects
Projects
Projects
Projects
Projects
Projects
In this course:

82
Big Data
Install Cloudera Quickstart VM

• Download Cloudera Quickstart VM for VirtualBox


https://downloads.cloudera.com/demo_vm/virtualbox/cloudera-quickstart-vm-5.13.0-0-virtualbox.zip

• Open Oracle VM VirtualBox. Click File → Import Appliance

83
Install Cloudera Quickstart VM

Choose “cloudera-quickstart-vm-5.13.0-0-virtualbox.ovf”

84
Install Cloudera Quickstart VM

85
Install Cloudera Quickstart VM

VMware

● Download Cloudera Quickstart VM for Vmware


https://downloads.cloudera.com/demo_vm/vmware/cloudera-quickstart-
vm-5.13.0-0-vmware.zip

86
Install Cloudera Quickstart VM

VMware

87
Q &A

Cảm ơn đã theo dõi


Chúng tôi hy vọng cùng nhau đi đến thành công.

88
Big Data

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy