Analytics: Topics
Analytics: Topics
SyllabusTopics
-
Data Analytics Architecture and Life Cycle,
Big Types of analysis, Analytical approaches, Data Analytics with
Mathematical manipulations, Data Ingestion from
different sources (CSV, JSON,
html, Excel, mongoDB, mysql,
sqlite), Data cleaning, Handling missing values, dataimputation, Data transformation, Data Standardization, handling
bnnorical data with 2 and more categories, statistical and graphical analysis
methods, Hive Data Analytics.
4.1
Big Data Analytics Life cycle is divided into ine phases, named as:.. 4-2
.....
Explain diferent steps in Data Analytics project life cycle. (SPPU-Q. 5(a), Dec. 19, 8 Marks) 4-2
42 Types of Big Data Analytics 4-4
UQ.
Explain different types of analysis in detail with example. SPPU- Q. 6a), Dec. 18, 9 Marks). 4-4
43 approaches
Analytical ,4-5
46 Data Cleaning....
4-9
4-9
47 Handling missing values
are preprocessed before building model?
UQ. How Missing values and categorical variables
18, 9 Marks). 4-9
Explain with example.(SPPU-Q. 5(b), Dec.
4.8 Data Imputation.. 4-12
4.9 Data Transformation
4-14
May 18, 8 Marks).
transformation in big data. (SPPU-Q.5(b), 4-14
UQ. Explain the different modes of data
4-14
4.9,1 Benefits of Data Transformation
4.10 4-14
Data Standardization...
411 handling 4-15
more categories.
categorical data with 2
and
4.12 4-16
statistical methods.
and graphical analysis
- Q. 7(b), Dec. 19,
8 Marks). 4-16
UQ. plot.(SPPU
Cplain pie chart and scatter 4-17
13 Hive Data
Analytics.
8 Marks) 4-17
UQ. SPPU-0. 6(a), Dec. 19,
Architecture
of
HIVE.
plain
4-18
Chapter Ends
(Big Dat Analyticn). Page
to(2
Analvtics (SPPU-Som 6-l)
Data Scionce and Bia Data Definition
IS Phase II : Data
ANALYTICS LIFE CYCLE case identificd, now it's
is
4.1 BIG DATA Oncethe busincss
DIVIDED INTO NINE
PHASES, appropriale datasetsto work with. In tthisstage,
find the
NAMED AS: see what other conpanieslave
unalysis is done to done
case.
Analytics project ite for a sinilar
Explain different steps in Data Marks) business Case and 1he scope
UQ.
(SPPUQ. 5(a)) Dec. 19, 8 Depending on lhe
ycle. of
analysis of the project being addressed, the sources of
can be either external or internal
Phases of Big Data Analytics datascts to the
Business Casc/Problem Definition company.
of internal datascts, the datascts can
2. Data ldentification In the case include
internal sources, such as
3 Data Acquisition and filtration data collected from feedback
On
forms, from existing software, the other hand, for
4. Data Extraction includes datascts rore a
external datascts, the list
Data Munging (Validation and Cleaning)
5. party providers.
Data Aggregation & Representation (Storage) filtration
Phase III Data Acquisition and
:
,.
7 Exploratory Data Analysis Once the source of data is identified, now it is time
8 Data Visualization (Preparation for Modeling and gather the data from such sources. This kind of data i
Assessment) mostly unstructurcd. Then it is subjccted to filration
such as removal of the corrupt data or irrelevant data
9 Uilization of analysis results.
which is of no scopc to the analysis objective. Here
Let us discuss each phase :
corrupt data means data that may have missing records
: or the ones, which include incompatible data types.
Phase I Business Problem Definition
In this stage, the team learns about the business After filtration, a copy of the filtered data is stored and
domain, which presents the motivation and goals for compressed, as it can be of use in the future, for some
carrying out the analysis. other analysis.
In this stage, the problem is identified, and assumptions Phase IV: Data Extraction
are made that how much potential gain a company will Now the data is filtered, but there might be a possibility
make after carrying out the analysis. that some of the entries of the data might be
incompatible, to rectify this issue, a separate phase is
Important activities in this step include framing the
created, known as the data extraction phase.
business problem as an analytics challenge that can be
In this phase, the data, which don't match with the
addressed in subsequent phases.
underlying scope of the analysis, are extracted and
It helps the decision-makers understand the business transformed in such a form.
resources that will be required to be utilized thereby
Phase V: Data Munging
determining the underlying budget required to carry out
the project.
As mentioned in phase III, the data is coollected from
various sources, which being
Moreover, it can be determined, whether the problem results in the data
unstructured. There might be a possibility, that uie
identified, is a Big Data problem not, based on the might have constraints, that are unsuitable, wile
business requirements in the business case. lead to false results. Hence clean and
there is a need to
To qualify as a big data problem, the business case validate the data.
should be directly related to one(or more) It includes removing any establishing
of the invalid data and ways to
complex validation rules.
characteristics of volume, velocity, or variety. There are many
validate and clean
the data.
(SPPU - New Syllabus w.e.f academic year 21-22)(P6-56)
Venture
Tech-Neo Publications...A SACHIN |
SHHAH
Big Data Analytics (SPPU-Sem 6-IT)
Science. and
Cata (Big Data Analytics)...Page No. (4-3)
erample, a dataset might contain few rows, with
for entries. If a similar datasct is present,
then those A sort of represcntation is reauired to obtains value or
null
are copied from that dataset, else those rows are Some conclusion from the analysis. Hence, various
entries
dropped.
tools are used to visualize the data in graphic forim,
Wnich can easily be interpreted by business Users.
Data Aggregation
&
Representation
Phase V: Visualization is said to influence the interpretation of
cleansed and validates, against certain rules the results.
The data is
by the enterprise.
But the data
be spread might Moreover, it allows the users to discover answers to
set
datasets, and it is not advisable to work
Multiple questions that are yet to be formulated.
acTOSS
Hence, the datasets are joined
with multiple datasets. Phase IX:Utilization of analysis results
together.
The analysis is done, the results are visualized, now it's
rexample: are two datasets, namely that of a
Ifthere time for the business users to make decisions to utilize
For
IV
End Se
The assumption is called the hypothesis. Phase |V Data visualization
Diagnostic analytics
Descrlptive analytlcs It givos a
dotailed and in
on tho root
simplifios the data and
It dopth insight problem.
a causo of a
Summarizes past data into
roadable fom.
Predictive analytics
Prescriptive analytics This type of analytics makes
Proscriptivo analytics allows use of historical and present
events.
business to determine the bost data to predict future
possible solution to a problom.
2. Diagnostic Analytics
Predictive Analytics
Diagnostic Analytics, as the name suggests, gives a the
diagnosis to a problem. It gives a detailed and in-depth
Predictive Analytics, as can be discerned from
name future
insight into the root cause ofa problemn. itself, is concerned with predicting
trends,
Data scientists turn to this analytics craving for the incidents. These future incidents can be market
consumer events.
reason behind a particular happening. Techniques like trends, and many such market-related
(SPPU
-
New Syllabus w.e.f academic year 21-22)(P6-56) Venture
Tech-Neo Publications..A SACHIN SHAH
6-IT)
Bia Data Analytics (SPPU-Sem Analytics)...Page No. (4-5)
Scienceand
(Big Data
analytics makes use of historical and
of ANALYTICAL APPROACHES
ype 4.3
This predict future events. This the most
data to
present
used
form of analytics among businesses.
conumonly Data fusion and data integration
and
analytics doesn't only work for the service By a techniques that analyse
Predictive combining of
set
the
also for the consumers. It keeps track of sources and solutions,
providers but
integrate data from multiple more
and based on them, predicts what we
activities insights are more efficient and
potentially
t source of
our past accurate than if developed through
a single
next.
maydo is
predictive analytics NOT to tell you data.
purpose of
"The
happen inthe
future. It cannot do that. In fact,
will 2. Data mining
what do that. Predictive analytics
can only analytics, data
can commnon tool used within big data
analytics A data sets by
no happen in
the future, because all
patterns from large
what might mining extracts
forecast
analytics are probabilistic in nature." statistics and machine
predictive Combining methods from
Strategist, PRos learning, within database
management.
Michael Wu, Chief Al to
ne
when customer data is mined
uses models like data mining, AI, would be an
An example to react to
Predictive analytics segments are most likely
current data and determine which
learning to analyze
machine
and scenarios.
Corecast what might
happen in specific offer.
next best
Predictive analytics include 3. Machine learning
Examples of artificial intelligence,
analysis. within the field of
chum risk, and renewal risk Well known
used for data
analysis.
fers machine learning is also with
science, it works
Prescriptive Ánalytics yet Emerging from
computer on
) 4. most valuable assumptions based
is the to produce
, Prescriptive analytics next step in computer algorithms
be impossible
analytics. It is the predictions that would
underused form of data. It provides
predictive analytics. several possible for human analysts.
analysis explores on processing (NLP). Unil
The
prescriptive
depending the results of
4. Natural language computerscience, artificialTV
suggests actions dataset. subspecialty of
actions and
analytics of a given Known as a tool uses Enl Se
descriptive and
predictive linguistics, this data analysis
of data and intelligence, and
analytics is a combination prescriptive analyse human (natural)
language.
Prescriptive
data of algorithms to
various business
rules. The (organizational inputs)
internal Statistics
analytics can be both 5. organise, and
interpret
and external (social
media insights). determine works to collect,
businesses to This technique experiments.
allows surveys and
Prescriptive analytics problem. When data, within analysis,
to a include spatial
possible solution benefit analysis techniques network
the best analytics, it adds the Other data association rule learning,
combined with
predictive future modelling,
occurrence like mitigate predictive more.
manipulating a future many, many analyse this
analysis and
O
customer process, manage, and
risk. for technologies that expansive field.
analytics The different and
prescriptive offer an entirely over time.
• Examples of
and next best data are of and develops
action of
retention is next best similarly evolves any form or size
that technologies aside, effectively, it
analysis.
Aurora Techniques and accurately and
the
analytics can be valuable. Managed product, and market
A
use case of prescriptive reducing the data is business,
million by a host of
Health Care system. It saved
$6 can reveal
Tcadmission rates by 10%. healthcare insights.
good of drug
use in the
process
Prescriptive analyytics has
enhance the clinical
industry. It can be used to patients for SHAH Venture
Publications..A SACHIN
development, finding the right Tech-Neo
trials, etc.
21-22)(P6-56)
mic
vear
(Big Data Analytics).
Page No
6-/T)
Data Science and Big Data Analytics
(SPPU-Sem Consistecnt data: It can
be structured. read,
and
data
understood by providing in a conststent bete
WITHH
DATA ANALYTICSMANIPULATIONS lorma
M 4.4 MATHEMATICAL You may
not have
a
unificd view wlen
taking
sources, but with data manipulation da
from various
can make sure
commands, you lat the
EO What is Data Manipulation? consistently,
is structurcd and storcd
Meaning: Manipulation of data
Data Manipulation paramount for organzalions
or changing information to Projcct data: it is to
thc process of manipulating usc historical data to projcct the future be
We use DML to able to
make it more organized and readable. more in-depth analysis, especially andh
Well, it provide
accomplish this. What is mcant by DML? (o finances. Manipulation of data
when
morc with the data. Create more value from the data
data thanks to DML (o make it digestible for
Cxpression. becomes pointless by providing data that remains static
But you will have straightforward insights male to
Data Manipulation Examples better business decisions when you know how to ue
Here are some of the use-cases where data ingestion is belaviour in real time && quickly push information t
required. the fans. After all, the whole business depends on it
(i) Moving Massive Amount of Big Data into IHadoop Let's talk about somc of the challenges the
development teams have to face while ingesting data
This is the primary & the most obvious usc case. As
discussed above, Big Data from all the loT devices, Challenges Companies Face When
social
apps & everywhere, is streamed
through data pipclines, Ingesting Data
moves into the most popular
distributed data processing
framework Hadoop for analysis & stuff. (i) Slow Process
Guys, data ingestion is a slow process. How? I
(ii) Moving Data from Databases to Elastic Search
explain. When data is streamed from several different
Server sources into the system, data
coming from cach &
In the past, with a few of my friends, I wrote a every different source has a different
product format, different
search software as a service solution from scratch with syntax, attachcd metadata. The data as a whole is
Java, Spring Boot, Elastic Search. Speaking of its
heterogeneous. It has to be transformed into a common
design the massive amount of product data from legacy
storage solutions of the organization was format like JSON or something to be understood by the
streamed, analytics system.
indexed & stored to Elastic Search Server. The
streaming process is more technically called the The conversion of data is a tedious process. It takes a
Rivering of data. lot of computing resources & time. Flowing data has to
As in, drawing an analogy from how the water flows be staged at several stages in the pipeline, processcd &
through a river, here the data moved through a data then moved ahead. Also, at each & every stage data nas
to be authenticated &
pipeline from legacy systems & got ingested into the verified to meet the
organization's security standards. With the traditionl
elastic search server enabled by a plugin
specifically data cleansing processes, it takes wecks if not monun
written to execute the task.
to get useful information on band. Traditional datd
(iiü) Log Processing, Running Log Analytics Systems ngestion systems like ETL ain't that eftecuh
anymore.
If your project isn't a hobby project, chances are it's
(ii) Complex & Expensive
running on a cluster. Monolithic systems are a thing of
the past. With so many microservices running As already stated process S
the entire data flow
concurrently. There is a massive number of logs which resource-intensive. A done
lot of heavy lifting hasto be
is generated over a period of time. And logs are the to prepare the data before into the
being ingested
system. Also, dedicated
it isn't a side process, an entire
team is required to pull off something
like that.
(SPPU - New Syllabus w.e.f academic year 21-22)(P6-56)
Tech-Neo Publications.A Venture
SACHIN SHAH
Science and Big Data Analytics (SPPU-Sem 6-IT) (4-9)
Data (Big Data Analytics)..Page No.
There are
always
scenarios were
the tools & This data is usually not necessary or helpful when it
in the market fail to serve your
frameworks available comes to analyzing data because it may hinder the
custom needs you are left with no option
&
(SpPU - New Syllabus w.e.f academic year 21-22)(P6-56) Tech-Neo Publications.A SACHIN SHAH Venture
(Big Data Analytics...Page
Data Science and Big Data Analvtics (SPPU-Sem 6-1) No.
(4-10)
t why do missing values occur in Some hypothetical examples of MAR data include:
data?
more likely
Missing values can occur in data for a number ol
A
certain swimmning lane is to have misssing
reasons, suclh as survey non-responses or errors clectronic time observations but the missing
in lala data
isni
cntry. directly related to the actual time,
a Missing completely at random (MCAR) When data is missing not at random (MNAR)
ha
likelihood of a missing observation is related to its values
Data is
missing completely at random if all l
can be difficult to identify MNAR
observations have the same likelihood data bccause the val
of being missing.
of missing data are unobserved. This can result in distorted
Some hypothetical examples of MCAR data include :
data.
Electronic time observations are missing,
independent Some hypothetical cxamples of MNAR data include:
of what lane a swimmer is in.
When surveyed people with more income are less
A scale is equally likely to produce
missing values
likely to report their incomes.
when placed on a soft surface or a hard surface (Van
Buren, 2018). On a health survey, illicit drug users are
less likely to
a
respond to question about illicit drug use.
G Missing at random (MAR)
Individuals surveyed about their age are more
When data is missing at random (MAR) the likelihood likely to
leave the age question blank
that a data point is missing is not related to when they are older.
the missing data
but may be related to other observed data.
t Deletion
may be a suitable method for dealing with missing values. However, when data
When data is MCAR and MAR deletion
is MNAR, deletion of missing observations can lead to bias.
.
In this section we cover three methods of data deletion for missing values
(A) Listwise deletion (B) Pairwise deletion
(C) Variable deletion
Cenn- New Syllabus W.e.f academic year 21-22)(P6-56) Tech-Neo Publications..A SACHIN SHAH Venture
(Big Data Analytics).Page No.(4-12)
Data Science and Big Data Analytics (SPPU-Sem 6-l)
missing values
of deletion for
Table 4.7.1: Methods Disadvantages
Adyantages
Method Description Can
result in biased estimales
Easy to implement.
Listwise deletion Delete all observations whee if the missing values are
the missing values occur.
n
MCAR.
Wastes useful information
Can disrupt time
seris
analysis by creating gaps in
dates used for analysis.
Can result in biased estimates
Pairwise deletion Uses all available data when Simple to implement.
if the missing values are not
computing meanS and Uses all available
MCAR.
covariances. information.
Results in different sample
sizes being used for different
computations.
Requires that data follow
normal distribution
SPPU-Ne
Syllabus
w.e.f accademic year 21-22)(P6-56) aTech-Neo Publications.A SACHIN SHAH Venture
Page
Data Science and Big Data Analytics (SPPU-Sem
6-|T) to(4-14)
Data transformation facilitates compatibility
between
applications, systems, and types of data, Data
4.9 DATA TRANSFORMATION used o
multiple purposes may need to he transformed
in
of data transformation
different ways
UQ. Explain the different modes
(SPPU Q. 5(b), May 18, 8 Marks)
-
in big data.
4.10 DATA STANDARDIZATION
process of changing the
Data transformation is the a
Standardization is data processing worklow
forinat, structure, or values of data.
Data
of disparate
may be transformed at converts the structure datasets into
a
For data analytics projects, data Format. As part of
Organizations that use Common Data the Data
two stages of the data pipeline. Data Standardization deals
use an ETL Preparation field, the with
on-premises data warehouses generally pulled fr
transformation of datasets after the data is
(extract, transform, load) process, in which data
before it's loaded into target
source systems and
transformation is the middle step. Data Standardization Can al..
use cloud-based data systems. Because of that,
Today, most organizations transformation rules enoine i
storage be thought of as the
warehouses, which can scale compute and
or minutes. Data Exchange operations.
resources with latency measured in seconds consumer to
lets organizations Data Sandardization enables the data
The scalability of the cloud platform a manner. Typically
load raw data into the analyze and use data in consistent
skip preload transformations and source system
when data is created and stored in the
- a
at query time
data warehouse, then transform it a way that is often unknown
transform). it's structured in particular
model called ELT (extract, load,
migration, to the data consumer.
Processes such as data integration, data semantically related
all may Moreover, datasets that might be
data warehousing, and data wrangling differently, thereby
may be stored and represented
involve data transformation. a consumer to aggregate or
may be constructive (adding, making it difficult for data
Data transformation (deleting compare the datasets.
data), destructive
copying, and replicating
(standardizing salutations DataStandardization Use Cases
fields and records), aesthetic LG
(renaming, moving, and
or street names), or structural There are two main
use case categories in Data
a database). and Complex
combining columns in Standardization: Source-to-Target Mapping,
among a variety of ETL tools into two sub
Anenterprise
can choose
Data Reconciliation. We typically divide the former
automate the process of data transformation. categories thereby arriving at three
use cases :
that scientists also
engineers, and data sources : This use
analysts, data such as Simple mapping from external
data using seripting languages case handles on-boarding data from systems that de
transform SQL.
or domain-specific languages like external to the organization, and mapping its
keys anu
Python
Transformation values to an output schema.
of Data
4.9.1 Benefits : Simple mapping from internal
sources This : Us
are based
several benefits case involves handling internal datasets that
Transforming data yields better-organized.
to make it on inconsistent definitions and transforming them i
transformed and
Data is easier for both humans a single trustworthy data set for the entire organizatiou.
may be
Transformed data involves the
use. Complex reconciliation : This use case
computers to improves data provide
validated data potential creation of complex calculated metrics that
formatted and
Properly applications from their own semantics based ondefined business log
protects duplicates,
and unexpected
quality values,
as nullincompatible formats.
landmines such
indexing, and
incorrect
Venture
21-22)(P6-56) Tech-Neo Publications..A SACHIN SHAH
w.e.facademic
year
Syllabus
New
Seneand (Big Data Analtics)..Pags
HANDLING CATEGORICAL DATA tio.(4-15)
411 2 AND MORE CATEGORIES
(II) Distance and Order
WITH
Numbers hold relationships.
For instance, four
two, and, when converting is twice
Cstcgorical data is simply information aggregated into catcgories into nurmbers
directly,
these relationships are created
rather than being in numeric formats, such as despite not existing bctwecn
rups
Sex or Education Level. They are present he original categories. Looking at the example
Gender, in bcfore,
United Kingdorm becomes
datasets, yet the current algorithms twice France, and France
amost all real-life | United States equals Germany.
plus
to deal withthem.
sGill struggle
Well, that's not exactly
XGBoost or most
right...
Take, for instance, SKlcarn models.
This is especially an issue for algorithms, such as K
deeply wrong once thought through. This is especially (III) One-Hot Encoding
true for non-ordinal categorical data, mcaning that the One-Hot Encoding is the most conmon, correct way to
classes are not ordered (As it might be for Good = 0. deal with non-ordinal categorical data. It consists of creating
an additional feature for each group of the categorical
Beter = 1, Best = 2). A bit of clarity is needed to
feature and mark each observation belonging (Value= 1) or
distinguish the approaches that Data Scientists should
not (Value = 0) to that group.
use from those that simply make the models run.
United States France Germany United Kingdom
3 What Not To Do 1
() Label Encoding 0 0
makes the models run, and it is combination. Increasing the number of features means
Ihis solution Data Scientists. However,
most commonly used by aspiring that we might encounter cases of not having enough
many issues. observations for each feature combination.
simplicity comes with
21-22)(P6-56) elTech-Neo
acadermic year Publications...A
(SPPU New Syllabus w.e.f
-
SACHIN
SHAH Ven
and Big Data Analytics
(SPPU-Sem 6-IT)
: This chart represents
Line Chart the change of (Big Data Analytica)..Page No.
continuous interval of time. data (4-17)
ra Map
Chart: This concept is based on
Aea the line chart. It RegionalMap : It uses
color to represent value
fills the area between the polyline distribution over a map
also and
with color,
representing better trend information.the axis partition.
Point Map : It represents
to represent the geographical
e Chart : It is used the proportion of distribution of data
in points on a
different classifications. It is only suitable background. When geographicat
for only one the points are the same
it can be size, it becomes meaningless in
series of data. However, made multi-layered for single data, but
represent the proportion of
data in different
i1 the points are as a bubble, it
to also represents the
size of the data in
categories, each region.
Flow Mlap : It represents
Funnel Chart : This chart represents the relationship
the proportion of between an inflow area
each stage and reflects the size of each module.
and an outflow area. 1t
It helps represents a line connecting
in comparing rankings. the geometric centers
of gravity of the spatial
clements. The use of
Word Cloud Chart : It a dynamic low lines helps
is visual representation
of reduce visual clutter.
t Äata. It
requies a large amount of data,
and the
Heat Map : This represents
the weight of cach
degree of discrimination needs to be point in a geographic area. The
high for users color here
perceive the most
to represents the density.
prominent one. It
is not a very
accurate analytical technique.
4.13 HIVE DATA ANALYTICS
Cantt Chart : It shows the actual timing
progress
and the
of the activity comparedto the requirements. i UQ. Explain Architecture
, of HIVE.
Radar Chart :
It is used to compare
multiple (SPPU-Q. 6(a), Dec. 19, 8 Marks)
quantized charts.
It represents which variables in the IGQ What is Hive ?
data have higher
values and which have lower values.
radar chart is used
Hive, originally developed by
A
for comparing classification and Facebook and later
owned by Apache, is a data storage
series along with
proportional representation. system that was
developed with a purpose to Unit
analyze organized
Scatter Plot : It shows
the distribution of variables in Working under an open-source
data platform called
data.V
points over a
rectangular coordinate system. The Hadoop, Apache Hive is an
application system that Ed Se
The concept of Structured Query Language or SQL Even though small scale organizations were
able to
software is involved in the process which manage medium-sized data and analyze
it with
communicates with nunerous databases and collects traditional data analytics tools, big data could not
the required data Understanding Hive big data through managed with such applications and so, there
Ws
the lens of data analytics can help us get more insights dire need for advanced software.
into the working of Apache Hive.
As data collection became a daily task and
By using a batch processing sequence, Hive generates
organizations expanded all aspects, data collection
data analytics in a much casier and organized form became exponential and vast. Furthermore, data began
that
also requires less time as compared to traditional tools. to be dealt in petabytes that define storage vast
HiveQL is a language similar to SQL that interacts with of data
For this, organizations necded hefty equipment and
the Hive database across various organizations
and perhaps that is the reason why the release of a software
analyses necessary data in a structured format.
like Apache Hive was necessary. Thus, Apache Hive
T Why do we need it ? was released with the purpose
of analyzing big data and
Hive in big data is a milestone innovation producing data-driven analogies.
that has
eventually led to data analysis on a large scale.
Big
Chapter Ends..
WT V
Big Data Visualization
PTER 5
vsualization tools
Conventional data
visualization, Tools used in
da
opics Data,
Big data Visualizing Big Analysis
Challenges to visualization, Case
Study:
visualization, of data visualization tools, visualization,
to Data representations, Types
n
data Visualizationtools, Open
-
source data
techniques used
in Big
data
explain
AAlso challenc
Visualization... visualization? overcomethese
nto Data
of data howto
need Marks). overco
What is the 18, 8 dataand to