0% found this document useful (0 votes)
19 views19 pages

Chap03 - Big Data and Data Retrieval

The document discusses big data, data retrieval, and related technologies. It defines big data as large volumes of structured and unstructured data that businesses deal with daily. Big data is characterized by its volume, velocity, and variety. The document outlines technologies like predictive analytics, NoSQL databases, and data integration that enable big data analytics. It also explains information retrieval as locating relevant information from resources and discusses data retrieval modes and key issues like security, searching, indexing, and retention.

Uploaded by

i21020791
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views19 pages

Chap03 - Big Data and Data Retrieval

The document discusses big data, data retrieval, and related technologies. It defines big data as large volumes of structured and unstructured data that businesses deal with daily. Big data is characterized by its volume, velocity, and variety. The document outlines technologies like predictive analytics, NoSQL databases, and data integration that enable big data analytics. It also explains information retrieval as locating relevant information from resources and discusses data retrieval modes and key issues like security, searching, indexing, and retention.

Uploaded by

i21020791
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

BIG DATA & DATA RETRIEVAL

Unit Learning Objectives

 Understand the concept of Big Data


 Explain Big Data technologies
 Explain Data retrieval and challenges involved in it

2
Big Data

 Big data is a term that describes the large volume of data (structured
and unstructured), that inundates a business on a day-to-day basis.
 It can be analyzed for insights that lead to better decisions and strategic
business moves

3
The THREE V’s of Big Data
Datasets cannot reasonably be handled by traditional computers or
tools due to their volume, velocity, and variety
 Volume: Organizations collect data from a variety of sources, including
business transactions, social media and information from sensor or
machine-to-machine data
 Velocity: Data streams in at an unprecedented speed and must be
dealt with in a timely manner (e.g.: RFID tags, sensors and smart
metering are driving the need to deal with torrents of data in near-real
time)
 Variety: Data comes in all types of formats (structured, numeric data in
traditional databases, unstructured text documents, email, video, audio,
stock ticker data and financial transactions)
4
What can we do with Big Data?

 Take the data from any source and analyze it to find answers that
enable:
 Cost reductions
 Time reductions
 New product development and optimized offerings
 Smart decision making
 Many more…

5
When you combine big data with high-powered analytics,
you can accomplish business-related tasks, such as:

 Determining root causes of failures, issues and defects in near-real


time.
 Generating coupons at the point of sale based on the customer’s buying
habits
 Recalculating entire risk portfolios in minutes
 Detecting fraudulent behavior before it affects your organization
 Etc

6
Big data analytics
 It is the process of collecting, organizing and analyzing the big data
to discover patterns and other useful information
 Advantages to an organization:
 To better understand the information contained within the data
 To identify the data that is most important to the business and
future business decisions

7
Big Data & Key Technologies

8
Key Technologies that enable Big Data Analytics for businesses

 Predictive analytics: a Big Data solutions that allow firms to


discover, evaluate, optimize, and deploy predictive models to
improve business performance or mitigate risk

9
 NoSQL databases: a mechanism for storage and retrieval of
data that is modeled (e.g. key-value, document, and graph
databases)

10
 Stream analytics: an event data processing service providing
real-time analytics and insights from apps, devices, sensors,
and more

11
 Distributed file stores: a computer network where data is stored on
more than one node (in a replicated fashion) for redundancy and
performance

12
 Data virtualization: a technology that delivers information from
various data sources

13
 Data integration: tools for data orchestration across solutions such
as Amazon Elastic MapReduce (EMR), Apache Hive, Apache Pig,
Apache Spark, MapReduce, Couchbase, Hadoop, and MongoDB.

14
 Data preparation: software that eases the burden of sourcing,
shaping, cleansing, and sharing diverse and messy data sets to
accelerate data’s usefulness for analytics.

15
Information retrieval

 It is referring to the task of collecting details of resources of


information, which are relevant to the information needed
(from a group of resources of information)
 Information retrieval can be grouped mainly into four stages:
1. Identifying the precise subject to search.
2. Locating search subject in a directory which directs the
searcher to the related documents.
3. Locating the above documents.
4. Identifying where the above information is located in the
documents.
16
Data Retrieval Modes

 Different retrieval modes allow you to access the data stored


in historian in different ways. E.g.:
 Multimedia mode: use the Internet where data is
accessed by placing search query on a website.
 Documented mode: It normally provide hard copy of data
on papers & documents.
 Verbal mode: It is the easy and a spontaneous retrieval
mode and this requires any known language.

17
Key issues involved in data retrieval

Security Searching Indexing Retention

18
Discussion

19

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy