0% found this document useful (0 votes)

4 views41 pages

1 Bda

The document provides an overview of Big Data, defining it as large and complex datasets characterized by volume, velocity, variety, veracity, and value. It discusses the sources, importance, and applications of Big Data across various industries, as well as key technologies like Hadoop and cloud computing that facilitate its processing and analysis. Additionally, it highlights the convergence of trends that have contributed to the rise of Big Data, including advancements in digital data generation and storage solutions.

Uploaded by

itstudents589

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views41 pages

1 Bda

Uploaded by

itstudents589

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 41

CCS334 - BIG DATA ANALYTICS

UNIT 1 - UNDERSTANDING BIG DATA 5

Introduction to big data – convergence of key trends – unstructured

data – industry examples of big data – web analytics – big data
applications– big data technologies – introduction to Hadoop – open
source technologies – cloud and big data – mobile business
intelligence – Crowd sourcing analytics – inter and trans firewall
analytics.
1. Introduction to Big Data
• Big Data refers to large and complex sets of data that are difficult to
process and analyze using traditional data management tools and
techniques.
• Data which are very large in size is called Big Data. Normally we work on
data of size MB (WordDoc ,Excel) or maximum GB(Movies, Codes) but
data in Peta bytes i.e. 10^15 byte size is called Big Data.
• The three main characteristics that define Big Data are commonly known
as the three Vs:
– Volume
– Velocity
– Variety
– Veracity
– Value
– Volume: Big Data involves massive quantities of data, ranging from terabytes
to petabytes and even exabytes. Traditional databases are often unable to
handle such enormous volumes of information.
– Velocity: The data in Big Data environments is generated and collected at a
high speed. This includes real-time data from social media platforms, online
transactions, and various other sources, which requires immediate processing
and analysis.
– Variety: Big Data is not limited to structured data like traditional databases; it
includes unstructured and semi-structured data as well. This data can take the
form of text, images, videos, audio files, log files, and more. Handling this
diverse data requires specialized tools and technologies.
• Beyond the three Vs, two additional characteristics are sometimes
considered:
– Veracity: This refers to the quality and reliability of the data. With large
volumes of data from different sources, ensuring the accuracy and credibility
of the data becomes crucial.
– Value: The ultimate goal of Big Data is to extract valuable insights and
knowledge from the vast amount of information. The value lies in the ability to
make data-driven decisions, uncover patterns, predict trends, and gain a
competitive advantage.
Sources of Big Data
These data come from many sources like
• Social networking sites: Facebook, Google, LinkedIn all these sites
generates huge amount of data on a day to day basis as they have billions
of users worldwide.
• E-commerce site: Sites like Amazon, Flipkart, Alibaba generates huge
amount of logs from which users buying trends can be traced.
• Weather Station: All the weather station and satellite gives very huge data
which are stored and manipulated to forecast weather.
• Telecom company: Telecom giants like Airtel, Vodafone study the user
trends and accordingly publish their plans and for this they store the data of
its million users.
• Share Market: Stock exchange across the world generates huge amount
of data through its daily transaction.
Importance of Bigdata

• Big Data does not take care of how much data is there, but how it can be
used. Data can be taken from various sources for analyzing it and finding
answers which enable:
– Reduction in cost.
– Time reductions.
– New product development with optimized offers.
– Well-groomed decision making.

Types of Big Data

The data generated in bulk amount with high velocity can be categorized as:
– Structured Data: These are relational data.
– Semi-structured Data: example: XML, JSON data.
– Unstructured Data: Data of different formats: document files, multimedia files,
images, backup files, etc.
2. Big Data – Convergence of Key Trends

• Big data is the result of the convergence of several key trends that have
emerged in the field of data management and technology.

• These trends have collectively contributed to the generation, collection, and

analysis of massive amounts of data, leading to the concept of big data.
• Some of the key trends that have converged to create the big data phenomenon are:

1. Proliferation of Digital Data: The digital revolution has led to an explosion of

data from various sources, including social media, websites, mobile devices,
sensors, IoT devices, and more. The increased digitization of information has
contributed significantly to the volume and variety of data available.

2. Advancements in Data Storage: The cost of data storage has dramatically reduced
over the years, making it economically viable to store large amounts of data. This
has enabled organizations to accumulate and retain vast datasets for longer periods.
3. Distributed Computing: The development of distributed computing
frameworks like Apache Hadoop and Apache Spark has revolutionized
data processing capabilities. These technologies allow for the distributed
storage and parallel processing of massive datasets across clusters of
computers, providing the scalability needed for big data.
4. Cloud Computing: Cloud platforms offer scalable and cost-effective
solutions for storing and processing big data. They provide easy access to
large storage capacities and computing resources, allowing organizations to
handle big data without significant upfront investments.
5. Internet of Things (IoT): IoT devices generate enormous amounts of data
from sensors and connected devices. The integration of IoT technology has
contributed to the velocity and volume of data in big data environments.
6. Social Media and User-Generated Content: The widespread adoption of
social media platforms has led to the creation of vast amounts of user-
generated content, including text, images, videos, and more. This
unstructured data adds to the variety of big data.
7. Machine Learning and Artificial Intelligence: The rise of machine
learning and AI has enabled advanced data analysis, pattern recognition,
and predictive modeling, making it possible to extract valuable insights
from large and complex datasets.
8. Open Data Initiatives: Governments and organizations worldwide have
initiated open data projects, making large datasets publicly available. These
initiatives have contributed to the growth and accessibility of big data.
9. Data Democratization: Data democratization aims to make data accessible
to a broader audience within an organization, empowering users to access
and analyze data independently. This trend has led to more data-driven
decision-making and increased reliance on big data.
10. Mobile Technology: The widespread use of smartphones and mobile
applications has generated vast amounts of data related to user behavior,
location, preferences, and more, further contributing to big data.
1.4 INDUSTRY EXAMPLES OF BIG DATA
• Big Data has found applications in various industries, revolutionizing how
businesses operate and make decisions.
1. Retail and E-commerce:
- Customer Analytics: Retailers analyze vast amounts of customer data,
including purchase history, online behavior, and social media interactions,
to understand customer preferences and provide personalized shopping
experiences.
- Inventory Management: Big Data helps optimize inventory levels by
predicting demand patterns, minimizing stockouts, and reducing excess
inventory.
- Price Optimization: Retailers use Big Data analytics to dynamically adjust
prices based on market trends, competitor pricing, and customer demand.

Example: Customer Analytics and Personalization

2. Healthcare:
- Patient Care: Big Data analytics is used to monitor patient health, track
medical records, and identify patterns that can lead to better treatment
outcomes and more precise diagnoses.
- Drug Discovery: Big Data is leveraged to analyze vast biological datasets,
accelerating drug discovery and development processes.
- Public Health: Health agencies use Big Data to monitor and respond to
disease outbreaks, track healthcare trends, and optimize resource allocation.
Example: Electronic Health Records (EHR) and Patient Monitoring
3. Finance:
- Fraud Detection: Big Data analytics helps financial institutions identify
suspicious transactions and patterns to prevent fraud and enhance security.
- Risk Assessment: Banks use Big Data to assess credit risks, investment
opportunities, and market trends to make informed decisions.
- Algorithmic Trading: Financial firms employ Big Data and machine
learning to analyze market data in real-time and make high-frequency
trading decisions.
4. Manufacturing:
- Predictive Maintenance: Big Data is used to monitor equipment health in
real-time, enabling proactive maintenance to minimize downtime and
reduce costs.
- Supply Chain Optimization: Big Data analytics helps optimize supply
chain operations, improve logistics, and enhance overall efficiency.
- Quality Control: Manufacturers analyze production data to identify
defects, improve product quality, and optimize production processes.
Example: Internet of Things (IoT) and Supply Chain Optimization
5. Transportation and Logistics:
- Route Optimization: Big Data helps optimize transportation routes,
reduce delivery times, and minimize fuel consumption for logistics
companies.
- Fleet Management: Fleet operators use Big Data to monitor vehicle
health, driver behavior, and safety compliance.
- Real-time Traffic Analysis: Big Data analytics enables real-time traffic
monitoring and helps in managing traffic flow and congestion.
6. Marketing and Advertising:
- Targeted Advertising: Big Data allows advertisers to target specific
customer segments with personalized advertisements based on their
interests and behavior.
- Social Media Analytics: Companies analyze social media data to
understand customer sentiments, monitor brand reputation, and engage
with customers effectively.
Example: Social Media Analytics and Targeted Advertising

• In the marketing and advertising industry, Big Data is used for social
media analytics to track customer sentiments, opinions, and interactions on
platforms like Twitter, Facebook, and Instagram.
• Moreover, Big Data is employed in targeted advertising, where algorithms
analyze customer data to deliver personalized ads to specific
demographics, increasing the effectiveness of ad campaigns.
1.5 WEB ANALYTICS
• Web analytics is the process of analyzing the behavior of visitors to a
website.
• This involves tracking, reviewing and reporting data to measure web
activity, including the use of a website and its components, such as
webpages, images and videos.
• Key components of web analytics include:
1. Data Collection
2. Data Measurement
3. Data Analysis
4. Reporting
• Popular web analytics tools:
1. Google Analytics: One of the most widely used web analytics tools, provided
by Google. It offers a comprehensive set of features to track and analyze
website data.
2. Adobe Analytics: A robust analytics platform that provides in-depth insights
and reports for large enterprises and e-commerce websites.
1.5 WEB ANALYTICS
Key Features of Google Analytics:
• Data Collection:
– Google Analytics uses a JavaScript tracking code installed on website pages to
collect data on user interactions, pageviews, events, and more.
– It can track visitors across sessions and devices, providing a comprehensive
view of user behavior.
• Real-time Reporting:
– Google Analytics offers real-time reporting, allowing users to monitor website
activity as it happens.
– This feature is particularly useful for tracking the immediate impact of
marketing campaigns or events.
• Audience Insights:
– Google Analytics provides valuable insights into the website's audience,
including demographics, interests, geographical location, and behavior.
– This data helps businesses understand their target audience better.
1.6 BIG DATA APPLICATIONS
1. Healthcare and Medical Research: Big Data is used to store and analyze
vast amounts of patient data, electronic health records, medical imaging,
and genomic data. This helps in disease diagnosis, drug development,
personalized medicine, and improving healthcare outcomes.
2. E-commerce and Retail: Big Data is applied to analyze customer behavior,
preferences, and purchase patterns. This data is used to offer personalized
product recommendations, optimize pricing strategies, and enhance the
overall shopping experience.
3. Financial Services: Big Data plays a crucial role in fraud detection, risk
assessment, and algorithmic trading. Financial institutions use data
analytics to analyze transaction data, customer behavior, and market trends
to make informed decisions.
4. Manufacturing and Industry 4.0: Big Data and IoT are utilized to monitor
and optimize manufacturing processes. Sensors collect real-time data from
machines, helping predict maintenance needs, improve efficiency, and
reduce downtime.
1.6 BIG DATA APPLICATIONS
5. Transportation and Logistics: Big Data is employed in route optimization,
supply chain management, and fleet tracking. Analyzing data from GPS,
sensors, and weather forecasts helps streamline logistics operations and
reduce costs.
6. Telecommunications: Big Data is used to analyze call data records,
customer behavior, and network performance. This data is utilized to
improve network efficiency, optimize service offerings, and enhance
customer satisfaction.
7. Media and Entertainment: Big Data enables content recommendation
engines, personalized advertising, and audience analysis. Media companies
use data analytics to deliver tailored content and marketing campaigns to
their audiences.
8. Energy and Utilities: Big Data is applied to analyze energy consumption
patterns, monitor equipment performance, and optimize energy
distribution. This helps in energy conservation and improved resource
management.
1.7 BIG DATA TECHNOLOGIES
• Big data technology is defined as software-utility. This technology is
primarily designed to analyze, process and extract information from a large
data set and a huge set of extremely complex structures.
Types of Big Data Technology

1. Operational Big Data Technologies

This type of big data technology mainly includes the basic day-to-day data
that people used to process.

• Examples
• Online ticket booking system, e.g., buses, trains, flights, and movies, etc.
• Online trading or shopping from e-commerce websites like Amazon,
Flipkart, Walmart, etc.
• Online data on social media sites, such as Facebook, Instagram, Whatsapp,
etc.
• The employees' data or executives' particulars in multinational companies.
1.7 BIG DATA TECHNOLOGIES
2. Analytical Big Data Technologies
• Analytical Big Data is commonly referred to as an improved version of
Big Data Technologies.
• Stock marketing data
• Weather forecasting data and the time series analysis
• Medical health records where doctors can personally monitor the health
status of an individual
• Carrying out the space mission databases where every information of a
mission is very important

We can categorize the leading big data technologies into the following four
sections:
• Data Storage
• Data Mining
• Data Analytics
• Data Visualization
1.7 BIG DATA TECHNOLOGIES
Here are some prominent Big Data technologies:
• Apache Hadoop: HDFS
• Apache Spark: batch processing, real-time streaming, machine learning,
and graph processing.
• NoSQL Databases: NoSQL databases, such as MongoDB, Cassandra, and
Hbase
• Apache Flink: Flink is another real-time stream processing engine similar to
Apache Spark.
• Apache Hive: Hive is a data warehousing and SQL-like querying
framework built on top of Hadoop.
• Apache Pig: Pig is a high-level platform for processing and analyzing large
datasets in Hadoop.
• Apache HBase: HBase is a distributed, columnar NoSQL database built to
work on top of Hadoop.
• Apache Storm: Storm is a distributed real-time computation system for
processing streaming data.
1.8 INTRODUCTION TO HADOOP
• Hadoop is an open-source software framework for storing and processing
big data.
– It was created by Apache Software Foundation in 2006, based on a white paper written
by Google in 2003 that described the Google File System (GFS) and the MapReduce
programming model

• Hadoop is an open-source framework that allows to store and process big

data in a distributed environment across clusters of computers using simple
programming models.
• It is designed to scale up from single servers to thousands of machines,
each offering local computation and storage.
• It is used by many organizations, including Yahoo, Facebook, and IBM, for
a variety of purposes such as data warehousing, log processing, and
research.
• Hadoop has been widely adopted in the industry and has become a key
technology for big data processing.
1.8 INTRODUCTION TO HADOOP
Advantages
• Scalability: Hadoop can easily scale to handle large amounts of
data by adding more nodes to the cluster.
• Cost-effective: Hadoop is designed to work with commodity
hardware, which makes it a cost-effective option for storing and
processing large amounts of data.
• Fault-tolerance: Hadoop’s distributed architecture provides built-in
fault-tolerance, which means that if one node in the cluster goes
down, the data can still be processed by the other nodes.
• Flexibility: Hadoop can process structured, semi-structured, and
unstructured data, which makes it a versatile option for a wide range of
big data scenarios.
• Open-source: Hadoop is open-source software, which means that it is free
to use and modify. This also allows developers to access the source code
and make improvements or add new features.
1.8 INTRODUCTION TO HADOOP
• Large community: Hadoop has a large and active community of
developers and users who contribute to the development of the software,
provide support, and share best practices.
• Integration: Hadoop is designed to work with other big data technologies
such as Spark, Storm, and Flink, which allows for integration with a wide
range of data processing and analysis tools.

The key components of Hadoop are

1. Hadoop Distributed File System (HDFS)
2. MapReduce
3. YARN (Yet Another Resource Negotiator)
4. Hadoop Common
1.8 INTRODUCTION TO HADOOP
1. Hadoop Distributed File System (HDFS)
– HDFS is a distributed file system that stores data across multiple servers in a
Hadoop cluster.
– It breaks large files into smaller blocks and replicates them across different
nodes for fault tolerance.
– HDFS is highly scalable and fault-tolerant, making it suitable for storing
massive datasets.
2. MapReduce
– MapReduce is a programming model and processing engine in Hadoop for
parallel data processing.
– It divides the data processing task into two stages: the Map stage, where data
is transformed and filtered, and the Reduce stage, where the results of the Map
stage are aggregated to produce the final output.
– MapReduce enables distributed processing of data across multiple nodes in the
cluster, making it efficient for processing large-scale datasets.
1.8 INTRODUCTION TO HADOOP
3. YARN (Yet Another Resource Negotiator)
– YARN is a resource management layer in Hadoop that manages and allocates
resources (CPU and memory) to applications running on the cluster.
– It separates the resource management from job scheduling and execution,
providing more flexibility and efficient resource utilization.
4. Hadoop Common
– Hadoop Common provides the shared utilities and libraries that support the
other components of Hadoop.
– It includes various tools, libraries, and APIs that make it easier to develop and
manage Hadoop applications.
Hadoop is highly suitable for processing Big Data because of its
distributed nature, fault tolerance, and scalability. It allows organizations to
store and process vast amounts of data on commodity hardware, which is
more cost-effective compared to traditional storage solutions.
1.9 OPEN SOURCE TECHNOLOGIES
• Hadoop is a significant open-source technology used for processing and
storing large datasets in a distributed computing environment.
• It is a key player in the Big Data ecosystem, providing a powerful
framework for handling massive volumes of data efficiently and cost-
effectively.
Hadoop Ecosystem
• The Hadoop ecosystem consists of various other open-source projects and
tools that integrate with Hadoop, extending its capabilities.
– Some popular components of the Hadoop ecosystem include Apache Hive
(SQL-like querying), Apache Pig (data processing), Apache HBase (NoSQL
database), and Apache Spark (real-time data processing).
• Hadoop's open-source nature has contributed to its widespread adoption
and continuous development, making it a fundamental technology for
handling Big Data in the modern data-driven world.
1.10 CLOUD AND BIG DATA
• Cloud computing and Big Data are two transformative technologies that
have become intertwined and complement each other.
• Cloud computing provides scalable and on-demand resources over the
internet, while Big Data deals with large volumes of data that require
specialized processing and storage.
• The integration of cloud and Big Data has enabled organizations to harness
the full potential of data analytics and unlock valuable insights.
• Here are some ways in which cloud and Big Data intersect:
 Scalability
 Storage and Data Management
 Distributed Processing
 Cost Optimization
1.10 CLOUD AND BIG DATA
• Cloud computing and Big Data are two powerful technologies that have
revolutionized how organizations manage and control data.

1. Data Storage and Scalability

• Cloud computing provides scalable and flexible storage options through
services like Amazon S3(Simple Storage Service), Google Cloud Storage,
and Microsoft Azure Blob Storage.
• These cloud storage services allow organizations to store vast amounts of
data, including structured, semi-structured, and unstructured data, without
the need to invest in physical hardware.

2. On-Demand Computing Resources

• Big Data processing often requires substantial computing power and
resources.
• Cloud computing offers on-demand access to virtual machines,
containers, and computing clusters, enabling organizations to provision
resources as needed.
1.10 CLOUD AND BIG DATA
3. Data Processing and Analytics
– Big Data frameworks and tools, such as Apache Hadoop, Apache
Spark, and Apache Flink, can be deployed on cloud infrastructure.
– Cloud providers offer managed services for these technologies,
simplifying the setup and management of Big Data clusters.

4. Cost Efficiency
– Cloud computing's pay-as-you-go model allows organizations to pay only for
the resources they use.
– This cost efficiency is especially beneficial for Big Data workloads, which may
have varying processing needs over time.

5. Real-Time Data Processing

– Cloud-based streaming platforms, such as Amazon Kinesis, Google
Cloud Pub/Sub, and Azure Event Hubs, enable real-time data ingestion
and processing.
• To streaming data in real time, facilitating applications like real-time analytics, fraud detection,
and monitoring systems.
1.10 CLOUD AND BIG DATA
• The real-time examples that demonstrate the synergy (combined action or
operation) between cloud computing and big data:

1. Real-Time Social Media Analytics

2. IoT Data Processing for Smart Cities
3. Real-Time Fraud Detection in Financial Services
4. Real-Time Health Monitoring and Predictive Analytics
5. Real-Time Supply Chain Management
1.10 CLOUD AND BIG DATA
1.11 MOBILE BUSINESS INTELLIGENCE
• Mobile Business Intelligence (Mobile BI) refers to the ability to access and
interact with business intelligence data and reports on mobile devices, such
as smartphones and tablets.
• It allows decision-makers and business users to access critical information
anytime, anywhere, and make data-driven decisions on the go.
• Mobile BI leverages the power of mobile technology and data analytics to
provide real-time insights and enable better business outcomes.

Here are some key aspects of Mobile Business Intelligence:

1. Mobile BI Applications
• Mobile BI applications are specifically designed to deliver business
intelligence content to mobile devices.
• These applications can be native apps developed for specific mobile
platforms (iOS, Android, etc.) or responsive web apps that adapt to
different screen sizes and device types.
1.11 MOBILE BUSINESS INTELLIGENCE
2. Real-Time Data Access
• Mobile BI allows users to access real-time data, KPIs (Key Performance
Indicators), and dashboards on their mobile devices.
• This real-time access empowers decision-makers to stay informed and
respond quickly to changing business conditions.

3. Interactive Dashboards and Reports

• Mobile BI tools provide interactive and user-friendly dashboards and reports
optimized for mobile screens.
• Users can drill down into data, apply filters, and perform data exploration and
analysis directly on their mobile devices.

4. Data Visualization
• Mobile BI emphasizes data visualization techniques to present complex data
in an easily digestible format on small screens.
• Visualizations such as charts, graphs, and maps help users understand trends,
patterns, and insights quickly.
1.11 MOBILE BUSINESS INTELLIGENCE
5. Offline Access
• Some Mobile BI applications offer offline access to data, allowing users to
access and view reports even when they are not connected to the internet.

6. Secure Data Access

• Mobile BI solutions prioritize data security and provide authentication and
authorization mechanisms to ensure that only authorized users can access
sensitive business data on mobile devices.

7. Push Notifications
• Mobile BI applications can send push notifications to users to alert them
about important events or changes in data, prompting them to take immediate
action.

8. Location-Based Analytics
• Mobile BI can leverage GPS and location data to provide location-based
insights, particularly useful for field sales teams, delivery personnel, and
location-specific business analysis.
1.11 MOBILE BUSINESS INTELLIGENCE
Example: Sentiment Analysis for Product Feedback

• Imagine a company that manufactures and sells consumer

electronics products.
• They have recently launched a new smartphone model and
want to gauge the sentiment and feedback of customers who
have purchased the product.
• Instead of relying solely on internal customer support data,
they decide to leverage crowdsourcing analytics to collect a
broader range of opinions.
1.10 CLOUD AND BIG DATA
1.12 CROWD SOURCING ANALYTICS
• Crowdsourcing analytics, also known as collaborative analytics or collective
intelligence, is a method of gathering and analyzing data by leveraging the
collective knowledge, expertise, and efforts of a diverse group of individuals or a
crowd.
• Crowdsourcing analytics is particularly valuable when dealing with complex or
large-scale data analysis tasks.

Example: Sentiment Analysis for Product Feedback

• Imagine a company that manufactures and sells consumer electronics products.
They have recently launched a new smart phone model and want to gauge the
sentiment and feedback of customers who have purchased the product.
• Instead of relying solely on internal customer support data, they decide to
leverage crowdsourcing analytics to collect a broader range of opinions.
1. Designing the Survey
2. Engaging the Crowd
3. Data Collection
4. Data Analysis
5. Insights and Actionable Findings
6. Comparison to Internal Data
7. Decision Making
1.12 CROWD SOURCING ANALYTICS
1.13 INTER AND TRANS FIREWALL ANALYTICS
• Inter and Trans Firewall Analytics are two types of analytics used to
analyze network traffic and security data within and between firewalls.

1. Inter-Firewall Analytics
• Inter-Firewall Analytics refers to the analysis of network traffic and
security events across multiple firewalls within an organization's network.
• In large enterprises or complex network environments, there might be
multiple firewalls deployed to protect different segments or zones of the
network.

2. Trans Firewall Analytics

• Trans Firewall Analytics, on the other hand, focuses on analyzing traffic
and security events that traverse a specific firewall or a set of firewalls.
• It involves deep inspection and analysis of network packets passing
through the firewall to identify potential threats, anomalies, or policy
violations.
1.13 INTER AND TRANS FIREWALL ANALYTICS

Unit 1 Big Data Analytics Full
No ratings yet
Unit 1 Big Data Analytics Full
29 pages
UNIT 1 - BIG DATA ANALYTICS Full
No ratings yet
UNIT 1 - BIG DATA ANALYTICS Full
28 pages
Unit 1 - Big Data Analytics - CCS334
No ratings yet
Unit 1 - Big Data Analytics - CCS334
35 pages
G12 It Unit 2
No ratings yet
G12 It Unit 2
30 pages
Unit-1.1-Introduction To Big Data
No ratings yet
Unit-1.1-Introduction To Big Data
50 pages
Unit - 1 Bda
No ratings yet
Unit - 1 Bda
14 pages
UNIT Two Emerging Technology
No ratings yet
UNIT Two Emerging Technology
43 pages
Unit 1 BDA
No ratings yet
Unit 1 BDA
38 pages
Big Data Analysis by Deshbandhu
No ratings yet
Big Data Analysis by Deshbandhu
368 pages
Introduction To Big Data Platform
No ratings yet
Introduction To Big Data Platform
20 pages
Big Data
No ratings yet
Big Data
9 pages
BDM 1
No ratings yet
BDM 1
37 pages
Big Data in CRM
No ratings yet
Big Data in CRM
12 pages
Bda U1
No ratings yet
Bda U1
78 pages
Introduction To Big Data - Report 1
No ratings yet
Introduction To Big Data - Report 1
5 pages
Bda Unit 1
No ratings yet
Bda Unit 1
20 pages
UNIT-1:Overview of Big Data
No ratings yet
UNIT-1:Overview of Big Data
10 pages
Unit I
No ratings yet
Unit I
64 pages
Introduction To Big Data Unit - 2
No ratings yet
Introduction To Big Data Unit - 2
75 pages
Big Data Analytics - CCS334 - Notes - Unit 1 - Understanding Big Data
No ratings yet
Big Data Analytics - CCS334 - Notes - Unit 1 - Understanding Big Data
40 pages
ETEM S01 - (Big Data)
No ratings yet
ETEM S01 - (Big Data)
24 pages
Big Data Technology Report With Pages Removed
No ratings yet
Big Data Technology Report With Pages Removed
32 pages
Introduction To Bda
No ratings yet
Introduction To Bda
67 pages
BD 1
No ratings yet
BD 1
15 pages
Big Data Analytics - CCS334 - Notes - ALL UNITS NOTES
No ratings yet
Big Data Analytics - CCS334 - Notes - ALL UNITS NOTES
130 pages
Title - Concept of Big Data: Presented by - Divyanshu Upadhyay Naman Gupta Adarsh Pandey Pankaj Chaudhary Shivbrat Singh
No ratings yet
Title - Concept of Big Data: Presented by - Divyanshu Upadhyay Naman Gupta Adarsh Pandey Pankaj Chaudhary Shivbrat Singh
17 pages
ETB 1 (Big Data)
No ratings yet
ETB 1 (Big Data)
28 pages
Ccs 334
No ratings yet
Ccs 334
16 pages
Big Data Sent 24 10 24
No ratings yet
Big Data Sent 24 10 24
49 pages
Big Data Use Cases: Product Development
No ratings yet
Big Data Use Cases: Product Development
8 pages
Unit 1 Big Data Notes
No ratings yet
Unit 1 Big Data Notes
40 pages
Need of Big Data
No ratings yet
Need of Big Data
5 pages
BDA
No ratings yet
BDA
148 pages
Unit 1 - ETI (BDA)
No ratings yet
Unit 1 - ETI (BDA)
20 pages
Unit 1 Notes Bda
No ratings yet
Unit 1 Notes Bda
20 pages
14 Big Data
No ratings yet
14 Big Data
39 pages
Chapter 1
No ratings yet
Chapter 1
21 pages
ET Ext
No ratings yet
ET Ext
217 pages
Unit-III CC&BD Cs62 Ab
No ratings yet
Unit-III CC&BD Cs62 Ab
85 pages
Unit I-Ch 01-Big Data Introduction
No ratings yet
Unit I-Ch 01-Big Data Introduction
40 pages
Sem Csen1301
No ratings yet
Sem Csen1301
12 pages
What Is Big Data - Introduction
No ratings yet
What Is Big Data - Introduction
6 pages
Now To Be Data
No ratings yet
Now To Be Data
16 pages
BDAchap 1
No ratings yet
BDAchap 1
15 pages
Emerging Big Data and Cloud Computing
No ratings yet
Emerging Big Data and Cloud Computing
15 pages
Unit 2
No ratings yet
Unit 2
35 pages
What's Is Big D-WPS Office
No ratings yet
What's Is Big D-WPS Office
3 pages
Big Data Analytics
No ratings yet
Big Data Analytics
8 pages
Chapter 3 Big Data Analytics and Big Data Analytics Techniques PDF
No ratings yet
Chapter 3 Big Data Analytics and Big Data Analytics Techniques PDF
22 pages
Big Data: Abstract
No ratings yet
Big Data: Abstract
15 pages
Big Data Introduction
No ratings yet
Big Data Introduction
7 pages
IT UNIT 2 Part 1
No ratings yet
IT UNIT 2 Part 1
33 pages
Big Data Analytics
No ratings yet
Big Data Analytics
73 pages
Big Data Analytics
No ratings yet
Big Data Analytics
83 pages
Unit 1 Big Data Notes
No ratings yet
Unit 1 Big Data Notes
40 pages
Big Data Analytics
No ratings yet
Big Data Analytics
14 pages
Big Data Analtics (Unit 1)
No ratings yet
Big Data Analtics (Unit 1)
31 pages
The Data Whisperer - Making Sense of Big Data
From Everand
The Data Whisperer - Making Sense of Big Data
Keaton Rivers
No ratings yet
Data Decoded - Understanding Big Data and Its Everyday Applications
From Everand
Data Decoded - Understanding Big Data and Its Everyday Applications
Michael Reed
No ratings yet
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
From Everand
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
Steven Vollmer
No ratings yet
Reference Architecture For Acceldata Deployments. v0.5
No ratings yet
Reference Architecture For Acceldata Deployments. v0.5
9 pages
c1943310 Datastage HD
No ratings yet
c1943310 Datastage HD
78 pages
11 Managed Services
No ratings yet
11 Managed Services
25 pages
SAS BASE A00 211 Sample Questions
No ratings yet
SAS BASE A00 211 Sample Questions
33 pages
3-1 Bigdata (Spark)
No ratings yet
3-1 Bigdata (Spark)
3 pages
Apache Hadoop
No ratings yet
Apache Hadoop
11 pages
Bda Lab Manual
No ratings yet
Bda Lab Manual
20 pages
Saurabh Verma
No ratings yet
Saurabh Verma
3 pages
Cloud Computing CS 15-319: Programming Models-Part III Lecture 6, Feb 1, 2012
No ratings yet
Cloud Computing CS 15-319: Programming Models-Part III Lecture 6, Feb 1, 2012
40 pages
Mobile Computing (KCS 713) Unit-5
No ratings yet
Mobile Computing (KCS 713) Unit-5
38 pages
!python Seminar
No ratings yet
!python Seminar
14 pages
6th Sem DS Syllabus 2022 Scheme
No ratings yet
6th Sem DS Syllabus 2022 Scheme
54 pages
(Ebook) Apache ZooKeeper Essentials - Saurav Haloi by 2015pdf Download
100% (4)
(Ebook) Apache ZooKeeper Essentials - Saurav Haloi by 2015pdf Download
55 pages
Aws Certified Data Engineer Associate 8
No ratings yet
Aws Certified Data Engineer Associate 8
16 pages
BDA Syllabus - Sem VII - Mumbai University
No ratings yet
BDA Syllabus - Sem VII - Mumbai University
3 pages
Unit 5
No ratings yet
Unit 5
6 pages
Notes Big Data
No ratings yet
Notes Big Data
106 pages
PowerScale OneFS Technical Specifications Guide 9.2.1.0
No ratings yet
PowerScale OneFS Technical Specifications Guide 9.2.1.0
18 pages
Codetru - Big Data
100% (1)
Codetru - Big Data
17 pages
Illumio White Paper How To Build A Micro Segmentation Strategy
No ratings yet
Illumio White Paper How To Build A Micro Segmentation Strategy
14 pages
Machine Learning Models and Algorithms For Big Data Classification - Suthaharan
100% (3)
Machine Learning Models and Algorithms For Big Data Classification - Suthaharan
30 pages
Unit-4-Unit-4-Bda EDIT
No ratings yet
Unit-4-Unit-4-Bda EDIT
16 pages
MANIRUL HALDER-profile
No ratings yet
MANIRUL HALDER-profile
6 pages
Farhan Data Engineer
No ratings yet
Farhan Data Engineer
9 pages
1 DataScience
No ratings yet
1 DataScience
91 pages
Logs
No ratings yet
Logs
12 pages
Big Data and Cognitive Computing
No ratings yet
Big Data and Cognitive Computing
10 pages
Ajay Kadiyala Resume 2023 PDF
No ratings yet
Ajay Kadiyala Resume 2023 PDF
6 pages
ICT30005 - Assignment 1 - Begum Bolu 6623433 - Big Data Analytics
No ratings yet
ICT30005 - Assignment 1 - Begum Bolu 6623433 - Big Data Analytics
7 pages
2023-24 M.SC I Computer Science Syllabus (NEP-2020 Pattern) (Affiliated Colleges) - 1
No ratings yet
2023-24 M.SC I Computer Science Syllabus (NEP-2020 Pattern) (Affiliated Colleges) - 1
34 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

1 Bda

Uploaded by

1 Bda

Uploaded by

CCS334 - BIG DATA ANALYTICS

UNIT 1 - UNDERSTANDING BIG DATA 5

Introduction to big data – convergence of key trends – unstructured

Types of Big Data

• These trends have collectively contributed to the generation, collection, and

1. Proliferation of Digital Data: The digital revolution has led to an explosion of

Example: Customer Analytics and Personalization

1. Operational Big Data Technologies

• Hadoop is an open-source framework that allows to store and process big

The key components of Hadoop are

1. Data Storage and Scalability

2. On-Demand Computing Resources

5. Real-Time Data Processing

1. Real-Time Social Media Analytics

Here are some key aspects of Mobile Business Intelligence:

3. Interactive Dashboards and Reports

6. Secure Data Access

• Imagine a company that manufactures and sells consumer

Example: Sentiment Analysis for Product Feedback

2. Trans Firewall Analytics

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.