1 Bda
1 Bda
• Big Data does not take care of how much data is there, but how it can be
used. Data can be taken from various sources for analyzing it and finding
answers which enable:
– Reduction in cost.
– Time reductions.
– New product development with optimized offers.
– Well-groomed decision making.
• Big data is the result of the convergence of several key trends that have
emerged in the field of data management and technology.
2. Advancements in Data Storage: The cost of data storage has dramatically reduced
over the years, making it economically viable to store large amounts of data. This
has enabled organizations to accumulate and retain vast datasets for longer periods.
3. Distributed Computing: The development of distributed computing
frameworks like Apache Hadoop and Apache Spark has revolutionized
data processing capabilities. These technologies allow for the distributed
storage and parallel processing of massive datasets across clusters of
computers, providing the scalability needed for big data.
4. Cloud Computing: Cloud platforms offer scalable and cost-effective
solutions for storing and processing big data. They provide easy access to
large storage capacities and computing resources, allowing organizations to
handle big data without significant upfront investments.
5. Internet of Things (IoT): IoT devices generate enormous amounts of data
from sensors and connected devices. The integration of IoT technology has
contributed to the velocity and volume of data in big data environments.
6. Social Media and User-Generated Content: The widespread adoption of
social media platforms has led to the creation of vast amounts of user-
generated content, including text, images, videos, and more. This
unstructured data adds to the variety of big data.
7. Machine Learning and Artificial Intelligence: The rise of machine
learning and AI has enabled advanced data analysis, pattern recognition,
and predictive modeling, making it possible to extract valuable insights
from large and complex datasets.
8. Open Data Initiatives: Governments and organizations worldwide have
initiated open data projects, making large datasets publicly available. These
initiatives have contributed to the growth and accessibility of big data.
9. Data Democratization: Data democratization aims to make data accessible
to a broader audience within an organization, empowering users to access
and analyze data independently. This trend has led to more data-driven
decision-making and increased reliance on big data.
10. Mobile Technology: The widespread use of smartphones and mobile
applications has generated vast amounts of data related to user behavior,
location, preferences, and more, further contributing to big data.
1.4 INDUSTRY EXAMPLES OF BIG DATA
• Big Data has found applications in various industries, revolutionizing how
businesses operate and make decisions.
1. Retail and E-commerce:
- Customer Analytics: Retailers analyze vast amounts of customer data,
including purchase history, online behavior, and social media interactions,
to understand customer preferences and provide personalized shopping
experiences.
- Inventory Management: Big Data helps optimize inventory levels by
predicting demand patterns, minimizing stockouts, and reducing excess
inventory.
- Price Optimization: Retailers use Big Data analytics to dynamically adjust
prices based on market trends, competitor pricing, and customer demand.
• In the marketing and advertising industry, Big Data is used for social
media analytics to track customer sentiments, opinions, and interactions on
platforms like Twitter, Facebook, and Instagram.
• Moreover, Big Data is employed in targeted advertising, where algorithms
analyze customer data to deliver personalized ads to specific
demographics, increasing the effectiveness of ad campaigns.
1.5 WEB ANALYTICS
• Web analytics is the process of analyzing the behavior of visitors to a
website.
• This involves tracking, reviewing and reporting data to measure web
activity, including the use of a website and its components, such as
webpages, images and videos.
• Key components of web analytics include:
1. Data Collection
2. Data Measurement
3. Data Analysis
4. Reporting
• Popular web analytics tools:
1. Google Analytics: One of the most widely used web analytics tools, provided
by Google. It offers a comprehensive set of features to track and analyze
website data.
2. Adobe Analytics: A robust analytics platform that provides in-depth insights
and reports for large enterprises and e-commerce websites.
1.5 WEB ANALYTICS
Key Features of Google Analytics:
• Data Collection:
– Google Analytics uses a JavaScript tracking code installed on website pages to
collect data on user interactions, pageviews, events, and more.
– It can track visitors across sessions and devices, providing a comprehensive
view of user behavior.
• Real-time Reporting:
– Google Analytics offers real-time reporting, allowing users to monitor website
activity as it happens.
– This feature is particularly useful for tracking the immediate impact of
marketing campaigns or events.
• Audience Insights:
– Google Analytics provides valuable insights into the website's audience,
including demographics, interests, geographical location, and behavior.
– This data helps businesses understand their target audience better.
1.6 BIG DATA APPLICATIONS
1. Healthcare and Medical Research: Big Data is used to store and analyze
vast amounts of patient data, electronic health records, medical imaging,
and genomic data. This helps in disease diagnosis, drug development,
personalized medicine, and improving healthcare outcomes.
2. E-commerce and Retail: Big Data is applied to analyze customer behavior,
preferences, and purchase patterns. This data is used to offer personalized
product recommendations, optimize pricing strategies, and enhance the
overall shopping experience.
3. Financial Services: Big Data plays a crucial role in fraud detection, risk
assessment, and algorithmic trading. Financial institutions use data
analytics to analyze transaction data, customer behavior, and market trends
to make informed decisions.
4. Manufacturing and Industry 4.0: Big Data and IoT are utilized to monitor
and optimize manufacturing processes. Sensors collect real-time data from
machines, helping predict maintenance needs, improve efficiency, and
reduce downtime.
1.6 BIG DATA APPLICATIONS
5. Transportation and Logistics: Big Data is employed in route optimization,
supply chain management, and fleet tracking. Analyzing data from GPS,
sensors, and weather forecasts helps streamline logistics operations and
reduce costs.
6. Telecommunications: Big Data is used to analyze call data records,
customer behavior, and network performance. This data is utilized to
improve network efficiency, optimize service offerings, and enhance
customer satisfaction.
7. Media and Entertainment: Big Data enables content recommendation
engines, personalized advertising, and audience analysis. Media companies
use data analytics to deliver tailored content and marketing campaigns to
their audiences.
8. Energy and Utilities: Big Data is applied to analyze energy consumption
patterns, monitor equipment performance, and optimize energy
distribution. This helps in energy conservation and improved resource
management.
1.7 BIG DATA TECHNOLOGIES
• Big data technology is defined as software-utility. This technology is
primarily designed to analyze, process and extract information from a large
data set and a huge set of extremely complex structures.
Types of Big Data Technology
• Examples
• Online ticket booking system, e.g., buses, trains, flights, and movies, etc.
• Online trading or shopping from e-commerce websites like Amazon,
Flipkart, Walmart, etc.
• Online data on social media sites, such as Facebook, Instagram, Whatsapp,
etc.
• The employees' data or executives' particulars in multinational companies.
1.7 BIG DATA TECHNOLOGIES
2. Analytical Big Data Technologies
• Analytical Big Data is commonly referred to as an improved version of
Big Data Technologies.
• Stock marketing data
• Weather forecasting data and the time series analysis
• Medical health records where doctors can personally monitor the health
status of an individual
• Carrying out the space mission databases where every information of a
mission is very important
We can categorize the leading big data technologies into the following four
sections:
• Data Storage
• Data Mining
• Data Analytics
• Data Visualization
1.7 BIG DATA TECHNOLOGIES
Here are some prominent Big Data technologies:
• Apache Hadoop: HDFS
• Apache Spark: batch processing, real-time streaming, machine learning,
and graph processing.
• NoSQL Databases: NoSQL databases, such as MongoDB, Cassandra, and
Hbase
• Apache Flink: Flink is another real-time stream processing engine similar to
Apache Spark.
• Apache Hive: Hive is a data warehousing and SQL-like querying
framework built on top of Hadoop.
• Apache Pig: Pig is a high-level platform for processing and analyzing large
datasets in Hadoop.
• Apache HBase: HBase is a distributed, columnar NoSQL database built to
work on top of Hadoop.
• Apache Storm: Storm is a distributed real-time computation system for
processing streaming data.
1.8 INTRODUCTION TO HADOOP
• Hadoop is an open-source software framework for storing and processing
big data.
– It was created by Apache Software Foundation in 2006, based on a white paper written
by Google in 2003 that described the Google File System (GFS) and the MapReduce
programming model
4. Cost Efficiency
– Cloud computing's pay-as-you-go model allows organizations to pay only for
the resources they use.
– This cost efficiency is especially beneficial for Big Data workloads, which may
have varying processing needs over time.
4. Data Visualization
• Mobile BI emphasizes data visualization techniques to present complex data
in an easily digestible format on small screens.
• Visualizations such as charts, graphs, and maps help users understand trends,
patterns, and insights quickly.
1.11 MOBILE BUSINESS INTELLIGENCE
5. Offline Access
• Some Mobile BI applications offer offline access to data, allowing users to
access and view reports even when they are not connected to the internet.
7. Push Notifications
• Mobile BI applications can send push notifications to users to alert them
about important events or changes in data, prompting them to take immediate
action.
8. Location-Based Analytics
• Mobile BI can leverage GPS and location data to provide location-based
insights, particularly useful for field sales teams, delivery personnel, and
location-specific business analysis.
1.11 MOBILE BUSINESS INTELLIGENCE
Example: Sentiment Analysis for Product Feedback
1. Inter-Firewall Analytics
• Inter-Firewall Analytics refers to the analysis of network traffic and
security events across multiple firewalls within an organization's network.
• In large enterprises or complex network environments, there might be
multiple firewalls deployed to protect different segments or zones of the
network.