Zoya Parasher - 2152916 - Big Data
Zoya Parasher - 2152916 - Big Data
On
Big Data
SUBMITTED BY:
Zoya Parasher
SUBMITTED TO:
Associate Professor
MBA PROGRAM
August,2022
Ques 1: Identify a key business initiative for organization ABC. What are the different business strategies
related to big data can be applied to improve customer retention, optimizing customer behavior?
Big data is a tool that Netflix employs to create popular content. Netflix was able to readily establish what kind
of material people desired because of the direct interaction it has with its users and the wealth of data it has
on how viewers interact with its content.
Netflix wants to give content producers the possibility to realize their most inventive ideas and to amuse via
creative storytelling (Smith et al., 2019). According to Brennan (2018), Netflix's intention is to concentrate on
local content both for locals and for people around the world. It carefully selects content and fosters artists'
inventiveness. Determining titles to stream, content that should be licensed or produced, customer journey
analytics from prospect to member and ongoing, marketing to existing customer base and attracting new
members, portfolio mix effectiveness, and insights to implement various strategies in various markets are
areas where data is used for decision-making and predictive analytics.
Bigdata analytics are used by Netflix to better understand its consumer base. They can provide the customer a
better service or product by utilizing these facts. Large volumes of data are gathered by Netflix from a diverse
subscriber base. It gathers information about a user's location, the content they watch, their interests, the
information they search for, and the time they watch it.
Netflix uses big data in the fields of global content development, content mix, user interface, licensing,
marketing, payment partnerships, and device support, according to Purkayastha, et al. (2013) and Brennan
(2018). In order to provide pertinent, individualized suggestions and produce spectacular, successful
experiences, Netflix analyses data on individual viewing patterns, including the subtleties of when members
pause, skip, stop, or switch. Netflix has confidently entered the production of original, inventive content by
utilizing insights into viewing behavior, hand-picking directors, writers, and performers, and acquiring license
rights (Brennan, 2018; McCranken, 2013). The article goes on to say that as part of its plan for exponential
globalization, Netflix is adjusting to local customs and preferences, forming relationships with local businesses,
and attracting viewers from abroad.
Currently, Netflix is very popular, and its popularity will increase during the Pandemic since people will truly
live by the slogan "Netflix and Chill." During the lockdown, it gained increased significance. The corporation
has significantly increased its subscriber base—it now has 195 million users worldwide—and has made AI and
data the cornerstones of their corporate strategy.
How are they using Big Data?
• They are using this to create content in the first place because it makes it evident what viewers enjoy
and dislike, where they spend more time watching, and which genre is popular among different age
groups. They were able to develop more content and comprehend specific markets as a result.
• The second area, which is rather large, is the automatic recommendations we receive after finishing a
series. The interesting conclusion is that since suggestions account for 80% of what we watch, Netflix is
able to predict our preferences almost as well as our mothers! They accomplished this by optimizing
their algorithms.
• The third section, which has automatically created thumbnails that we saw on the screen, is the most
fascinating section. In the five minutes it takes us to go through 30 titles, each thumbnail must be
interesting, succinct, and represent the main idea of the series or film. This thumbnail might be from
any scene, but the data makes it more likely that it will come from one that I particularly enjoy!
• The fourth area, which is their streaming optimization, is quite large. When a murderer is about to be
revealed, no one wants their movie to get stuck. They put the movies and shows you wish to view on a
local server using AI.
Netflix uses data processing software and traditional business intelligence tools such as Hadoop and Teradata,
as well as its own open-source solutions such as Lipstick and Genie, to gather, store, and process massive
amounts of information. (Sadeh, ClickZ, 2019).
Ques 2: Identify a "Big Data" application and write a report with a detailed description of the following:
• Application domain
• Big data-related problems (computing, storage, database, analytics, transfer, etc.)
• Existing solutions
Also, please mention the 10-vs associated with the term “Big Data”.
According to Valuates Reports, the market for big data analytics is expected to rise at a CAGR of 13.5% from
2020 to 2030.
The expansion of the big data market is being driven by the rising use of data analytics by various industries in
order to lower costs and deliver quicker and better decision-making by evaluating and acting on information in
a timely way.
Big data may help telecom firms become more profitable by enhancing customer experience, enhancing
network utilization and services, and improving security.
The telecommunications sector has access to new prospects because to big data. It can enhance service
quality and enable more efficient traffic routing. Tele businesses can also spot fraud and take prompt action
on it by monitoring call data records in real-time. In the end, this gives them a market advantage and helps
them discover untapped possibilities.
Big data is now crucial for advancing the telecommunications sector. Telecommunications providers may
significantly enhance their services and make their users happier with the correct data analytics strategy.
Big data analytics may help businesses and organizations make more informed decisions, provide better
customer service, and run more smoothly.
• Network Improvement:
The telecom sector is beginning to make use of big data analytics to efficiently monitor and manage network
capacity, create predictive capacity models, and use it for decision-making on network growth.
The telecom service providers can identify extremely congested locations where network usage is approaching
its capacity thresholds to prioritize expansion for new capacity roll out using real-time data analytics.
They can create predictive capacity forecasting models based on real-time analytics and prepare for extra
capacity in case of disruptions.
Data analytics for telecom can also assist in the detection of abnormalities and the maintenance of the secure,
dependable, and effective operation of the network systems.
• Price Reduction
It is now essential for telecom operators to set the best prices for their goods and services due to the
increased market rivalry for customers.
By examining consumers' responses to various pricing strategies, purchase histories, and competition pricing,
telecom operators can use data analytics to gather precise data insights and develop the best pricing plans.
Additionally, telecom providers can increase their return on investment, determine the perceived worth of
their goods or services, and boost the efficiency of their sales force.
Sales may be increased, new consumers can be attracted, and most importantly, devoted clients can be kept
by optimizing the price plan based on profit and revenue earned.
• Avoiding Fraud
According to industry estimates, telecoms lose 2.8% of their revenue each year to fraud and leakage, costing
the sector $40 billion yearly.
The telecommunications sector can be shielded against this scam by using big data analytics. It can capture
spam calls and mailings and recognize terms used frequently by online fraudsters. As an illustration, a Chinese
mobile operator recently released an app called Sky Shield that uses big data and AI to stop fraud in the
telecom industry.
The police gave the programmers a database of fraud cases, and Sky Shield was able to identify fraudulent
communication patterns, tell them apart from legitimate communications, and block spam calls and texts as a
result.
• Engine Recommendations
The customer's behavior is indicated by the recommendation engine, which is a collection of clever
algorithms. It forecasts the wants of customers in the future based on that behavior. Both content-based and
collaborative filtering techniques are used by recommendation engines.
The attributes that demonstrate the connection between the customer profile and the product or service a
customer selects are used in content-based filtering. On the other hand, collaborative filtering is based on
data analysis in accordance with user preferences and behavior.
Velocity
Variety
Veracity
•The Data in Doubt is another name for veracity. It speaks to the calibre of the information.
•The validity of the source data affects how accurate the analysis is.
•Veracity raises the question of whether the information is true and thorough, as well as from a reputable
source.
•Biases, noise, and irregularity are all examples of veracity in data.
•If the data is inaccurate, having a lot of it arriving quickly in diverse amounts is useless.
Validity
•The term "validity" describes the data's veracity, responsibility, correctness, appropriateness, precision, and
accuracy with regard to time.
Value
Variability
Venue
•Different types of data came from various sources via various platforms, including personnel systems and
private and public clouds.
•Work in data science occurs locally, on consumer workstations, and in the cloud, among other varied settings.
Vocabulary
•Data science offers a language for solving many issues. Different modelling strategies address various problem
domains, and various validation methods strengthen these strategies in various applications.
Vagueness
•Regardless of the quantity of data accessible, the interpretation of found data is frequently quite ambiguous.
•Information that was vague regarding the truth or showed little to no consideration for what it would mean
Ques 3: Can daily transaction be considered a part of HDFS? (Y/N?) Illustrate the difference with the existing
systems.
The data kept in Hadoop's file is not accessible at random. Hadoop cannot be used as an OLTP database, which
is defined by INSERT, UPDATE, and DELETE operations. Hadoop makes it possible to access previous data for
analysis. MapReduce is an algorithm that runs on the Hadoop distributed file system.