0% found this document useful (0 votes)
15 views60 pages

Unit 2&3 - 250421 - 215911

The document provides an overview of key concepts in machine learning, focusing on linear regression, Bayes' theorem, and support vector machines (SVM). It explains linear regression's predictive capabilities for continuous variables, differentiating between simple and multiple linear regression, and discusses the importance of cost functions and gradient descent. Additionally, it covers Bayes' theorem for probability updates and SVM's role in classification, including linear and non-linear approaches, as well as kernel methods for enhancing classification accuracy.

Uploaded by

vahuyad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views60 pages

Unit 2&3 - 250421 - 215911

The document provides an overview of key concepts in machine learning, focusing on linear regression, Bayes' theorem, and support vector machines (SVM). It explains linear regression's predictive capabilities for continuous variables, differentiating between simple and multiple linear regression, and discusses the importance of cost functions and gradient descent. Additionally, it covers Bayes' theorem for probability updates and SVM's role in classification, including linear and non-linear approaches, as well as kernel methods for enhancing classification accuracy.

Uploaded by

vahuyad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Unit 2

1. Linear Regression in Machine Learning


Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a statistical method
that is used for predictive analysis. Linear regression makes predictions for continuous/real or numeric variables
such as sales, salary, age, product price, etc. Linear regression algorithm shows a linear relationship between a
dependent (y) and one or more independent (y) variables, hence called as linear regression. Since linear regression
shows the linear relationship, which means it finds how the value of the dependent variable is changing according
to the value of the independent variable. The linear regression model provides a sloped straight line representing
the relationship between the variables. Consider the below image:

Mathematically, we can represent a linear regression as:

y= a0+a1x+ ε

Here,

Y= Dependent Variable (Target Variable)


X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error

The values for x and y variables are training datasets for Linear Regression model representation.

Types of Linear Regression

Linear regression can be further divided into two types of the algorithm:

o Simple Linear Regression: If a single independent variable is used to predict the value of a numerical
dependent variable, then such a Linear Regression algorithm is called Simple Linear Regression.
o Multiple Linear regression :If more than one independent variable is used to predict the value of a
numerical dependent variable, then such a Linear Regression algorithm is called Multiple Linear
Regression.
Linear Regression Line
A linear line showing the relationship between the dependent and independent variables is called a regression
line. A regression line can show two types of relationship:

o Positive Linear Relationship : If the dependent variable increases on the Y-axis and independent
variable increases on X-axis, then such a relationship is termed as a Positive linear relationship.

o Negative Linear Relationship: If the dependent variable decreases on the Y-axis and independent
variable increases on the X-axis, then such a relationship is called a negative linear relationship.

Finding the best fit line: When working with linear regression, our main goal is to find the best fit line that
means the error between predicted values and actual values should be minimized. The best fit line will have the
least error.the different values for weights or the coefficient of lines (a 0, a1) gives a different line of regression,
so we need to calculate the best values for a0 and a1 to find the best fit line, so to calculate this we use cost
function.

Cost function-

o The different values for weights or coefficient of lines (a 0, a1) gives the different line of regression, and
the cost function is used to estimate the values of the coefficient for the best fit line.
o Cost function optimizes the regression coefficients or weights. It measures how a linear regression model
is performing.
o We can use the cost function to find the accuracy of the mapping function, which maps the input
variable to the output variable. This mapping function is also known as Hypothesis function.
For Linear Regression, we use the Mean Squared Error (MSE) cost function, which is the average of squared
error occurred between the predicted values and actual values. It can be written as:

For the above linear e quation, MSE can be calculated as:

Where,

N=Total number of observation


Yi=Actual value
(a1xi+a0)= Predicted value.

Residuals: The distance between the actual value and predicted values is called residual. If the observed points
are far from the regression line, then the residual will be high, and so cost function will high. If the scatter points
are close to the regression line, then the residual will be small and hence the cost function.

Gradient Descent:

o Gradient descent is used to minimize the MSE by calculating the gradient of the cost function.
o A regression model uses gradient descent to update the coefficients of the line by reducing the cost
function.
o It is done by a random selection of values of coefficient and then iteratively update the values to reach
the minimum cost function.

2. Multivariate linear regression


Multivariate linear regression extends clean linear regression to a couple of unbiased variables. Instead of getting
in reality one feature (X) to predict a goal (Y), we have numerous capabilities (X1, X2, ..., Xn). The purpose stays
the same: to locate the exceptional linear dating of a number of the independent variables and the goal variable.

Multivariate linear regression's general formula is

Y = b0 + b1 * X1 + b2 * X2 +b3 * X3 + ……………bn * Xn + ?

The goal variable here is Y and X1 , X2 , X3 , X4 , …………Xn are the independent variables, b0 is the intercept,
b1 , b2, b3 , b4 , ………..bn are the coefficients, and ? represents the error term.

Assumption of Regression Model :

1. Homoscedasticity: Constant variance of the mistakes must be maintained.


2. Linearity: The relationship between based and unbiased variables ought to be linear.
3. Lack of Multicollinearity: It is believed that there is very little multicollinearity within the records.
4. Multivariate normality: Multiple Regression assumes that the residuals are commonly disbursed.

3. Bayes' theorem
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the
probability of an event with uncertain knowledge.In probability theory, it relates the conditional probability and
marginal probabilities of two random events.Bayes' theorem was named after the British mathematician Thomas
Bayes. The Bayesian inference is an application of Bayes' theorem, which is fundamental to Bayesian statistics.
It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).

Bayes' theorem allows updating the probability prediction of an event by observing new information of the real
world.

Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the probability of
cancer more accurately with the help of age.

Bayes' theorem can be derived using product rule and conditional probability of event A with known event B:

As from product rule we can write:

1. P(A ⋀ B)= P(A|B) P(B) or


Similarly, the probability of event B with known event A:

1. P(A ⋀ B)= P(B|A) P(A)


Equating right hand side of both the equations, we will get:

The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most modern AI
systems for probabilistic inference.

It shows the simple relationship between joint and conditional probabilities. Here,

P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of hypothesis A when
we have occurred an evidence B.

P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculate the probability of
evidence.

P(A) is called the prior probability, probability of hypothesis before considering the evidence

P(B) is called marginal probability, pure probability of an evidence.

In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be written as:

Where A1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.

Applying Bayes' rule:

Bayes' rule allows us to compute the single term P(B|A) in terms of P(A|B), P(B), and P(A). This is very useful in
cases where we have a good probability of these three terms and want to determine the fourth one. Suppose we
want to perceive the effect of some unknown cause, and want to compute that cause, then the Bayes' rule becomes:
Question1: What is the probability that a patient has diseases meningitis with a stiff neck?

Given Data:

A doctor is aware that disease meningitis causes a patient to have a stiff neck, and it occurs 80% of the time. He
is also aware of some more facts, which are given as follows:

o The Known probability that a patient has meningitis disease is 1/30,000.


o The Known probability that a patient has a stiff neck is 2%.
Let a be the proposition that patient has stiff neck and b be the proposition that patient has meningitis. , so we can
calculate the following as:

P(a|b) = 0.8

P(b) = 1/30000

P(a)= .02

Hence, we can assume that 1 patient out of 750 patients has meningitis disease with a stiff neck.

Example-2:

Question: From a standard deck of playing cards, a single card is drawn. The probability that the card is
king is 4/52, then calculate posterior probability P(King|Face), which means the drawn face card is a king
card.

Solution:

P(king): probability that the card is King= 4/52= 1/13

P(face): probability that a card is a face card= 3/13

P(Face|King): probability of face card when we assume it is a king = 1

Putting all values in equation (i) we will get:

Advertisement
Application of Bayes' theorem in Artificial intelligence:

Following are some applications of Bayes' theorem:

o It is used to calculate the next step of the robot when the already executed step is given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem.

4.Support Vector Machine Algorithm


Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is used for
Classification as well as Regression problems. However, primarily, it is used for Classification problems in
Machine Learning.The goal of the SVM algorithm is to create the best line or decision boundary that can segregate
n-dimensional space into classes so that we can easily put the new data point in the correct category in the future.
This best decision boundary is called a hyperplane.SVM chooses the extreme points/vectors that help in creating
the hyperplane. These extreme cases are called as support vectors, and hence algorithm is termed as Support
Vector Machine. Consider the below diagram in which there are two different categories that are classified using
a decision boundary or hyperplane:

SVM algorithm can be used for Face detection, image classification, text categorization, etc .

Types of SVM

SVM can be of two types:

o Linear SVM: Linear SVM is used for linearly separable data, which means if a dataset can be classified
into two classes by using a single straight line, then such data is termed as linearly separable data, and
classifier is used called as Linear SVM classifier.
o Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which means if a dataset
cannot be classified by using a straight line, then such data is termed as non-linear data and classifier
used is called as Non-linear SVM classifier.

Hyperplane and Support Vectors in the SVM algorithm:

Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in n-dimensional space, but
we need to find out the best decision boundary that helps to classify the data points. This best boundary is known
as the hyperplane of SVM.The dimensions of the hyperplane depend on the features present in the dataset, which
means if there are 2 features (as shown in image), then hyperplane will be a straight line. And if there are 3
features, then hyperplane will be a 2-dimension plane.We always create a hyperplane that has a maximum margin,
which means the maximum distance between the data points.

Support Vectors:The data points or vectors that are the closest to the hyperplane and which affect the position
of the hyperplane are termed as Support Vector. Since these vectors support the hyperplane, hence called a Support
vector.

Linear SVM:

The working of the SVM algorithm can be understood by using an example. Suppose we have a dataset that has
two tags (green and blue), and the dataset has two features x1 and x2. We want a classifier that can classify the
pair(x1, x2) of coordinates in either green or blue. Consider the below image:

So as it is 2-d space so by just using a straight line, we can easily separate these two classes. But there can be
multiple lines that can separate these classes. Consider the below image:

Hence, the SVM algorithm helps to find the best line or decision boundary; this best boundary or region is called
as a hyperplane. SVM algorithm finds the closest point of the lines from both the classes. These points are called
support vectors. The distance between the vectors and the hyperplane is called as margin. And the goal of SVM
is to maximize this margin. The hyperplane with maximum margin is called the optimal hyperplane.
Non-Linear SVM:

If data is linearly arranged, then we can separate it by using a straight line, but for non-linear data, we cannot draw
a single straight line. Consider the below image:

So to separate these data points, we need to add one more dimension. For linear data, we have used two dimensions
x and y, so for non-linear data, we will add a third dimension z. It can be calculated as:

z=x2 +y2

By adding the third dimension, the sample space will become as below image:
So now, SVM will divide the datasets into classes in the following way. Consider the below image:

Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we convert it in 2d space with
z=1, then it will become as:

Hence we get a circumference of radius 1 in case of non-linear data.

5.SVM Kernel
A set of techniques known as kernel methods are used in machine learning to address classification, regression,
and other prediction issues. They are built around the idea of kernels, which are functions that gauge how similar
two data points are to one another in a high-dimensional feature space.Kernel methods' fundamental premise is
used to convert the input data into a high-dimensional feature space, which makes it simpler to distinguish between
classes or generate predictions. Kernel methods employ a kernel function to implicitly map the data into the feature
space, as opposed to manually computing the feature space.The most popular kind of kernel approach is
the Support Vector Machine (SVM), a binary classifier that determines the best hyperplane that most effectively
divides the two groups. In order to efficiently locate the ideal hyperplane, SVMs map the input into a higher-
dimensional space using a kernel function.Other examples of kernel methods include kernel ridge regression,
kernel PCA, and Gaussian processes. Since they are strong, adaptable, and computationally efficient, kernel
approaches are frequently employed in machine learning. They are resilient to noise and outliers and can handle
sophisticated data structures like strings and graphs.

Kernel Method in SVMs

Support Vector Machines (SVMs) use kernel methods to transform the input data into a higher-dimensional
feature space, which makes it simpler to distinguish between classes or generate predictions. Kernel approaches
in SVMs work on the fundamental principle of implicitly mapping input data into a higher-dimensional feature
space without directly computing the coordinates of the data points in that space.

The kernel function in SVMs is essential in determining the decision boundary that divides the various classes. In
order to calculate the degree of similarity between any two points in the feature space, the kernel function
computes their dot product.

The most commonly used kernel function in SVMs is the Gaussian or radial basis function (RBF) kernel. The
RBF kernel maps the input data into an infinite-dimensional feature space using a Gaussian function. This kernel
function is popular because it can capture complex nonlinear relationships in the data.

Other types of kernel functions that can be used in SVMs include the polynomial kernel, the sigmoid kernel, and
the Laplacian kernel. The choice of kernel function depends on the specific problem and the characteristics of the
data.

Basically, kernel methods in SVMs are a powerful technique for solving classification and regression problems,
and they are widely used in machine learning because they can handle complex data structures and are robust to
noise and outliers.

Characteristics of Kernel Function

Kernel functions used in machine learning, including in SVMs (Support Vector Machines), have several important
characteristics, including:

o Mercer's condition: A kernel function must satisfy Mercer's condition to be valid. This condition
ensures that the kernel function is positive semi definite, which means that it is always greater than or
equal to zero.
o Positive definiteness: A kernel function is positive definite if it is always greater than zero except for
when the inputs are equal to each other.
o Non-negativity: A kernel function is non-negative, meaning that it produces non-negative values for all
inputs.
o Symmetry: A kernel function is symmetric, meaning that it produces the same value regardless of the
order in which the inputs are given.
o Reproducing property: A kernel function satisfies the reproducing property if it can be used to
reconstruct the input data in the feature space.
o Smoothness: A kernel function is said to be smooth if it produces a smooth transformation of the input
data into the feature space.
o Complexity: The complexity of a kernel function is an important consideration, as more complex kernel
functions may lead to over fitting and reduced generalization performance.
Basically, the choice of kernel function depends on the specific problem and the characteristics of the data, and
selecting an appropriate kernel function can significantly impact the performance of machine learning algorithms.

Major Kernel Function in Support Vector Machine

In Support Vector Machines (SVMs), there are several types of kernel functions that can be used to map the input
data into a higher-dimensional feature space. The choice of kernel function depends on the specific problem and
the characteristics of the data.

Here are some most commonly used kernel functions in SVMs:

Linear Kernel

A linear kernel is a type of kernel function used in machine learning, including in SVMs (Support Vector
Machines). It is the simplest and most commonly used kernel function, and it defines the dot product between the
input vectors in the original feature space.

The linear kernel can be defined as:


1. K(x, y) = x .y
Where x and y are the input feature vectors. The dot product of the input vectors is a measure of their similarity
or distance in the original feature space.

When using a linear kernel in an SVM, the decision boundary is a linear hyperplane that separates the different
classes in the feature space. This linear boundary can be useful when the data is already separable by a linear
decision boundary or when dealing with high-dimensional data, where the use of more complex kernel functions
may lead to overfitting.

Polynomial Kernel

A particular kind of kernel function utilised in machine learning, such as in SVMs, is a polynomial kernel (Support
Vector Machines). It is a nonlinear kernel function that employs polynomial functions to transfer the input data
into a higher-dimensional feature space.

One definition of the polynomial kernel is:

Where x and y are the input feature vectors, c is a constant term, and d is the degree of the polynomial, K(x, y) =
(x. y + c)d. The constant term is added to, and the dot product of the input vectors elevated to the degree of the
polynomial.The decision boundary of an SVM with a polynomial kernel might capture more intricate correlations
between the input characteristics because it is a nonlinear hyperplane.The degree of nonlinearity in the decision
boundary is determined by the degree of the polynomial.

6.Time series forecasting methods


Time series forecasting is a vital thing of records evaluation, used throughout severa industries to count on destiny
values primarily based mostly on historical data. Whether forecasting income, stock fees, or climate styles,
information considered one of a type forecasting techniques is important for making knowledgeable alternatives.
This article explores the important factor techniques applied in time series forecasting, highlighting their
programs, strengths, and weaknesses.

Understanding Time Series Data

Time series information is a sequence of facts factors collected or recorded at unique time intervals. Unlike
different information kinds, in which observations are impartial of each other, time series data has an inherent
temporal ordering. This makes it unique and requires specific interest when studying and forecasting future values.
Understanding the tendencies and components of time collection facts is important for effective evaluation and
prediction.

What is Time Series Data?

Time collection data consists of observations made sequentially over the years, frequently at normal intervals
including daily, month-to-month, or each 12 months. These information factors could constitute diverse
phenomena, such as inventory expenses, temperature readings, earnings figures, or even internet site site visitors.
The key feature of time series records is its chronological order, which need to be maintained all through
evaluation to keep the temporal relationships among observations.

Key Components of Time Series Data

Time series facts can be decomposed into severa key additives that help us apprehend the underlying styles:

Trend

Definition: A fashion is the lengthy-term movement or direction within the facts over the years. It represents the
general tendency of the facts to growth, decrease, or live stable over an extended length.
Example: A ordinary upward fashion in annual revenue over severa years suggests constant industrial company
growth.

Seasonality

Definition: Seasonality refers to periodic fluctuations or styles that repeat at regular intervals, frequently pushed
by means of seasonal elements like weather, holidays, or monetary cycles.

Example: Retail income peaking during the holiday season each year is a traditional instance of seasonality.

Cyclic Patterns

Definition: Cyclic styles are fluctuations that get up over longer, irregular durations, unlike seasonality, which has
a fixed periodicity. These cycles are frequently inspired with the resource of outside economic or social elements.

Example: Business cycles, wherein periods of economic expansion are followed with the aid of the usage of
recessions, are an example of cyclic styles.

Noise (Irregular Component)

Definition: Noise refers to random variations or fluctuations within the facts that can not be attributed to the
fashion, seasonality, or cyclic patterns. Noise is regularly considered because the "errors" or "residual" aspect of
the time collection.

Example: Sudden spikes in stock fees due to surprising news or activities constitute noise in economic time series
facts.

Different Approaches for Time Series

Time Series Modeling Techniques

To capture these components, there are a number of popular time series modelling techniques. This section
gives a brief introduction of each technique, however we will discuss about them in detail in the upcoming
chapters −

Naïve Methods

These are simple estimation techniques, such as the predicted value is given the value equal to mean of
preceding values of the time dependent variable, or previous actual value. These are used for comparison with
sophisticated modelling techniques.

Auto Regression

Auto regression predicts the values of future time periods as a function of values at previous time periods.
Predictions of auto regression may fit the data better than that of naïve methods, but it may not be able to
account for seasonality.

ARIMA Model

An auto-regressive integrated moving-average models the value of a variable as a linear function of previous
values and residual errors at previous time steps of a stationary timeseries. However, the real world data may be
non-stationary and have seasonality, thus Seasonal-ARIMA and Fractional-ARIMA were developed. ARIMA
works on univariate time series, to handle multiple variables VARIMA was introduced.
Exponential Smoothing

It models the value of a variable as an exponential weighted linear function of previous values. This statistical
model can handle trend and seasonality as well.

LSTM

Long Short-Term Memory model (LSTM) is a recurrent neural network which is used for time series to account
for long term dependencies. It can be trained with large amount of data to capture the trends in multi-variate
time series.

7. LINEAR AND NON LINEAR SYSTEM

linear system is a mathematical model of a system based on the use of a linear operator. Linear systems typically
exhibit features and properties that are much simpler than the nonlinear case. As a mathematical abstraction or
idealization, linear systems find important applications in automatic control theory, signal processing,
and telecommunications. For example, the propagation medium for wireless communication systems can often be
modeled by linear systems.

Nonlinear system (or a non-linear system) is a system in which the change of the output is not proportional to
the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians,
and many other scientists since most systems are inherently nonlinear in nature Nonlinear dynamical systems,
describing changes in variables over time, may appear chaotic, unpredictable, or counterintuitive, contrasting with
much simpler linear systems

8. RULE INDUCTION
Rule induction is a data mining process of deducing if-then rules from a data set. These symbolic decision rules
explain an inherent relationship between the attributes and class labels in the data set. Many real-life experiences
are based on intuitive rule induction. For example, we can proclaim a rule that states “if it is 8 a.m. on a weekday,
then highway traffic will be heavy” and “if it is 8 p.m. on a Sunday, then the traffic will be light.” These rules are
not necessarily right all the time. 8 a.m. weekday traffic may be light during a holiday season. But, in general,
these rules hold true and are deduced from real-life experience based on our every day observations. Rule
induction provides a powerful classification approach that can be easily understood by the general audienceRule
induction is a machine-learning technique that involves the discovery of patterns or rules in data. It aims to extract
explicit if-then rules that can accurately predict or classify instances based on their features or attributes.

The process of rule induction typically involves the following steps:

Data Preparation: The input data is prepared by organizing it into a structured format, such as a table or a matrix,

where each row represents an instance or observation, and each column represents a feature or attribute.

Rule Generation: The rule generation process involves finding patterns or associations in the data that can be

expressed as if-then rules. Various algorithms and methods can be used for rule generation, such as decision tree

algorithms (e.g., C4.5, CART), association rule mining algorithms (e.g., Apriori), and logical reasoning

approaches (e.g., inductive logic programming).


Rule Evaluation: Once the rules are generated, they need to be evaluated to determine their quality and

usefulness. Evaluation metrics can include accuracy, coverage, support, confidence, lift, and other measures

depending on the specific application and domain.

Rule Selection and Pruning: Depending on the complexity of the rule set and the specific requirements, rule

selection and pruning techniques can be applied to refine the rule set. This process involves removing redundant,

irrelevant, or overlapping rules to improve interpretability and efficiency.

Rule Application: Once a set of high-quality rules is obtained, they can be applied to new, unseen instances for

prediction or classification. Each instance is evaluated against the rules, and the applicable rule(s) with the highest

confidence or support is used to make predictions or decisions.Rule induction has been widely used in various

domains, such as data mining, machine learning, expert systems, and decision support systems. It provides

interpretable and human-readable models, making it useful for generating understandable insights and

explanations from data.While rule induction can be effective in capturing explicit patterns and associations in the

data, it may struggle with capturing complex or non-linear relationships. Additionally, rule induction algorithms

may face challenges when dealing with large and high-dimensional datasets, as the search space of possible rules

can become exponentially large.

9. NEURAL NETWORKS
The term "Artificial Neural Network" is derived from Biological neural networks that develop the structure of a
human brain. Similar to the human brain that has neurons interconnected to one another, artificial neural networks
also have neurons that are interconnected to one another in various layers of the networks. These neurons are
known as nodes.

The given figure illustrates the typical diagram of Biological Neural Network.

The typical Artificial Neural Network looks something like the given figure.
Dendrites from Biological Neural Network represent inputs in Artificial Neural Networks, cell nucleus represents
Nodes, synapse represents Weights, and Axon represents Output.

Relationship between Biological neural network and artificial neural network:

Biological Neural Network Artificial Neural Network

Dendrites Inputs

Cell nucleus Nodes

Synapse Weights

Axon Output

An Artificial Neural Network in the field of Artificial intelligence where it attempts to mimic the network of
neurons makes up a human brain so that computers will have an option to understand things and make decisions
in a human-like manner. The artificial neural network is designed by programming computers to behave simply
like interconnected brain cells.There are around 1000 billion neurons in the human brain. Each neuron has an
association point somewhere in the range of 1,000 and 100,000. In the human brain, data is stored in such a manner
as to be distributed, and we can extract more than one piece of this data when necessary from our memory
parallelly. We can say that the human brain is made up of incredibly amazing parallel processors.We can
understand the artificial neural network with an example, consider an example of a digital logic gate that takes an
input and gives an output. "OR" gate, which takes two inputs. If one or both the inputs are "On," then we get "On"
in output. If both the inputs are "Off," then we get "Off" in output. Here the output depends upon input. Our brain
does not perform the same task. The outputs to inputs relationship keep changing because of the neurons in our
brain, which are "learning."

The architecture of an artificial neural network:

To understand the concept of the architecture of an artificial neural network, we have to understand what a neural
network consists of. In order to define a neural network that consists of a large number of artificial neurons, which
are termed units arranged in a sequence of layers. Lets us look at various types of layers available in an artificial
neural network.

Artificial Neural Network primarily consists of three layers:

Input Layer:

As the name suggests, it accepts inputs in several different formats provided by the programmer.

Hidden Layer:

The hidden layer presents in-between input and output layers. It performs all the calculations to find hidden
features and patterns.

Output Layer:

The input goes through a series of transformations using the hidden layer, which finally results in output that is
conveyed using this layer.

The artificial neural network takes input and computes the weighted sum of the inputs and includes a bias. This
computation is represented in the form of a transfer function.

Feedforward Neural Network

A Feedforward Neural Network (FNN) is a type of artificial neural network where connections between the
nodes do not form cycles. This characteristic differentiates it from recurrent neural networks (RNNs). The
network consists of an input layer, one or more hidden layers, and an output layer. Information flows in one
direction—from input to output—hence the name "feedforward."

Structure of a Feedforward Neural Network


1. Input Layer: The input layer consists of neurons that receive the input data. Each neuron in the input
layer represents a feature of the input data.
2. Hidden Layers: One or more hidden layers are placed between the input and output layers. These layers
are responsible for learning the complex patterns in the data. Each neuron in a hidden layer applies a
weighted sum of inputs followed by a non-linear activation function.
3. Output Layer: The output layer provides the final output of the network. The number of neurons in this
layer corresponds to the number of classes in a classification problem or the number of outputs in a
regression problem.
Each connection between neurons in these layers has an associated weight that is adjusted during the training
process to minimize the error in predictions.
Feed Forward Neural Network

Activation Functions
Activation functions introduce non-linearity into the network, enabling it to learn and model complex data
patterns. Common activation functions include:
 Sigmoid: σ(x)=σ(x)=11+e−xσ(x)=1+e−x1
 Tanh: tanh(x)=ex−e−xex+e−xtanh(x)=ex+e−xex−e−x
 ReLU (Rectified Linear Unit): ReLU(x)=max⁡(0,x)ReLU(x)=max(0,x)
 Leaky ReLU: Leaky ReLU(x)=max⁡(0.01x,x)Leaky ReLU(x)=max(0.01x,x)

FeedBackward Neural Network

Backpropagation is a powerful algorithm in deep learning, primarily used to train artificial neural networks,
particularly feed-forward networks. It works iteratively, minimizing the cost function by adjusting weights
and biases.In each epoch, the model adapts these parameters, reducing loss by following the error gradient.
Backpropagation often utilizes optimization algorithms like gradient descent or stochastic gradient descent.
The algorithm computes the gradient using the chain rule from calculus, allowing it to effectively navigate
complex layers in the neural network to minimize the cost function.

A simple illustration of how the backpropagation works by adjustments of weights


Why is Backpropagation Important?
Backpropagation plays a critical role in how neural networks improve over time. Here's why:
1. Efficient Weight Update: It computes the gradient of the loss function with respect to each weight using
the chain rule, making it possible to update weights efficiently.
2. Scalability: The backpropagation algorithm scales well to networks with multiple layers and complex
architectures, making deep learning feasible.
3. Automated Learning: With backpropagation, the learning process becomes automated, and the model
can adjust itself to optimize its performance.

10. PRINCIPAL COMPONENT ANALYSIS


Principal Component Analysis is an unsupervised learning algorithm that is used for the dimensionality reduction
in machine learning. It is a statistical process that converts the observations of correlated features into a set of
linearly uncorrelated features with the help of orthogonal transformation. These new transformed features are
called the Principal Components. It is one of the popular tools that is used for exploratory data analysis and
predictive modeling. It is a technique to draw strong patterns from the given dataset by reducing the variances.

PCA generally tries to find the lower-dimensional surface to project the high-dimensional data.

PCA works by considering the variance of each attribute because the high attribute shows the good split between
the classes, and hence it reduces the dimensionality. Some real-world applications of PCA are image processing,
movie recommendation system, optimizing the power allocation in various communication channels. It is a
feature extraction technique, so it contains the important variables and drops the least important variable.

The PCA algorithm is based on some mathematical concepts such as:

o Variance and Covariance


o Eigenvalues and Eigen factors
Some common terms used in PCA algorithm:

o Dimensionality: It is the number of features or variables present in the given dataset. More easily, it is
the number of columns present in the dataset.
o Correlation: It signifies that how strongly two variables are related to each other. Such as if one changes,
the other variable also gets changed. The correlation value ranges from -1 to +1. Here, -1 occurs if
variables are inversely proportional to each other, and +1 indicates that variables are directly proportional
to each other.
o Orthogonal: It defines that variables are not correlated to each other, and hence the correlation between
the pair of variables is zero.
o Eigenvectors: If there is a square matrix M, and a non-zero vector v is given. Then v will be eigenvector
if Av is the scalar multiple of v.
o Covariance Matrix: A matrix containing the covariance between the pair of variables is called the
Covariance Matrix.

Problem-01:

Given data = { 2, 3, 4, 5, 6, 7 ; 1, 5, 3, 6, 7, 8 }.
Compute the principal component using PCA Algorithm.

OR
Consider the two dimensional patterns (2, 1), (3, 5), (4, 3), (5, 6), (6, 7), (7, 8).

Compute the principal component using PCA Algorithm.

OR

Compute the principal component of following data-

CLASS 1

X=2,3,4
Y=1,5,3

CLASS 2

X=5,6,7
Y=6,7,8

Solution

Step-01:

Get data.

The given feature vectors are-

 x1 = (2, 1)
 x2 = (3, 5)
 x3 = (4, 3)
 x4 = (5, 6)
 x5 = (6, 7)
 x6 = (7, 8)

Step-02:

Calculate the mean vector (µ).


Mean vector (µ)
= ((2 + 3 + 4 + 5 + 6 + 7) / 6, (1 + 5 + 3 + 6 + 7 + 8) / 6)

= (4.5, 5)

Thus,

Step-03:

Subtract mean vector (µ) from the given feature vectors.

 x1 – µ = (2 – 4.5, 1 – 5) = (-2.5, -4)


 x2 – µ = (3 – 4.5, 5 – 5) = (-1.5, 0)
 x3 – µ = (4 – 4.5, 3 – 5) = (-0.5, -2)
 x4 – µ = (5 – 4.5, 6 – 5) = (0.5, 1)
 x5 – µ = (6 – 4.5, 7 – 5) = (1.5, 2)
 x6 – µ = (7 – 4.5, 8 – 5) = (2.5, 3)

Feature vectors (xi) after subtracting mean vector (µ) are-

Step-04:

Calculate the covariance matrix.


Covariance matrix is given by-
Now

Now,

Covariance matrix

= (m1 + m2 + m3 + m4 + m5 + m6) / 6

On adding the above matrices and dividing by 6, we get-


Step-05:

Calculate the eigen values and eigen vectors of the covariance matrix.

λ is an eigen value for a matrix M if it is a solution of the characteristic equation |M – λI| = 0.

So, we have-

From here,
(2.92 – λ)(5.67 – λ) – (3.67 x 3.67) = 0

16.56 – 2.92λ – 5.67λ + λ2 – 13.47 = 0

λ2 – 8.59λ + 3.09 = 0

Solving this quadratic equation, we get λ = 8.22, 0.38

Thus, two eigen values are λ1 = 8.22 and λ2 = 0.38.

Clearly, the second eigen value is very small compared to the first eigen value.
So, the second eigen vector can be left out.

Eigen vector corresponding to the greatest eigen value is the principal component for the given
data set.
So. we find the eigen vector corresponding to eigen value λ1.

We use the following equation to find the eigen vector-


MX = λX
where-

 M = Covariance Matrix
 X = Eigen vector
 λ = Eigen value

Substituting the values in the above equation, we get-

Solving these, we get-

2.92X1 + 3.67X2 = 8.22X1

3.67X1 + 5.67X2 = 8.22X2

On simplification, we get-
5.3X1 = 3.67X2 ………(1)

3.67X1 = 2.55X2 ………(2)

From (1) and (2), X1 = 0.69X2

From (2), the eigen vector is-

Thus, principal component for the given data set is-


Lastly, we project the data points onto the new subspace as-

11. FUZZY LOGIC


The 'Fuzzy' word means the things that are not clear or are vague. Sometimes, we cannot decide in real life that
the given problem or statement is either true or false. At that time, this concept provides many values between the
true and false and gives the flexibility to find the best solution to that problem.

Example of Fuzzy Logic as comparing to Boolean Logic

Characteristics of Fuzzy Logic

Following are the characteristics of fuzzy logic:

1. This concept is flexible and we can easily understand and implement it.
2. It is used for helping the minimization of the logics created by the human.
3. It is the best method for finding the solution of those problems which are suitable for approximate or
uncertain reasoning.
4. It always offers two values, which denote the two possible solutions for a problem and statement.
5. It allows users to build or create the functions which are non-linear of arbitrary complexity.
6. In fuzzy logic, everything is a matter of degree.
7. In the Fuzzy logic, any system which is logical can be easily fuzzified.
8. It is based on natural language processing.
9. It is also used by the quantitative analysts for improving their algorithm's execution.
10. It also allows users to integrate with the programming.

Architecture of a Fuzzy Logic System

In the architecture of the Fuzzy Logic system, each component plays an important role. The architecture consists
of the different four components which are given below.

1. Rule Base
2. Fuzzification
3. Inference Engine
4. Defuzzification
Following diagram shows the architecture or process of a Fuzzy Logic system:

1. Rule Base

Rule Base is a component used for storing the set of rules and the If-Then conditions given by the experts are used
for controlling the decision-making systems. There are so many updates that come in the Fuzzy theory recently,
which offers effective methods for designing and tuning of fuzzy controllers. These updates or developments
decreases the number of fuzzy set of rules.

2. Fuzzification

Fuzzification is a module or component for transforming the system inputs, i.e., it converts the crisp number into
fuzzy steps. The crisp numbers are those inputs which are measured by the sensors and then fuzzification passed
them into the control systems for further processing. This component divides the input signals into following five
states in any Fuzzy Logic system:

o Large Positive (LP)


o Medium Positive (MP)
o Small (S)
o Medium Negative (MN)
o Large negative (LN)

3. Inference Engine

This component is a main component in any Fuzzy Logic system (FLS), because all the information is processed
in the Inference Engine. It allows users to find the matching degree between the current fuzzy input and the rules.
After the matching degree, this system determines which rule is to be added according to the given input field.
When all rules are fired, then they are combined for developing the control actions.

4. Defuzzification

Defuzzification is a module or component, which takes the fuzzy set inputs generated by the Inference Engine,
and then transforms them into a crisp value. It is the last step in the process of a fuzzy logic system. The crisp
value is a type of value which is acceptable by the user. Various techniques are present to do this, but the user has
to select the best one for reducing the errors.

Membership Function
The membership function is a function which represents the graph of fuzzy sets, and allows users to quantify
the linguistic term. It is a graph which is used for mapping each element of x to the value between 0 and 1.

This function is also known as indicator or characteristics function.

This function of Membership was introduced in the first papers of fuzzy set by Zadeh. For the Fuzzy set B, the
membership function for X is defined as: μB:X → [0,1]. In this function X, each element of set B is mapped to
the value between 0 and 1. This is called a degree of membership or membership value.

Classical and Fuzzy Set Theory

To learn about classical and Fuzzy set theory, firstly you have to know about what is set.

Set

A set is a term, which is a collection of unordered or ordered elements. Following are the various examples of a
set:

1. A set of all-natural numbers


2. A set of students in a class.
3. A set of all cities in a state.
4. A set of upper-case letters of the alphabet.

Types of Set:

There are following various categories of set:

1. Finite
2. Empty
3. Infinite
4. Proper
5. Universal
6. Subset
7. Singleton
8. Equivalent Set
9. Disjoint Set

Classical Set

It is a type of set which collects the distinct objects in a group. The sets with the crisp boundaries are classical
sets. In any set, each single entity is called an element or member of that set.

Mathematical Representation of Sets

Any set can be easily denoted in the following two different ways:

1. Roaster Form: This is also called as a tabular form. In this form, the set is represented in the following way:

Set_name = { element1, element2, element3, ......, element N}


The elements in the set are enclosed within the brackets and separated by the commas.

Advertisement
Following are the two examples which describes the set in Roaster or Tabular form:

Example 1:
Set of Natural Numbers: N={1, 2, 3, 4, 5, 6, 7, ......,n).
Example 2:

Set of Prime Numbers less than 50: X={2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47}.
2. Set Builder Form: Set Builder form defines a set with the common properties of an element in a set. In this
form, the set is represented in the following way:

A = {x:p(x)}
The following example describes the set in the builder form:

The set {2, 4, 6, 8, 10, 12, 14, 16, 18} is written as:
B = {x:2 ≤ x < 20 and (x%2) = 0}

Operations on Classical Set

Following are the various operations which are performed on the classical sets:

1. Union Operation
2. Intersection Operation
3. Difference Operation
4. Complement Operation

1. Union:

This operation is denoted by (A U B). A U B is the set of those elements which exist in two different sets A and
B. This operation combines all the elements from both the sets and make a new set. It is also called a Logical OR
operation.

It can be described as:

A ∪ B = { x | x ∈ A OR x ∈ B }.
Example:

Set A = {10, 11, 12, 13}, Set B = {11, 12, 13, 14, 15}, then A ∪ B = {10, 11, 12, 13, 14, 15}
2. Intersection

This operation is denoted by (A ∩ B). A ∩ B is the set of those elements which are common in both set A and B.
It is also called a Logical OR operation.

It can be described as:

A ∩ B = { x | x ∈ A AND x ∈ B }.
Example:

Set A = {10, 11, 12, 13}, Set B = {11, 12, 14} then A ∩ B = {11, 12}
3. Difference Operation

This operation is denoted by (A - B). A-B is the set of only those elements which exist only in set A but not in set
B.

It can be described as:

A - B = { x | x ∈ A AND x ∉ B }.
4. Complement Operation: This operation is denoted by (A`). It is applied on a single set. A` is the set of
elements which do not exist in set A.
It can be described as:

A′ = {x|x ∉ A}.

Properties of Classical Set

There are following various properties which play an essential role for finding the solution of a fuzzy logic
problem.

1. Commutative Property:

This property provides the following two states which are obtained by two finite sets A and B:

A∪B=B∪A
A∩B=B∩A
2. Associative Property:

This property also provides the following two states but these are obtained by three different finite sets A, B, and
C:

A ∪ (B ∪ C) = (A ∪ B) ∪ C
A ∩ (B ∩ C) = (A ∩ B) ∩ C
3. Idempotency Property:

This property also provides the following two states but for a single finite set A:

A∪A=A
A∩A=A
4. Absorption Property

This property also provides the following two states for any two finite sets A and B:

A ∪ (A ∩ B) = A
A ∩ (A ∪ B) = A
5. Distributive Property:

This property also provides the following two states for any three finite sets A, B, and C:

A∪ (B ∩ C) = (A ∪ B)∩ (A ∪ C)
A∩ (B ∪ C) = (A∩B) ∪ (A∩C)
6. Identity Property:

This property provides the following four states for any finite set A and Universal set X:

A ∪ φ =A
A∩X=A
A∩φ=φ
A∪X=X
7. Transitive property

This property provides the following state for the finite sets A, B, and C:

If A ⊆ B ⊆ C, then A ⊆ C
8. Ivolution property

This property provides following state for any finite set A:


9. De Morgan's Law

This law gives the following rules for providing the contradiction and tautologies:

Fuzzy Set

The set theory of classical is the subset of Fuzzy set theory. Fuzzy logic is based on this theory, which is a
generalisation of the classical theory of set (i.e., crisp set) introduced by Zadeh in 1965.

A fuzzy set is a collection of values which exist between 0 and 1. Fuzzy sets are denoted or represented by the
tilde (~) character. The sets of Fuzzy theory were introduced in 1965 by Lofti A. Zadeh and Dieter Klaua. In the
fuzzy set, the partial membership also exists. This theory released as an extension of classical set theory.

This theory is denoted mathematically asA fuzzy set (Ã) is a pair of U and M, where U is the Universe of discourse
and M is the membership function which takes on values in the interval [ 0, 1 ]. The universe of discourse (U) is
also denoted by Ω or X.

Operations on Fuzzy Set

Given à and B are the two fuzzy sets, and X be the universe of discourse with the following respective member
functions:

The operations of Fuzzy set are as follows:

1. Union Operation: The union operation of a fuzzy set is defined by:

μA∪B(x) = max (μA(x), μB(x))

Example:

Let's suppose A is a set which contains following elements:

A = {( X1, 0.6 ), (X2, 0.2), (X3, 1), (X4, 0.4)}


And, B is a set which contains following elements:

B = {( X1, 0.1), (X2, 0.8), (X3, 0), (X4, 0.9)}


then,

AUB = {( X1, 0.6), (X2, 0.8), (X3, 1), (X4, 0.9)}


Because, according to this operation

For X1

μA∪B(X1) = max (μA(X1), μB(X1))


μA∪B(X1) = max (0.6, 0.1)
μA∪B(X1) = 0.6
For X2

μA∪B(X2) = max (μA(X2), μB(X2))


μA∪B(X2) = max (0.2, 0.8)
μA∪B(X2) = 0.8
For X3

μA∪B(X3) = max (μA(X3), μB(X3))


μA∪B(X3) = max (1, 0)
μA∪B(X3) = 1
For X4

μA∪B(X4) = max (μA(X4), μB(X4))


μA∪B(X4) = max (0.4, 0.9)
μA∪B(X4) = 0.9
2. Intersection Operation:The intersection operation of fuzzy set is defined by:

μA∩B(x) = min (μA(x), μB(x))

Example:

Let's suppose A is a set which contains following elements:

A = {( X1, 0.3 ), (X2, 0.7), (X3, 0.5), (X4, 0.1)}


And, B is a set which contains following elements:

B = {( X1, 0.8), (X2, 0.2), (X3, 0.4), (X4, 0.9)}


then,

A∩B = {( X1, 0.3), (X2, 0.2), (X3, 0.4), (X4, 0.1)}


Because, according to this operation

For X1

μA∩B(X1) = min (μA(X1), μB(X1))


μA∩B(X1) = min (0.3, 0.8)
μA∩B(X1) = 0.3
For X2

μA∩B(X2) = min (μA(X2), μB(X2))


μA∩B(X2) = min (0.7, 0.2)
μA∩B(X2) = 0.2
For X3
μA∩B(X3) = min (μA(X3), μB(X3))
μA∩B(X3) = min (0.5, 0.4)
μA∩B(X3) = 0.4
For X4

μA∩B(X4) = min (μA(X4), μB(X4))


μA∩B(X4) = min (0.1, 0.9)
μA∩B(X4) = 0.1
3. Complement Operation: The complement operation of fuzzy set is defined by:

μĀ(x) = 1-μA(x),

Example:

Let's suppose A is a set which contains following elements:

A = {( X1, 0.3 ), (X2, 0.8), (X3, 0.5), (X4, 0.1)}


then,

Ā= {( X1, 0.7 ), (X2, 0.2), (X3, 0.5), (X4, 0.9)}


Because, according to this operation

For X1

μĀ(X1) = 1-μA(X1)
μĀ(X1) = 1 - 0.3
μĀ(X1) = 0.7
For X2

μĀ(X2) = 1-μA(X2)
μĀ(X2) = 1 - 0.8
μĀ(X2) = 0.2
For X3

μĀ(X3) = 1-μA(X3)
μĀ(X3) = 1 - 0.5
μĀ(X3) = 0.5
For X4

μĀ(X4) = 1-μA(X4)
μĀ(X4) = 1 - 0.1
μĀ(X4) = 0.9

Classical Set Theory Fuzzy Set Theory

1. This theory is a class of those sets having sharp boundaries. 1. This theory is a class of those sets having un-sharp bound

2. This set theory is defined by exact boundaries only 0 and 1. 2. This set theory is defined by ambiguous boundaries.
3. In this theory, there is no uncertainty about the boundary's 3. In this theory, there always exists uncertainty about the
location of a set. boundary's location of a set.

4. This theory is widely used in the design of digital systems. 4. It is mainly used for fuzzy controllers.

Applications of Fuzzy Logic

Following are the different application areas where the Fuzzy Logic concept is widely used:

1. It is used in Businesses for decision-making support system.


2. It is used in Automative systems for controlling the traffic and speed, and for improving the efficiency
of automatic transmissions. Automative systems also use the shift scheduling method for automatic
transmissions.
3. This concept is also used in the Defence in various areas. Defence mainly uses the Fuzzy logic systems
for underwater target recognition and the automatic target recognition of thermal infrared images.
4. It is also widely used in the Pattern Recognition and Classification in the form of Fuzzy logic-based
recognition and handwriting recognition. It is also used in the searching of fuzzy images.
5. Fuzzy logic systems also used in Securities.
6. It is also used in microwave oven for setting the lunes power and cooking strategy.
7. This technique is also used in the area of modern control systems such as expert systems.
8. Finance is also another application where this concept is used for predicting the stock market, and for
managing the funds.
9. It is also used for controlling the brakes.
10. It is also used in the industries of chemicals for controlling the ph, and chemical distillation process.
11. It is also used in the industries of manufacturing for the optimization of milk and cheese production.
12. It is also used in the vacuum cleaners, and the timings of washing machines.
13. It is also used in heaters, air conditioners, and humidifiers.

Advantages of Fuzzy Logic

Fuzzy Logic has various advantages or benefits. Some of them are as follows:

1. The methodology of this concept works similarly as the human reasoning.


2. Any user can easily understand the structure of Fuzzy Logic.
3. It does not need a large memory, because the algorithms can be easily described with fewer data.
4. It is widely used in all fields of life and easily provides effective solutions to the problems which have
high complexity.
5. This concept is based on the set theory of mathematics, so that's why it is simple.
6. It allows users for controlling the control machines and consumer products.
7. The development time of fuzzy logic is short as compared to conventional methods.
8. Due to its flexibility, any user can easily add and delete rules in the FLS system.

Disadvantages of Fuzzy Logic

Fuzzy Logic has various disadvantages or limitations. Some of them are as follows:

1. The run time of fuzzy logic systems is slow and takes a long time to produce outputs.
2. Users can understand it easily if they are simple.
3. The possibilities produced by the fuzzy logic system are not always accurate.
4. Many researchers give various ways for solving a given statement using this technique which leads to
ambiguity.
5. Fuzzy logics are not suitable for those problems that require high accuracy.
6. The systems of a Fuzzy logic need a lot of testing for verification and validation.
Fuzzy Decision Tree

Steps for Decision Making

Let us now discuss the steps involved in the decision making process −

 Determining the Set of Alternatives − In this step, the alternatives from which the decision has to be
taken must be determined.
 Evaluating Alternative − Here, the alternatives must be evaluated so that the decision can be taken
about one of the alternatives.
 Comparison between Alternatives − In this step, a comparison between the evaluated alternatives is
done.

Types of Decision

Making We will now understand the different types of decision making.

Individual Decision Making

In this type of decision making, only a single person is responsible for taking decisions. The decision making
model in this kind can be characterized as −

 Set of possible actions


 Set of goals Gi(i∈Xn);Gi(i∈Xn);
 Set of Constraints Cj(j∈Xm)Cj(j∈Xm)

The goals and constraints stated above are expressed in terms of fuzzy sets.

Now consider a set A. Then, the goal and constraints for this set are given by −

Gi(a)Gi(a) = composition[Gi(a)][Gi(a)] = G1i(Gi(a))Gi1(Gi(a)) with G1iGi1

Cj(a)Cj(a) = composition[Cj(a)][Cj(a)] = C1j(Cj(a))Cj1(Cj(a)) with C1jCj1 for a∈Aa∈A

The fuzzy decision in the above case is given by −

FD=min[i∈XinnfGi(a),j∈XinmfCj(a)]FD=min[i∈XninfGi(a),j∈XminfCj(a)]

Multi-person Decision Making

Decision making in this case includes several persons so that the expert knowledge from various persons is
utilized to make decisions.

Calculation for this can be given as follows −

Number of persons preferring xixi to xjxj = N(xi,xj)N(xi,xj)

Total number of decision makers = nn


Then, SC(xi,xj)=N(xi,xj)nSC(xi,xj)=N(xi,xj)n

Multi-objective Decision Making

Multi-objective decision making occurs when there are several objectives to be realized. There are following
two issues in this type of decision making −

 To acquire proper information related to the satisfaction of the objectives by various alternatives.
 To weigh the relative importance of each objective.

Mathematically we can define a universe of n alternatives as −

A=[a1,a2,...,ai,...,an]A=[a1,a2,...,ai,...,an]

And the set of “m” objectives as O=[o1,o2,...,oi,...,on]O=[o1,o2,...,oi,...,on]

Multi-attribute Decision Making

Multi-attribute decision making takes place when the evaluation of alternatives can be carried out based on
several attributes of the object. The attributes can be numerical data, linguistic data and qualitative data.

Mathematically, the multi-attribute evaluation is carried out on the basis of linear equation as follows −

Y=A1X1+A2X2+...+AiXi+...+ArXr

UNIT 3
1. Data Stream in Data Analytics

Data stream refers to the continuous flow of data generated by various sources in real-time. It plays a crucial role
in modern technology, enabling applications to process and analyze information as it arrives, leading to timely
insights and actions. In this article, we are going to discuss concepts of the data stream in data analytics in detail
what data streams are, their importance, and how they are used in fields like finance, telecommunications, and
IoT (Internet of Things).A data stream is an existing, continuous, ordered (implicitly by entrance time or explicitly
by timestamp) chain of items. It is unfeasible to control the order in which units arrive, nor it is feasible to locally
capture stream in its entirety.It is enormous volumes of data, items arrive at a high rate.

Types of Data Streams

 Data stream –
A data stream is a(possibly unchained) sequence of tuples. Each tuple comprised of a set of attributes, similar to
a row in a database table.
 Transactional data stream –
It is a log interconnection between entities
1. Credit card – purchases by consumers from producer
2. Telecommunications – phone calls by callers to the dialed parties
3. Web – accesses by clients of information at servers

 Measurement data streams –


1. Sensor Networks – a physical natural phenomenon, road traffic
2. IP Network – traffic at router interfaces
3. Earth climate – temperature, humidity level at weather stations

Examples of Stream Sources


1. Sensor Data –
In navigation systems, sensor data is used. Imagine a temperature sensor floating about in the ocean,
sending back to the base station a reading of the surface temperature each hour. The data generated by this
sensor is a stream of real numbers. We have 3.5 terabytes arriving every day and we for sure need to think
about what we can be kept continuing and what can only be archived.
2. Image Data –
Satellites frequently send down-to-earth streams containing many terabytes of images per day. Surveillance
cameras generate images with lower resolution than satellites, but there can be numerous of them, each
producing a stream of images at a break of 1 second each.
3. Internet and Web Traffic –
A bobbing node in the center of the internet receives streams of IP packets from many inputs and paths
them to its outputs. Websites receive streams of heterogeneous types. For example, Google receives a
hundred million search queries per day.

Characteristics of Data Streams

1. Large volumes of continuous data, possibly infinite.


2. Steady changing and requires a fast, real-time response.
3. Data stream captures nicely our data processing needs of today.
4. Random access is expensive and a single scan algorithm
5. Store only the summary of the data seen so far.
6. Maximum stream data are at a pretty low level or multidimensional in creation, needs multilevel and
multidimensional treatment.

Applications of Data Streams

1. Fraud perception
2. Real-time goods dealing
3. Consumer enterprise
4. Observing and describing on inside IT systems

Advantages of Data Streams

 This data is helpful in upgrading sales


 Help in recognizing the fallacy
 Helps in minimizing costs
 It provides details to react swiftly to risk

Disadvantages of Data Streams

 Lack of security of data in the cloud


 Hold cloud donor subordination
 Off-premises warehouse of details introduces the probable for disconnection

2. Database Management System (DBMS)


Database Management System (DBMS) is a software system that is designed to manage and organize data in a
structured manner. It allows users to create, modify, and query a database, as well as manage the security and
access controls for that database.

The Data Stream Management System manages continuous streams of data with very fast changes in real-time.
Unlike other databases which may be static data, its source might include sensors or social media. It thus offers
real-time insight and rapid decision-making in those applications that need immediate data analysis and
reporting.

3.DSMS Architecture

DSMS stands for data stream management system. It is nothing but a software application just
like DBMS (database management system) but it involves processing and management of a continuously
flowing data stream rather than static data like Excel PDF or other files. It is generally used to deal data
streamsfrom with various sources which include sensor data, social media fields, financial reports, etc.Just like
DBMS, DSMS also provides a wide range of operations like storage, processing, analyzing, integration also
helps to generate the visualization and report only used for data streams.There are wide range of DSMS
applications available in the market among them Apache Flint, Apache Kafka, Apache Storm, Amazon kinesis,
etc. DSMS processes 2 types of queries standard queries and ad hoc queries.

Data stream Management system architecture

DSMS consists of various layer which are dedicated to perform particular operation which are as follows:
1. Data source Layer
The first layer of DSMS is data source layer as it name suggest it is comprises of all the data sources which
includes sensors, social media feeds, financial market, stock markets etc. In the layer capturing and parsing of
data stream happens. Basically it is the collection layer which collects the data.
2. Data Ingestion Layer
You can consider this layer as bridge between data source layer and processing layer. The main purpose of this
layer is to handle the flow of data i.e., data flow control, data buffering and data routing.
3. Processing Layer
This layer consider as heart of DSMS architecture it is functional layer of DSMS applications. It process the
data streams in real time. To perform processing it is uses processing engines like Apache flink or Apache
storm etc., The main function of this layer is to filter, transform, aggregate and enriching the data stream. This
can be done by derive insights and detect patterns.
4. Storage Layer
Once data is process we need to store the processed data in any storage unit. Storage layer consist of various
storage like NoSQL database, distributed database etc., It helps to ensure data durability and availability of
data in case of system failure.
5. Querying Layer
As mentioned above it support 2 types of query ad hoc query and standard query. This layer provides the tools
which can be used for querying and analyzing the stored data stream. It also have SQL like query languages or
programming API. This queries can be question like how many entries are done? which type of data is inserted?
etc.,
6. Visualization and Reporting Layer
This layer provides tools for perform visualization like charts, pie chart, histogram etc., On the basis of this
visual representation it also helps to generate the report for analysis.
7. Integration Layer
This layer responsible for integrating DSMS application with traditional system, business intelligence tools,
data warehouses, ML application, NLP applications. It helps to improve already present running applications.
The layers are responsible for working of DSMS applications. It provides scalable and fault tolerance
application which can handle huge volume of streaming data. These layer can change according to the business
requirements some may include all layer some may exclude layers.

4. Difference Between DBMS and DSMS


DBMS DSMS

DBMS refers to Data Base Management System. DSMS refers to Data Stream Management System.

Data Base Management System deals with Data Stream Management System deals with
persistent data. stream data.

In DBMS random data access takes place. In DSMS sequential data access takes place.

It is based on Query Driven processing model i.e It is based on Data Driven processing model i.e
called pull based model. called push based model.

In DBMS query plan is optimized at


DSMS is based on adaptive query plans.
beginning/fixed.

The data update rates in DBMS is relatively low. The data update rates in DSMS is relatively high.

In DBMS the queries are one time queries. But in DSMS the queries are continuous.

In DSMS the query gives the exact/approximate


In DBMS the query gives the exact answer.
answer.
DBMS DSMS

DBMS provides no real time service. DSMS provides real time service.

DBMS uses unbounded disk store means unlimited DSMS uses bounded main memory means
secondary storage. limited main memory.

5.Data Stream Sampling


Data Sampling is a statistical method that is used to analyze and observe a subset of data from a larger piece
of dataset and configure meaningful information, all the required info from the subset that helps in gaining
information, or drawing conclusion for the larger dataset, or it's parent dataset.
 Sampling in data science helps in finding more better and accurate results and works best when the data
size is big.
 Sampling helps in identifying the entire pattern on which the subset of the dataset is based upon and on
the basis of that smaller dataset, entire sample size is presumed to hold the same properties.
 It is a quicker and more effective method to draw conclusions.

What is Data Sampling important?

Data sampling is important for a couple of key reasons:


1. Cost and Time Efficiency: Sampling allows researchers to collect and analyze a subset of data rather
than the entire population. This reduces the time and resources required for data collection and analysis,
making it more cost-effective, especially when dealing with large datasets.
2. Feasibility: In many cases, it's impractical or impossible to analyze the entire population due to
constraints such as time, budget, or accessibility. Sampling makes it feasible to study a representative
portion of the population while still yielding reliable results.
3. Risk Reduction: Sampling helps mitigate the risk of errors or biases that may occur when analyzing the
entire population. By selecting a random or systematic sample, researchers can minimize the impact of
outliers or anomalies that could skew the results.
4. Accuracy: In some cases, examining the entire population might not even be possible. For instance,
testing every single item in a large batch of manufactured goods would be impractical. Data sampling
allows researchers to get a good understanding of the whole population by examining a well-chosen
subset.

Types of Data Sampling Techniques

There are mainly two types of Data Sampling techniques which are further divided into 4 sub-categories each.
They are as follows:

Probability Data Sampling Technique

Probability Data Sampling technique involves selecting data points from a dataset in such a way that every data
point has an equal chance of being chosen. Probability sampling techniques ensure that the sample is
representative of the population from which it is drawn, making it possible to generalize the findings from the
sample to the entire population with a known level of confidence.

1. Simple Random Sampling: In Simple random sampling, every dataset has an equal chance or probability
of being selected. For eg. Selection of head or tail. Both of the outcomes of the event have equal
probabilities of getting selected.
2. Systematic Sampling: In Systematic sampling, a regular interval is chosen each after which the dataset
continues for sampling. It is more easier and regular than the previous method of sampling and reduces
inefficiency while improving the speed. For eg. In a series of 10 numbers, we have a sampling after every
2nd number. Here we use the process of Systematic sampling.
3. Stratified Sampling: In Stratified sampling, we follow the strategy of divide & conquer. We opt for the
strategy of dividing into groups on the basis of similar properties and then perform sampling. This ensures
better accuracy. For eg. In a workplace data, the total number of employees is divided among men and
women.
4. Cluster Sampling: Cluster sampling is more or less like stratified sampling. However in cluster sampling
we choose random data and form it in groups, whereas in stratified we use strata, or an orderly division
takes place in the latter. For eg. Picking up users of different networks from a total combination of users.

Non-Probability Data Sampling

Non-probability data sampling means that the selection happens on a non-random basis, and it depends on the
individual as to which data does it want to pick. There is no random selection and every selection is made by a
thought and an idea behind it.

1. Convenience Sampling: As the name suggests, the data checker selects the data based on his/her
convenience. It may choose the data sets that would require lesser calculations, and save time while
bringing results at par with probability data sampling technique. For eg. Dataset involving recruitment of
people in IT Industry, where the convenience would be to choose the data which is the latest one, and the
one which encompasses youngsters more.
2. Voluntary Response Sampling: As the name suggests, this sampling method depends on the voluntary
response of the audience for the data. For eg. If a survey is being conducted on types of Blood groups
found in majority at a particular place, and the people who are willing to take part in this survey, and then
if the data sampling is conducted, it will be referred to as the voluntary response sampling.
3. Purposive Sampling: The Sampling method that involves a special purpose falls under purposive
sampling. For eg. If we need to tackle the need of education, we may conduct a survey in the rural areas
and then create a dataset based on people's responses. Such type of sampling is called Purposive Sampling.
4. Snowball Sampling: Snowball sampling technique takes place via contacts. For eg. If we wish to conduct
a survey on the people living in slum areas, and one person contacts us to the other and so on, it is called
a process of snowball sampling.

6.Data Sampling Process

The process of data sampling involves the following steps:

 Find a Target Dataset: Identify the dataset that you want to analyze or draw conclusions about. This
dataset represents the larger population from which a sample will be drawn.
 Select a Sample Size: Determine the size of the sample you will collect from the target dataset. The sample
size is the subset of the larger dataset on which the sampling process will be performed.
 Decide the Sampling Technique: Choose a suitable sampling technique from options such as Simple
Random Sampling, Systematic Sampling, Cluster Sampling, Snowball Sampling, or Stratified Sampling.
The choice of technique depends on factors such as the nature of the dataset and the research objectives.
 Perform Sampling: Apply the selected sampling technique to collect data from the target dataset. Ensure
that the sampling process is carried out systematically and according to the chosen method.
 Draw Inferences for the Entire Dataset: Analyze the properties and characteristics of the sampled data
subset. Use statistical methods and analysis techniques to draw inferences and insights that are
representative of the entire dataset.
 Extend Properties to the Entire Dataset: Extend the findings and conclusions derived from the sample
to the entire target dataset. This involves extrapolating the insights gained from the sample to make broader
statements or predictions about the larger population.

Advantages of Data Sampling

 Data Sampling helps draw conclusions, or inferences of larger datasets using a smaller sample space, which
concerns the entire dataset.
 It helps save time and is a quicker and faster approach.
 It is a better way in terms of cost effectiveness as it reduces the cost for data analysis, observation and
collection. It is more of like gaining the data, applying sampling method & drawing the conclusion.
 It is more accurate in terms of result and conclusion.

Disadvantages of Data Sampling

 Sampling Error: It is the act of differentiation among the entire sample size and the smaller dataset. There
arise some differences in characteristics, or properties among both the datasets that reduce the accuracy
and the sample set is unable to represent a larger piece of information. Sampling Error mostly occurs by a
chance and is regarded as an error-less term.
 It becomes difficult in a few data sampling methods, such as forming clusters of similar properties.
 Sampling Bias: It is the process of choosing a sample set which does not represent the entire population
on a whole. It occurs mostly due to incorrect method of sampling usage and consists of errors as the given
dataset is not properly able to draw conclusions for the larger set of data.

7.FILTERING STREAM
Bloom Filters


Bloom Filter is a data structure that can do this job. It is mainly a spaced optimized version of hashing where
we may have false positives. The idea is to not store the actual key rather store only hash values. It is mainly a
probabilistic and space optimized hashing where less than 10 bits per key are required for a 1% false positive
probability and is not dependent on the size of individual keys.

A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a
member of a set. For example, checking availability of username is set membership problem, where the set is
the list of all registered username. The price we pay for efficiency is that it is probabilistic in nature that means,
there might be some False Positive results. False positive means, it might tell that given username is already
taken but actually it’s not.

Interesting Properties of Bloom Filters


 Unlike a standard hash table, a Bloom filter of a fixed size can represent a set with an arbitrarily large
number of elements.
 Adding an element never fails. However, the false positive rate increases steadily as elements are added
until all bits in the filter are set to 1, at which point all queries yield a positive result.
 Bloom filters never generate false negative result, i.e., telling you that a username doesn’t exist when it
actually exists.
 Deleting elements from filter is not possible because, if we delete a single element by clearing bits at
indices generated by k hash functions, it might cause deletion of few other elements. Example – if we delete
“geeks” (in given example below) by clearing bit at 1, 4 and 7, we might end up deleting “nerd” also
Because bit at index 4 becomes 0 and bloom filter claims that “nerd” is not present.

Working of Bloom Filter


A empty bloom filter is a bit array of m bits, all set to zero, like this –

We need k number of hash functions to calculate the hashes for a given input. When we want to add an item in
the filter, the bits at k indices h1(x), h2(x), … hk(x) are set, where indices are calculated using hash functions.
Example – Suppose we want to enter “geeks” in the filter, we are using 3 hash functions and a bit array of length
10, all set to 0 initially. First we’ll calculate the hashes as follows:

h1(“geeks”) % 10 = 1
h2(“geeks”) % 10 = 4
h3(“geeks”) % 10 = 7
Note: These outputs are random for explanation only.
Now we will set the bits at indices 1, 4 and 7 to 1

Again we want to enter “nerd”, similarly, we’ll calculate hashes


h1(“nerd”) % 10 = 3
h2(“nerd”) % 10 = 5
h3(“nerd”) % 10 = 4
Set the bits at indices 3, 5 and 4 to 1

Now if we want to check “geeks” is present in filter or not. We’ll do the same process but this time in reverse
order. We calculate respective hashes using h1, h2 and h3 and check if all these indices are set to 1 in the bit
array. If all the bits are set then we can say that “geeks” is probably present. If any of the bit at these indices
are 0 then “geeks” is definitely not present.

False Positive in Bloom Filters


The question is why we said “probably present”, why this uncertainty. Let’s understand this with an example.
Suppose we want to check whether “cat” is present or not. We’ll calculate hashes using h1, h2 and h3
h1(“cat”) % 10 = 1
h2(“cat”) % 10 = 3
h3(“cat”) % 10 = 7
If we check the bit array, bits at these indices are set to 1 but we know that “cat” was never added to the filter.
Bit at index 1 and 7 was set when we added “geeks” and bit 3 was set we added “nerd”.
So, because bits at calculated indices are already set by some other item, bloom filter erroneously claims that
“cat” is present and generating a false positive result. Depending on the application, it could be huge downside
or relatively okay.
We can control the probability of getting a false positive by controlling the size of the Bloom filter. More space
means fewer false positives. If we want to decrease probability of false positive result, we have to use more
number of hash functions and larger bit array. This would add latency in addition to the item and checking
membership.

Operations that a Bloom Filter supports

 insert(x) : To insert an element in the Bloom Filter.


 lookup(x) : to check whether an element is already present in Bloom Filter with a positive false probability.
NOTE : We cannot delete an element in Bloom Filter.

Probability of False positivity: Let m be the size of bit array, k be the number of hash functions and n be the
number of expected elements to be inserted in the filter, then the probability of false positive p can be calculated
as:
P=(1−[1−1m]kn)k P=(1−[1−m1]kn)k

Size of Bit Array: If expected number of elements n is known and desired false positive probability is p then
the size of bit array m can be calculated as :

m=−nln⁡P(ln2)2 m=−(ln2)2nlnP

Optimum number of hash functions: The number of hash functions k must be a positive integer. If m is size
of bit array and n is number of elements to be inserted, then k can be calculated as :

k=mnln2 k=nmln2

Space Efficiency

If we want to store large list of items in a set for purpose of set membership, we can store it in hashmap, tries or
simple array or linked list. All these methods require storing item itself, which is not very memory efficient. For
example, if we want to store “geeks” in hashmap we have to store actual string “ geeks” as a key value pair
{some_key : ”geeks”}. Bloom filters do not store the data item at all. As we have seen they use bit array which
allow hash collision. Without hash collision, it would not be compact.

Choice of Hash Function

The hash function used in bloom filters should be independent and uniformly distributed. They should be fast as
possible. Fast simple non cryptographic hashes which are independent enough include murmur, FNV series of
hash functions and Jenkins hashes. Generating hash is major operation in bloom filters. Cryptographic hash
functions provide stability and guarantee but are expensive in calculation. With increase in number of hash
functions k, bloom filter become slow. All though non-cryptographic hash functions do not provide guarantee
but provide major performance improvement.

8.COUNTING DISTINCT ELEMENTS IN A STREAM


Flajolet Martin Algorithm

The Flajolet-Martin algorithm is also known as probabilistic algorithm which is mainly used to count the number
of unique elements in a stream or database . This algorithm was invented by Philippe Flajolet and G. Nigel Martin
in 1983 and since then it has been used in various applications such as , data mining and database management.
The basic idea to which Flajolet-Martin algorithm is based on is to use a hash function to map the elements in the
given dataset to a binary string, and to make use of the length of the longest null sequence in the binary string as
an estimator for the number of unique elements to use as a value element.
The steps for the Flajolet-Martin algorithm are:
 First step is to choose a hash function that can be used to map the elements in the database to fixed-length
binary strings. The length of the binary string can be chosen based on the accuracy desired.
 Next step is to apply the hash function to each data item in the dataset to get its binary string representation.
 Next step includes determinig the position of the rightmost zero in each binary string.
 Next we compute the maximum position of the rightmost zero for all binary strings.
 Now we estimate the number of distinct elements in the dataset as 2 to the power of the maximum position
of the rightmost zero which we calculated in previous step.
The accuracy of Flajolet Martin Algorithm is determined by the length of the binary strings and the number of
hash functions it uses. Generally, with increse in the length of the binary strings or using more hash functions in
algorithm can often increase the algorithm’s accuracy.
The Flajolet Martin Algorithm is especially used for big datasets that cannot be kept in memory or analysed with
regular methods. This algorithm , by using good probabilistic techniques, can provide a precise estimate of the
number of unique elements in the data set by using less computing

Give problem in Flajolet-Martin (FM) Algorithm to count distinct elements in a stream.

To estimate the number of different elements appearing in a stream, we can hash elements to integers interpreted
as binary numbers. 2 raised to the power that is the longest sequence of 0's seen in the hash value of any stream
element is an estimate of the number of different elements.
Eg. Stream: 4, 2, 5 ,9, 1, 6, 3, 7
Hash function, h(x) = (ax + b) mod 32
a) h(x) = 3x + 1 mod 32
b) h(x) = x + 6 mod 32

a) h(x) = 3x + 7 mod 32
h(4) = 3(4) + 7 mod 32 = 19 mod 32 = 19 = (10011)
h(2) = 3(2) + 7 mod 32 = 13 mod 32 = 13 = (01101)
h(5) = 3(5) + 7 mod 32 = 22 mod 32 = 22 = (10110)
h(9) = 3(9) + 7 mod 32 = 34 mod 32 = 2 = (00010)
h(1) = 3(1) + 7 mod 32 = 10 mod 32 = 10 = (01010)
h(6) = 3(6) + 7 mod 32 = 25 mod 32 = 25 = (11001)
h(3) = 3(3) + 7 mod 32 = 16 mod 32 = 16 = (10000)
h(7) = 3(7) + 7 mod 32 = 28 mod 32 = 28 = (11100)
Trailing zero's {0, 0, 1, 1, 1, 0, 4, 2}
R = max [Trailing Zero] = 4
Output = 2R = 24 = 16

b) h(x) = x + 6 mod 32
h(4) = (4) + 6 mod 32 = 10 mod 32 = 10 = (01010)
h(2) = (2) + 6 mod 32 = 8 mod 32 = 8 = (01000)
h(5) = (5) + 6 mod 32 = 11 mod 32 = 11 = (01011)
h(9) = (9) + 6 mod 32 = 15 mod 32 = 15 = (01111)
h(1) = (1) + 6 mod 32 = 7 mod 32 = 7 = (00111)
h(6) = (6) + 6 mod 32 = 12 mod 32 = 12 = (01110)
h(3) = (3) + 6 mod 32 = 9 mod 32 = 9 = (01001)
h(7) = (7) + 6 mod 32 = 13 mod 32 = 13 = (01101)
Trailing zero's {1, 3, 0, 0, 0, 1, 0, 0}
R = max [Trailing Zero] = 3
Output = 2R = 23 = 8

9.COUNTING THE NUMBER OF 1’s IN THE DATA STREAM

DGIM algorithm (Datar-Gionis-Indyk-Motwani Algorithm)

Designed to find the number 1’s in a data set. This algorithm uses O(log²N) bits to represent a window of N bit,

allows to estimate the number of 1’s in the window with and error of no more than 50%.

So this algorithm gives a 50% precise answer. In DGIM algorithm, each bit that arrives has a timestamp, for the

position at which it arrives. if the first bit has a timestamp 1, the second bit has a timestamp 2 and so on.. the

positions are recognized with the window size N (the window sizes are usually taken as a multiple of 2).The

windows are divided into buckets consisting of 1’s and 0's.

RULES FOR FORMING THE BUCKETS:

1. The right side of the bucket should always start with 1. (if it starts with a 0,it is to be neglected) E.g. ·

1001011 → a bucket of size 4 ,having four 1’s and starting with 1 on it’s right end.

2. Every bucket should have at least one 1, else no bucket can be formed.

3. All buckets should be in powers of 2.

4. The buckets cannot decrease in size as we move to the left. (move in increasing order towards left)

Let us take an example to understand the algorithm.Estimating the number of 1’s and counting the buckets in the

given data stream.


This picture shows how we can form the buckets based on the number of ones by following the rules.In the given

data stream let us assume the new bit arrives from the right. When the new bit = 0

After the new bit ( 0 ) arrives with a time stamp 101, there is no change in the buckets.But what if the new bit that

arrives is 1, then we need to make changes..


· Create a new bucket with the current timestamp and size 1.

·If there was only one bucket of size 1, then nothing more needs to be done. However, if there are now three

buckets of size 1( buckets with timestamp 100,102, 103 in the second step in the picture) We fix the problem by

combining the leftmost(earliest) two buckets of size 1. (purple box)To combine any two adjacent buckets of the

same size, replace them by one bucket of twice the size. The timestamp of the new bucket is the timestamp of the

rightmost of the two buckets.

Now, sometimes combining two buckets of size 1 may create a third bucket of size 2. If so, we combine the leftmost

two buckets of size 2 into a bucket of size 4. This process may ripple through the bucket sizes.How long can you

continue doing this…You can continue if current timestamp- leftmost bucket timestamp of window < N (=24 here)
E.g. 103–87=16 < 24 so I continue, if it greater or equal to then I stop.Finally the answer to the query.How many

1’s are there in the last 20 bits?Counting the sizes of the buckets in the last 20 bits, we say, there are 11 ones.

10.Real -Time Analytics in Big Data


Real-Time Analytics:

In real-time, analysis of data allows users to view, analyse and understand data in the system it's entered.
Mathematical reasoning and logic are incorporated into the data, which means it gives users a sense of real-time
data to make decisions.

Real-time analytics allows organizations to gain awareness and actionable information immediately or as soon as
the data has entered their systems. Analytics responses in real-time are completed within a matter of minutes.
They can process a huge amount of data in a short time with high speed and a low response time. For instance,
real-time big-data analytics makes use of financial databases to inform traders of decisions. Analytics may be
performed on-demand or continuously. On-demand alerts users to results when the user wants them. Users can
continuously update their results as events occur. It can also be programmed to respond to specific circumstances
automatically. For instance, real-time web analytics could restructure the administrator's page if the load
presentation is not within the boundaries of the present.

Examples -

o Monitoring orders as they take place to trace them better and determine the type of clothing.
o Continuously modernize customer interactions, such as the number of page views and shopping cart
usage, to better understand the etiquette of users.
o Select customers who are more advanced in their shopping habits in a shop, impacting the decisions in
real time.

The Operation of Real-time Analytics

Real-time analytics tools for data analytics can pull or push. Streaming demands that faculty push huge amounts
of fast-moving data. If streaming consumes too many resources and isn't an empirical process, data could be
moved at intervals between a couple of seconds and hours. The two may occur between business requirements
that need to be figured out in order not to interrupt the flow. The time to react for real-time analysis can vary from
nearly instantaneous to a few minutes or seconds. The key components of real-time analytics comprise the
following.

o Aggregator
o Broker
o Analytics engine
o Stream processor

Benefits of Real-time Analytics

Momentum is the primary benefit of real-time analysis of data. The shorter a company has to wait for data from
the moment it arrives and is processed, and the business is able to utilize data insights to make changes and make
the results of a crucial decision.

In the same way, real-time analytics tools allow companies to see how users connect to an item after liberating
the product, so there's no problem in understanding the behaviour of users to make the necessary adjustments.
Advantages of Real-time Analytics:

Real-time analytics provides the benefits over traditional analytics.

o Create our interactive analytics tools.


o Transparent dashboards allow users to share information.
o Monitor behaviour in a way that is customized.
o Perform immediate adjustments if necessary.
o Make use of machine learning.

Real-time analytics in Big Data provides the ability to extract useful insights quickly from massive
datasets. Real-time analytics stands at the forefront of this transformation, enabling organizations to analyze
data streams as they are generated, rather than relying on historical analysis alone. This capability not only
enhances decision-making processes but also empowers businesses to respond dynamically to changing market
conditions, customer behaviors, and operational challenges.

Real-Time Analytics – working

Real-time analytics involves a comprehensive and intricate process that encompasses several critical
components and steps. Here’s a more detailed breakdown of how it operates:

Data Ingestion
 Continuous Data Collection: Real-time analytics systems continuously collect data from various
sources, such as sensors, IoT devices, social media feeds, transaction logs, and application databases.
This data can come in various formats, including structured, semi-structured, and unstructured data.
 Stream Processing: Data is ingested as streams, meaning it is captured and processed in real-time as it
arrives. Technologies like Apache Kafka, RabbitMQ, and Amazon Kinesis are commonly used for data
ingestion due to their ability to handle high-throughput data streams reliably.
Data Processing Engines
 Stream Processing Platforms: Once ingested, the data is processed by stream processing engines such
as Apache Flink, Apache Storm, or Spark Streaming. These platforms are designed to handle continuous
data flows and perform complex event processing, transformations, aggregations, and filtering in real-
time.
 In-Memory Processing: To ensure low-latency processing, many real-time analytics solutions use in-
memory computing frameworks. This allows data to be processed directly in memory rather than being
written to disk, significantly speeding up the processing time.
 Parallel Processing: Real-time analytics systems often employ parallel processing techniques,
distributing the workload across multiple nodes or processors to handle large volumes of data efficiently.
Real-Time Querying
 Low-Latency Query Engines: Real-time query engines like Apache Druid, ClickHouse, and Amazon
Redshift Spectrum allow users to run queries on streaming data with minimal delay. These engines are
optimized for low-latency query execution, providing near-instantaneous results.
 Complex Queries: Users can perform complex queries and analytical operations on streaming data, such
as joins, aggregations, window functions, and pattern matching, enabling sophisticated real-time analysis.
Data Storage
 Time-Series Databases: Real-time analytics often involves storing data in time-series databases like
InfluxDB, TimescaleDB, or OpenTSDB. These databases are optimized for handling time-stamped data
and can efficiently store and retrieve real-time data points.
 NoSQL Databases: For unstructured or semi-structured data, NoSQL databases like MongoDB,
Cassandra, and HBase provide flexible storage solutions that can scale horizontally to accommodate
large data volumes.
Visualization Tools
 Dashboards and BI Tools: Real-time data is visualized using dashboards and business intelligence (BI)
tools like Tableau, Power BI, Grafana, and Kibana. These tools provide interactive and customizable
visualizations that allow users to monitor and analyze data in real-time.
 Alerts and Notifications: Real-time analytics systems can be configured to trigger alerts and
notifications based on predefined conditions or thresholds. This enables proactive responses to critical
events, such as system failures, security breaches, or significant business metrics.
Benefits and Advantages of using Real-Time Analytics
Immediate Insights
 Faster Decision-Making: Real-time analytics provides instant access to data insights, allowing businesses
to make informed decisions quickly.
 Proactive Problem Solving: By continuously monitoring data streams, organizations can identify and
address issues as they arise, preventing potential problems from escalating.
Enhanced Operational Efficiency
 Optimized Processes: By analyzing data as it is generated real-time analytics, businesses can streamline
processes, reduce waste, and improve overall productivity.
 Resource Allocation: Organizations can optimize the allocation of resources, such as labor, inventory, and
energy, based on real-time demand and usage patterns.
Improved Customer Experience
 Personalized Interactions: Real-time analytics enables businesses to tailor their interactions with
customers based on current data.
 Responsive Service: By analyzing customer behavior and feedback in real-time, businesses can quickly
address issues and adapt their services to meet customer expectations.
Competitive Advantage
 Market Responsiveness: Businesses that leverage real-time analytics can quickly adapt to changing
market conditions and emerging trends. This allows them to stay ahead of competitors and capitalize on
new opportunities.
 Innovation: Real-time data insights can drive innovation by highlighting emerging trends and customer
preferences. Businesses can use these insights to develop new products and services that meet market
demands.
Enhanced Risk Management
 Fraud Detection: Real-time analytics is critical for identifying and preventing fraudulent activities. By
continuously monitoring transactions and behavior patterns, businesses can detect anomalies and take
immediate action to mitigate risks.
 Operational Risk Management: Real-time monitoring of operations allows businesses to identify and
address potential risks before they cause significant disruptions.
Improved Financial Performance
 Revenue Optimization: Real-time analytics can help businesses identify and capitalize on revenue
opportunities. For example, dynamic pricing models can adjust prices based on current demand and market
conditions, maximizing revenue.
 Cost Reduction: By optimizing operations and resource allocation, real-time analytics can lead to
significant cost savings. Businesses can reduce waste, improve efficiency, and lower operational expenses.
Enhanced Collaboration and Communication
 Data-Driven Culture: Real-time analytics helps employees across different departments can access and
analyze real-time data, leading to more informed decisions and better collaboration.
 Transparent Operations: Real-time data visualization tools, such as dashboards and reports, provide a
clear and transparent view of operations.
Regulatory Compliance
 Real-Time Monitoring: For industries with stringent regulatory requirements, real-time analytics provides
continuous monitoring and reporting capabilities.
 Audit Trails: Real-time analytics systems can maintain detailed audit trails of data access and
modifications, aiding in compliance and accountability.

Challenges of Real-Time Analytics in Big Data


Data Latency
 Minimizing Delay: Ensuring minimal delay in data processing and analysis can be challenging due to the
sheer volume and velocity of Big Data. Achieving true real-time processing requires robust infrastructure
and optimized algorithms.
Scalability
 Handling Massive Data: As data volumes grow, maintaining the scalability of real-time analytics systems
becomes difficult. Organizations need to invest in scalable infrastructure and distributed processing
frameworks to manage the increasing load.
Data Quality
 Ensuring Accuracy: Maintaining high data quality and accuracy in real-time environments is critical.
Inconsistent or erroneous data can lead to incorrect insights and decisions, impacting business outcomes
negatively.
Integration Complexity
 Seamless Integration: Integrating real-time analytics with existing systems and processes can be complex.
Organizations need to ensure seamless data flow between various sources, analytics platforms, and
applications.
Resource Intensive
 High Computational Demands: Real-time analytics requires significant computational resources for
processing and storing data. This can lead to increased costs and the need for advanced hardware and
software solutions.
Data Security
 Protecting Data: Ensuring the security and privacy of data in real-time analytics is crucial. Real-time
systems are often more vulnerable to cyber-attacks due to continuous data transmission and processing.
Technical Expertise
 Skilled Professionals: Implementing and maintaining real-time analytics systems require skilled
professionals with expertise in Big Data technologies, data science, and system integration. Finding and
retaining such talent can be difficult.
Cost Implications
 Financial Investment: The infrastructure, tools, and human resources needed for real-time analytics can
be expensive. Organizations must weigh the benefits against the costs to justify the investment.
Addressing these challenges is essential for successfully implementing real-time analytics in Big Data
environments, enabling organizations to leverage timely insights for better decision-making and competitive
advantage

Applications of Real-Time Analytics

Real-time analytics is a powerful tool that finds applications across various industries. Here are some key
applications:
Predictive Maintenance: Manufacturing, utilities, and transportation sectors utilize real-time analytics to
monitor equipment health and predict failures before they occur.
Fraud Detection: Financial services, e-commerce platforms, and insurance companies continuously monitor
transactions and user behavior, to identify anomalies and take immediate action to mitigate fraud risks.
Customer Experience Management: Retailers, hospitality providers, and online services analyze customer
interactions and feedback in real-time, businesses can personalize services, optimize marketing campaigns, and
promptly address customer issues, leading to higher satisfaction and loyalty.
Smart Cities: Urban planners and city administrations employ real-time analytics in traffic management,
public transportation optimization, and real-time monitoring of public safety and environmental conditions.
Healthcare: Healthcare providers use real-time analytics to monitor patient vitals, manage hospital resources,
and provide timely interventions. For instance, real-time analysis of patient data can alert medical staff to
potential emergencies, improving patient outcomes and operational efficiency.
Financial Trading: Financial institutions and traders rely on real-time analytics to make quick, informed
trading decisions. By analyzing market data as it happens, traders can identify trends, detect anomalies, and
execute trades at the optimal moment to maximize profits.
Supply Chain Management: Logistics and supply chain companies use real-time analytics to track shipments,
manage inventory, and optimize delivery routes. This ensures timely deliveries, reduces costs, and improves
overall supply chain efficiency.
Telecommunications: Telecom operators use real-time analytics to monitor network performance, detect
outages, and manage bandwidth. This helps in maintaining service quality, reducing downtime, and enhancing
customer satisfaction.
Energy Management: Utility companies and large enterprises employ real-time analytics for energy
consumption monitoring and optimization. By analyzing real-time data from smart meters and sensors,
businesses can optimize energy usage, reduce costs, and support sustainability initiatives.
Marketing and Advertising: Marketers and advertisers use real-time analytics to measure the effectiveness
of campaigns and adjust strategies on the fly. Real-time insights into customer behavior and engagement help
in creating targeted and impactful marketing efforts.
Retail and E-commerce: Retailers and e-commerce platforms leverage real-time analytics to manage
inventory, optimize pricing strategies, and enhance the shopping experience. Analyzing real-time sales data
and customer interactions helps in making informed decisions that drive sales and improve customer
satisfaction.

11. Case Study of Real-Time Sentiment Analysis

Real-Time Sentiment Analysis For Live Social Feeds

Real-time sentiment analysis is an important artificial intelligence-driven process that is used by organizations

for live market research for brand experience and customer experience analysis purposes. In this article, we

explore what is real-time sentiment analysis and what features make for a really brilliant live social feed analysis

tool.

What Is Real-Time Sentiment Analysis?

Real-time Sentiment Analysis is a machine learning (ML) technique that automatically recognizes and extracts

the sentiment in a text whenever it occurs. It is most commonly used to analyze brand and product mentions in

live social comments and posts. An important thing to note is that real-time sentiment analysis can be done only

from social media platforms that share live feeds like Twitter does.

The real-time sentiment analysis process uses several ML tasks such as natural language
processing, text analysis, semantic clustering, etc to identify opinions expressed about brand
experiences in live feeds and extract business intelligence from them.

Why Do We Need Real-Time Sentiment Analysis?

Real-time sentiment analysis has several applications for brand and customer analysis. These
include the following.

1. Live social feeds from video platforms like Instagram or Facebook

2. Real-time sentiment analysis of text feeds from platforms such as Twitter. This is

immensely helpful in prompt addressing of negative or wrongful social mentions as

well as threat detection in cyberbullying.

3. Live monitoring of Influencer live streams.

4. Live video streams of interviews, news broadcasts, seminars, panel discussions,


speaker events, and lectures.
5. Live audio streams such as in virtual meetings on Zoom or Skype, or at product

support call centers for customer feedback analysis.

6. Live monitoring of product review platforms for brand mentions.

7. Up-to-date scanning of news websites for relevant news through keywords and
hashtags along with the sentiment in the news.

How Is Real-Time Sentiment Analysis Done?

Live sentiment analysis is done through machine learning algorithms that are trained to
recognize and analyze all data types from multiple data sources, across different languages,
for sentiment.

A real-time sentiment analysis platform needs to be first trained on a data set based on your
industry and needs. Once this is done, the platform performs live sentiment analysis of real-
time feeds effortlessly.

Below are the steps involved in the process.

Step 1 - Data collection

To extract sentiment from live feeds from social media or other online sources, we first need
to add live APIs of those specific platforms, such as Instagram or Facebook. In case of a
platform or online scenario that does not have a live API, such as can be the case of Skype or
Zoom, repeat, time-bound data pull requests are carried out. This gives the solution the ability
to constantly track relevant data based on your set criteria.

Step 2 - Data processing

All the data from the various platforms thus gathered is now analyzed. All text data in
comments are cleaned up and processed for the next stage. All non-text data from live video
or audio feeds is transcribed and also added to the text pipeline. In this case, the platform
extracts semantic insights by first converting the audio, and the audio in the video data, to
text through speech-to-text software.
This transcript has timestamps for each word and is indexed section by section based on
pauses or changes in the speaker. A granular analysis of the audio content like this gives the
solution enough context to correctly identify entities, themes, and topics based on your
requirements. This time-bound mapping of the text also helps with semantic search.

Even though this may seem like a long drawn-out process, the algorithms complete this in
seconds.

Step 3 - Data analysis

All the data is now analyzed using native natural language processing (NLP), semantic
clustering, and aspect-based sentiment analysis. The platform derives sentiment from aspects
and themes it discovers from the live feed, giving you the sentiment score for each of them.

It can also give you an overall sentiment score in percentile form and tell you sentiment based
on language and data sources, thus giving you a break-up of audience opinions based
on various demographics.

Step 4 - Data visualization

All the intelligence derived from the real-time sentiment analysis in step 3 is now showcased
on a reporting dashboard in the form of statistics, graphs, and other visual elements. It is from
this sentiment analysis dashboard that you can set alerts for brand mentions and keywords in
live feeds as well.

Learn more about the steps in sentiment analysis.

What Are The Most Important Features Of A Real-Time Sentiment Analysis Platform?

A live feed sentiment analysis solution must have certain features that are necessary to extract
and determine real-time insights. These are:

 Multiplatform

One of the most important features of a real-time sentiment analysis tool is its ability to
analyze multiple social media platforms. This multiplatform capability means that the tool is
robust enough to handle API calls from different platforms, which have different rules and
configurations so that you get accurate insights from live data.

This gives you the flexibility to choose whether you want to have a combination of platforms
for live feed analysis such as from a Ted talk, live seminar, and Twitter, or just a single
platform, say, live Youtube video analysis.

 Multimedia

Being multi-platform also means that the solution needs to have the capability to process
multiple data types such as audio, video, and text. In this way, it allows you to discover brand
and customer sentiment through live TikTok social listening, real-time Instagram social
listening, or live Twitter feed analysis, effortlessly, regardless of the data format.

 Multilingual

Another important feature is a multilingual capability. For this, the platform needs to have
part-of-speech taggers for each language that it is analyzing. Machine translations can lead to
a loss of meanings and nuances when translating non-Germanic languages such as Korean,
Chinese, or Arabic into English. This can lead to inaccurate insights from live conversations.

 Web scraping

While metrics from a social media platform can tell you numerical data like the number of followers, posts,

likes, dislikes, etc, a real-time sentiment analysis platform can perform data scraping for more qualitative

insights. The tool’s in-built web scraper automatically extracts data from the social media platform you want

to extract sentiment from. It does so by sending HTTP requests to the different web pages it needs to target for

the desired information, downloads them, and then prepares them for analysis.

It parses the saved data and applies various ML tasks such as NLP, semantic classification, and sentiment

analysis. And in this way gives you customer insights beyond the numerical metrics that you are looking for.

 Alerts

The sentiment analysis tool for live feeds must have the capability to track and simplify complex data sets as it

conducts repeat scans for brand mentions, keywords, and hashtags. These repeat scans, ultimately, give you live

updates based on comments, posts, and audio content on various channels. Through this feature, you can set
alerts for particular keywords or when there is a spike in your mentions. You can get these notifications on your

mobile device or via email.

 Reporting

Another major feature of a real-time sentiment analysis platform is the reporting dashboard. The insights

visualization dashboard is needed to give you the insights that you require in a manner that is easily

understandable. Color-coded pie charts, bar graphs, word clouds, and other formats make it easy for you to

assess sentiment in topics, aspects, and the overall brand, while also giving you metrics in percentile form.

The user-friendly customer experience analysis solution, Repustate IQ, has a very comprehensive reporting

dashboard that gives numerous insights based on various aspects, topics, and sentiment combinations. In

addition, it is also available as an API that can be easily integrated with a dashboard such as Power BI or

Tableau that you are already using. This gives you the ability to leverage a high-precision sentiment analysis

API without having to invest in yet another end-to-end solution that has a fixed reporting dashboard.

12.CASE STUDY OF STOCK PRICE PREDICTION

The stock market is the collection of markets where stocks and other securities are bought and sold by

investors. Publicly traded companies offer shares of ownership to the public, and those shares can be

bought and sold on the stock market. Investors can make money by buying shares of a company at a low

price and selling them at a higher price. The stock market is a key component of the global economy,

providing businesses with funding for growth and expansion. It is also a popular way for individuals to

invest and grow their wealth over time.

Importance of Stock Market

Importance Description

Capital Formation It provides a source of capital for companies to raise funds for growth and expansion.
Importance Description

Investment
Investors can potentially grow their wealth over time by investing in the stock market.
Opportunities

Economic Indicators The stock market can indicate the overall health of the economy.

Job Creation Publicly traded companies often create jobs and contribute to the economy’s growth.

Corporate Governance Shareholders can hold companies accountable for their actions and decision-making processes.

Investors can use the stock market to manage their investment risk by diversifying their
Risk Management
portfolio.

The stock market helps allocate resources efficiently by directing investments to companies with
Market Efficiency
promising prospects.

What is Stock Market Prediction?

Let us see the data on which we will be working before we begin implementing the software to anticipate

stock market values. In this section, we will examine the stock price of Microsoft Corporation (MSFT) as

reported by the National Association of Securities Dealers Automated Quotations (NASDAQ). The stock

price data will be supplied as a Comma Separated File (.csv) that may be opened and analyzed in Excel or

a Spreadsheet.

MSFT’s stocks are listed on NASDAQ, and their value is updated every working day of th e stock market.

It should be noted that the market does not allow trading on Saturdays and Sundays. Therefore, there is a

gap between the two dates. The Opening Value of the stock, the Highest and Lowest values of that stock

on the same day, as well as the Closing Value at the end of the day are all indicated for each date. Analyzing

this data can be useful for stock market prediction using machine learning techniques. he Adjusted Close

Value reflects the stock’s value after dividends have been declared (t oo technical!). Furthermore, the total

volume of the stocks in the market is provided. With this information, it is up to the job of a Machine

Learning/Data Scientist to look at the data and develop different algorithms that may extract patterns from

the historical data of the Microsoft Corporation stock.

Stock Market Prediction Using the Long Short-Term Memory Method.We will use the Long Short-Term
Memory(LSTM) method to create a Machine Learning model to forecast Microsoft Corporation stock
values. They are used to make minor changes to the information by multiplying and adding. Long -term
memory (LSTM) is a deep learning artificial recurrent neural network (RNN) architecture.

Step4.

1: Importing the Libraries

2: Getting to Visualising the Stock Market Prediction Data

3: Checking for Null Values by Printing the DataFrame Shape

4: Plotting the True Adjusted Close Value

5: Setting the Target Variable and Selecting the Features

6: Scaling

7: Creating a Training Set and a Test Set for Stock Market Prediction

8: Data Processing For LSTM

9: Building the LSTM Model for Stock Market Prediction

10: Training the Stock Market Prediction Model

11: Making the LSTM Prediction

12: Comparing Predicted vs True Adjusted Close Value – LSTM

13. Decaying Window Algorithm


This algorithm allows you to identify the most popular elements (trending, in other words) in an incoming data
stream.

The decaying window algorithm not only tracks the most recurring elements in an incoming data stream, but
also discounts any random spikes or spam requests that might have boosted an element’s frequency. In a
decaying window, you assign a score or weight to every element of the incoming data stream. Further, you need
to calculate the aggregate sum for each distinct element by adding all the weights assigned to that element. The
element with the highest total score is listed as trending or the most popular.

1. Assign each element with a weight/score.


2. Calculate aggregate sum for each distinct element by adding all the weights assigned to that element.

In a decaying window algorithm, you assign more weight to newer elements. For a new element, you first
reduce the weight of all the existing elements by a constant factor k and then assign the new element with a
specific weight. The aggregate sum of the decaying exponential weights can be calculated using the following
formula:

∑t−1i=0at−i(1−c)i

Here, c is usually a small constant of the order


10−6 or 10−9. Whenever a new element, say at+1 , arrives in the data stream you perform the following steps to
achieve an updated sum:

1. Multiply the current sum/score by the value (1−c).

2. Add the weight corresponding to the new element.

Weight decays exponentially over time

In a data stream consisting of various elements, you maintain a separate sum for each distinct element. For every
incoming element, you multiply the sum of all the existing elements by a value of (1−c). Further, you add the
weight of the incoming element to its corresponding aggregate sum.

A threshold can be kept to, ignore elements of weight lesser than that.

Finally, the element with the highest aggregate score is listed as the most popular element.
Example

For example, consider a sequence of twitter tags below:


fifa, ipl, fifa, ipl, ipl, ipl, fifa

Also, let's say each element in sequence has weight of 1.


Let's c be 0.1
The aggregate sum of each tag in the end of above stream will be calculated as below:

fifa

fifa - 1 * (1-0.1) = 0.9

ipl - 0.9 * (1-0.1) + 0 = 0.81 (adding 0 because current tag is different than fifa)

fifa - 0.81 * (1-0.1) + 1 = 1.729 (adding 1 because current tag is fifa only)

ipl - 1.729 * (1-0.1) + 0 = 1.5561

ipl - 1.5561 * (1-0.1) + 0 = 1.4005

ipl - 1.4005 * (1-0.1) + 0 = 1.2605


fifa - 1.2605 * (1-0.1) + 1 = 2.135

ipl

fifa - 0 * (1-0.1) = 0

ipl - 0 * (1-0.1) + 1 = 1

fifa - 1 * (1-0.1) + 0 = 0.9 (adding 0 because current tag is different than ipl)

ipl - 0.9 * (1-0.01) + 1 = 1.81

ipl - 1.81 * (1-0.01) + 1 = 2.7919

ipl - 2.7919 * (1-0.01) + 1 = 3.764


fifa - 3.764 * (1-0.01) + 0 = 3.7264

In the end of the sequence, we can see the score of fifa is 2.135 but ipl is 3.7264
So, ipl is more trending then fifa
Even though both of them occurred same number of times in input there score is still different.

Advantages of Decaying Window Algorithm:

1. Sudden spikes or spam data is taken care.


2. New element is given more weight by this mechanism, to achieve right trending output.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy