Ai Unit 5
Ai Unit 5
1. Classification Accuracy
Classification Accuracy is what we usually mean, when we use the term accuracy. It is the ratio
of number of correct predictions to the total number of input samples.
It works well only if there are equal number of samples belonging to each class.
2. Logarithmic Loss
Logarithmic Loss or Log Loss, works by penalising the false classifications. It works well for
multi-class classification. When working with Log Loss, the classifier must assign probability
to each class for all the samples. Suppose, there are N samples belonging to M classes, then the
Log Loss is calculated as below :
where,
y_ij, indicates whether sample i belongs to class j or not
p_ij, indicates the probability of sample i belonging to class j
3. Confusion Matrix
Confusion Matrix as the name suggests gives us a matrix as output and describes the complete
performance of the model.
Lets assume we have a binary classification problem. We have some samples belonging to two
classes : YES or NO. Also, we have our own classifier which predicts a class for a given input
sample. On testing our model on 165 samples ,we get the following result.
Confusion Matrix
There are 4 important terms :
• True Positives : The cases in which we predicted YES and the actual output was
also YES.
• True Negatives : The cases in which we predicted NO and the actual output was
NO.
• False Positives : The cases in which we predicted YES and the actual output was
NO.
• False Negatives : The cases in which we predicted NO and the actual output was
YES.
Accuracy for the matrix can be calculated by taking average of the values lying across the “main
diagonal” i.e
Confusion Matrix forms the basis for the other types of metrics.
4. F1 Score
F1 Score is used to measure a test’s accuracy
F1 Score is the Harmonic Mean between precision and recall. The range for F1 Score is [0, 1].
It tells you how precise your classifier is (how many instances it classifies correctly), as well as
how robust it is (it does not miss a significant number of instances).
High precision but lower recall, gives you an extremely accurate, but it then misses a large
number of instances that are difficult to classify. The greater the F1 Score, the better is the
performance of our model. Mathematically, it can be expressed as :
F1 Score
F1 Score tries to find the balance between precision and recall.
• Precision : It is the number of correct positive results divided by the number of
positive results predicted by the classifier.
Precision
• Recall : It is the number of correct positive results divided by the number
of all relevant samples (all samples that should have been identified as positive).
Recall
Model Selection
Model selection is an essential phase in the development of powerful and precise predictive
models in the field of machine learning. Model selection is the process of deciding which
algorithm and model architecture is best suited for a particular task or dataset. It entails
contrasting various models, assessing their efficacy, and choosing the one that most effectively
addresses the issue at hand.
The choice of an appropriate machine learning model is crucial since there are various levels
of complexity, underlying assumptions, and capabilities among them. A model's ability to
generalize to new, untested data may not be as strong as its ability to perform effectively on a
single dataset or problem. Finding a perfect balance between the complexity of models &
generalization is therefore key to model selection.
Choosing a model often entails a number of processes. The first step in this process is to define
a suitable evaluation metric that matches the objectives of the particular situation. According
to the nature of the issue, this statistic may refer to precision, recall, accuracy, F1-score, or any
other relevant measure.
The selection of numerous candidate models is then made in accordance with the problem at
hand and the data that are accessible. These models might be as straightforward as decision
trees or linear regression or as sophisticated as deep neural networks, random forests, or support
vector machines. During the selection process, it is important to take into account the
assumptions, constraints, and hyperparameters that are unique to each model.
Using a suitable methodology, such as cross-validation, the candidate models are trained and
evaluated after being selected. To do this, the available data must be divided into validation
and training sets, with each model fitting on the training set before being evaluated on the
validation set. The models are compared using their performance metrics, then the model with
the highest performance is chosen.
Model selection is a continuous process, though. In order to make wise selections, it frequently
calls for an iterative process that involves testing several models and hyperparameters. The
models are improved through this iterative process, which also aids in choosing the ideal mix
of algorithms & hyperparameters.
Model Selection
In machine learning, the process of selecting the top model or algorithm from a list of potential
models to address a certain issue is referred to as model selection. It entails assessing and
contrasting various models according to how well they function and choosing the one that
reaches the highest level of accuracy or prediction power.
Because different models have varied levels of complexity, underlying assumptions, and
capabilities, model selection is a crucial stage in the machine-learning pipeline. Finding a
model that fits the training set of data well and generalizes well to new data is the objective.
While a model that is too complex may overfit the data and be unable to generalize, a model
that is too simple could underfit the data and do poorly in terms of prediction.
The following steps are frequently included in the model selection process:
• Problem formulation: Clearly express the issue at hand, including the kind of
predictions or task that you'd like the model to carry out (for example, classification,
regression, or clustering).
• Candidate model selection: Pick a group of models that are appropriate for the issue
at hand. These models can include straightforward methods like decision trees or linear
regression as well as more sophisticated ones like deep neural networks, random
forests, or support vector machines.
• Performance evaluation: Establish measures for measuring how well each model
performs. Common measurements include recall, F1-score, mean squared error, and
accuracy, precision, and recall. The type of problem and the particular requirements
will determine which metrics are used.
• Training and evaluation: Each candidate model should be trained using a subset of
the available data (the training set), and its performance should be assessed using a
different subset (the validation set or via cross-validation). The established evaluation
measures are used to gauge the model's effectiveness.
• Model comparison: Evaluate the performance of various models and determine which
one performs best on the validation set. Take into account elements like data handling
capabilities, interpretability, computational difficulty, and accuracy.
• Hyperparameter tuning: Before training, many models require that certain
hyperparameters, such as the learning rate, regularisation strength, or the number of
layers that are hidden in a neural network, be configured. Use methods like grid search,
random search, and Bayesian optimization to identify these hyperparameters' ideal
values.
• Final model selection: After the models have been analyzed and fine-tuned, pick the
model that performs the best. Then, this model can be used to make predictions based
on fresh, unforeseen data.
There are numerous important considerations to bear in mind while selecting a model for
machine learning. These factors assist in ensuring that the chosen model is effective in solving
the issue at its core and has an opportunity for outstanding performance. Here are some crucial
things to remember:
• The complexity of the issue
• Data Availability & Quality
• Model Assumptions
• Scalability and Efficiency
• Domain Expertise
• Resource Constraints
• Evaluation and Experimentation
Ensemble methods
Ensemble methods is a machine learning technique that combines several base models in
order to produce one optimal predictive model. To better understand this definition lets take a
step back into ultimate goal of machine learning and model building.
A Decision Tree determines the predictive value based on series of questions and conditions.
For instance, this simple Decision Tree determining on whether an individual should play
outside or not. The tree takes several weather factors into account, and given each factor either
makes a decision or asks another question.
Similar to BAGGing, bootstrapped subsamples are pulled from a larger dataset. A decision tree
is formed on each subsample. HOWEVER, the decision tree is split on different features (in this
diagram the features are represented by shapes).
3.Boosting
Boosting is a sequential process, where each subsequent model attempts to correct the
errors of the previous model. The succeeding models are dependent on the previous model.
Let’s understand the way boosting works in the below steps.
1. A subset is created from the original dataset.
2. Initially, all data points are given equal weights.
3. A base model is created on this subset.
4. This model is used to make predictions on the whole dataset.
5. Errors are calculated using the actual values and predicted values.
6. The observations which are incorrectly predicted, are given higher weights.
(Here, the three misclassified blue-plus points will be given higher weights)
7. Another model is created and predictions are made on the dataset.
(This model tries to correct the errors from the previous model)
8. Similarly, multiple models are created, each correcting the errors of the previous
model.
9. The final model (strong learner) is the weighted mean of all the models (weak
learners).
Thus, the boosting algorithm combines a number of weak learners to form a strong learner.
The individual models would not perform well on the entire dataset, but they work well
for some part of the dataset. Thus, each model actually boosts the performance of the
ensemble.
Deep generative models
A Generative Model is a powerful way of learning any kind of data distribution using
unsupervised learning and it has achieved tremendous success in just few years.
Deep generative models are a class of artificial intelligence algorithms used in machine
learning and specifically in the field of generative modeling. These models aim to learn
the underlying distribution of a dataset in order to generate new data samples that resemble
the original dataset. Deep generative models leverage deep learning techniques, typically
using neural networks with multiple layers to capture complex patterns and relationships
within the data.
Some popular types of deep generative models include:
1. Variational Autoencoders (VAEs)
2. Generative Adversarial Networks (GANs)
3. Autoregressive Models
4. Flow-Based Models
Deep generative models have a wide range of applications, including image generation,
text generation, data augmentation, and anomaly detection.
A Boltzmann machine is an unsupervised deep learning model in which every node is
connected to every other node. It is a type of recurrent neural network, and the nodes make
binary decisions with some level of bias.
These machines are not deterministic deep learning models, they are stochastic or generative
deep learning models. They are representations of a system.
A Boltzmann machine has two kinds of nodes
• Visible nodes:
These are nodes that can be measured and are measured.
• Hidden nodes:
These are nodes that cannot be measured or are not measured.
Time series data refers to sequential data points collected over time, where each data point is
associated with a specific timestamp. Deep learning techniques have been increasingly applied
to analyze and model time series data due to their ability to capture complex temporal
dependencies and patterns.
Here are several common approaches to using deep learning for time series data:
Recurrent Neural Networks (RNNs): RNNs are a type of neural network architecture
specifically designed to handle sequential data. They have recurrent connections that allow
information to persist over time. Long Short-Term Memory (LSTM) networks and Gated
Recurrent Units (GRUs) are popular variants of RNNs that are capable of capturing long-range
dependencies and mitigating the vanishing gradient problem.
Convolutional Neural Networks (CNNs): While CNNs are traditionally used for image data,
they can also be applied to time series data by treating the temporal dimension as a spatial
dimension. This is achieved by using one-dimensional convolutions over the time axis. CNNs
can capture local patterns and are particularly effective when there are spatial-temporal patterns
in the data.
Hybrid Models: Some approaches combine multiple architectures, such as combining CNNs
and RNNs or CNNs and Transformers, to leverage the strengths of each model type for
different aspects of the time series data.
Deep learning models for time series data have been applied in various domains, including
finance (for stock price prediction and algorithmic trading), healthcare (for patient monitoring
and disease diagnosis), energy (for demand forecasting and anomaly detection), and many
others. However, it's important to carefully design the architecture, preprocess the data
appropriately, and tune the hyperparameters to achieve optimal performance for a specific task.