3rd Unit ML
3rd Unit ML
Representation
Neural networks are computational models that mimic the functions of the human brain and learn from
data. They consist of layers of nodes, or artificial neurons, which process input data and produce output.
Each neuron receives input, applies a weight to it, and passes it through an activation function to
determine its output.
2. *Hidden Layers*: These layers perform computations and feature extraction. The number of hidden
layers and the number of neurons in each layer can vary depending on the complexity of the problem.
3. *Output Layer*: This layer produces the final output of the network.
Neural networks can represent nonlinear relationships and are particularly useful for tasks such as
classification, regression, and pattern recognition. They learn by adjusting the weights of connections
based on the error of the output compared to the expected result, using techniques like
backpropagation.
Problems
Neural networks can encounter several problems during their development and deployment. Here are
some common issues:
1. *Overfitting*: This occurs when the model learns the training data too well, including noise and
outliers, leading to poor generalization on unseen data. Techniques like regularization, dropout, and
using more training data can help mitigate overfitting.
2. *Underfitting*: This happens when the model is too simple to capture the underlying patterns in the
data, resulting in poor performance on both training and test datasets. Increasing model complexity or
adding more features can help.
3. *Vanishing Gradients*: In deep networks, gradients can become very small during backpropagation,
making it difficult for the model to learn. This is often addressed by using activation functions like ReLU
or techniques like batch normalization.
4. *Exploding Gradients*: Conversely, gradients can become excessively large, leading to unstable
training. Techniques such as gradient clipping can help manage this issue.
5. *Data Quality and Quantity*: Neural networks require large amounts of high-quality data to perform
well. Insufficient or poor-quality data can lead to inaccurate models. Data augmentation and
preprocessing can improve data quality.
7. *Interpretability*: Neural networks, especially deep ones, are often seen as "black boxes," making it
challenging to understand how they arrive at specific decisions or predictions. Techniques like SHAP or
LIME can help improve interpretability.
8. *Bias in Data*: If the training data contains biases, the neural network may learn and perpetuate
these biases in its predictions, leading to unfair or unethical outcomes.
9. *Hyperparameter Tuning*: Finding the right hyperparameters (e.g., learning rate, batch size, number
of layers) can be challenging and often requires extensive experimentation.
10. *Deployment Challenges*: Once trained, deploying neural networks in real-world applications can
pose challenges related to scalability, integration with existing systems, and maintaining performance
over time.
Perceptron
A perceptron is the simplest type of artificial neural network and serves as a foundational building block
for more complex neural networks. It is primarily used for binary classification tasks. Here are the key
components and characteristics of a perceptron:
1. *Structure*:
- *Inputs*: A perceptron receives multiple input values (features), denoted as x1, x2, ..., xn.
- *Weights*: Each input is associated with a weight (w1, w2, ..., wn) that signifies its importance in the
decision-making process.
- *Bias*: A bias term (b) is added to the weighted sum to adjust the output independently of the input
values.
2. *Operation*:
- *Weighted Sum*: The perceptron computes a weighted sum of the inputs: z = (w1*x1 + w2*x2 + ... +
wn*xn) + b.
3. *Learning*:
- The perceptron learns by adjusting the weights based on the error in its predictions using a learning
rule, often through a process called the perceptron learning algorithm. This involves updating the
weights in response to the difference between the predicted output and the actual label.
4. *Limitations*:
- A single-layer perceptron can only classify linearly separable data. For more complex tasks, multi-
layer perceptrons (MLPs) with hidden layers are used, allowing the network to learn non-linear
relationships.
5. *Applications*:
- Perceptrons are used in various applications, including image recognition, natural language
processing, and any binary classification problem.
Multilayer Network
A multilayer neural network, often referred to as a multi-layer perceptron (MLP), consists of multiple
layers of neurons, including an input layer, one or more hidden layers, and an output layer. This
architecture allows the network to learn complex patterns and relationships in the data.
2. *Hidden Layers*: These layers perform computations and feature extraction. The number of hidden
layers and the number of neurons in each layer can vary depending on the complexity of the problem.
3. *Output Layer*: This layer produces the final output of the network. For binary classification, it
typically has one neuron with a sigmoid activation function, while for multi-class classification, it may
have multiple neurons with a softmax activation function.
*Diagram Representation*:
- Arrows represent the connections between neurons, where each connection has an associated weight.
- The input layer feeds data into the hidden layers, which process the information and pass it to the
output layer.
Back Propagation
Backpropagation is a widely used algorithm for training artificial neural networks. It is essential for
optimizing the weights of the network by minimizing the error between the predicted output and the
actual target values. Here’s a breakdown of how backpropagation works:
1. *Forward Pass*:
- This process continues until the output layer produces the final prediction.
2. *Loss Calculation*:
- After obtaining the output, the loss (error) is calculated using a loss function (e.g., mean squared
error for regression or cross-entropy for classification).
- The loss quantifies how far the predicted output is from the actual target.
3. *Backward Pass*:
- The backpropagation algorithm calculates the gradient of the loss function with respect to each
weight in the network using the chain rule of calculus.
- It starts from the output layer and moves backward through the network, layer by layer.
- For each neuron, the algorithm computes how much each weight contributed to the error and
adjusts the weights accordingly.
4. *Weight Update*:
- The weights are updated using an optimization algorithm, typically stochastic gradient descent (SGD)
or its variants (e.g., Adam, RMSprop).
- Here, the learning rate determines the step size for weight updates.
5. *Iteration*:
- The forward and backward passes are repeated for multiple epochs (iterations over the entire
training dataset) until the network converges to a solution with minimal error.
Backpropagation is crucial for training deep neural networks, enabling them to learn complex patterns in
data. However, it can face challenges such as vanishing gradients in very deep networks, which can
hinder learning.
Advanced topics
Advanced topics in neural networks and machine learning encompass a variety of concepts and
techniques that go beyond the basics. Here are some key advanced topics:
1. *Deep Learning*: This involves neural networks with many layers (deep networks) that can learn
complex representations of data. Techniques include convolutional neural networks (CNNs) for image
processing and recurrent neural networks (RNNs) for sequential data.
2. *Transfer Learning*: This technique involves taking a pre-trained model on a large dataset and fine-
tuning it on a smaller, specific dataset. It is particularly useful when data is scarce.
3. *Generative Adversarial Networks (GANs)*: GANs consist of two neural networks (a generator and a
discriminator) that compete against each other. The generator creates fake data, while the discriminator
tries to distinguish between real and fake data.
4. *Reinforcement Learning*: This area focuses on training agents to make decisions by maximizing
cumulative rewards through trial and error. Techniques include Q-learning and deep reinforcement
learning.
5. *Natural Language Processing (NLP)*: Advanced NLP techniques involve using neural networks for
tasks such as sentiment analysis, machine translation, and text generation. Models like Transformers
and BERT have revolutionized this field.
6. *Explainable AI (XAI)*: This field focuses on making AI models interpretable and understandable to
humans. Techniques include SHAP values and LIME (Local Interpretable Model-agnostic Explanations).
7. *Neural Architecture Search (NAS)*: This involves automating the design of neural network
architectures to find the best-performing models for a given task.
8. *Hyperparameter Optimization*: Techniques such as grid search, random search, and Bayesian
optimization are used to find the optimal hyperparameters for training neural networks.
9. *Federated Learning*: This is a decentralized approach to training models across multiple devices
while keeping data localized, enhancing privacy and security.
10. *Adversarial Attacks and Defenses*: This area studies how neural networks can be fooled by small
perturbations in input data and explores methods to make models robust against such attacks.
https://www.youtube.com/watch?v=NdD9UuXKMjU&list=PLmAmHQ-
_5ySyQeEryrlomrEvOGNYN3TAL&index=41 maximum likelihood
https://www.youtube.com/watch?v=tRHpFG3P2k8&list=PLmAmHQ-
_5ySyQeEryrlomrEvOGNYN3TAL&index=42 minimum description length principle