Machine Learning
Machine Learning
LEARNING
- UNIT 1
REVIEW OF LINEAR ALGEBRA FOR MACHINE LEARNING
1. Basics :
•Scalars: Single values, denoted as lowercase letters (e.g., aaa).
•Vectors: Ordered lists of numbers denoted as bold lowercase (e.g., v\mathbf{v}v).
•Matrices: 2D arrays of numbers denoted as bold uppercase (e.g., A\mathbf{A}A).
•Tensors: Generalizations of vectors & matrices to higher dimensions.
2. Key Operations :
model’s capacity, or its ability to classify a variety of data patterns. It’s defined as
the maximum number of points a model can shatter, meaning the model can
perfectly classify all possible labelings of those points. Higher VC dimensions imply
suggest simpler models that may underfit. It’s crucial for understanding a model’s
generalization ability.
PROBABLY APPROXIMATELY CORRECT (PAC) LEARNING
Probably Approximately Correct (PAC) Learning is a framework in machine learning that quantifies a
model's ability to learn from data. In PAC learning, a model is considered successful if, with high
probability (the "Probably" part), it can learn a hypothesis that is approximately correct—that is, close
enough to the true function or distribution generating the data.
Key points:
•Probably: The model will produce an accurate hypothesis with a high probability (e.g., 95%).
•Approximately Correct: The hypothesis may not be perfect, but its error is within an acceptable margin
(ε).
•Efficiency: PAC learning also considers the computational efficiency of finding this hypothesis within a
reasonable amount of data and time.
HYPOTHESIS SPACE
The hypothesis space in machine learning is the set of all possible models or
functions that a learning algorithm can choose from to fit a given dataset. It
includes every potential hypothesis (or function) that could map inputs to outputs
based on the training data.
Why It’s Essential
•Defines Learning Scope: The hypothesis space determines the complexity and
flexibility of the models, influencing what patterns or relationships the model can
learn from the data.
•Affects Generalization: A too-large hypothesis space can lead to overfitting,
where the model learns noise instead of patterns. A too-small hypothesis space may
underfit, missing important data relationships.
•Guides Model Selection: Choosing an appropriate hypothesis space helps
INDUCTIVE BIAS
Bias:
• Represents the error introduced by approximating a real-world problem, which
may be complex, with a simpler model.
• High bias models tend to make strong assumptions about the data and may
oversimplify it.
• Leads to underfitting, where the model is too simple and fails to capture the data
patterns.
Variance:
• Represents the model's sensitivity to fluctuations in the training data.
• High variance models capture noise along with the underlying data patterns.
• Leads to overfitting, where the model performs well on training data but poorly
on new, unseen data.
Trade-Off :
BIAS-VARIANCE TRADE-OFF VISUALIZATION
THANK YOU..