IT 323 Lectures by Ruchika Pharswan Till Midterm V2
IT 323 Lectures by Ruchika Pharswan Till Midterm V2
ML: Machine
Learning
Conducted by Ruchika Pharswan
at Delhi Technological University
8 Aug 2024
Brief About Me
Email : ruchikapharswan2024@gmail.com
If there will be mass bunk, then for next lecture, few of you will be randomly selected to present assigned topics.
Topic presentation (10), Assignments (5) will hold weightage of CWS component.
Also feel free to ask any doubts in class or even after class you can drop your query in AIES
WhatsApp group. Kindly send your rest of the requests and queries to CR , refrain individual
messages unless important. Thanks.
DO NOT UPLOAD THIS PPT. ANYWHERE !
Alan Turing
Computer score high only if the questions were formulated in the queries, that can be answered either in "Yes" or "No“
or related to a narrow field of knowledge.
Whereas when questions were open-ended and needed conversational answers, computer scored less.
PEAS is a type of model on which an AI agent works upon. When we define an AI agent or rational agent,
then we can group its properties under PEAS representation model. Here performance measure is the
objective for the success of an agent's behavior.
Complicated
Voluminous
Dynamic
Challenging to identify accurately
2.Inference:
1. After training, the model can take new, unseen input data and apply the learned rules or patterns to
predict an output.
After training, the model can make predictions, classify data, or even generate new
Model Output content based on what it has learned.
Supervised learning involves learning a function that maps input data to output labels, based on a
set of input-output pairs. The model is trained on labeled data, where the correct output is provided
for each input in the training set.
•Classification:
•Clustering:
• Objective: Group similar data points together based on some notion of similarity.
• Example: Customer segmentation, grouping similar documents.
• Algorithms: k-Means, Hierarchical Clustering, DBSCAN, Gaussian Mixture Models (GMM).
Anomaly Detection:
• Objective: Identify rare items, events, or observations that do not conform to the general distribution of the data.
• Example: Fraud detection, network security.
• Algorithms: Isolation Forest, One-Class SVM, Auto encoders.
2.Avoids Bias:
Lack of diversity can lead to biased models.
For example, if a facial recognition system is trained only on images of people with a certain skin tone, it
may perform poorly on images of people with different skin tones.
•Specific to General: The model starts with specific instances (training data) and generalizes from these
examples to learn a broader pattern or rule.
•Learning from Observations: Inductive learning relies on learning patterns from observed data without assuming
the data fits into a pre-defined theory or structure.
How It Works
1.Training Data
2. Generalization
3.Prediction (tasks)
X Y
2 3
4 7
6 5
8 10
Goodness of Fit refers to how well a model fits the observed data. It measures the discrepancy between the
observed data and the values predicted by the model. In simple terms, it helps us assess how well the chosen
model explains the variability in the data.
X Y
1 1
2 1
3 2
4 2
5 4
Coefficient of Determination
Coefficient of Correlation
This means that 91.4% of the variability in the dependent variable 𝑌 can be explained by the independent variable 𝑋.
The remaining 8.6% is due to factors not captured by the model.
A residual is the individual difference between the observed value and the predicted value for each data point
in a regression model. Residuals show how much each data point deviates from the regression line.
(s) Or
The SEE of 0.60 means that, on average, the actual values of the dependent variable Y (observed values) differ
from the predicted values 𝑌^ by about 0.60 units.
In other words, the typical prediction error or residual is 0.60 units away from the actual data points.
X Y
1 2
2 3
3 5
4 4
5 6
Given : s = 0.6055
t- calculated value
t- calculated value
Since ∣3.654∣>3.182
Likewise :
It requires to calculate the gradients for the whole data set to perform just one update.
BGD can be very slow and is intractable for datasets that don't fit In memory ,it also doesn't allow us to update the
model online i.e BGD isn’t performed on data set that update continuously.
For a binary classification problem, the confusion matrix is a 2x2 matrix. It consists of four components:
True Negative:
You predicted that a woman is not pregnant but she actually is.
The above equation can be explained by saying, from all the classes we have predicted as positive, how many
are actually positive.