0% found this document useful (0 votes)
4 views31 pages

ML UT 1 Merged

Machine Learning (ML) is a subset of Artificial Intelligence that enables systems to learn from data and make predictions without explicit programming. It is categorized into Supervised Learning, Unsupervised Learning, and Reinforcement Learning, each with distinct methodologies and applications. ML is widely applied across various domains such as healthcare, finance, and transportation, enhancing efficiency and decision-making through data-driven insights.

Uploaded by

varsha.g
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views31 pages

ML UT 1 Merged

Machine Learning (ML) is a subset of Artificial Intelligence that enables systems to learn from data and make predictions without explicit programming. It is categorized into Supervised Learning, Unsupervised Learning, and Reinforcement Learning, each with distinct methodologies and applications. ML is widely applied across various domains such as healthcare, finance, and transportation, enhancing efficiency and decision-making through data-driven insights.

Uploaded by

varsha.g
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

ML QB

1.Define Machine Learning. Explain how machine learning is different from


conventional programming?
Machine Learning (ML) is a subset of Artificial Intelligence (AI) that enables
systems to learn from data and make decisions or predictions without being
explicitly programmed. It involves training models on data to identify patterns
and improve performance over time based on experience.
Difference Between Machine Learning and Conventional Programming:
Feature Machine Learning (ML) Conventional
Programming
Approach Learns patterns from data Follows explicitly written
and improves performance rules and logic
over time
Data Dependency Requires large amounts of Works based on predefined
data for training rules, less reliant on data
Flexibility Can adapt and improve with Fixed rules; requires manual
more data updates
Handling Complexity Suitable for complex tasks Works well for rule-based
like image recognition, NLP, systems and deterministic
and recommendation tasks
systems
Example Predicting diseases from Writing an if-else logic to
patient records using ML classify a patient's condition
algorithms

2. What are the types of Machine Learning? Explain the types in brief with
examples?
Machine Learning is broadly categorized into three types:
1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement Learning
1. Supervised Learning
In supervised learning, the model is trained using labeled data, meaning each
input has a corresponding correct output. The model learns the mapping
between inputs and outputs.
Examples:
• Regression: Predicting house prices based on features like size, location,
and number of rooms. (Algorithm: Linear Regression)
• Classification: Identifying whether an email is spam or not. (Algorithm:
Decision Tree, SVM, Neural Networks)
Use Cases:
• Fraud detection
• Disease prediction
• Sentiment analysis
2. Unsupervised Learning
In unsupervised learning, the model is trained on unlabeled data. The system
learns patterns and relationships in the data without explicit supervision.
Examples:
• Clustering: Grouping customers based on purchasing behavior.
(Algorithm: K-Means Clustering)
• Dimensionality Reduction: Compressing high-dimensional data while
retaining meaningful patterns. (Algorithm: PCA – Principal Component
Analysis)
Use Cases:
• Customer segmentation
• Market basket analysis
• Anomaly detection (e.g., fraud detection in banking)
3. Reinforcement Learning (RL)
In reinforcement learning, an agent learns by interacting with an environment
and receiving rewards or penalties based on its actions. The goal is to maximize
cumulative rewards over time.
Examples:
• Game Playing: AI mastering chess or Go. (Algorithm: Deep Q-Networks,
Policy Gradient)
• Robotics: Training robots to walk or perform tasks autonomously.
Use Cases:
• Self-driving cars
• Automated trading
• Personalized recommendations

3. Explain Supervised Learning with examples.


Supervised Learning is a type of Machine Learning where the model is trained
on labelled data. This means that each input (features) has a corresponding
correct output (labels). The model learns the relationship between inputs and
outputs to make predictions on new data.
Types of Supervised Learning
Supervised Learning is divided into two main types:
1. Regression – Predicting continuous values
2. Classification – Predicting discrete categories

1. Regression (Continuous Output)


Regression is used when the output is a continuous value, such as predicting
prices, temperature, or stock prices.
Examples:
• House Price Prediction:
o Input (Features): Size of the house, number of rooms, location, etc.
o Output (Label): Predicted price of the house.
o Algorithm Used: Linear Regression, Decision Tree Regression
• Weather Prediction:
o Input: Temperature, humidity, wind speed
o Output: Expected rainfall in mm
o Algorithm Used: Multiple Linear Regression

2. Classification (Categorical Output)


Classification is used when the output belongs to predefined categories, such as
"Spam or Not Spam," "Disease or No Disease," etc.
Examples:
• Email Spam Detection:
o Input: Email content, subject line, sender details
o Output: "Spam" or "Not Spam"
o Algorithm Used: Naïve Bayes, Random Forest
• Disease Prediction (e.g., Diabetes Detection):
o Input: Blood sugar level, age, BMI
o Output: "Diabetic" or "Non-Diabetic"
o Algorithm Used: Logistic Regression, Decision Tree

How Supervised Learning Works?


1. Training Phase: The model learns patterns from labelled training data.
2. Testing Phase: The model is evaluated on unseen test data.
3. Prediction: The trained model predicts outcomes for new inputs.
Key Algorithms Used in Supervised Learning
• Regression: Linear Regression, Decision Tree Regression
• Classification: Decision Tree, SVM, Naïve Bayes, Random Forest
Supervised Learning is widely used in areas like fraud detection, speech
recognition, recommendation systems, and medical diagnosis.

4. Explain Unsupervised Learning with examples.


Unsupervised Learning is a type of Machine Learning where the model is
trained on unlabelled data. The system tries to learn patterns, structures, or
relationships within the data without predefined labels.
Types of Unsupervised Learning
Unsupervised Learning is mainly categorized into:
1. Clustering – Grouping similar data points
2. Association Rule Learning – Finding relationships between data items
3. Dimensionality Reduction – Reducing the number of features while
preserving information

1. Clustering (Grouping Similar Data)


Clustering is used to segment data into meaningful groups based on similarity.
Examples:
• Customer Segmentation:
o Input: Customer purchase history, demographics, browsing
behaviour
o Output: Groups customers into clusters (e.g., Budget Shoppers,
Premium Buyers)
o Algorithm Used: K-Means, Hierarchical Clustering
• Image Segmentation:
o Input: Pixels of an image
o Output: Groups similar pixel regions (used in object detection)
o Algorithm Used: K-Means

2. Association Rule Learning (Finding Relationships in Data)


This technique discovers relationships between variables in large datasets.
Examples:
• Market Basket Analysis:
o Input: Transaction data from a supermarket
o Output: Identifies which products are frequently bought together
(e.g., people who buy bread often buy butter)
o Algorithm Used: Apriori, FP-Growth
• Movie Recommendation Systems:
o Input: User’s past watch history
o Output: Suggests movies based on frequently watched genres
o Algorithm Used: Apriori, Association Rules

3. Dimensionality Reduction (Feature Reduction)


This method reduces the number of features while preserving essential
information.
Examples:
• Image Compression:
o Input: High-dimensional image data
o Output: Compressed images with minimal loss of quality
o Algorithm Used: PCA (Principal Component Analysis)
• Text Analysis:
o Input: Large text documents
o Output: Extracts key topics and removes redundant information
o Algorithm Used: t-SNE, PCA

How Unsupervised Learning Works?


1. The model analyzes raw data without labelled outputs.
2. It finds hidden patterns and groups data points based on similarities.
3. The results help in decision-making, insights, and automation.

Key Algorithms Used in Unsupervised Learning


• Clustering: K-Means, DBSCAN, Hierarchical Clustering
• Association Rule Learning: Apriori, FP-Growth
• Dimensionality Reduction: PCA, t-SNE

5. Explain Reinforcement Learning with examples.


Reinforcement Learning (RL) is a type of Machine Learning where an agent
learns by interacting with an environment and receiving rewards or penalties
based on its actions. The goal of the agent is to maximize cumulative rewards
over time by improving its decision-making strategy.
Key Components of Reinforcement Learning
1. Agent – The learner or decision-maker (e.g., a robot, AI model).
2. Environment – The world in which the agent interacts (e.g., a game, real-
world system).
3. Actions (A) – The possible moves the agent can take.
4. State (S) – The current situation or condition of the agent in the
environment.
5. Reward (R) – Feedback from the environment that guides learning
(positive or negative).
6. Policy (π) – The strategy that the agent follows to decide actions.

How Reinforcement Learning Works?


1. The agent takes an action (A) in the environment.
2. The environment responds by providing a new state (S') and a reward
(R).
3. The agent learns from this reward and updates its policy (π).
4. The process repeats until the agent learns an optimal strategy.

Types of Reinforcement Learning


1. Positive Reinforcement: Reward is given for good actions, encouraging
repetition.
2. Negative Reinforcement: A penalty is applied for bad actions,
discouraging them.

Examples of Reinforcement Learning


1. Game Playing (AI in Games)
• Example: AlphaGo and Chess-playing AI
• How it works:
o The agent plays a game and learns the best moves by trial and
error.
o It receives positive rewards for winning moves and negative
rewards for losing moves.
o The AI eventually masters the game through repeated play.
• Algorithms Used: Deep Q-Network (DQN), Policy Gradient

2. Self-Driving Cars
• Example: Tesla’s Autonomous Driving AI
• How it works:
o The car (agent) interacts with the road (environment).
o It gets rewards for following lanes and penalties for crossing lines
or hitting obstacles.
o The AI continuously improves by learning the best driving actions.
• Algorithms Used: Deep Q-Learning, Reinforcement Learning with Neural
Networks

3. Robotics
• Example: Teaching a robotic arm to pick and place objects
• How it works:
o The robot learns by trying different movements.
o It gets rewards for successfully picking up and placing an object.
o Over time, it learns the most efficient way to perform the task.
• Algorithms Used: Q-Learning, Actor-Critic Models

4. Personalized Recommendations
• Example: Netflix and YouTube video recommendations
• How it works:
o The system observes user behavior (watch history, clicks).
o It gets rewards if a user watches a recommended video and
penalties if the user skips it.
o Over time, the AI learns to suggest better videos.
• Algorithms Used: Multi-Armed Bandit, Deep Reinforcement Learning

Key Algorithms in Reinforcement Learning


1. Q-Learning – Learns the best action for each state.
2. Deep Q-Networks (DQN) – Uses deep learning to approximate Q-values.
3. Policy Gradient Methods – Directly optimizes the policy for decision-
making.
4. Actor-Critic Methods – Combines value-based and policy-based learning.

6. Write applications of machine learning in different domains. Elaborate with


example, how machine learning is useful in solving the problem?
Machine Learning (ML) is transforming various industries by automating tasks,
improving efficiency, and enabling data-driven decision-making. Below are
some key domains where ML is widely applied, along with detailed examples of
how it helps solve real-world problems.

1. Healthcare
Application: Disease Diagnosis and Prediction
Example: Predicting Diabetes using ML
• Problem: Early diagnosis of diabetes is crucial, but traditional methods
require extensive tests.
• ML Solution: ML models like Logistic Regression, Decision Trees, and
Neural Networks analyze patient data (age, BMI, glucose levels) to
predict diabetes risk.
• Impact:
o Early detection enables timely treatment.
o Reduces the need for expensive and time-consuming tests.
Other Use Cases:
• Cancer detection (using Deep Learning on medical images).
• Personalized medicine recommendations.
• AI-powered drug discovery.

2. Finance
Application: Fraud Detection in Banking
Example: Detecting Credit Card Fraud
• Problem: Fraudsters use stolen credit cards for unauthorized
transactions.
• ML Solution: Algorithms like Random Forest and Neural Networks
analyze transaction patterns, location, and spending behavior to detect
anomalies.
• Impact:
o Identifies fraudulent transactions in real-time.
o Reduces financial losses for banks and customers.
Other Use Cases:
• Stock price prediction using ML models.
• Customer credit risk analysis for loan approval.
• Automated financial portfolio management.

3. Retail and E-commerce


Application: Product Recommendation Systems
Example: Amazon’s Personalized Recommendations
• Problem: Customers struggle to find relevant products among millions of
choices.
• ML Solution: Algorithms like Collaborative Filtering and Content-Based
Filtering analyze past purchases, browsing history, and user preferences
to recommend products.
• Impact:
o Increases sales and customer satisfaction.
o Drives user engagement by suggesting personalized content.
Other Use Cases:
• Dynamic pricing (adjusting prices based on demand and trends).
• Customer sentiment analysis from reviews.
• Inventory management using demand forecasting.

4. Manufacturing and Industry 4.0


Application: Predictive Maintenance
Example: Machine Failure Prediction in Factories
• Problem: Unexpected machine failures cause production downtime and
financial losses.
• ML Solution: Time Series Analysis and Deep Learning models analyze
sensor data to predict when machines might fail.
• Impact:
o Reduces unplanned downtime.
o Optimizes maintenance schedules, saving costs.
Other Use Cases:
• Supply chain optimization using ML forecasts.
• Quality control in manufacturing.
• Automated defect detection using image processing.

5. Agriculture
Application: Crop Recommendation System
Example: AI-powered Crop Selection
• Problem: Farmers struggle to choose the right crops based on soil,
climate, and water availability.
• ML Solution: ML models like Decision Trees and Random Forest analyze
soil pH, temperature, and rainfall data to recommend the best crops.
• Impact:
o Increases agricultural yield.
o Reduces wastage of resources like water and fertilizers.
Other Use Cases:
• Pest and disease detection using image recognition.
• Smart irrigation systems using ML-based weather forecasting.
• Yield prediction based on climate and soil conditions.

6. Transportation and Autonomous Vehicles


Application: Self-Driving Cars
Example: Tesla’s Autopilot System
• Problem: Human errors cause accidents, traffic congestion, and
inefficiencies.
• ML Solution: Reinforcement Learning and Computer Vision help self-
driving cars recognize obstacles, pedestrians, and road signs in real-time.
• Impact:
o Reduces road accidents.
o Improves traffic efficiency and fuel consumption.
Other Use Cases:
• Route optimization for delivery services (Google Maps, Uber).
• Traffic prediction using ML-powered analytics.
• Automated drone navigation for deliveries.

7. Entertainment and Media


Application: Personalized Content Recommendation
Example: Netflix Movie Recommendations
• Problem: Users find it difficult to discover relevant content from vast
libraries.
• ML Solution: Collaborative Filtering and Deep Learning analyze viewing
history and preferences to recommend TV shows and movies.
• Impact:
o Enhances user engagement and retention.
o Increases watch time and subscription renewals.
Other Use Cases:
• AI-generated music and art.
• Automatic video summarization.
• Deepfake detection in media.
8. Cybersecurity
Application: Malware Detection and Prevention
Example: AI-based Antivirus Systems
• Problem: Traditional antivirus software struggles to detect new and
evolving threats.
• ML Solution: Anomaly Detection and Neural Networks identify
suspicious behaviors in files and processes to detect malware.
• Impact:
o Improves cybersecurity defense mechanisms.
o Detects zero-day attacks before they spread.
Other Use Cases:
• AI-driven phishing email detection.
• Real-time intrusion detection in networks.
• Biometric authentication security enhancements.

9. Education and E-Learning


Application: AI-powered Tutoring Systems
Example: Personalized Learning on Khan Academy
• Problem: Students have different learning speeds and needs.
• ML Solution: Natural Language Processing (NLP) and Adaptive Learning
Models provide customized lessons based on student performance.
• Impact:
o Enhances learning experiences.
o Helps teachers identify struggling students faster.
Other Use Cases:
• Automated essay grading using NLP.
• AI-powered chatbots for student queries.
• Smart content generation for adaptive learning.

10. Smart Assistants and Chatbots


Application: AI-powered Virtual Assistants
Example: Google Assistant and Alexa
• Problem: Users need hands-free, voice-based assistance for tasks.
• ML Solution: Speech Recognition and NLP allow virtual assistants to
understand and respond to human speech.
• Impact:
o Improves accessibility and convenience.
o Enhances automation in smart homes.
Other Use Cases:
• AI chatbots for customer support.
• Smart home automation using voice commands.
• AI-driven email response suggestions.

7. Differentiate between Supervised and Unsupervised Learning?


8. Differentiate between the Supervised, Unsupervised and Reinforcement
Learning with example?
9. What type of machine learning problem is,
a) Predicting the survival of a passenger in the Titanic disaster
b) Recognizing handwritten digit
c) Forecasting sales for next 6 months for D-Mart
d) Suggesting songs on Spotify
e) Identifying a fraudulent transaction
10. Explain ROC and AUC curve in ML with examples?
ROC and AUC Curve in Machine Learning
1. ROC (Receiver Operating Characteristic) Curve
The ROC curve is a graphical representation used to evaluate the performance
of a classification model at different threshold values. It plots:
• True Positive Rate (TPR) (Sensitivity/Recall) on the Y-axis
• False Positive Rate (FPR) on the X-axis
The curve shows how well the model distinguishes between classes by varying
the decision threshold.
2. AUC (Area Under the Curve)
The AUC measures the overall ability of the model to separate positive and
negative classes.
• AUC value ranges from 0 to 1:
o AUC = 1: Perfect classifier
o AUC > 0.9: Excellent model
o AUC = 0.5: Random guess (No discrimination)
o AUC < 0.5: Poor model (worse than random guessing)
3. How ROC-AUC Helps in Model Evaluation
• A higher AUC means the model is better at distinguishing between
positive and negative classes.
• It is useful when the dataset is imbalanced because it considers both TPR
and FPR.
4. Example: Spam Email Detection
Imagine a binary classification model that predicts whether an email is spam
(1) or not spam (0).
• If the threshold is high, fewer emails are marked as spam (low FPR, but
may miss actual spam emails).
• If the threshold is low, more emails are marked as spam (higher TPR, but
may have more false positives).
• The ROC curve helps find the best trade-off.
• The AUC score tells us how well the model performs across all
thresholds.
5. Conclusion
• ROC Curve: Visualizes the trade-off between TPR and FPR.
• AUC Score: A single metric to compare model performance.
• Higher AUC = Better Classification Performance.

11. Define
a) Accuracy
Definition:
Accuracy is the ratio of correctly predicted instances to the total instances in
the dataset.
Formula:
Accuracy= TP + TN / TP + TN + FP +FN
where:
• TP (True Positives): Correctly predicted positive cases
• TN (True Negatives): Correctly predicted negative cases
• FP (False Positives): Incorrectly predicted positive cases
• FN (False Negatives): Incorrectly predicted negative cases
Example:
If an email spam classifier correctly classifies 90 out of 100 emails, the accuracy
is 90%.
Limitation:
Accuracy is not reliable for imbalanced datasets (e.g., predicting rare diseases).

b) Precision (Positive Predictive Value - PPV)


Definition:
Precision measures how many of the predicted positive instances were actually
correct.
Formula:
Precision= TP / TP + FP
Example:
If a spam classifier predicts 100 emails as spam, but only 80 are actually spam,
the precision is 80%.
Use Case:
Useful when false positives need to be minimized (e.g., fraud detection).

c) Recall (Sensitivity / True Positive Rate - TPR)


Definition:
Recall measures how many actual positive cases were correctly identified by
the model.
Formula:
Recall= TP / TP +FN
Example:
If there are 100 spam emails in total, and the model correctly identifies 80 but
misses 20, recall is 80%.
Use Case:
Important when false negatives are costly (e.g., medical diagnosis).

d) F1-Score
Definition:
F1-score is the harmonic mean of precision and recall, balancing both.
Formula:
F1 – Score = 2 * (Precision * Recall / Precison * Recall )
Example:
If precision = 80% and recall = 60%, the F1-score is:
F1=2×0.8×0.60.8+0.6=0.685
Use Case:
Useful when both false positives and false negatives matter (e.g., spam
detection).
e) Specificity (True Negative Rate - TNR)
Definition:
Specificity measures how well the model identifies actual negative cases.
Formula:
Specificity= TN / TN + FP
Example:
If there are 100 non-spam emails, and the model correctly classifies 90 of
them, specificity is 90%.
Use Case:
Important when false positives need to be minimized (e.g., avoiding wrongful
arrests in crime detection).

12. Explain Confusion Matrix with an example?


13. Write a short note on learning rate. Explain how it affects convergence
with examples?
Learning Rate in Gradient Descent
The learning rate (α\alphaα) is a hyperparameter in gradient descent that
controls the step size of parameter updates. It determines how quickly or
slowly the algorithm moves toward the optimal solution in the cost function.

Effect of Learning Rate on Convergence


1. Small Learning Rate (α\alphaα is too low)
o Steps are very small, leading to slow convergence.
o The algorithm may take too long to reach the minimum.
o Risk of getting stuck in local variations of the cost function.
Example: If α=0.0001\alpha = 0.0001α=0.0001, it may take thousands of
iterations to converge.
Visualization:
scss
CopyEdit
Slow movement → ● ● ● ● ● ● ● ● (Minimum)
2. Optimal Learning Rate (α\alphaα is just right)
o The algorithm moves efficiently toward the minimum.
o It converges in a reasonable number of iterations.
Example: If α=0.01\alpha = 0.01α=0.01, it reaches the minimum quickly
without overshooting.
Visualization:
scss
CopyEdit
Fast but stable descent → ● ● ● ● ● (Minimum)
3. Large Learning Rate (α\alphaα is too high)
o Steps are too large, causing the algorithm to overshoot the
minimum.
o It may fail to converge or oscillate indefinitely.
Example: If α=1\alpha = 1α=1, the cost function value may jump around
without settling.
Visualization:
scss
CopyEdit
Oscillations → ● ● ● ● ● (Diverges)

Choosing the Right Learning Rate


• Start with a small value and increase if convergence is slow.
• Use adaptive learning rates (e.g., Adam optimizer) for better control.
• Plot the cost function vs. iterations to detect issues like divergence or
slow learning.

14. Write a short note on learning rate (α) in linear regression. How is a
trained model debugged with respect to the learning rate?
The learning rate (α\alphaα) in linear regression is a crucial hyperparameter in
gradient descent, controlling how much the model’s parameters are updated
at each iteration. It determines the step size towards minimizing the cost
function J(θ)J(\theta)J(θ).
• A small α\alphaα results in slow convergence, requiring more iterations
to reach the minimum.
• A large α\alphaα can cause overshooting or divergence, preventing the
model from finding the optimal solution.
• An optimal α\alphaα allows for fast and stable convergence to the
global minimum.
Debugging a Trained Model with Respect to Learning Rate
To debug a trained model based on α\alphaα, check:
1. Cost Function Plot
o If J(θ)J(\theta)J(θ) decreases smoothly → α\alphaα is good.
o If J(θ)J(\theta)J(θ) decreases too slowly → α\alphaα is too small.
o If J(θ)J(\theta)J(θ) oscillates or increases → α\alphaα is too large.
2. Convergence Behavior
o Slow learning: Increase α\alphaα slightly.
o Divergence: Decrease α\alphaα to stabilize training.
3. Use Adaptive Learning Rates
o Optimizers like Adam or RMSprop adjust α\alphaα dynamically for
efficient training.

15. Explain the decision tree in brief?


Decision Tree: A Brief Explanation
A Decision Tree is a supervised learning algorithm used for both classification
and regression tasks. It mimics human decision-making by splitting data into
branches based on feature values.
1. Structure of a Decision Tree
A decision tree consists of:
• Root Node: The starting point containing the entire dataset.
• Internal Nodes: Represent decisions based on feature conditions.
• Branches: Paths from one node to another based on conditions.
• Leaf Nodes: Final output (class label or predicted value).
2. Working of a Decision Tree
• The dataset is split recursively based on the feature that best separates
the data (using criteria like Gini impurity or entropy).
• This process continues until all data is classified or a stopping condition
is met (e.g., max depth).
• 4. Advantages of Decision Trees
• Easy to interpret & visualize
Handles both numerical & categorical data
No need for feature scaling

• 5. Disadvantages & Solutions
• Overfitting (Solved by pruning)
Biased if class distribution is imbalanced
16. How does the CART problem solve the regression problem?
The goal of a regression tree is to split the dataset into smaller subsets by minimizing the
variance of the target variable in each subset.
Steps in Regression Tree Construction:
1. Select the Best Feature & Split Point
o The dataset is split based on an input feature and a threshold value that
minimizes variance.
o The split is chosen to minimize the mean squared error (MSE) or variance.
2. Continue Splitting Until Stopping Condition is Met
o Stopping criteria could be maximum depth, minimum samples per leaf, or
variance threshold.
3. Final Prediction at Leaf Nodes
o The prediction for a new input is the average of all training samples in that
leaf node.

2. Splitting Criteria: Minimizing Variance


Regression Trees use Variance Reduction or Mean Squared Error (MSE) to decide the best
split.
Formula for Variance Reduction:
Variance=1/N summation (i=1 – n) (yi -yy)^2
where:
• yiy_iyi = actual value
• yˉ\bar{y}yˉ = mean of the target values in the node
• nnn = number of samples in the node
At each split, the variance reduction is computed as:
Variance_Reduction=Varianceparent−∑(∣child∣∣parent∣×Variancechild)
The split that maximizes variance reduction is chosen.

3. Example: Predicting House Prices Using a Regression Tree


Dataset:
Size (sq.ft) Bedrooms Price ($)
1000 2 200,000
1200 3 250,000
1500 3 270,000
1800 4 320,000
2000 4 340,000
Decision Tree Splitting Example:
1. First Split: Based on House Size (sq.ft)
o If size < 1500, go left
o If size ≥ 1500, go right
2. Second Split (If Needed)
o Further splits may occur based on Bedrooms or other features.
Final Prediction:
• If a new house (1300 sq.ft, 3 bedrooms) is evaluated, it will fall into the left branch,
and the model will predict the average price of houses in that region.

4. Advantages of Regression Trees (CART)


Handles non-linearity well (Unlike Linear Regression)
Automatically selects important features
Can model interactions between variables
5. Limitations & Solutions
Overfitting → Use pruning or set a minimum leaf size
Sensitive to outliers → Use ensemble methods (Random Forest, Gradient Boosting)
6. Conclusion
CART solves regression problems by:
• Splitting data based on variance reduction.
• Predicting continuous values at leaf nodes.
• Providing interpretable models for decision-making.
Regression Trees are widely used in finance (stock price prediction), real estate (house
prices), and healthcare (predicting patient recovery time).
ML TesT 2
MOD 6 –
Q. Write a short note on dimensionality reduction?
Dimensionality Reduction is a technique used in data science and machine learning to reduce
the number of input variables or features in a dataset while retaining as much relevant
information as possible. High-dimensional data can be complex, computationally expensive,
and prone to overfitting—this is known as the curse of dimensionality.
There are two main types of dimensionality reduction:
1. Feature Selection – Choosing a subset of the most important features from the original
dataset.
2. Feature Extraction – Transforming the data into a lower-dimensional space (e.g., using
methods like Principal Component Analysis (PCA) or t-SNE).
Benefits:
• Improves model performance and training speed.
• Reduces storage and memory requirements.
• Makes data visualization easier (especially for 2D or 3D plotting).
• Helps eliminate noise and redundant features.
Overall, dimensionality reduction simplifies models, enhances interpretability, and improves
generalization.

Q. Define dimensionality reduction. Write advantages of dimensionality reduction?


Dimensionality Reduction is the process of reducing the number of input variables or features
in a dataset while preserving the essential information. It involves transforming data from a
high-dimensional space into a lower-dimensional space to simplify analysis, improve
efficiency, and reduce noise.
Advantages of Dimensionality Reduction:
1. Reduces Overfitting: By removing irrelevant or redundant features, it helps in building
more generalizable models.
2. Improves Model Performance: Fewer input features often lead to faster training and
testing of machine learning models.
3. Enhances Visualization: High-dimensional data can be reduced to 2D or 3D, making it
easier to visualize and understand patterns.
4. Decreases Computational Cost: Less data means less memory usage and faster
computations.
5. Removes Multicollinearity: Helps eliminate highly correlated features that can
negatively impact model performance.
6. Simplifies Models: Makes models easier to interpret and maintain.
Q. Write a short note on Principal Component Analysis?
Principal Component Analysis (PCA) is a popular dimensionality reduction technique used in
data analysis and machine learning. It transforms the original features of a dataset into a new
set of uncorrelated variables called principal components. These components are ordered by
the amount of variance they capture from the data.
The first principal component captures the maximum variance, the second captures the next
highest variance, and so on. PCA helps reduce the number of dimensions while preserving the
most important information.
Key Points:
• PCA is an unsupervised linear transformation technique.
• It is commonly used for noise reduction, visualization, and speeding up machine
learning algorithms.
• It works by computing the eigenvectors and eigenvalues of the data's covariance
matrix.
Applications of PCA:
• Image compression
• Pattern recognition
• Exploratory data analysis
• Preprocessing for machine learning models

Q. What is the steepest descent method? Explain with an example. (NB)


The Steepest Descent Method (also known as Gradient Descent) is an iterative optimization
algorithm used to minimize a function by moving in the direction of the negative gradient
(i.e., the direction of steepest descent).
It is widely used in machine learning and numerical optimization to find the minimum of a
function.

MOD 5 –
Q. Define optimization. Explain the different types of optimizations?
Optimization is the process of finding the best solution from all feasible solutions for a given
problem. It involves maximizing or minimizing an objective function by systematically
choosing input values from a defined set and computing the corresponding output.
Mathematically, it is expressed as:
Minimize or Maximize: f(x)subject to constraints
Where:
• f(x) is the objective function,
• x is the variable or vector of variables.
Types of Optimizations:
1. Linear Optimization (Linear Programming)
• Objective function and constraints are linear.
• Example:
Maximize z=3x+2y
subject to x+y≤4,x,y≥0
2. Non-Linear Optimization
• Either the objective function or constraints (or both) are non-linear.
• Used in complex real-world problems like portfolio optimization, machine learning,
etc.
3. Unconstrained Optimization
• No constraints on variables.
• Example: Minimize f(x)=x2+2x+3
4. Constrained Optimization
• Includes equality or inequality constraints.
• Example: Minimize f(x) = x^2, subject to x≥1
5. Convex Optimization
• The objective function is convex and the feasible region is a convex set.
• Global minimum is guaranteed.
• Common in machine learning and control systems.
6. Combinatorial Optimization
• Deals with discrete variables and finding the best combination (e.g., scheduling,
traveling salesman problem).
• Often solved using heuristics or approximation algorithms.
7. Stochastic Optimization
• Involves randomness or uncertainty in data or model.
• Used in scenarios where exact information is not available (e.g., financial modelling,
reinforcement learning).
8. Multi-objective Optimization
• More than one objective function to optimize simultaneously.
• Trade-offs are considered to find Pareto optimal solutions.

Q. Write a short note on derivative free optimization methods?


Derivative-Free Optimization (DFO) methods are optimization techniques used when the
derivatives of the objective function are unavailable, unreliable, or expensive to compute.
These methods are ideal for problems where the function is noisy, discontinuous, or computed
through simulations or black-box models.
Key Characteristics:
• Do not require gradients or Hessians.
• Useful for non-differentiable or complex functions.
• Work well for black-box or simulation-based optimization.
Common Derivative-Free Methods:
1. Genetic Algorithms (GA):
Inspired by natural selection; uses crossover, mutation, and selection.
2. Particle Swarm Optimization (PSO):
Swarm-based method that adjusts solutions based on the best-found positions.
3. Nelder-Mead Method (Simplex):
Uses a simplex of points to search for the minimum without derivatives.
4. Simulated Annealing:
Probabilistic technique that allows occasional uphill moves to escape local minima.
5. Random Search:
Tries random points in the domain and selects the best.
Applications:
• Hyperparameter tuning in machine learning
• Engineering design
• Simulation-based optimization
• Optimization under uncertainty
Advantages:
• Can handle noisy, non-smooth, or black-box functions.
• Simple to implement.
• No need for analytical gradients.

Q. List advantages and disadvantages of derivative based optimization techniques.


These are optimization methods that use the derivative (or gradient) of a function to guide
the search for a minimum or maximum. Examples include:
• Gradient Descent
• Newton's Method
• Conjugate Gradient
• Quasi-Newton Methods (like BFGS, L-BFGS)
These methods rely on information like:
• First derivatives (gradients)
• Sometimes second derivatives (Hessians) for faster convergence
Q. State differences between derivative based and derivative free methods.
Q. Write a short note on Random Search Method and Newton’s method.
Random Search Method
The Random Search Method is a derivative-free optimization technique used for finding the
global minimum or maximum of an objective function. It works by randomly sampling points
in the search space and evaluating the objective function at these points.
Key Features:
• Does not require gradient information.
• Suitable for non-differentiable, discontinuous, or black-box functions.
• Simple to implement and useful for global search, especially when the search space is
large or complex.
Limitations:
• Can be inefficient and slow, especially for high-dimensional problems.
• No guarantee of finding the global optimum, but good for exploration.
Newton’s Method
Newton’s Method is a second-order derivative-based optimization technique used to find
local minima or maxima of a function. It uses both the gradient (first derivative) and the
Hessian matrix (second derivative) to iteratively update the variable values.
Update Formula:

Key Features:
• Fast convergence (quadratic) near the optimum.
• Suitable for smooth and differentiable functions.
• Efficient in low to moderate dimensions.
Limitations:
• Requires computation of second derivatives (Hessian), which can be expensive.
• Not suitable for non-differentiable or noisy functions.
• May fail if the Hessian is singular or poorly conditioned.

Q. Describe Downhill Simplex method. Why is it called the Derivative free method?
The Downhill Simplex Method, also known as the Nelder-Mead Method, is a derivative-free
optimization algorithm used to minimize a scalar-valued function of one or more variables. It
is particularly useful when the function is non-differentiable, noisy, or complex, where
derivatives are unavailable or unreliable.
• Why is it Called a Derivative-Free Method?
The Downhill Simplex Method is called a derivative-free method because it does not require
the calculation of gradients (first derivatives) or Hessians (second derivatives) of the
objective function. Instead, it relies solely on function evaluations to explore the search space
and determine the direction of improvement.
• Concept of a Simplex
A simplex is a geometric figure consisting of n+1n + 1n+1 points in an n-dimensional space.
• In 1D → simplex is a line segment (2 points)
• In 2D → simplex is a triangle (3 points)
• In 3D → simplex is a tetrahedron (4 points)
These points represent possible solutions, and the method iteratively moves and reshapes the
simplex to find the minimum.

Advantages
• No need for derivative information
• Simple to implement
• Effective for small to medium-sized problems
• Works well on non-smooth or noisy functions
Disadvantages
• Slow for high-dimensional problems
• May converge to a local minimum
• Performance depends on initial simplex

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy