0% found this document useful (0 votes)
10 views5 pages

Ludic - Workshop - Iris - Copie

Bon travail

Uploaded by

cngongang1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views5 pages

Ludic - Workshop - Iris - Copie

Bon travail

Uploaded by

cngongang1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Building Your First Machine Learning Model

This series of exercises will gradually guide you and your group through understanding
features, feature selection, and the implementation of a machine learning model using the
Iris datasets. The exercises are designed to build foundational knowledge and culminate in
implementing and refining a model in Google Colab.
Part 1: Understanding Features and Feature Selection
Exercise 1: Introduction to Features
Objective: Help students understand the importance of features in machine learning.
Question:
You are provided with a dataset of flowers, each with measurements such as petal length,
petal width, sepal length, and sepal width. If you wanted to predict the species of the flower
based on its measurements, which of the following would be considered features?
Choices:
a) Petal length, petal width, sepal length, sepal width
b) Flower species name
c) The color of the dataset file
Exercise 2: Importance of Feature Selection
Objective: Understand why some features may be more important than others.
Question:
In the Iris dataset, if you wanted to predict flower species, why would you consider excluding
a feature like 'flower color'?
Choices:
a) Flower color is not numerical
b) Flower color is unrelated to species prediction
c) Machine learning models only use numerical features
Part 2: Getting Started with Google Colab and Datasets
Exercise 3: Loading the Iris Dataset in Google Colab
Objective: Introduce students to loading datasets in Colab.
Task:
In Google Colab, load the Iris dataset using pandas.

Page 1|5
Code:
```python
import pandas as pd
from sklearn.datasets import load_iris
iris = load_iris()
iris_df = pd.DataFrame(iris.data, columns=iris.feature_names)
iris_df['species'] = iris.target
iris_df.head()
```
Question:
What does the `iris_df.head()` function return?
Exercise 4: Feature Inspection
Objective: Teach students to inspect the features of a dataset.
Task:
Inspect the first few rows of the dataset and describe the features. What are the units of the
measurements (e.g., centimeters, inches)?
Part 3: Building Your First Model
Exercise 5: Splitting the Data
Objective: Teach students to split data into training and testing sets.
Question:
Why do we split the dataset into a training set and a testing set?
Choices:
a) To prevent the model from memorizing the data
b) To increase the size of the dataset
c) To make the model faster
Task:
Using scikit-learn, split the dataset into training and testing sets.

Page 2|5
```python
from sklearn.model_selection import train_test_split
X = iris_df.drop('species', axis=1)
y = iris_df['species']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
```
Exercise 6: Choosing a Model
Objective: Introduce basic models like Decision Trees.
Question:
Which of the following algorithms is suitable for classifying species in the Iris dataset?
Choices:
a) Linear Regression
b) Decision Tree
c) K-Means Clustering
Exercise 7: Training the Model
Objective: Show how to train a basic decision tree model.
Task:
Train a Decision Tree model on the Iris dataset.
Code:
```python
from sklearn.tree import DecisionTreeClassifier
model = DecisionTreeClassifier()
model.fit(X_train, y_train)
```
Question:
What does `model.fit(X_train, y_train)` do?

Page 3|5
Part 4: Evaluating the Model
Exercise 8: Evaluating Model Performance
Objective: Introduce evaluation metrics like accuracy.
Question:
What metric would you use to evaluate a classification model's performance?
Choices:
a) Mean Squared Error
b) Accuracy
c) Precision
Task:
Evaluate the model’s accuracy on the test set.
```python
accuracy = model.score(X_test, y_test)
print(f"Accuracy: {accuracy}")
```
Exercise 9: Visualizing the Decision Tree
Objective: Teach students to visualize the model’s decision-making process.
Task:
Use matplotlib to visualize the decision tree.
```python
from sklearn.tree import plot_tree
import matplotlib.pyplot as plt
plt.figure(figsize=(10,7))
plot_tree(model, filled=True, feature_names=iris.feature_names)
plt.show()
```
Part 5: Exploring Advanced Topics
Exercise 10: Feature Importance
Objective: Help students understand the importance of each feature.
Task:

Page 4|5
Determine which features are most important in the decision tree.
```python
feature_importances = model.feature_importances_
print("Feature importances:", feature_importances)
```
Question:
Why might petal length be more important than sepal length for predicting species?
Exercise 11: Hyperparameter Tuning
Objective: Introduce basic hyperparameter tuning concepts.
Task:
Change the `max_depth` of the Decision Tree and observe its effect on model performance.
```python
model = DecisionTreeClassifier(max_depth=3)
model.fit(X_train, y_train)
accuracy = model.score(X_test, y_test)
print(f"Accuracy: {accuracy}")
```
Question:
What is the impact of limiting the tree’s depth?

Page 5|5

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy