0% found this document useful (0 votes)
3 views19 pages

ML Lab Manual

The document outlines various Python programming experiments focused on statistical analysis, machine learning, and data visualization. It includes implementations of central tendency measures, linear regression, decision trees, KNN, logistic regression, and K-Means clustering, utilizing libraries such as NumPy, Pandas, and Scikit-learn. Each section provides code examples and expected outputs for better understanding of the concepts.

Uploaded by

Sofia tarannum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views19 pages

ML Lab Manual

The document outlines various Python programming experiments focused on statistical analysis, machine learning, and data visualization. It includes implementations of central tendency measures, linear regression, decision trees, KNN, logistic regression, and K-Means clustering, utilizing libraries such as NumPy, Pandas, and Scikit-learn. Each section provides code examples and expected outputs for better understanding of the concepts.

Uploaded by

Sofia tarannum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

NAME OF THE EXPERIMENT PAGE

NO
1 python program to compute Central Tendency
Measures :Mean, Median, Mode Measures of
Dispersion: variance ,standard Deviation

2 .Study of Python Basic Libraries such as Statistics,


Math, Numpy and Scipy

3 Study of Python Libraries for ML application such


as Pandas and Matplotlib
4 Python Program for Simple Linear Regression.

5 Implementation of Multiple Linear Regression for


House Pricing Pricing Prediction using sklearn
6 Implementation of Decision tree using sklearn and
its parameter tuning

7 Implementation of KNN using sklearn

8 Implementation of Logistic Regression using


sklearn

9 Implementation of K-Means Clustering


import numpy as np

10 Performance analysis of Classification Algorithms

1
Program 1: python program to compute Central Tendency
Measures :Mean, Median, Mode Measures of Dispersion:
variance ,standard Deviation

import statistics as stats

def central_tendency_dispersion(data):

# Central Tendency Measures

mean = stats.mean(data)

median = stats.median(data)

try:

mode = stats.mode(data)

except stats.StatisticsError:

mode = "No unique mode found"

# Measures of Dispersion

variance = stats.variance(data)

std_dev = stats.stdev(data)

# Display results

print(f"Mean: {mean}")

print(f"Median: {median}")

print(f"Mode: {mode}")

print(f"Variance: {variance}")

print(f"Standard Deviation: {std_dev}")

# Example data

2
data = [10, 15, 14, 10, 15, 18, 20, 25, 30]

central_tendency_dispersion(data)

OUTPUT:

Mean: 17.444444444444443
Median: 15
Mode: 10
Variance: 44.52777777777778
Standard Deviation: 6.672913739722534

3
2.Study of Python Basic Libraries such as Statistics, Math, Numpy and
Scipy

Python provides a wide range of basic libraries that are essential for various computational
tasks. These libraries offer functionality to handle statistical calculations, mathematical
operations, and scientific computing. Here is an overview:

Statistics Module

 Used for statistical computations such as mean, median, mode, variance, etc.
 Example

import statistics

data = [1, 2, 2, 3, 4]

print("Mean:", statistics.mean(data))

print("Median:", statistics.median(data))

print("Mode:", statistics.mode(data))

Math Module

 Provides mathematical functions such as trigonometric calculations, logarithms,


factorials, and more.
 Example

import math

print("Square root of 16:", math.sqrt(16))

print("Factorial of 5:", math.factorial(5))

print("Cosine of 45 degrees:", math.cos(math.radians(45)))

Numpy Library

 Widely used for numerical computations with arrays, matrices, and linear algebra
functions.
 Example:

import numpy as np

array = np.array([1, 2, 3, 4, 5])

4
print("Mean of array:", np.mean(array))

print("Sum of array:", np.sum(array))

Scipy Library

 Built on Numpy, it provides additional functionality for optimization, integration, and


scientific computations.
 Example

from scipy import integrate

# Define a function to integrate

result, _ = integrate.quad(lambda x: x**2, 0, 1)

print("Integral of x^2 from 0 to 1:", result)

5
3. Study of Python Libraries for ML application such as Pandas and
Matplotlib

For machine learning and data analysis, Python libraries like Pandas and Matplotlib are
essential for data manipulation and visualization.

Pandas

 Provides data structures like Series and DataFrame for handling and analyzing data
efficiently.
 Example:

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}

df = pd.DataFrame(data)

print(df)

print("Mean Age:", df['Age'].mean())

Matplotlib

 A visualization library used for creating static, interactive, and animated plots.
 Example:

import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]

y = [10, 20, 25, 30, 35]

plt.plot(x, y, marker='o', linestyle='--', color='r')

plt.title("Sample Line Plot")

plt.xlabel("X-axis")

plt.ylabel("Y-axis")

plt.show()

6
Program 4:Python Program for Simple Linear Regression.

import numpy as np

import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_squared_error, r2_score

# Generate some example data

np.random.seed(0)

X = 2 * np.random.rand(100, 1)

y = 4 + 3 * X + np.random.randn(100, 1)

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,


random_state=42)

# Create and train the model

model = LinearRegression()

model.fit(X_train, y_train)

# Make predictions

y_pred = model.predict(X_test)

# Evaluate the model

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)

7
print(f"Mean Squared Error: {mse:.2f}")

print(f"R-squared: {r2:.2f}")

# Plotting the results

plt.scatter(X_test, y_test, color="black", label="Actual data")

plt.plot(X_test, y_pred, color="blue", linewidth=2, label="Fitted line")

plt.xlabel("X")

plt.ylabel("y")

plt.title("Simple Linear Regression")

plt.legend()

plt.show()

OUTPUT:

8
program5: Implementation of Multiple Linear Regression for House
Pricing Pricing Prediction using sklearn

import numpy as np

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_squared_error, r2_score

# Load the dataset

data = pd.read_csv('house_prices.csv')

# Display the first few rows of the dataset

print(data.head())

# Selecting features and target variable

X = data[['Size', 'Bedrooms', 'Age']]

y = data['Price']

# Handling missing data

X = X.fillna(X.mean())

y = y.fillna(y.mean())

# Splitting the data into training and testing sets

9
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Creating and training the model

model = LinearRegression()

model.fit(X_train, y_train)

# Making predictions on the testing set

y_pred = model.predict(X_test)

# Evaluating the model's performance

mse = mean_squared_error(y_test, y_pred)

r2 = r2_score(y_test, y_pred)

print(f'Mean Squared Error: {mse}')

print(f'R-squared: {r2}')

# Model coefficients

print("Intercept:", model.intercept_)

print("Coefficients:", model.coef_)

coefficients = pd.DataFrame(model.coef_, X.columns, columns=['Coefficient'])

print(coefficients)

10
6. Implementation of Decision tree using sklearn and its parameter tuning
11
Importing necessary libraries

import numpy as np

import pandas as pd

from sklearn.model_selection import train_test_split, GridSearchCV

from sklearn.tree import DecisionTreeClassifier

from sklearn.metrics import accuracy_score, classification_report

from sklearn.datasets import load_iris

# Load dataset (for example, the Iris dataset)

data = load_iris()

X = data.data

y = data.target

# Split dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,


random_state=42)

# Initialize a basic DecisionTreeClassifier

clf = DecisionTreeClassifier(random_state=42)

# Fit the model with the training data

clf.fit(X_train, y_train)

# Predict on the test set

y_pred = clf.predict(X_test)

12
# Evaluate model performance

print("Accuracy without tuning: ", accuracy_score(y_test, y_pred))

print("Classification Report:\n", classification_report(y_test, y_pred))

# Parameter tuning using GridSearchCV

param_grid = {

'criterion': ['gini', 'entropy'], # Different criteria for splitting

'splitter': ['best', 'random'], # Split strategy

'max_depth': [None, 10, 20, 30], # Depth of tree

'min_samples_split': [2, 5, 10], # Minimum number of samples to split a


node

'min_samples_leaf': [1, 2, 4], # Minimum number of samples to be at a


leaf node

'max_features': [None, 'auto', 'sqrt', 'log2'] # Number of features to consider


for the best split

# Using GridSearchCV for parameter tuning

grid_search = GridSearchCV(estimator=clf, param_grid=param_grid, cv=5,


n_jobs=-1, verbose=1)

# Fit GridSearchCV

grid_search.fit(X_train, y_train)

# Best parameters from GridSearchCV

13
print("Best Parameters: ", grid_search.best_params_)

# Predict with the best estimator from grid search

best_clf = grid_search.best_estimator_

y_pred_best = best_clf.predict(X_test)

# Evaluate performance with the tuned model

print("Accuracy with tuning: ", accuracy_score(y_test, y_pred_best))

print("Classification Report:\n", classification_report(y_test, y_pred_best))

OUTPUT:

Accuracy with tuning: 1.0


Classification Report:
precision recall f1-score support

0 1.00 1.00 1.00 10


1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11

accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30

7. Implementation of KNN using sklearn

14
# Import necessary libraries

from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split

from sklearn.neighbors import KNeighborsClassifier

from sklearn.metrics import accuracy_score

# Load the dataset (Iris dataset)

iris = load_iris()

X = iris.data # Features

y = iris.target # Target labels

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,


random_state=42)

# Create the KNN model with k=3

knn = KNeighborsClassifier(n_neighbors=3)

# Train the model

knn.fit(X_train, y_train)

# Make predictions

y_pred = knn.predict(X_test)

# Evaluate the model's performance

accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy * 100:.2f}%")

OUTPUT:

Accuracy: 100.00%

8.Implementation of Logistic Regression using sklearn

15
# Import necessary libraries

import numpy as np

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score, confusion_matrix,


classification_report

from sklearn.datasets import load_iris

# Load a sample dataset

# Here, we're using the Iris dataset for simplicity.

# We'll use only two classes (binary classification) for logistic regression.

iris = load_iris()

X = iris.data

y = iris.target

# For binary classification, we'll select only two classes (e.g., class 0 and 1)

X = X[y != 2] # Select only class 0 and 1

y = y[y != 2] # Select only class 0 and 1

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,


random_state=42)

# Create a Logistic Regression model

log_reg = LogisticRegression()

# Train the model

log_reg.fit(X_train, y_train)

16
# Make predictions on the test set

y_pred = log_reg.predict(X_test)

# Evaluate the model

accuracy = accuracy_score(y_test, y_pred)

conf_matrix = confusion_matrix(y_test, y_pred)

class_report = classification_report(y_test, y_pred)

# Print the results

print("Accuracy:", accuracy)

print("\nConfusion Matrix:\n", conf_matrix)

print("\nClassification Report:\n", class_report)

OUTPUT:

Accuracy: 1.0

Confusion Matrix:

[[17 0]

[ 0 13]]

Classification Report:

precision recall f1-score support

0 1.00 1.00 1.00 17

1 1.00 1.00 1.00 13

accuracy 1.00 30

macro avg 1.00 1.00 1.00 30

weighted avg 1.00 1.00 1.00 30

9.Implementation of K-Means Clustering

17
import numpy as np

from sklearn.cluster import KMeans

from sklearn.datasets import make_blobs

import matplotlib.pyplot as plt

# Generate synthetic data with 4 clusters

X, y_true = make_blobs(n_samples=300, centers=4, cluster_std=0.60,


random_state=0)

# Create a KMeans model with the number of clusters set to 4

kmeans = KMeans(n_clusters=4, random_state=0)

# Fit the model to the data

kmeans.fit(X)

# Predict the cluster labels for each data point

y_kmeans = kmeans.predict(X)

# Plotting the clusters and their centroids

plt.scatter(X[:, 0], X[:, 1], c=y_kmeans, s=50, cmap='viridis')

# Marking the centroids

centers = kmeans.cluster_centers_

plt.scatter(centers[:, 0], centers[:, 1], c='red', s=200, alpha=0.75, marker='X')

plt.title("K-Means Clustering")

18
plt.xlabel("Feature 1")

plt.ylabel("Feature 2")

plt.show()

OUTPUT:

19

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy