27 KrishParasShah
27 KrishParasShah
Task Given: Write a python program to read a csv file and determine mean, variance
and standard deviation of the given data set. Use SPYDER only.
VES Institute of Technology
Department of Automation and Robotics
AIM: Write a python program to determine mean, variance and standard deviation.
CODE:
import pandas as pd import numpy as np file_path =
'data_set.csv' df = pd.read_csv(file_path)
column_name = 'age' mean =
df[column_name].mean() variance =
df[column_name].var() std_dev =
df[column_name].std() print(f"Mean of
'{column_name}': {mean}") print(f"Variance of
'{column_name}': {variance}") print(f"Standard
Deviation of '{column_name}': {std_dev}")
OUTPUT:
VES Institute of Technology
Task Given: Write a python program to implement linear regression with one
variable for the given data set (without using python libraries). Use SPYDER only.
Program Code:
X = [1, 2, 3, 4, 5, 6]
Y = [2, 4, 5, 4, 5, 7]
(Assignment-04)
Task Given: Write a python program to perform multivariate linear regression. Use
SPYDER only.
Theory:
β0 is Y-intercept
X1 = [1, 2, 6, 8, 15]
Aim: Write a python program to perform logistic regression on the given data set (read from csv
file). Use SPYDER only.
Theory:
Logistic Regression is a statistical method used for binary classification problems. It is used to model
the relationship between a dependent binary variable and one or more independent variables. The
range of logistic regression is restricted to 0 to 1.
Unlike linear regression, it does not require a linear relationship between input and output. This is due
to application of nonlinear logarithmic transformation.
Code:
Output:
Graph:
Aim: Write a python program to perform k-means clustering on a given image for compression.
Theory:
K-means image compression is a technique that reduces the size of an image by reducing the number
of unique colors in the image, while maintaining its visual appearance. The method relies on K-means
clustering, a popular unsupervised machine learning algorithm, to group similar colors and represent
them with a smaller set of "centroids" (average colors of the clusters). This process results in a
compressed image with fewer colors, effectively reducing the amount of data needed to represent it.
Code:
Input Image:
Output Image:
Code:
import pandas as pd import
numpy as np import
matplotlib.pyplot as plt
# Load CSV
data = pd.read_csv("C:/Users/DELL/Desktop/VESIT/SEM_6/ML/customers.csv")
# Extract features
X = data[['AnnualIncome', 'SpendingScore']].values
# Step 2: Dendrogram
plt.figure(figsize=(10, 5))
dendrogram(linked, orientation='top',
distance_sort='descending',
show_leaf_counts=True)
plt.grid(True) plt.show()
= fcluster(linked, 3, criterion='maxclust')
plt.show()
Output:
Aim: Write a python program to implement ANN for handwritten digits Recognition.
Code:
import tensorflow as tf
dataset
encoding y_train_cat =
to_categorical(y_train) y_test_cat
= to_categorical(y_test) # Step 2:
Sequential()
the model
range(5):
cmap='gray')
plt.axis('off')
plt.show()
Output:
VES Institute of Technology
Aim: Write a python program to implement SVM for spam email classifiers.
Code:
import pandas as pd from
sklearn.feature_extraction.text import
TfidfVectorizer from sklearn.model_selection import
train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix url =
"https://raw.githubusercontent.com/justmarkham/pycon-2016-tutorial/master/data/sms.tsv"
df = pd.read_table(url, header=None, names=["label", "message"])
df['label'] = df['label'].map({'ham': 0, 'spam':
1}) vectorizer =
TfidfVectorizer(stop_words='english') X =
vectorizer.fit_transform(df['message']) y =
df['label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42) model = SVC(kernel='linear') model.fit(X_train, y_train)
y_pred = model.predict(X_test) print(" Spam Detection using SVM")
print("Accuracy:", accuracy_score(y_test, y_pred)) print("\nConfusion Matrix:\n",
confusion_matrix(y_test, y_pred)) print("\nClassification Report:\n",
classification_report(y_test, y_pred))
Output: