0% found this document useful (0 votes)

37 views38 pages

ML Lab - 231009 - 210335

The document describes an experiment implementing the Candidate-Elimination algorithm to output the set of hypotheses consistent with a given training data set stored in a CSV file. It provides the code to initialize the specific and general hypotheses, learn from the training examples by eliminating inconsistent values, and output the final specific and general hypotheses. The training data contains examples of weather conditions and whether someone enjoys sports. The algorithm is demonstrated on this data set to learn a hypothesis to predict sport enjoyment.

Uploaded by

Jahnavi Anand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views38 pages

ML Lab - 231009 - 210335

Uploaded by

Jahnavi Anand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

MACHINE LEARNING WITH PYTHON

LABORATORY
JNTUA COLLEGE OF ENGINEERING (AUTONOMOUS)
ANANTAPUR

Department of Computer Science and Engineering

Master of Computer Applications (MCA), R20

Prepared by
Potte Thumucherla Khasim Baba
Admission No: - 21001F0056

21001F0056 Khasim MCA

Experiment Experiment Name Page No.
No.
1 Implement and demonstrate the FIND-S algorithm for finding the 1-3
most specific hypothesis based on a given set of training data
samples. Read the training data from a .csv file.

2 For a given set of training data examples stored in a .csv file, 4-7
implement and demonstrate the Candidate-Elimination algorithm
to output a description of the set of all hypotheses consistent with
the training examples.

3 Write a program to demonstrate the working of the decision tree 8-12

based ID3 algorithm. Use an appropriate data set for building the
decision tree and apply this knowledge to classify a new sample.

4 Write a Python program to implement k-Nearest Neighbour 13-15

algorithm to classify the iris data set. Print both correct and
wrong predictions.

5 Build an Artificial Neural Network by implementing the 16-18

Backpropagation algorithm and test the same using appropriate
data sets.

6 Write a program to implement the naive Bayesian classifier for a 19-22

sample training data set stored as a .csv file. Compute the
accuracy of the classifier, considering few test data sets.

7 Write a Python program to construct a Bayesian network 23-26

considering medical data. Use this model to demonstrate the
diagnosis of heart patients using standard Heart Disease Data Set.

8 Assuming a set of documents that need to be classified, use the 27-29

naive Bayesian Classifier model to perform this task. Built-in
Java classes/API can be used to write the program. Calculate the
accuracy, precision and recall for your data set.

9 Apply EM algorithm to cluster a set of data stored in a .CSV file. 30-32

Use the same data set for clustering using k-Means algorithm.
Compare the results of these two algorithms and comment on the
quality of clustering using Python Programming.

10 Implement the non-parametric Locally Weighted Regression 33-36

algorithm in order to fit data points. Select appropriate data set
for your experiment and draw graphs.

21001F0056 Khasim MCA

Experiment 1:
Implement and demonstrate the FIND-S algorithm for finding the most specific
hypothesis based on a given set of training data samples. Read the training data from a .csv
file.

Code: -
import csv
a = []
print("The given Training Data Set")
with open('enjoysport.csv', 'r') as csvfile:
for row in csv.reader(csvfile):
a.append(row)
print(a)
print("\nThe total number of training instances are : ",len(a))
num_attribute = len(a[0])-1
print("\nThe initial hypothesis is : ")
hypothesis = ['0']*num_attribute
print(hypothesis)
for i in range(0, len(a)):
if a[i][num_attribute] == 'yes':
for j in range(0, num_attribute):
if hypothesis[j] == '0' or hypothesis[j] == a[i][j]:
hypothesis[j] = a[i][j]
else:
hypothesis[j] = '?'
print("\nThe hypothesis for the training instance {} is :\n" .format(i+1),hypothesis)

print("\nThe Maximally specific hypothesis for the training instance is ")

print(hypothesis)

21001F0056 Khasim MCA

Dataset: -
Sky AirTemp Humidity Wind Water Forecast Enjoysport

sunny warm normal strong warm same yes

sunny warm high strong warm same yes

rainy cold high strong warm change no

sunny warm high strong cool change yes

enjoysport.csv: -
sunny warm normal strong warm same yes

sunny warm high strong warm same yes

rainy cold high strong warm change no

sunny warm high strong cool change yes

21001F0056 Khasim MCA

Output: -
The given Training Data Set
[['sunny', 'warm', 'normal', 'strong', 'warm', 'same', 'yes'], ['sunny', 'warm', 'high', 'strong',
'warm', 'same', 'yes'], ['rainy', 'cold', 'high', 'strong', 'warm', 'change', 'no'], ['sunny', 'warm',
'high', 'strong', 'cool', 'change', 'yes']]

The total number of training instances are : 4

The initial hypothesis is :

['0', '0', '0', '0', '0', '0']

The hypothesis for the training instance 1 is :

['sunny', 'warm', 'normal', 'strong', 'warm', 'same']

The hypothesis for the training instance 2 is :

['sunny', 'warm', '?', 'strong', 'warm', 'same']

The hypothesis for the training instance 3 is :

['sunny', 'warm', '?', 'strong', 'warm', 'same']

The hypothesis for the training instance 4 is :

['sunny', 'warm', '?', 'strong', '?', '?']

The Maximally specific hypothesis for the training instance is

['sunny', 'warm', '?', 'strong', '?', '?']

21001F0056 Khasim MCA

Experiment 2:
For a given set of training data examples stored in a .csv file, implement and demonstrate
the Candidate-Elimination algorithm to output a description of the set of all hypotheses
consistent with the training examples.

Code: -
import numpy as np
import pandas as pd
data = pd.DataFrame(data=pd.read_csv('enjoysport.csv'))
print("The training data is : \n",data)
concepts = np.array(data.iloc[:,0:-1])
print("\nThe concepts are :\n",concepts)
target = np.array(data.iloc[:,-1])
print("\nThe targets of concepts are :\n",target)
def learn(concepts, target):
print("\nInitialization of specific_h and general_h :")
specific_h = concepts[0].copy()
general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]
print("General Hypothesis :\n",general_h)
print("Specific Hypothesis :\n",specific_h)
for i, h in enumerate(concepts):
if target[i] == "yes":
for x in range(len(specific_h)):
if h[x] != specific_h[x]:
specific_h[x] = '?'
general_h[x][x] = '?'
if target[i] == "no":
for x in range(len(specific_h)):
if h[x] != specific_h[x]:
general_h[x][x] = specific_h[x]

21001F0056 Khasim MCA

else:
general_h[x][x] = '?'
print("\nStep",i+1,"of Candidate Elimination Algorithm :")
print("\nGeneral Hypothesis is :\n",general_h)
print("\nSpecific Hypothesis is :\n",specific_h)
indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]
for i in indices:
general_h.remove(['?', '?', '?', '?', '?', '?'])
return specific_h, general_h
s_final, g_final = learn(concepts, target)
print("\nFinal General Hypothesis :", g_final, sep="\n")
print("\nFinal Specific Hypothesis :", s_final, sep="\n")

Dataset: -

enjoysport.csv:
Sky AirTemp Humidity Wind Water Forecast Enjoysport

sunny warm normal strong warm same yes

sunny warm high strong warm same yes

rainy cold high strong warm change no

sunny warm high strong cool change yes

21001F0056 Khasim MCA

Output: -
The training data is :
Sky AirTemp Humidity Wind Water Forecast Enjoysport
0 sunny warm normal strong warm same yes
1 sunny warm high strong warm same yes
2 rainy cold high strong warm change no
3 sunny warm high strong cool change yes

The concepts are :

[['sunny' 'warm' 'normal' 'strong' 'warm' 'same']
['sunny' 'warm' 'high' 'strong' 'warm' 'same']
['rainy' 'cold' 'high' 'strong' 'warm' 'change']
['sunny' 'warm' 'high' 'strong' 'cool' 'change']]
The targets of concepts are :
['yes' 'yes' 'no' 'yes']
Initialization of specific_h and general_h :
General Hypothesis :
[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?']]
Specific Hypothesis :
['sunny' 'warm' 'normal' 'strong' 'warm' 'same']

Step 1 of Candidate Elimination Algorithm :

General Hypothesis is :
[['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?',
'?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?']]

Specific Hypothesis is :
['sunny' 'warm' 'normal' 'strong' 'warm' 'same']

21001F0056 Khasim MCA

Step 2 of Candidate Elimination Algorithm :

Step 3 of Candidate Elimination Algorithm :

General Hypothesis is :
[['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?',
'?'], ['?', '?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?', 'same']]
Specific Hypothesis is :
['sunny' 'warm' '?' 'strong' 'warm' 'same']

Step 4 of Candidate Elimination Algorithm :

Specific Hypothesis is :
['sunny' 'warm' '?' 'strong' '?' '?']

Final General Hypothesis :

[['sunny', '?', '?', '?', '?', '?'], ['?', 'warm', '?', '?', '?', '?']]

Final Specific Hypothesis :

['sunny' 'warm' '?' 'strong' '?' '?']

21001F0056 Khasim MCA

Experiment 3:
Write a program to demonstrate the working of the decision tree based ID3 algorithm.
Use an appropriate data set for building the decision tree and apply this knowledge to classify
a new sample.

Code: -
import pandas as pd
import math
import numpy as np
data = pd.read_csv("decisiontreedataset.csv")
features = [feat for feat in data]
features.remove("answer")
class Node:
def __init__(self):
self.children = []
self.value = ""
self.isLeaf = False
self.pred = ""
def entropy(examples):
pos = 0.0
neg = 0.0
for _, row in examples.iterrows():
if row["answer"] == "yes":
pos += 1
else:
neg += 1
if pos == 0.0 or neg == 0.0:
return 0.0
else:
p = pos / (pos + neg)
n = neg / (pos + neg)
21001F0056 Khasim MCA
9

return -(p * math.log(p, 2) + n * math.log(n, 2))

def info_gain(examples, attr):
uniq = np.unique(examples[attr])
gain = entropy(examples)
for u in uniq:
subdata = examples[examples[attr] == u]
sub_e = entropy(subdata)
gain -= (float(len(subdata)) / float(len(examples))) * sub_e
return gain
def ID3(examples, attrs):
root = Node()
max_gain = 0
max_feat = ""
for feature in attrs:
gain = info_gain(examples, feature)
if gain > max_gain:
max_gain = gain
max_feat = feature
root.value = max_feat
uniq = np.unique(examples[max_feat])
for u in uniq:
subdata = examples[examples[max_feat] == u]
if entropy(subdata) == 0.0:
newNode = Node()
newNode.isLeaf = True
newNode.value = u
newNode.pred = np.unique(subdata["answer"])
root.children.append(newNode)

21001F0056 Khasim MCA

else:
dummyNode = Node()
dummyNode.value = u
new_attrs = attrs.copy()
new_attrs.remove(max_feat)
child = ID3(subdata, new_attrs)
dummyNode.children.append(child)
root.children.append(dummyNode)
return root
def printTree(root: Node, depth=0):
for i in range(depth):
print("\t", end="")
print(root.value, end="")
if root.isLeaf:
print(" -> ", root.pred)
print()
for child in root.children:
printTree(child, depth + 1)
def classify(root: Node, new):
for child in root.children:
if child.value == new[root.value]:
if child.isLeaf:
print ("Predicted Label for new example", new," is:", child.pred)
exit
else:
classify (child.children[0], new)

root = ID3(data, features)

print("Decision Tree is:")
21001F0056 Khasim MCA
11

printTree(root)
print ("--------------------------------------------------------------------------------")
new = {"outlook":"sunny", "temperature":"hot", "humidity":"normal", "wind":"strong"}
classify (root, new)

Dataset: -
decisiontreedataset.csv:
outlook temperature humidity wind answer
sunny hot high weak no
sunny hot high strong no
overcast hot high weak yes
rain mild high weak yes
rain cool normal weak yes
rain cool normal strong no
overcast cool normal strong yes
sunny mild high weak no
sunny cool normal weak yes
rain mild normal weak yes
sunny mild normal strong yes
overcast mild high strong yes
overcast hot normal weak yes
rain mild high strong no

21001F0056 Khasim MCA

Output: -
Decision Tree is:
outlook
overcast -> ['yes']

rain
wind
strong -> ['no']

weak -> ['yes']

sunny
humidity
high -> ['no']

normal -> ['yes']

--------------------------------------------------------------------------------
Predicted Label for new example {'outlook': 'sunny', 'temperature': 'hot', 'humidity': 'normal',
'wind': 'strong'} is: ['yes']

21001F0056 Khasim MCA

Experiment 4:
Write a Python program to implement k-Nearest Neighbour algorithm to classify the
iris data set. Print both correct and wrong predictions.

Code: -
import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn import metrics
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']
dataset = pd.read_csv("irisdataset.csv", names=names)
x = dataset.iloc[:, :-1]
y = dataset.iloc[:, -1]
print(x.head())
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.10)
classifier = KNeighborsClassifier(n_neighbors=5).fit(xtrain, ytrain)
ypred = classifier.predict(xtest)

i=0
print ("\n-------------------------------------------------------------------------")
print ('%-25s %-25s %-25s' % ('Original Label', 'Predicted Label', 'Correct/Wrong'))
print ("-------------------------------------------------------------------------")
for label in ytest:
print ('%-25s %-25s' % (label, ypred[i]), end="")
if (label == ypred[i]):
print (' %-25s' % ('Correct'))
else:
print (' %-25s' % ('Wrong'))
i=i+1
21001F0056 Khasim MCA
14

print ("-------------------------------------------------------------------------")
print("\nConfusion Matrix:\n",metrics.confusion_matrix(ytest, ypred))
print ("-------------------------------------------------------------------------")
print("\nClassification Report:\n",metrics.classification_report(ytest, ypred))
print ("-------------------------------------------------------------------------")
print('Accuracy of the classifer is %0.2f' % metrics.accuracy_score(ytest,ypred))

Dataset: -
irisdataset.csv: - headings are not necessary in excel csv file
sepal-length sepal-width petal-length petal-width class
5.1 3.5 1.4 0.2 Iris-setosa
4.9 3 1.4 0.2 Iris-setosa
4.7 3.2 1.3 0.2 Iris-setosa
4.6 3.1 1.5 0.2 Iris-setosa
5 3.6 1.4 0.2 Iris-setosa
5.4 3.9 1.7 0.4 Iris-setosa
4.6 3.4 1.4 0.3 Iris-setosa
5 3.4 1.5 0.2 Iris-setosa
4.4 2.9 1.4 0.2 Iris-setosa
4.9 3.1 1.5 0.1 Iris-setosa
7 3.2 4.7 1.4 Iris-versicolor
6.4 3.2 4.5 1.5 Iris-versicolor
6.9 3.1 4.9 1.5 Iris-versicolor
5.5 2.3 4 1.3 Iris-versicolor
6.5 2.8 4.6 1.5 Iris-versicolor
5.7 2.8 4.5 1.3 Iris-versicolor
6.3 3.3 4.7 1.6 Iris-versicolor
4.9 2.4 3.3 1 Iris-versicolor
6.6 2.9 4.6 1.3 Iris-versicolor
5.2 2.7 3.9 1.4 Iris-versicolor
6.3 3.3 6 2.5 Iris-virginica
5.8 2.7 5.1 1.9 Iris-virginica
7.1 3 5.9 2.1 Iris-virginica
6.3 2.9 5.6 1.8 Iris-virginica
6.5 3 5.8 2.2 Iris-virginica
7.6 3 6.6 2.1 Iris-virginica
4.9 2.5 4.5 1.7 Iris-virginica
7.3 2.9 6.3 1.8 Iris-virginica
6.7 2.5 5.8 1.8 Iris-virginica
7.2 3.6 6.1 2.5 Iris-virginica
21001F0056 Khasim MCA
15

Output: -
sepal-length sepal-width petal-length petal-width
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
-------------------------------------------------------------------
Original Label Predicted Label Correct/Wrong
-------------------------------------------------------------------
Iris-virginica Iris-versicolor Wrong
Iris-virginica Iris-virginica Correct
Iris-versicolor Iris-versicolor Correct
-------------------------------------------------------------------
Confusion Matrix:
[[1 0]
[1 1]]
-------------------------------------------------------------------
Classification Report:
precision recall f1-score support
Iris-versicolor 0.50 1.00 0.67 1
Iris-virginica 1.00 0.50 0.67 2

accuracy 0.67 3
macro avg 0.75 0.75 0.67 3
weighted avg 0.83 0.67 0.67 3
-------------------------------------------------------------------
Accuracy of the classifer is 0.67

21001F0056 Khasim MCA

Experiment 5:
Build an Artificial Neural Network by implementing the Backpropagation algorithm
and test the same using appropriate data sets.

Code: -
import numpy as np
x = np.array(([1, 2], [3, 4], [5, 6]), dtype=float)
y = np.array(([30], [60], [90]), dtype=float)
x = x/np.amax(x,axis=0)
y = y/100
def sigmoid (x):
return 1/(1 + np.exp(-x))
def derivatives_sigmoid(x):
return x * (1 - x)
epoch=5000
lr=0.1
inputlayer_neurons = 2
hiddenlayer_neurons = 3
output_neurons = 1
wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bout=np.random.uniform(size=(1,output_neurons))
for i in range(epoch):
hinp1=np.dot(x,wh)
hinp=hinp1 + bh
hlayer_act = sigmoid(hinp)
outinp1=np.dot(hlayer_act,wout)
outinp= outinp1+ bout
output = sigmoid(outinp)
21001F0056 Khasim MCA
17

EO = y-output
outgrad = derivatives_sigmoid(output)
d_output = EO* outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)
d_hiddenlayer = EH * hiddengrad
wout += hlayer_act.T.dot(d_output) *lr
wh += x.T.dot(d_hiddenlayer) *lr
print("Input: \n" , str(x))
print("Actual Output: \n" , str(y))
print("Predicted Output: \n" ,output)

21001F0056 Khasim MCA

Output: -
Input:
[[0.2 0.33333333]
[0.6 0.66666667]
[1. 1. ]]
Actual Output:
[[0.3]
[0.6]
[0.9]]
Predicted Output:
[[0.33497618]
[0.64084238]
[0.80288884]]

21001F0056 Khasim MCA

Experiment 6:
Write a program to implement the naive Bayesian classifier for a sample training data
set stored as a .csv file. Compute the accuracy of the classifier, considering few test data sets.

Code: -
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
data = pd.read_csv('playtennis.csv')
print("The first 5 values of data is :\n",data.head())
x = data.iloc[:,:-1]
print("\nThe First 5 values of train data is\n",x.head())
y = data.iloc[:,-1]
print("\nThe first 5 values of Train output is\n",y.head())
le_outlook = LabelEncoder()
x.Outlook = le_outlook.fit_transform(x.Outlook)
le_Temperature = LabelEncoder()
x.Temperature = le_Temperature.fit_transform(x.Temperature)
le_Humidity = LabelEncoder()
x.Humidity = le_Humidity.fit_transform(x.Humidity)
le_Windy = LabelEncoder()
x.Windy = le_Windy.fit_transform(x.Windy)
print("\nNow the Train data is :\n",x.head())
le_PlayTennis = LabelEncoder()
y = le_PlayTennis.fit_transform(y)
print("\nNow the Train output is\n",y)
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.20)

21001F0056 Khasim MCA

classifier = GaussianNB()
classifier.fit(x_train,y_train)
print("Accuracy is:",accuracy_score(classifier.predict(x_test),y_test))

Dataset: -
playtennis.csv:
Outlook Temperature Humidity Windy PlayTennis
rainy cold high weak no
rainy cold normal mild no
sunny warm normal weak yes
rainy cold normal weak no
sunny mild mild strong yes
cloudy cold high weak yes
rainy warm normal strong no
sunny mild mild mild yes
cloudy warm mild mild yes
cloudy cold high high no

21001F0056 Khasim MCA

Output: -
The first 5 values of data is :
Outlook Temperature Humidity Windy PlayTennis
0 rainy cold high weak no
1 rainy cold normal mild no
2 sunny warm normal weak yes
3 rainy cold normal weak no
4 sunny mild mild strong yes

The First 5 values of train data is

Outlook Temperature Humidity Windy
0 rainy cold high weak
1 rainy cold normal mild
2 sunny warm normal weak
3 rainy cold normal weak
4 sunny mild mild strong

The first 5 values of Train output is

0 no
1 no
2 yes
3 no
4 yes
Name: PlayTennis, dtype: object

21001F0056 Khasim MCA

Now the Train data is :

Outlook Temperature Humidity Windy
0 1 0 0 3
1 1 0 2 1
2 2 2 2 3
3 1 0 2 3
4 2 1 1 2
Now the Train output is
[0 0 1 0 1 1 0 1 1 0]
Accuracy is: 0.5

21001F0056 Khasim MCA

Experiment 7:
Write a Python program to construct a Bayesian network considering medical data. Use
this model to demonstrate the diagnosis of heart patients using standard Heart Disease Data
Set.

Code: -
import numpy as np
import pandas as pd
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianNetwork
from pgmpy.inference import VariableElimination
heartDisease = pd.read_csv('naivebayesnetworkdataset.csv')
heartDisease = heartDisease.replace('?',np.nan)
print('Sample instances from the dataset are:')
print(heartDisease.head())
print('\n Attributes and datatypes of dataset are:')
print(heartDisease.dtypes)
model=BayesianNetwork([('age','heartdisease'),('sex','heartdisease'),('exang','heartdisease'),
('cp','heartdisease'),('heartdisease','restecg'),('heartdisease','chol')])
print('\nLearning CPD using Maximum Likelihood Estimator')
model.fit(heartDisease,estimator=MaximumLikelihoodEstimator)
print('\n Inferencing with Bayesian Network:')
HeartDiseaseTest_infer = VariableElimination(model)
print('\n 1. Probability of HeartDisease given evidence= restecg')
q1=HeartDiseaseTest_infer.query(variables=['heartdisease'],evidence={'restecg':1})
print(q1)
print('\n 2. Probability of HeartDisease given evidence= cp ')
q2=HeartDiseaseTest_infer.query(variables=['heartdisease'],evidence={'cp':2})
print(q2)

21001F0056 Khasim MCA

Dataset: -
naivebayesnetworkdataset.csv:

age sex cp trestbps chol fbs restecg thalach exang oldpeak slope ca thal heartdisease
62 1 2 128 208 1 2 140 0 0 1 0 3 0
57 1 4 110 201 0 0 126 1 1.5 2 0 6 0
58 1 4 146 218 0 0 105 0 2 2 1 7 1
64 1 4 128 263 0 0 105 1 0.2 2 1 7 0
51 0 3 120 295 0 2 157 0 0.6 1 0 3 0
43 1 4 115 303 0 0 181 0 1.2 2 0 3 0
42 0 3 120 209 0 0 173 0 0 2 0 3 0
67 0 4 106 223 0 0 142 0 0.3 1 2 3 0
76 0 3 140 197 0 1 116 0 1.1 2 0 3 0
70 1 2 156 245 0 2 143 0 0 1 0 3 0
57 1 2 124 261 0 0 141 0 0.3 1 0 7 1
44 0 3 118 242 0 0 149 0 0.3 2 1 3 0
58 0 2 136 319 1 2 152 0 0 1 2 3 3
60 0 1 150 240 0 0 171 0 0.9 1 0 3 0
44 1 3 120 226 0 0 169 0 0 1 0 3 0
61 1 4 138 166 0 2 125 1 3.6 2 1 3 4
42 1 4 136 315 0 0 125 1 1.8 2 0 6 2
52 1 4 128 204 1 0 156 1 1 2 0 ? 2
59 1 3 126 218 1 0 134 0 2.2 2 1 6 2
40 1 4 152 223 0 0 181 0 0 1 0 7 1
58 0 4 130 197 0 0 131 0 0.6 2 0 3 0
57 1 4 110 335 0 0 143 1 3 2 1 7 2
47 1 3 130 253 0 0 179 0 0 1 0 3 0
55 0 4 128 205 0 1 130 1 2 2 1 7 3
35 1 2 122 192 0 0 174 0 0 1 0 3 0
61 1 4 148 203 0 0 161 0 0 1 1 7 2
58 1 4 114 318 0 1 140 0 4.4 3 3 6 4
58 0 4 170 225 1 2 146 1 2.8 2 2 6 2
58 1 2 125 220 0 0 144 0 0.4 2 ? 7 0
56 1 2 130 221 0 2 163 0 0 1 0 7 0

21001F0056 Khasim MCA

Output: -
Sample instances from the dataset are:
age sex cp trestbps chol ... oldpeak slope ca thal heartdisease
0 62 1 2 128 208 ... 0.0 1 0 3 0
1 57 1 4 110 201 ... 1.5 2 0 6 0
2 58 1 4 146 218 ... 2.0 2 1 7 1
3 64 1 4 128 263 ... 0.2 2 1 7 0
4 51 0 3 120 295 ... 0.6 1 0 3 0

[5 rows x 14 columns]

Attributes and datatypes of dataset are:

age int64
sex int64
cp int64
trestbps int64
chol int64
fbs int64
restecg int64
thalach int64
exang int64
oldpeak float64
slope int64
ca object
thal object
heartdisease int64
dtype: object

Learning CPD using Maximum Likelihood Estimator

Inferencing with Bayesian Network:

21001F0056 Khasim MCA

1. Probability of HeartDisease given evidence= restecg

+-----------------+---------------------+
| heartdisease | phi(heartdisease) |
+=================+=====================+
| heartdisease(0) | 0.0793 |
+-----------------+---------------------+
| heartdisease(1) | 0.0000 |
+-----------------+---------------------+
| heartdisease(2) | 0.0000 |
+-----------------+---------------------+
| heartdisease(3) | 0.4404 |
+-----------------+---------------------+
| heartdisease(4) | 0.4803 |
+-----------------+---------------------+

2. Probability of HeartDisease given evidence= cp

+-----------------+---------------------+
| heartdisease | phi(heartdisease) |
+=================+=====================+
| heartdisease(0) | 0.2352 |
+-----------------+---------------------+
| heartdisease(1) | 0.2180 |
+-----------------+---------------------+
| heartdisease(2) | 0.1663 |
+-----------------+---------------------+
| heartdisease(3) | 0.2142 |
+-----------------+---------------------+
| heartdisease(4) | 0.1663 |
+-----------------+---------------------+

21001F0056 Khasim MCA

Experiment 8:
Assuming a set of documents that need to be classified, use the naive Bayesian Classifier
model to perform this task. Built-in Java classes/API can be used to write the program.
Calculate the accuracy, precision and recall for your data set.

Code: -
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score
msg = pd.read_csv('naivebayestextdoc.csv',encoding= 'unicode_escape', names=['message',
'label'])
print("Total Instances of Dataset: ", msg.shape[0])
msg['labelnum'] = msg.label.map({'pos': 1, 'neg': 0})
x = msg.message
y = msg.labelnum

xtrain, xtest, ytrain, ytest = train_test_split(x, y)

count_v = CountVectorizer()
xtrain_dm = count_v.fit_transform(xtrain)
xtest_dm = count_v.transform(xtest)
df = pd.DataFrame(xtrain_dm.toarray(),columns=count_v.get_feature_names_out())
print('\nFeatures for first 5 training instances are:\n')
print(df[0:5])
clf = MultinomialNB()
clf.fit(xtrain_dm, ytrain)
pred = clf.predict(xtest_dm)
print('\nClassstification results of testing samples are:\n')

21001F0056 Khasim MCA

for doc, p in zip(xtrain, pred):

p = 'pos' if p == 1 else 'neg'
print("%s -> %s" % (doc, p))
print('\nAccuracy Metrics: \n')
print('Accuracy: ', accuracy_score(ytest, pred))
print('Recall: ', recall_score(ytest, pred))
print('Precision: ', precision_score(ytest, pred))
print('Confusion Matrix: \n', confusion_matrix(ytest, pred))

Dataset: -
naivebayestextdoc.csv: headings are not necessary in excel csv file

Message Label
I love this sandwich pos
This is an amazing place pos
I feel very good about these beers pos
This is my best work pos
What an awesome view pos
I do not like this restaurant neg
I am tired of this stuff neg
I can’t deal with this neg
He is my sworn enemy neg
My boss is horrible neg
This is an awesome place pos
I do not like the taste of this juice neg
I love to dance pos
I am sick and tired of this place neg
What a great holiday pos
That is a bad locality to stay neg
We will have good fun tomorrow pos
I went to my enemy’s house today neg

21001F0056 Khasim MCA

Output: -
Total Instances of Dataset: 18

Features for first 5 training instances are:

about am an and awesome bad ... view we what will with work
0 0 0 0 0 0 0 ... 0 0 0 0 0 1
1 0 0 0 0 0 0 ... 0 0 0 0 0 0
2 0 0 0 0 0 0 ... 0 0 0 0 0 0
3 1 0 0 0 0 0 ... 0 0 0 0 0 0
4 0 0 0 0 0 0 ... 0 0 1 0 0 0

[5 rows x 47 columns]

Classstification results of testing samples are:

This is my best work -> pos

I do not like the taste of this juice -> neg
I love this sandwich -> pos
I feel very good about these beers -> neg
What a great holiday -> pos

Accuracy Metrics:

Accuracy: 0.8
Recall: 1.0
Precision: 0.6666666666666666
Confusion Matrix:
[[2 1]
[0 2]]

21001F0056 Khasim MCA

Experiment 9:
Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set
for clustering using k-Means algorithm. Compare the results of these two algorithms and
comment on the quality of clustering using Python Programming.

Code: -
from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture
import sklearn.metrics as metrics
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
names = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width', 'Class']
dataset = pd.read_csv("irisdataset.csv", names=names)
x = dataset.iloc[:, :-1]
label = {'Iris-setosa': 0,'Iris-versicolor': 1, 'Iris-virginica': 2}
y = [label[c] for c in dataset.iloc[:, -1]]
plt.figure(figsize=(14,7))
colormap=np.array(['red','lime','black'])
plt.subplot(1,3,1)
plt.title('Real')
plt.scatter(x.Petal_Length,x.Petal_Width,c=colormap[y])
model=KMeans(n_clusters=3, random_state=0).fit(x)
plt.subplot(1,3,2)
plt.title('K-Means')
plt.scatter(x.Petal_Length,x.Petal_Width,c=colormap[model.labels_])
print('The accuracy score of K-Mean: ',metrics.accuracy_score(y, model.labels_))
print('The Confusion matrix of K-Mean:\n',metrics.confusion_matrix(y, model.labels_))
gmm=GaussianMixture(n_components=3, random_state=0).fit(x)
y_cluster_gmm=gmm.predict(x)

21001F0056 Khasim MCA

plt.subplot(1,3,3)
plt.title('GMM Classification')
plt.scatter(x.Petal_Length,x.Petal_Width,c=colormap[y_cluster_gmm])
print('The accuracy score of EM: ',metrics.accuracy_score(y, y_cluster_gmm))
print('The Confusion matrix of EM:\n ',metrics.confusion_matrix(y, y_cluster_gmm))
plt.show()

Output: -
The accuracy score of K-Mean: 0.4
The Confusion matrix of K-Mean:
[[10 0 0]
[ 0 0 10]
[ 0 8 2]]
The accuracy score of EM: 0.03333333333333333
The Confusion matrix of EM:
[[ 0 10 0]
[ 6 0 4]
[ 9 0 1]]

21001F0056 Khasim MCA

Experiment 10:
Implement the non-parametric Locally Weighted Regression algorithm in order to fit
data points. Select appropriate data set for your experiment and draw graphs.

Code: -
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
def kernel(point,xmat, k):
m,n = np.shape(xmat)
weights = np.mat(np.eye((m)))
for j in range(m):
diff = point - x[j]
weights[j,j] = np.exp(diff*diff.T/(-2.0*k**2))
return weights
def localWeight(point,xmat,ymat,k):
wei = kernel(point,xmat,k)
W=(x.T*(wei*x)).I*(x.T*(wei*ymat.T))
return W
def localWeightRegression(xmat,ymat,k):
m,n = np.shape(xmat)
ypred = np.zeros(m)
for i in range(m):
ypred[i] = xmat[i]*localWeight(xmat[i],xmat,ymat,k)
return ypred
data = pd.read_csv('lowweightregressiondataset.csv')
bill = np.array(data.total_bill)
tip = np.array(data.tip)
mbill = np.mat(bill)
mtip = np.mat(tip)
21001F0056 Khasim MCA
34

m= np.shape(mbill)[1]
one = np.mat(np.ones(m))
x= np.hstack((one.T,mbill.T))
ypred = localWeightRegression(x,mtip,3)
SortIndex = x[:,1].argsort(0)
xsort = x[SortIndex][:,0]
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatter(bill,tip, color='green')
ax.plot(xsort[:,1],ypred[SortIndex], color = 'red', linewidth=3)
plt.xlabel('Total Bill')
plt.ylabel('Tip')
plt.show()

21001F0056 Khasim MCA

Dataset: -
lowweightregressiondataset.csv:

total_bill tip sex smoker day time size

28.17 6.5 Female Yes Sat Dinner 3
12.9 1.1 Female Yes Sat Dinner 2
28.15 3 Male Yes Sat Dinner 5
11.59 1.5 Male Yes Sat Dinner 2
7.74 1.44 Male Yes Sat Dinner 2
30.14 3.09 Female Yes Sat Dinner 4
12.16 2.2 Male Yes Fri Lunch 2
13.42 3.48 Female Yes Fri Lunch 2
8.58 1.92 Male Yes Fri Lunch 1
15.98 3 Female No Fri Lunch 3
13.42 1.58 Male Yes Fri Lunch 2
16.27 2.5 Female Yes Fri Lunch 2
10.09 2 Female Yes Fri Lunch 2
20.45 3 Male No Sat Dinner 4
13.28 2.72 Male No Sat Dinner 2
13.51 2 Male Yes Thur Lunch 2
18.71 4 Male Yes Thur Lunch 3
12.74 2.01 Female Yes Thur Lunch 2
13 2 Female Yes Thur Lunch 2
16.4 2.5 Female Yes Thur Lunch 2
20.53 4 Male Yes Thur Lunch 4
16.47 3.23 Female Yes Thur Lunch 3
26.59 3.41 Male Yes Sat Dinner 3
38.73 3 Male Yes Sat Dinner 4
24.27 2.03 Male Yes Sat Dinner 2
12.76 2.23 Female Yes Sat Dinner 2
30.06 2 Male Yes Sat Dinner 3
25.89 5.16 Male Yes Sat Dinner 4
48.33 9 Male No Sat Dinner 4
13.27 2.5 Female Yes Sat Dinner 2

21001F0056 Khasim MCA

Output: -

21001F0056 Khasim MCA

Example Sky Airtemp Humidity Wind Water Forecast Enjoysport 1 2 3 4
No ratings yet
Example Sky Airtemp Humidity Wind Water Forecast Enjoysport 1 2 3 4
6 pages
ML Lab
No ratings yet
ML Lab
21 pages
ML Lab Manual-99
No ratings yet
ML Lab Manual-99
23 pages
1.implement FIND-S Algorithm: Desription
No ratings yet
1.implement FIND-S Algorithm: Desription
19 pages
ML Lab Experiments (1) - Pages-1
No ratings yet
ML Lab Experiments (1) - Pages-1
6 pages
MLlab Manual LIET
No ratings yet
MLlab Manual LIET
52 pages
Final Lab Programs
No ratings yet
Final Lab Programs
52 pages
ML Manual
No ratings yet
ML Manual
74 pages
AD3461 - ML Lab Manual
No ratings yet
AD3461 - ML Lab Manual
54 pages
ML Lab Record
No ratings yet
ML Lab Record
30 pages
PESIT Bangalore South Campus: Vii Semester Lab Manual Subject: Machine Learning
No ratings yet
PESIT Bangalore South Campus: Vii Semester Lab Manual Subject: Machine Learning
31 pages
Practical 1: A. Design A Simple Machine Learning Model To Train The Training Instances and Test The Same
No ratings yet
Practical 1: A. Design A Simple Machine Learning Model To Train The Training Instances and Test The Same
30 pages
ML Lab Observation
100% (1)
ML Lab Observation
44 pages
ML Lab Manual - Merged
No ratings yet
ML Lab Manual - Merged
44 pages
ML1 3 Merged
No ratings yet
ML1 3 Merged
19 pages
ML Lab Programs
No ratings yet
ML Lab Programs
42 pages
Lab Manual
No ratings yet
Lab Manual
55 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
43 pages
R20 Iii-Ii ML Lab Manual
100% (1)
R20 Iii-Ii ML Lab Manual
79 pages
Machine Learning Laboratory 18CSL76: Institute of Technology and Management
No ratings yet
Machine Learning Laboratory 18CSL76: Institute of Technology and Management
49 pages
ML Lab
No ratings yet
ML Lab
9 pages
Lab Manual
No ratings yet
Lab Manual
25 pages
EXP2
No ratings yet
EXP2
3 pages
Screenshot 2023-12-07 at 11.07.49 AM
No ratings yet
Screenshot 2023-12-07 at 11.07.49 AM
14 pages
IV - ML Lab
No ratings yet
IV - ML Lab
31 pages
ML1408-Machine Learning Lab Programs
No ratings yet
ML1408-Machine Learning Lab Programs
17 pages
MLT Lab1
No ratings yet
MLT Lab1
27 pages
ML EXP-2
No ratings yet
ML EXP-2
5 pages
Edited - Edited - Final ML Lab Manual Version11
No ratings yet
Edited - Edited - Final ML Lab Manual Version11
83 pages
Machine Learning Techniques Lab: Session: 2023-24, Even Semester
No ratings yet
Machine Learning Techniques Lab: Session: 2023-24, Even Semester
20 pages
(ML) Machine Learning Lab Manual
No ratings yet
(ML) Machine Learning Lab Manual
25 pages
ML Final
No ratings yet
ML Final
19 pages
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
Cat 2 Document Likkitha
No ratings yet
Cat 2 Document Likkitha
80 pages
ML Lab Output
No ratings yet
ML Lab Output
15 pages
ML Lab Programs 1-10-Converted NAM COLLEGE PDF
No ratings yet
ML Lab Programs 1-10-Converted NAM COLLEGE PDF
33 pages
22K61A0618 - Removed - Lab Manual Sasi CLD
No ratings yet
22K61A0618 - Removed - Lab Manual Sasi CLD
25 pages
Ex 1 in ML
No ratings yet
Ex 1 in ML
4 pages
ML LAB Record
No ratings yet
ML LAB Record
35 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
23 pages
ML Lab PFG - Removed - Removed - Removed
No ratings yet
ML Lab PFG - Removed - Removed - Removed
22 pages
Shashidhar-18csl76 Final
No ratings yet
Shashidhar-18csl76 Final
19 pages
CANDIDATE-ELIMINATION Learning Algorithm
0% (1)
CANDIDATE-ELIMINATION Learning Algorithm
3 pages
Original ML Lab Manual
No ratings yet
Original ML Lab Manual
22 pages
Machine Learning LAB MANUAL
No ratings yet
Machine Learning LAB MANUAL
23 pages
Lab Manual Final
No ratings yet
Lab Manual Final
34 pages
Exp 4a
No ratings yet
Exp 4a
3 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
Machine Learning Manual Final
No ratings yet
Machine Learning Manual Final
37 pages
ML Lab Prog1-5 (5) College PDF
No ratings yet
ML Lab Prog1-5 (5) College PDF
12 pages
ML Lab Record
No ratings yet
ML Lab Record
49 pages
ML Lab Manual
No ratings yet
ML Lab Manual
46 pages
ML Lab Programs
No ratings yet
ML Lab Programs
18 pages
ML Record
No ratings yet
ML Record
18 pages
Machine Learning Through Python Lab Mannual
No ratings yet
Machine Learning Through Python Lab Mannual
33 pages
ML Lab Manual
No ratings yet
ML Lab Manual
90 pages
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
Interview Questions With Answers On All Topics (Rev1)
No ratings yet
Interview Questions With Answers On All Topics (Rev1)
41 pages
Maaz Assignment # 3 Deep Learning
No ratings yet
Maaz Assignment # 3 Deep Learning
5 pages
FMX / Cruiso / BW 8-12: Ganzeboom Transmission Parts & Torque Converters
No ratings yet
FMX / Cruiso / BW 8-12: Ganzeboom Transmission Parts & Torque Converters
2 pages
FD ch5 PPT Hull
No ratings yet
FD ch5 PPT Hull
37 pages
Indexing Sites
No ratings yet
Indexing Sites
2,191 pages
For Communication Skills
No ratings yet
For Communication Skills
2 pages
AGS Guide To Ground Investigation Reports Final
No ratings yet
AGS Guide To Ground Investigation Reports Final
6 pages
LOADALL - 533-105: Static Dimensions
No ratings yet
LOADALL - 533-105: Static Dimensions
4 pages
Towards A Critical Health Psychology Practice
100% (1)
Towards A Critical Health Psychology Practice
15 pages
OPA Annex 4 Request For Funds Format (15 March 2018)
No ratings yet
OPA Annex 4 Request For Funds Format (15 March 2018)
5 pages
Me170a - Lab 01 - Instrumentation Handout - Edited2015
No ratings yet
Me170a - Lab 01 - Instrumentation Handout - Edited2015
8 pages
18 Home Savings vs. Dailo
No ratings yet
18 Home Savings vs. Dailo
11 pages
Assignment MCA 103
No ratings yet
Assignment MCA 103
4 pages
Nop 180
No ratings yet
Nop 180
2 pages
Defecte Multiplexare
No ratings yet
Defecte Multiplexare
22 pages
BOQ - Zallaf South Refinery Project - CAMP & TSF
No ratings yet
BOQ - Zallaf South Refinery Project - CAMP & TSF
18 pages
Strategic Value Management - Michael Thiry
No ratings yet
Strategic Value Management - Michael Thiry
8 pages
Aguinaldo Industries V CIR - Peralta
No ratings yet
Aguinaldo Industries V CIR - Peralta
2 pages
A Study Between Social Media Usage and Self-Esteem Among Youths
No ratings yet
A Study Between Social Media Usage and Self-Esteem Among Youths
10 pages
Compensation Management Systems - Paper B - 4
No ratings yet
Compensation Management Systems - Paper B - 4
9 pages
Adjusting Review 2
No ratings yet
Adjusting Review 2
9 pages
Risk Assessment Template Teen Fashion
No ratings yet
Risk Assessment Template Teen Fashion
2 pages
Labor Law BarVenture 2024
No ratings yet
Labor Law BarVenture 2024
4 pages
Southpoint School & College: Time: 30 Mins Subject: Computer Studies (Objectives) Full Marks: 30
No ratings yet
Southpoint School & College: Time: 30 Mins Subject: Computer Studies (Objectives) Full Marks: 30
2 pages
Odms E-14 Ef009-50 - Manual
100% (1)
Odms E-14 Ef009-50 - Manual
247 pages
Toshiba Satellite L30 SpecificationBrochure 110706
No ratings yet
Toshiba Satellite L30 SpecificationBrochure 110706
2 pages
PT205 Hydrogen Sulfide
No ratings yet
PT205 Hydrogen Sulfide
2 pages
Presentation of Financial Statements
No ratings yet
Presentation of Financial Statements
27 pages
5.load Transfer Mechanism and Load Test - 2
No ratings yet
5.load Transfer Mechanism and Load Test - 2
18 pages
PP86S20 400m3.hr at 88.2m TDH Performance Datasheet
No ratings yet
PP86S20 400m3.hr at 88.2m TDH Performance Datasheet
1 page

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.