Final Copy Aiml
Final Copy Aiml
IMPLEMENTATION OF
01 UNINFORMED SEARCH
ALGORITHMS (BFS,DFS)
IMPLEMENTATION OF
INFORMED SEARCH
02
ALGORITHMS (A*,MEMORY
BOUNDED A*)
IMPLEMENTATION OF NAÏVE
03
BAYES MODELS
IMPLEMETATION OF
04
BAYESIAN NETWORKS
PROGRAM TO BUILD
05
REGRESSION MODELS
PROGRAM TO BUILD
06 DECISION TREES AND
RANDOM FORESTS
IMPLEMENTATION OF
08
ENSEMBLING TECHNIQUES
IMPLEMENTATION OF
09
CLUSTERING ALGORITHMS
IMPLEMENTATION OF EM FOR
10
BAYESIAN NETWORKS
Aim:
To implement uniformed search algorithm using BFS.
Algorithm:
The steps of the algorithm work as follow:
1. Start by putting any one of the graph’s vertices at the back of the queue.
2. Now take the front item of the queue and add it to the visited list.
3. Create a list of that vertex's adjacent nodes. Add those which are not within
the visited list to the rear of the queue.
4. Keep continuing steps two and three till the queue is empty.
BFS pseudocode
The pseudocode for BFS in python goes as below:
create a queue Q
mark v as visited and put v into Q
while Q is non-empty
remove the head u of Q
mark and enqueue all (unvisited) neighbors of u
Program:
graph = {
'5' : ['3','7'], '3' : ['2', '4'], '7' : ['8'], '2' : [],'4' : ['8'], '8' : []
visited = []
queue = []
def bfs(visited, graph, node):
visited.append(node)
queue.append(node)
while queue:
m = queue.pop(0)
print (m, end = " ")
for neighbour in graph[m]:
if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)
print("Following is the Breadth-First Search")
bfs(visited, graph, '5') # function calling
Output:
Result:
Thus the program to implement uninformed search algorithm using BFS is
executed and the output is verified successfully.
EX.NO:1B IMPLEMENTATION OF UNINFORMED SEARCH
DATE: ALGORITHM(DFS)
Aim:
To implement uniformed search algorithm using DFS.
Algorithm:
The DSF algorithm follows as:
1. We will start by putting any one of the graph's vertex on top of the stack.
2. After that take the top item of the stack and add it to the visited list of the
vertex.
3. Next, create a list of that adjacent node of the vertex. Add the ones which
aren't in the visited list of vertexes to the top of the stack.
4. Lastly, keep repeating steps 2 and 3 until the stack is empty.
DFS pseudocode
DFS(G, u)
u.visited = true
for each v ∈ G.Adj[u]
if v.visited == false
DFS(G,v)
init() {
For each u ∈ G
u.visited = false
For each u ∈ G
DFS(G, u)
}
Program:
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
visited = set()
def dfs(visited, graph, node):
if node not in visited:
print (node)
visited.add(node)
for neighbour in graph[node]:
dfs(visited, graph, neighbour)
print("Following is the Depth-First Search")
dfs(visited, graph, '5')
Output:
Result:
Thus the program to implement uninformed search algorithm using DFS is
executed and the output is verified successfully.
EX.NO:2A
IMPLEMENTATION OF INFORMED SEARCH
DATE:
ALGORITHMS(A*)
Aim:
To Implement informed search algorithms using a*.
Algorithm:
1. Place the starting node in the OPEN list.
2. Check if the OPEN list is empty or not, if the list is empty then return
failure and stops.
3. Select the node from the OPEN list which has the smallest value of
evaluation function (g+h), if node n is goal node then return success and
stop, otherwise
4. Expand node n and generate all of its successors, and put n into the closed
list. For each successor n', check whether n' is already in the OPEN or
CLOSED list, if not then compute evaluation function for n' and place into
Open list.
5. Else if node n' is already in OPEN and CLOSED, then it should be attached to
the back pointer which reflects the lowest g(n') value.
6. Return to Step 2.
Program:
graph_nodes={'a':[('b',6),('f',3)],'b':[('c',3),('d',2)],'c':[('d',1),('e',5)],'d':[('c',1),('e',8)],'e':[
('i',5),('j',5)],'f':[('g',1),('h',7)],'g':[('i',3)],'h':[('i',2)],'i':[('e',5),('j',3)]}
def get_neighbours(v):
if v in graph_nodes:
return graph_nodes[v]
else:
return None
def h(n):
h_dist={'a':10,'b':8,'c':5,'d':7,'e':3,'f':6,'g':5,'h':3,'i':1,'j':0}
return h_dist[n]
def astaralgo(start_node,stop_node):
open_set=set(start_node)
closed_set=set()
g={}
parents={}
g[start_node]=0
parents[start_node]=start_node
while len(open_set)>0:
n=None
for v in open_set:
if n==None or g[v]+h(v)<g[n]+h(n):
n=v
if n== stop_node or graph_nodes[n]==None:
pass
else:
for(m,weight) in get_neighbours(n):
if m not in open_set and m not in closed_set:
open_set.add(m)
parents[m]=n
g[m]=g[n]+weight
else:
if g[m]>g[n] + weight:
g[m]=g[n]+weight;
parents[m]=n
if m in closed_set:
closed_set.remove(m)
open_set.add(m)
if n==None:
print("path doesnot exist")
return None
if n== stop_node:
path=[]
while parents[n]!=n:
path.append(n)
n=parents[n]
path.append(start_node)
path.reverse()
print('path found: {}'.format(path))
return path
open_set.remove(n)
closed_set.add(n)
print("path does not exist")
return None
astaralgo('a','j')
Output:
Result:
Thus the program to implement informed search algorithm using A* is executed
and the output is verified successfully.
EX.NO:2B IMPLEMENTATION OF INFORMED SEARCH
DATE: ALGORITHMS(MEMORY-BOUNDED A*)
Aim:
To implement informed search algorithms using A*.
Algorithm:
1. Initialize the open list with the start node and set its g-value to 0 and its f-
value to the heuristic estimate of the distance to the goal node.
2. Initialize the closed list as an empty set.
3. While the open list is not empty:
*Select the node with the lowest f-value from the open list and remove it from
the list.
*If the selected node is the goal node, return the path from the start node to the
goal node.
*Add the selected node to the closed list.
*Expand the selected node by generating its neighboring nodes and computing
their g- and f-values
*For each generated node:
->If the node is already in the closed list, discard it.
->If the node is not in the open list, add it to the open list.
->If the node is in the open list and its g-value is greater than the newly
computed g-value, update its g-value and parent node.
*If the size of the open list exceeds a certain memory limit, remove the node
with the highest f-value from the list.
4.If the open list is empty and the goal node has not been reached, then there is no
path from the start node to the goal node.
Program:
import heapq
import psutil
def heuristic(node, goal):
return abs(node[0] - goal[0]) + abs(node[1] - goal[1])
def mba_star(start, goal, neighbors, max_ram):
pq = []
g_values = {start: 0}
parents = {start: None}
max_ram_usage = 0
heapq.heappush(pq, (heuristic(start, goal), start))
while pq:
_, current = heapq.heappop(pq)
if current == goal:
path = []
while current:
path.append(current)
current = parents[current]
path.reverse()
return path, max_ram_usage
ram_usage = psutil.Process().memory_info().rss
if ram_usage > max_ram_usage:
max_ram_usage = ram_usage
if max_ram_usage > max_ram:
return None, max_ram_usage
for neighbor in neighbors(current):
g_value = g_values[current] + 1
if neighbor not in g_values or g_value < g_values[neighbor]:
g_values[neighbor] = g_value
parents[neighbor] = current
heapq.heappush(pq, (g_value + heuristic(neighbor, goal), neighbor))
return None, max_ram_usage
# Define the neighbors function for a square grid
def square_neighbors(node):
i, j = node
return [(i+1, j), (i-1, j), (i, j+1), (i, j-1)]
start = (0, 0)
goal = (4, 4)
max_ram = 1000000
path, max_ram_usage = mba_star(start, goal, square_neighbors, max_ram)
if path:
print("Path:", path)
print("Max RAM usage:", max_ram_usage, "bytes")
else:
print("Goal not found within maximum RAM usage limit.")
print("Max RAM usage:", max_ram_usage, "bytes")
Output:
Result:
Thus the program to implement informed search algorithm using memory-
bounded A* is executed and the output is verified successfully.
EX.NO:3 IMPLEMENTATION OF NAÏVE BAYES MODELS
DATE:
Aim:
To write a program to implement naïve bayes models .
Algorithm:
1.Prepare labeled dataset with features and labels.
2.Split dataset into training and testing sets.
3.Compute class probabilities using prior probabilities from training set.
4.Compute feature probabilities using conditional probabilities from training set.
5.Predict class labels for testing set by calculating joint probabilities and selecting
class with highest probability.
6.Evaluate model using accuracy, precision, recall, and F1-score.
7.Fine-tune model with adjustments or techniques such as Laplace smoothing.
8.Predict class labels for new samples using trained model by computing joint
probabilities and selecting class with highest probability.
Program:
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4,
random_state=1)
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb.predict(X_test)
from sklearn import metrics
print("Gaussian Naive Bayes model accuracy(in %):",
metrics.accuracy_score(y_test, y_pred)*100)
Output:
Result:
Thus the program to implement naïve bayes model is executed and the output is
executed successfully.
EX.NO:4 IMPLEMENTATION OF BAYESIAN NETWORKS
DATE:
Aim:
To write a program to implement Bayesian models.
Algorithm:
1.Define the Bayesian network structure by specifying nodes as variables and
directed edges as dependencies.
2.Learn the network structure from data using algorithms like constraint-based
(e.g., PC, FCI) or score-based (e.g., BIC, BDeu) methods.
3.Assign probability distributions to each node based on data or expert
knowledge.
4.Perform probabilistic inference, such as Bayesian updating or belief
propagation, to compute probabilities of unobserved variables given observed
variables.
5.Update probabilities based on new evidence using Bayes' theorem.
6.Perform model validation using techniques like cross-validation or holdout
validation.
7.Fine-tune the network structure or probabilities based on validation results.
8.Use the trained Bayesian network for prediction, decision-making, or
knowledge discovery in various domains, such as healthcare, finance, or
natural language processing.
Datasets:heart.csv
Program:
import numpy as np
import pandas as pd
import csv
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianModel
from pgmpy.inference import VariableElimination
heartDisease = pd.read_csv('heart.csv')
heartDisease = heartDisease.replace('?',np.nan)
print('Sample instances from the dataset are given below')
print(heartDisease.head())
print('\n Attributes and datatypes')
print(heartDisease.dtypes)
model=BayesianModel([('age','heartdisease'),('gender','heartdisease'),('exang','heartdis
ease'),('cp','heartdisease'),('heartdisease','restecg'),('heartdisease','chol')])
print('\nLearning CPD using Maximum likelihood estimators')
model.fit(heartDisease,estimator=MaximumLikelihoodEstimator)
print('\n Inferencing with Bayesian Network:')
HeartDiseasetest_infer = VariableElimination(model)
print('\n 1. Probability of HeartDisease given evidence= restecg')
q1=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'restecg':1})
print(q1)
print('\n 2. Probability of HeartDisease given evidence= cp ')
q2=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'cp':2})
print(q2)
Output:
Sample instances from the dataset are given below
age gender cp trestbps chol fbs restecg thalach exang oldpeak \
0 63 1 1 145 233 1 2 150 0 2.3
1 67 1 4 160 286 0 2 108 1 1.5
2 67 1 4 120 229 0 2 129 1 2.6
3 37 1 3 130 250 0 0 187 0 3.5
4 41 0 2 130 204 0 2 172 0 1.4
Aim:
To write a program to build linear regression model.
Algorithm:
1.Load and preprocess the data: Load training data, preprocess it by handling
missing values, scaling features, and splitting it into training and testing sets.
2.Initialize model parameters: Initialize weight vector w and bias term b.
3.Define the hypothesis function: Define the mathematical function that relates
input features to target predictions.
4.Calculate loss: Compute the loss or error between predicted and actual target
values using a loss function.
5.Update parameters: Update model parameters using an optimization
algorithm, such as gradient descent, to minimize the loss.
6.Repeat: Iterate the process for a certain number of epochs or until
convergence is reached.
7.Predict: Use the learned parameters to make predictions on new data.
8.Evaluate: Evaluate the performance of the model on the test set using
evaluation metrics.
9.Fine-tune: Adjust hyperparameters, such as learning rate and regularization
strength, for better performance.
10.Deploy: Deploy the trained linear regression model for making predictions in
real-world applications.
Program:
import numpy as np
import matplotlib.pyplot as plt
def estimate_coef(x, y):
n = np.size(x)
m_x = np.mean(x)
m_y = np.mean(y)
SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x
b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return (b_0, b_1)
def plot_regression_line(x, y, b):
plt.scatter(x, y, color = "m",marker = "o", s = 30)
y_pred = b[0] + b[1]*x
plt.plot(x, y_pred, color = "g")
plt.xlabel('x')
plt.ylabel('y')
plt.show()
def main():
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])
b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} \\nb_1 = {}".format(b[0], b[1]))
plot_regression_line(x, y, b)
if __name__ == "__main__":
main()
Output:
Result:
Thus the program to build linear regression model is executed and the
output is verified successfully.
EX.NO:5B BUILD REGRESSION MODELS
DATE: (BAYESIAN REGRESSION)
Aim:
To write a program to build Bayesian regression model.
Algorithm:
1.Specify the dependent variable and independent variables and their relationship.
2.Choose prior distributions for the model parameters.
3.Calculate the likelihood of the data given the model parameters.
4.Calculate the posterior distribution of the model parameters using Bayes' theorem.
5.Use the posterior distribution to predict outcomes for new data.
6.Check the fit of the model to the data using appropriate goodness-of-fit statistics
and diagnostic plots.
7.If the model does not fit the data well, refine the model by changing the model
structure or the prior distributions and repeat the process.
8.Interpret the results in the context of the problem and draw conclusions.
Program:
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from sklearn.linear_model import BayesianRidge
dataset = load_boston()
X, y = dataset.data, dataset.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.15, random_state =
42)
model = BayesianRidge()
model.fit(X_train, y_train)
prediction = model.predict(X_test)
print(f"Test Set r2 score : {r2_score(y_test, prediction)}")
Output:
Result:
Thus the program to implement bayesian regression model is executed and the
output is verified successfully.
EX.NO:6A BUILD DECISION TREES
DATE:
Aim:
To write a program to build decision trees.
Algorithm:
1.Load and preprocess data: Load the training data, preprocess it by handling
missing values, and encoding categorical features.
2.Define splitting criterion: Choose a splitting criterion, such as Gini impurity or
entropy, to determine the best feature for splitting the data at each node.
3.Build the tree: Recursively split the data into subsets based on the chosen
criterion until a stopping condition is met, such as reaching a maximum depth
or minimum number of samples per leaf.
4.Assign labels: Assign the majority class or average value of target variable in
the leaf nodes.
5.Prune the tree: Optionally, prune the tree to prevent overfitting.
6.Predict: Use the trained decision tree to make predictions on new data.
7.Evaluate: Evaluate the performance of the tree using evaluation metrics.
8.Fine-tune: Adjust hyperparameters, such as maximum depth or minimum
samples per leaf, for optimal performance.
9.Visualize: Optionally, visualize the decision tree for interpretability.
10.Deploy: Deploy the trained decision tree for making predictions in real-world
applications.
Datasets:traintennis.csv,testtennis.csv
Program:
import math
import csv
def load_csv(filename):
lines=csv.reader(open(filename,"r"))
dataset = list(lines)
headers = dataset.pop(0)
return dataset,headers
class Node:
def __init__ (self,attribute):
self.attribute=attribute
self.children=[]
self.answer=""
def subtables(data,col,delete):
dic={}
coldata=[row[col] for row in data]
attr=list(set(coldata))
counts=[0]*len(attr)
r=len(data)
c=len(data[0])
for x in range(len(attr)):
for y in range(r):
if data[y][col]==attr[x]:
counts[x]+=1
for x in range(len(attr)):
dic[attr[x]]=[[0 for i in range(c)] for j in range(counts[x])]
pos=0
for y in range(r):
if data[y][col]==attr[x]:
if delete:
del data[y][col]
dic[attr[x]][pos]=data[y]
pos+=1
return attr,dic
def entropy(S):
attr=list(set(S))
if len(attr)==1:
return 0
counts=[0,0]
for i in range(2):
counts[i]=sum([1 for x in S if attr[i]==x])/(len(S)*1.0)
sums=0
for cnt in counts:
sums+=-1*cnt*math.log(cnt,2)
return sums
def compute_gain(data,col):
attr,dic = subtables(data,col,delete=False)
total_size=len(data)
entropies=[0]*len(attr)
ratio=[0]*len(attr)
total_entropy=entropy([row[-1] for row in data])
for x in range(len(attr)):
ratio[x]=len(dic[attr[x]])/(total_size*1.0)
entropies[x]=entropy([row[-1] for row in dic[attr[x]]])
total_entropy-=ratio[x]*entropies[x]
return total_entropy
def build_tree(data,features):
lastcol=[row[-1] for row in data]
if(len(set(lastcol)))==1:
node=Node("")
node.answer=lastcol[0]
return node
n=len(data[0])-1
gains=[0]*n
for col in range(n):
gains[col]=compute_gain(data,col)
split=gains.index(max(gains))
node=Node(features[split])
fea = features[:split]+features[split+1:]
attr,dic=subtables(data,split,delete=True)
attributes(key: values)
for x in range(len(attr)):
child=build_tree(dic[attr[x]],fea)
node.children.append((attr[x],child))
return node
def print_tree(node,level):
if node.answer!="":
print(" "*level,node.answer)
return
print(" "*level,node.attribute)
for value,n in node.children:
print(" "*(level+1),value)
print_tree(n,level+2)
def classify(node,x_test,features):
if node.answer!="":
print(node.answer)
return
pos=features.index(node.attribute)
for value, n in node.children:
if x_test[pos]==value:
classify(n,x_test,features)
dataset,features=load_csv("traintennis.csv")
#lastcol=[row[-1] for row in dataset]
node1=build_tree(dataset,features)
print("The decision tree for the dataset using ID3 algorithm is")
print_tree(node1,0)
testdata,features=load_csv("testtennis.csv")
for xtest in testdata:
print("The test instance:",xtest)
print("The label for test instance:",end=" ")
classify(node1,xtest,features)
Output:
Result:
Thus the program to build decision trees is executed and the output is verified
successfully
EX.NO:6B BUILD RANDOM FOREST
DATE:
Aim:
To write a program to bulid random forest.
Algorithm:
1.Randomly sample a bootstrap training set from the original training data.
2.Randomly select a subset of features.
3.Train a decision tree on the bootstrap sample using the selected features,
with a maximum depth.
4.Repeat steps 1-3 for a specified number of decision trees.
5.During prediction, each tree independently predicts the class.
6.The majority class among all trees is taken as the final prediction. This
ensemble approach improves accuracy and reduces overfitting.
7.Repeat steps 1-6 for each prediction
Dataset:iris.csv
Program:
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from mlxtend.plotting import plot_decision_regions
from sklearn.metrics import accuracy_score
from sklearn.ensemble import RandomForestClassifier
import numpy as np
iris = datasets.load_iris()
X = iris.data[:, 2:]
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,
random_state=1, stratify=y)
forest = RandomForestClassifier(criterion='gini',
n_estimators=5,
random_state=1,
n_jobs=2)
forest.fit(X_train, y_train)
y_pred = forest.predict(X_test)
print('Accuracy: %.3f' % accuracy_score(y_test, y_pred))
from mlxtend.plotting import plot_decision_regions
Result:
Thus the program to build Random forest is executed and the output is verified
successfully
EX.NO:7 BUILD SVM MODELS
DATE:
Aim:
To write a program to build SVM models.
Algorithm:
1.Data Preprocessing: Collect and preprocess the input data, including feature
scaling and data splitting for training and testing.
2.Model Training: Select a kernel function (e.g., linear, polynomial, or radial basis
function) and fit the SVM model to the training data. The model aims to find the
optimal hyperplane that maximizes the margin between the two classes while
minimizing misclassifications.
3.Model Evaluation: Evaluate the trained SVM model on the testing data using
performance metrics such as accuracy, precision, recall, and F1-score.
4.Model Tuning: Fine-tune the SVM model by adjusting hyperparameters such as
the regularization parameter (C), kernel parameters, and class weights, using
techniques like cross-validation.
5.Model Deployment: Deploy the trained SVM model in a production environment
for making predictions on new, unseen data.1
6.Model Interpretation: Interpret the SVM model by analyzing support vectors,
margins, and decision boundaries to gain insights into the classification process
and understand the model's behavior.
7.Model Maintenance: Periodically retrain and update the SVM model with new
data to ensure its accuracy and reliability over time.
Program:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
iris = datasets.load_iris()
X = iris.data[:, :2]
y = iris.target
C = 1.0
svc = svm.SVC(kernel ='linear', C = 1).fit(X, y)
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
h = (x_max / x_min)/100
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
plt.subplot(1, 1, 1)
Z = svc.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, cmap = plt.cm.Paired, alpha = 0.8)
plt.scatter(X[:, 0], X[:, 1], c = y, cmap = plt.cm.Paired)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.xlim(xx.min(), xx.max())
plt.title('SVC with linear kernel')
plt.show()
Output:
Result:
Thus the program to build SVM model is executed and the output is
verified successfully.
EX.NO:8A IMPLEMENTATION OF ENSEMBLING TECHNIQUES
DATE: (BAGGING)
Aim:
To write a program to implement ensembling techniques using bagging.
Algorithm:
1.Input: Training dataset (X, y), base model, n_estimators, subset_size.
Initialize an empty list for trained base models.
2.Repeat n_estimators times:
a. Randomly sample subset_size instances from X with replacement.
b. Train base model on the bootstrap sample.
c. Store trained base model in the list.
3.Output: List of trained base models.
4.For prediction: Input: Test dataset (X_test).
a. For each trained base model, make predictions on X_test.
b. Aggregate predictions using voting or averaging.
c. Output: Ensemble predictions for X_test.
Program:
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
n_redundant=5, random_state=5)
model = BaggingClassifier()
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1,
error_score='raise')
print('Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))
Output:
Result:
Thus the program to implement ensembling techniques using bagging is
executed and the otput is verified successfully.
EX.NO:8B IMPLEMENTATION OF ENSEMBLING TECHNIQUES
DATE: (BOOSTING)
Aim:
To write a program to implement ensembling techniques using boosting.
Algorithm:
1.Input: Training dataset (X, y), base model, n_estimators.
2.Initialize model weights and empty list for trained base models.
3.Repeat n_estimators times:
a. Train base model on (X, y) with weights.
b. Calculate errors/residuals on the training dataset.
c. Update model weights based on errors.
d. Store trained base model in list.
4..For prediction:
Input: Test dataset X_test.
For each trained base model, make predictions on X_test.
Aggregate predictions using weighted averaging.
Output: Ensemble predictions for X_test.
Program:
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import GradientBoostingClassifier
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
n_redundant=5, random_state=7)
model = GradientBoostingClassifier()
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate the model on the dataset
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))
Output:
Result:
Thus the program to implement ensembling technique using boosting is executed
and the output is verified successfully.
EX.NO:8C IMPLEMENTATION OF ENSEMBLING TECHNIQUES
DATE: (STACKING)
Aim:
To write a program to implement ensembling techniques using stacking.
Algorithm:
1.Split training data sets into n-folds using the RepeatedStratifiedKFold as this is
the most common approach to preparing training datasets for meta-models.
2. Now the base model is fitted with the first fold, which is n-1, and it will make
predictions for the nth folds.
3.The prediction made in the above step is added to the x1_train list.
4.Repeat steps 2 & 3 for remaining n-1folds, so it will give x1_train array of size n,
5.Now, the model is trained on all the n parts, which will make predictions for the
sample data.
6.Add this prediction to the y1_test list.
7.In the same way, we can find x2_train, y2_test, x3_train, and y3_test by using
Model 2 and 3 for training, respectively, to get Level 2 predictions.
8.Now train the Meta model on level 1 prediction, where these predictions will be
used as features for the model.
9.Finally, Meta learners can now be used to make a prediction on test data in the
stacking model.
Program:
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from matplotlib import pyplot
def get_dataset():
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
n_redundant=5, random_state=1)
return X, y
def get_models():
models = dict()
models['lr'] = LogisticRegression()
models['knn'] = KNeighborsClassifier()
models['cart'] = DecisionTreeClassifier()
models['svm'] = SVC()
models['bayes'] = GaussianNB()
return models
def evaluate_model(model, X, y):
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1,
error_score='raise')
return scores
X, y = get_dataset()
models = get_models()
results, names = list(), list()
for name, model in models.items():
scores = evaluate_model(model, X, y)
results.append(scores)
names.append(name)
print('>%s %.3f (%.3f)' % (name, mean(scores), std(scores)))
pyplot.boxplot(results, labels=names, showmeans=True)
pyplot.show()
Output:
Result:
Thus the program to implement ensembling techniques using stacking is executed
and the output is verified successfully.
EX.NO:9 IMPLEMENTATION OF CLUSTERING ALGORITHMS
DATE: (K-MEANS)
Aim:
To write a program to implement clustering algorithm using k-means algorithm.
Algorithm:
1.Initialize centroids randomly or using other techniques.
2.Assign data points to the nearest centroid based on distance.
3.Update centroids by calculating the mean of assigned data points.
4.Repeat steps 2 and 3 until convergence.
5.Terminate when convergence criteria are met.
6.Return the final cluster centroids.
7.Optionally, predict cluster membership for new data points by assigning them to
the nearest centroid based on distance.
Program:
import matplotlib.pyplot as plt
x = [4, 5, 10, 4, 3, 11, 14 , 6, 10, 12]
y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21]
plt.scatter(x, y)
plt.show()
from sklearn.cluster import KMeans
data = list(zip(x, y))
inertias = []
for i in range(1,11):
kmeans = KMeans(n_clusters=i)
kmeans.fit(data)
inertias.append(kmeans.inertia_)
plt.plot(range(1,11), inertias, marker='o')
plt.title('Elbow method')
plt.xlabel('Number of clusters')
plt.ylabel('Inertia')
plt.show()
kmeans = KMeans(n_clusters=2)
kmeans.fit(data)
plt.scatter(x, y, c=kmeans.labels_)
plt.show()
Output:
Result:
Thus the program to implement clustering algorithm using k-means is executed
and the output is executed successfully.
EX.NO:10 IMPLEMENTATION OF EM FOR BAYESIAN NETWORKS
DATE:
Aim:
To write a program to implement EM for Bayesian Networks.
Algorithm:
1.Initialize the parameters of the Bayesian network, such as the conditional
probability tables (CPTs) for each node, either randomly or based on prior
knowledge.
2.Iterate the following steps until convergence:
a. Expectation (E) step:
i. Given the observed data, use the current parameter estimates to compute the
posterior probabilities of the hidden variables using Bayes' rule.
ii. Compute the expected values of the hidden variables using the posterior
probabilities.
b. Maximization (M) step:
i. Update the parameter estimates by maximizing the likelihood of the observed
data, taking into account the expected values from the E step.
ii. Update the CPTs for each node based on the computed expected values and
observed data.
3.Check for convergence, which can be based on various criteria such as the change
in parameter estimates or likelihood.
Program:
from pgmpy.models import BayesianNetwork
from pgmpy.factors.discrete import TabularCPD
import networkx as nx
import pylab as plt
model = BayesianNetwork([('Guest', 'Host'), ('Price', 'Host')])
cpd_guest = TabularCPD('Guest', 3, [[0.33], [0.33], [0.33]])
cpd_price = TabularCPD('Price', 3, [[0.33], [0.33], [0.33]])
cpd_host = TabularCPD('Host', 3, [[0, 0, 0, 0, 0.5, 1, 0, 1, 0.5],
[0.5, 0, 1, 0, 0, 0, 1, 0, 0.5],
[0.5, 1, 0, 1, 0.5, 0, 0, 0, 0]],
evidence=['Guest', 'Price'], evidence_card=[3, 3])
model.add_cpds(cpd_guest, cpd_price, cpd_host)
model.check_model()
from pgmpy.inference import VariableElimination
infer = VariableElimination(model)
posterior_p = infer.query(['Host'], evidence={'Guest': 2, 'Price': 2})
print(posterior_p)
Output:
Result:
Thus the program to implement EM for Bayesian network is executed and the
output is executed successfully.
EX.NO:11 TO BUILD SIMPLE NEURAL NETWORK MODEL
DATE:
Aim:
To write a program to build simple neural network models.
Algorithm:
1.Define the problem and the dataset you want to work with.
2.Preprocess the data by normalizing or standardizing it.
3.Split the data into training and testing sets.
4.Define the architecture of your neural network by selecting the
number of layers, the number of nodes in each layer, and the
activation function for each layer.
5.Initialize the weights and biases randomly.
6.Train the neural network by feeding the training data through the
network and updating the weights and biases using an optimizer and a
loss function.
7.Evaluate the performance of the neural network on the testing data.
8.Tweak the hyperparameters of the neural network (such as learning
rate, batch size, and number of epochs) to improve performance.
9.Repeat steps 6-8 until the performance is satisfactory.
10.Use the trained neural network to make predictions on new data
Program:
import numpy as np
class NeuralNetwork():
def __init__(self):
np.random.seed(1)
self.synaptic_weights = 2 * np.random.random((3, 1)) - 1
def sigmoid(self, x):
return 1 / (1 + np.exp(-x))
def sigmoid_derivative(self, x):
return x * (1 - x)
def train(self, training_inputs, training_outputs, training_iterations):
for iteration in range(training_iterations):
#siphon the training data via the neuron
output = self.think(training_inputs)
error = training_outputs - output
adjustments = np.dot(training_inputs.T, error * self.sigmoid_derivative(output))
self.synaptic_weights += adjustments
def think(self, inputs):
inputs = inputs.astype(float)
output = self.sigmoid(np.dot(inputs, self.synaptic_weights))
return output
if __name__ == "__main__":
neural_network = NeuralNetwork()
print("Beginning Randomly Generated Weights: ")
print(neural_network.synaptic_weights)
training_inputs = np.array([[0,0,1], [1,1,1], [1,0,1], [0,1,1]])
training_outputs = np.array([[0,1,1,0]]).T
neural_network.train(training_inputs, training_outputs, 15000)
print("Ending Weights After Training: ")
print(neural_network.synaptic_weights)
user_input_one = str(input("User Input One: "))
user_input_two = str(input("User Input Two: "))
user_input_three = str(input("User Input Three: "))
print("Considering New Situation: ", user_input_one, user_input_two,
user_input_three)
print("New Output data: ")
print(neural_network.think(np.array([user_input_one, user_input_two,
user_input_three])))
print("Wow, we did it!")
Output:
Result:
Thus the program to build simple neural network has been executed and verified
successfully.
EX.NO:12 TO BUILD DEEP LEARNING NN MODELS
DATE:
Aim:
To write a program to build deep learning NN models.
Algorithm:
1.Define the problem and collect/prepare the data.
2.Choose a neural network architecture and define the model.
3.Compile the model by specifying the loss function, optimizer, and
performance metrics.
4.Train the model by feeding the data and updating the weights using
backpropagation.
5.Evaluate the model's performance on a validation set to ensure it is
not overfitting or underfitting.
6.Fine-tune the model by adjusting hyperparameters like learning rate
and regularization.
7.Test the model on a testing set to evaluate its generalization ability.
8.Deploy the model for use in real-world applications, like integrating it
with an API or creating a mobile app.
Program:
import pandas as pd
import numpy as np
df = pd.read_csv("HR_comma_sep1.csv")
print(df.head())
feats = ['department','salary']
df_final = pd.get_dummies(df,columns=feats,drop_first=True)
from sklearn.model_selection import train_test_split
X = df_final.drop(['left'],axis=1).values
y = df_final['left'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
import keras
from keras.models import Sequential
from keras.layers import Dense
classifier = Sequential()
classifier.add(Dense(9, kernel_initializer = "uniform",activation = "relu",
input_dim=18))
classifier.add(Dense(1, kernel_initializer = "uniform",activation = "sigmoid"))
classifier.compile(optimizer= "adam",loss = "binary_crossentropy",metrics =
["accuracy"])
classifier.fit(X_train, y_train, batch_size = 10, epochs = 1)
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
print(cm)
new_pred = classifier.predict(sc.transform(np.array([[0.26,0.7 ,3., 238., 6.,
0.,0.,0.,0., 0.,0.,0.,0.,0.,1.,0., 0.,1.]])))
new_pred = (new_pred > 0.5)
print(new_pred)
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
def make_classifier():
classifier = Sequential()
classifier.add(Dense(9, kernel_initializer = "uniform", activation = "relu",
input_dim=18))
classifier.add(Dense(1, kernel_initializer = "uniform", activation = "sigmoid"))
classifier.compile(optimizer= "adam",loss = "binary_crossentropy",metrics =
["accuracy"])
return classifier
classifier = KerasClassifier(build_fn = make_classifier, batch_size=10,
nb_epoch=1)
accuracies = cross_val_score(estimator = classifier,X = X_train,y = y_train,cv =
10,n_jobs = -1)
mean = accuracies.mean()
print(mean)
variance = accuracies.var()
print(variance)
from keras.layers import Dropout
classifier = Sequential()
classifier.add(Dense(9, kernel_initializer = "uniform", activation = "relu",
input_dim=18))
classifier.add(Dropout(rate = 0.1))
classifier.add(Dense(1, kernel_initializer = "uniform", activation = "sigmoid"))
classifier.compile(optimizer= "adam",loss = "binary_crossentropy",metrics =
["accuracy"])
from sklearn.model_selection import GridSearchCV
def make_classifier(optimizer):
classifier = Sequential()
classifier.add(Dense(9, kernel_initializer = "uniform", activation = "relu",
input_dim=18))
classifier.add(Dense(1, kernel_initializer = "uniform", activation = "sigmoid"))
classifier.compile(optimizer= optimizer,loss = "binary_crossentropy",metrics =
["accuracy"])
return classifier
classifier = KerasClassifier(build_fn = make_classifier)
params = {
'batch_size':[20,35],
'epochs':[2,3],
'optimizer':['adam','rmsprop']
}
grid_search = GridSearchCV(estimator=classifier,
param_grid=params,
scoring="accuracy",
cv=2)
grid_search = grid_search.fit(X_train,y_train)
best_param = grid_search.best_params_
best_accuracy = grid_search.best_score_
print(best_param)
print(best_accuracy)
Output:
satisfaction_level last_evaluation number_project average_montly_hours
0 0.38 0.53 2 157
1 0.80 0.86 5 262
2 0.11 0.88 7 272
3 0.72 0.87 5 223
4 0.37 0.52 2 159
time_company Work_accident left promotion_last_5years department
0 3 0 1 0 sales
1 6 0 1 0 sales
2 4 0 1 0 sales
3 5 0 1 0 sales
4 3 0 1 0 sales
salary
0 low
1 medium
2 medium
3 low
4 low
1050/1050 [==============================] - 7s 5ms/step - loss: 0.4274
- accuracy: 0.8044
141/141 [==============================] - 1s 3ms/step
[[3332 82]
[ 661 425]]
1/1 [==============================] - 0s 44ms/step
[[False]]
0.8291254222393036
0.001282633636306265
Epoch 1/2
263/263 [==============================] - 4s 7ms/step - loss: 0.5999 -
accuracy: 0.7662
Epoch 2/2
263/263 [==============================] - 1s 5ms/step - loss: 0.4015 -
accuracy: 0.8209
165/165 [==============================] - 1s 4ms/step
Epoch 1/2
263/263 [==============================] - 4s 6ms/step - loss: 0.5760 -
accuracy: 0.7743
Epoch 2/2
263/263 [==============================] - 2s 8ms/step - loss: 0.3886 -
accuracy: 0.8053
165/165 [==============================] - 2s 7ms/step
Epoch 1/2
263/263 [==============================] - 3s 5ms/step - loss: 0.5866 -
accuracy: 0.7590
Epoch 2/2
263/263 [==============================] - 1s 4ms/step - loss: 0.4430 -
accuracy: 0.7600
165/165 [==============================] - 1s 4ms/step
Epoch 1/2
263/263 [==============================] - 3s 7ms/step - loss: 0.5884 -
accuracy: 0.7663
Epoch 2/2
263/263 [==============================] - 1s 6ms/step - loss: 0.4290 -
accuracy: 0.7857
165/165 [==============================] - 2s 8ms/step
Epoch 1/3
263/263 [==============================] - 3s 6ms/step - loss: 0.5794 -
accuracy: 0.7596
Epoch 2/3
263/263 [==============================] - 2s 6ms/step - loss: 0.3860 -
accuracy: 0.8295
Epoch 3/3
263/263 [==============================] - 3s 10ms/step - loss: 0.3025
- accuracy: 0.8847
165/165 [==============================] - 2s 7ms/step
Epoch 1/3
263/263 [==============================] - 3s 4ms/step - loss: 0.5959 -
accuracy: 0.7606
Epoch 2/3
263/263 [==============================] - 2s 7ms/step - loss: 0.3967 -
accuracy: 0.8282
Epoch 3/3
263/263 [==============================] - 2s 9ms/step - loss: 0.3075 -
accuracy: 0.8794
165/165 [==============================] - 1s 4ms/step
Epoch 1/3
263/263 [==============================] - 3s 5ms/step - loss: 0.5838 -
accuracy: 0.7584
Epoch 2/3
263/263 [==============================] - 1s 5ms/step - loss: 0.4378 -
accuracy: 0.7600
Epoch 3/3
263/263 [==============================] - 1s 4ms/step - loss: 0.3894 -
accuracy: 0.7600
165/165 [==============================] - 1s 4ms/step
Epoch 1/3
263/263 [==============================] - 3s 5ms/step - loss: 0.5860 -
accuracy: 0.7665
Epoch 2/3
263/263 [==============================] - 2s 7ms/step - loss: 0.4261 -
accuracy: 0.7838
Epoch 3/3
263/263 [==============================] - 3s 10ms/step - loss: 0.3654
- accuracy: 0.8091
165/165 [==============================] - 1s 6ms/step
Epoch 1/2
150/150 [==============================] - 3s 6ms/step - loss: 0.6356 -
accuracy: 0.7550
Epoch 2/2
150/150 [==============================] - 1s 5ms/step - loss: 0.4817 -
accuracy: 0.7600
165/165 [==============================] - 1s 4ms/step
Epoch 1/2
150/150 [==============================] - 2s 5ms/step - loss: 0.6499 -
accuracy: 0.7495
Epoch 2/2
150/150 [==============================] - 1s 5ms/step - loss: 0.4908 -
accuracy: 0.7977
165/165 [==============================] - 1s 3ms/step
Epoch 1/2
150/150 [==============================] - 2s 5ms/step - loss: 0.6428 -
accuracy: 0.7546
Epoch 2/2
150/150 [==============================] - 1s 4ms/step - loss: 0.5160 -
accuracy: 0.7756
165/165 [==============================] - 1s 3ms/step
Epoch 1/2
150/150 [==============================] - 2s 4ms/step - loss: 0.6381 -
accuracy: 0.7617
Epoch 2/2
150/150 [==============================] - 1s 4ms/step - loss: 0.5021 -
accuracy: 0.7667
165/165 [==============================] - 1s 3ms/step
Epoch 1/3
150/150 [==============================] - 2s 5ms/step - loss: 0.6394 -
accuracy: 0.7521
Epoch 2/3
150/150 [==============================] - 1s 5ms/step - loss: 0.4872 -
accuracy: 0.8000
Epoch 3/3
150/150 [==============================] - 1s 7ms/step - loss: 0.3800 -
accuracy: 0.8455
165/165 [==============================] - 1s 4ms/step
Epoch 1/3
150/150 [==============================] - 2s 5ms/step - loss: 0.6206 -
accuracy: 0.7665
Epoch 2/3
150/150 [==============================] - 1s 5ms/step - loss: 0.4609 -
accuracy: 0.7667
Epoch 3/3
150/150 [==============================] - 1s 4ms/step - loss: 0.4053 -
accuracy: 0.7667
165/165 [==============================] - 1s 4ms/step
Epoch 1/3
150/150 [==============================] - 3s 4ms/step - loss: 0.6271 -
accuracy: 0.7594
Epoch 2/3
150/150 [==============================] - 1s 5ms/step - loss: 0.4909 -
accuracy: 0.7607
Epoch 3/3
150/150 [==============================] - 1s 6ms/step - loss: 0.4105 -
accuracy: 0.8087
165/165 [==============================] - 1s 4ms/step
Epoch 1/3
150/150 [==============================] - 2s 4ms/step - loss: 0.6306 -
accuracy: 0.7659
Epoch 2/3
150/150 [==============================] - 1s 4ms/step - loss: 0.4909 -
accuracy: 0.7667
Epoch 3/3
150/150 [==============================] - 1s 4ms/step - loss: 0.4138 -
accuracy: 0.7939
165/165 [==============================] - 1s 3ms/step
Epoch 1/3
525/525 [==============================] - 4s 5ms/step - loss: 0.4863 -
accuracy: 0.7828
Epoch 2/3
525/525 [==============================] - 2s 4ms/step - loss: 0.3170 -
accuracy: 0.8344
Epoch 3/3
525/525 [==============================] - 2s 5ms/step - loss: 0.2719 -
accuracy: 0.8415
{'batch_size': 20, 'epochs': 3, 'optimizer': 'adam'}
0.8952272995309765
Result:
Thus the program to build deep learning NN models has been executed and verified
successfully.