DAA Lab Manual
DAA Lab Manual
Laboratory Practice –I
Laboratory Manual
Laboratory Objectives:
1.Learn effect of data preprocessing on the performance of machine learning
algorithms
4.Learn how to implement algorithms that follow algorithm design strategies namely
divide and conquer, greedy, dynamic programming, backtracking, branch and bound.
5. Understand and explore the working of Blockchain technology and its applications.
● Machine Learning
CO1: Apply preprocessing techniques on datasets.
CO2: Implement and evaluate linear regression and random forest regression models.
CO5: Implement an algorithm that follows one of the following algorithm design
strategies: divide and conquer, greedy, dynamic programming, backtracking, branch
and bound.
● Block Chain
CO6: Interpret the basic concepts in Blockchain technology and its applications.
Course Contents
GROUP A
Experiment:-1
Write a program non-recursive and recursive program to calculate Fibonacci
numbers and analyze their time and space complexity.
First of all let’s see the differences between a recursive function and a non-
recursive one. A recursive function in general has an extremely high time
complexity while a non-recursive one does not.A recursive function generally
has smaller code size whereas a non-recursive one is larger. In some situations,
only a recursive function can perform a specific task, but in other situations,
both a recursive function and a non-recursive one can do it. Here is a recursive
version of calculating the Fibonacci number:
/* compute n’th Fibonacci number by using recursion */
int fibonacci(int n){
if(n<=2)
return 1;
else
return fibonacci(n-1) + fibonacci(n-2);
}
An experienced programmer should have no problem understanding the logic
behind the code. As we can see, in order to compute a Fibonacci number, Fn,
the function needs to call Fn-1 and Fn-2. Fn-1 recursively calls Fn-2 and Fn-3,
and Fn-2 calls Fn-3 and Fn-4. In a nutshell, each call recursively computes two
values needed to get the result until control hits the base case, which happens
when n<=2.You can write a simple main() that accepts an integer n as input and
outputs the n’th Fibonacci by calling this recursive function and see for yourself
how slowly it computes as n gets bigger. It gets horrendously slow once n gets
past 40 on my machine. Here is a non-recursive version that calculates the
Fibonacci number:
/* compute n’th Fibonacci number by using a loop */
int fibonacci(int n){
if(n<=2)
return 1;
int i, last, nextToLast, result;
last = 1;
nextToLast = 1;
result = 1;
for(i=3; i<=n; i++){
result = last + nextToLast;
nextToLast = last;
last = result;
}
return result;
}
The logic here is to keep the values already computed in variables last and next
To Last in every iteration of the for loop so that every Fibonacci number is
computed exactly once. In this case, every single value is computed only once
no matter how big n is. Try to replace the recursive version with this version
and see how fast you get the result when n is very big. By analyzing these
examples, we should have no problem seeing that recursion usually has small
code size, but sometimes the price it pays in the execution time is far too dear.
Now let’s shift our attention to situations where recursion is absolutely
necessary. One of the most well-known examples is the clone function for a
binary search tree. Say at some point in the program you want to make two
separate copies of the same tree, called theTree, how do you do that? Many
green programmers simply declare a new pointer to tree and make it point to
theTree just as follows:
BinarySearchTree *cloneTree = theTree;
Then they happily think that they have made an identical copy successfully and
proceed to perform operations on cloneTree. The truth is that cloneTree simply
points to what theTree points to; changing either cloneTree or theTree changes
the only tree that exists.Therefore, to have two completely identical and
independent trees, you need to use a function that recursively copies the right
and left subtrees of the original tree to the new tree. The function may look
something like this:
BinarySearchTree* BinarySearchTree::clone(BinaryNode *t){
if(t==NULL)
return NULL;
else
return BinaryNode(t->element, clone(t->left), clone(t->right));
}
The first argument to BinaryNode is the data the node contains; the second
argument is a pointer to BinarySearchTree which is the root of the left subtree;
the third argument is a pointer to BinarySearchTree which is the root of the
right subtree. Basically, this function returns a node which is identical to the
root of the original tree by recursively constructing the left and the right
subtrees until they hit the leaf nodes. As we can see, this operation is not
achievable by using a non-recursive function because you do not know what the
tree looks like in advance. In this type of situation, we can rely only on
recursion.
Experiment: 2
PROBLEM STATEMENT:
● The code length of a character depends on how frequently it occurs in the given text.
● The character which occurs most frequently gets the smallest code.
● The character which occurs least frequently gets the largest code.
Prefix Rule-
● It ensures that the code assigned to any character is not a prefix of the code assigned to any other
character.
Huffman Tree-
The steps involved in the construction of Huffman Tree are as follows-
Step-01:
● Create a leaf node for each character of the text.
Step-02:
● Arrange all the nodes in increasing order of their frequency value.
Step-03:
Considering the first two nodes having minimum frequency,
● The frequency of this new node is the sum of frequency of those two nodes.
● Make the first node as a left child and the other node as a right child of the newly created node.
Step-04:
● Keep repeating Step-02 and Step-03 until all the nodes form a single tree.
Time Complexity-
The time complexity analysis of Huffman Coding is as follows-
Important Formulas-
The following 2 formulas are important to solve the problems based on Huffman Coding-
Formula-01:
Formula-02:
Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per character
Problem-
A file contains the following characters with the frequencies as shown. If Huffman Coding is
used for data compression, determine-
1. Huffman Code for each character
2. Average code length
3. Length of Huffman encoded message (in bits)
Character
Frequencies
s
a 10
e 15
i 12
o 3
u 4
s 13
t 1
Solution-
Step-01:
Step-02:
Step-03:
Step-04:
Step-05:
Step-06:
Step-07:
Now,
● Let us assign weight ‘0’ to the left edges and weight ‘1’ to the right edges.
Rule
● If you assign weight ‘0’ to the left edges, then assign weight ‘1’ to the right edges.
● If you assign weight ‘1’ to the left edges, then assign weight ‘0’ to the right edges.
● But follow the same convention at the time of decoding that is adopted at the time of encoding.
After assigning weight to all the edges, the modified Huffman Tree is-
Now, let us answer each part of the given problem one by one-
● a = 111
● e = 10
● i = 00
● o = 11001
● u = 1101
● s = 01
● t = 11000
From here, we can observe-
● Characters occurring less frequently in the text are assigned the larger code.
● Characters occurring more frequently in the text are assigned the smaller code.
PROBLEM STATEMENT:
Write a program to solve a fractional Knapsack problem using a
greedy method.
Operating System recommended :- 64-bit Open source Linux or its derivative
Programming tools recommended: - C++, Java, Python, Solidity, etc
THEORY:
Knapsack Problem-
You are given the following-
A knapsack (kind of shoulder bag) with limited weight capacity.
Few items each having some weight and value.
The problem states-Which items should be placed into the knapsack such that-
The value or profit obtained by putting the items into the knapsack is maximum.
And the weight limit of the knapsack does not exceed.
Knapsack Problem Variants-
Knapsack problem has the following two variants-
Time Complexity-
The main time taking step is the sorting of all items in decreasing order of their value /
weight ratio.
If the items are already arranged in the required order, then while loop takes O(n)
time.
The average time complexity of Quick Sort is O(nlogn).
Therefore, total time taken including the sort is O(nlogn).
PRACTICE PROBLEM BASED ON FRACTIONAL KNAPSACK PROBLEM-
Problem-
For the given set of items and knapsack capacity = 60 kg, find the optimal solution for
the fractional knapsack problem making use of greedy approach.
Step-02: Sort all the items in decreasing order of their value / weight ratio-
I1 I2 I5 I4 I3
(6) (4) (3.6) (3.5) (3)
Step-03: Start filling the knapsack by putting the items into it one by one.
Now,
Knapsack weight left to be filled is 20 kg but item-4 has a weight of 22 kg.
Since in fractional knapsack problem, even the fraction of any item can be taken.
So, knapsack will contain the following items-
< I1 , I2 , I5 , (20/22) I4 >
Important Note-
Had the problem been a 0/1 knapsack problem, knapsack would contain the following
items-
< I1 , I2 , I5 >
The knapsack’s total cost would be 160 units.
Experiment: 4
PROBLEM STATEMENT:
Write a program to solve a 0-1 Knapsack problem using dynamic
programming or branch and bound strategy.
THEORY:
Consider-
● Knapsackweight capacity = w
● Number of items each having some weight and value = n
0/1 knapsack problem is solved using dynamic programming in the following steps-
Step-01:
● Draw a table say ‘T’ with (n+1) number of rows and (w+1) number of columns.
th th
● Fill all the boxes of 0 row and 0 column with zeroes as shown-
Step-02:
Start filling the table row wise top to bottom from left to right.
Use the following formula-
T (i , j) = max { T ( i-1 , j ) , valuei + T( i-1 , j – weighti ) }
Here, T(i , j) = maximum value of the selected items if we can take items 1 to i and
have weight restrictions of j.
● This step leads to completely filling the table.
● Then, value of the last box represents the maximum possible value that can be put into
the knapsack.
Step-03:
To identify the items that must be put into the knapsack to obtain that maximum
profit,
● Consider the last column of the table.
● Start scanning the entries from bottom to top.
● On encountering an entry whose value is not same as the value stored in the entry
immediately above it, mark the row label of that entry.
● After all the entries are scanned, the marked labels represent the items that must be put
into the knapsack.
Time Complexity-
● Each entry of the table requires constant time θ(1) for its computation.
● It takes θ(nw) time to fill (n+1)(w+1) table entries.
● It takes θ(n) time for tracing the solution since tracing process traces the n rows.
● Thus, overall θ(nw) time is taken to solve 0/1 knapsack problem using dynamic
programming.
Problem-
For the given set of items and knapsack capacity = 5 kg, find the optimal solution for
the 0/1 knapsack problem making use of dynamic programming approach.
Ite
Weight Value
m
1 2 3
2 3 4
3 4 5
4 5 6
Solution-
Given-
● Knapsack capacity (w) = 5 kg
● Number of items (n) = 4
Step-01:
● Draw a table say ‘T’ with (n+1) = 4 + 1 = 5 number of rows and (w+1) = 5 + 1 = 6
number of columns.
th th
● Fill all the boxes of 0 row and 0 column with 0.
Step-02:
Start filling the table row wise top to bottom from left to right using the formula-
T (i , j) = max { T ( i-1 , j ) , valuei + T( i-1 , j – weighti ) }
Finding T(1,1)-
We have,
●i =1
●j =1
● (value)i = (value)1 = 3
● (weight)i = (weight)1 = 2
Finding T(1,2)-
We have,
●i =1
●j = 2
● (value)i = (value)1 = 3
● (weight)i = (weight)1 = 2
Finding T(1,3)-
We have,
●i =1
●j = 3
● (value)i = (value)1 = 3
● (weight)i = (weight)1 = 2
Substituting the values, we get-
T(1,3) = max { T(1-1 , 3) , 3 + T(1-1 , 3-2) }
T(1,3) = max { T(0,3) , 3 + T(0,1) }
T(1,3) = max {0 , 3+0}
T(1,3) = 3
Finding T(1,4)-
We have,
●i =1
●j = 4
● (value)i = (value)1 = 3
● (weight)i = (weight)1 = 2
Finding T(1,5)-
We have,
●i =1
●j = 5
● (value)i = (value)1 = 3
● (weight)i = (weight)1 = 2
Finding T(2,1)-
We have,
●i =2
●j = 1
● (value)i = (value)2 = 4
● (weight)i = (weight)2 = 3
We have,
●i =2
●j = 2
● (value)i = (value)2 = 4
● (weight)i = (weight)2 = 3
Finding T(2,3)-
We have,
●i =2
●j = 3
● (value)i = (value)2 = 4
● (weight)i = (weight)2 = 3
Finding T(2,4)-
We have,
●i =2
●j = 4
● (value)i = (value)2 = 4
● (weight)i = (weight)2 = 3
Finding T(2,5)-
We have,
●i =2
●j = 5
● (value)i = (value)2 = 4
● (weight)i = (weight)2 = 3
Substituting the values, we get-
T(2,5) = max { T(2-1 , 5) , 4 + T(2-1 , 5-3) }
T(2,5) = max { T(1,5) , 4 + T(1,2) }
T(2,5) = max { 3 , 4+3 }
T(2,5) = 7
● The last entry represents the maximum possible value that can be put into the
knapsack.
● So, maximum possible value that can be put into the knapsack = 7.
Following Step-04,
● We mark the rows labelled “1” and “2”.
● Thus, items that must be put into the knapsack to obtain the maximum value 7 are-
Item-1 and Item-2
Experiment:-5
PROBLEM STATEMENT:
Design n-Queens matrix having first Queen placed. Use backtracking to place
remaining Queens to generate the final n-queen‘s matrix.
Theory
N Queen problem is the classical Example of backtracking. N-Queen problem is
defined as, “given N x N chess board, arrange N queens in such a way that no two
queens attack each other by being in same row, column or diagonal”.
Fig. (d) describes the backtracking sequence for the 4-queen problem.
Fig. (d): Backtracking sequence for 4-queen
The solution of the 4-queen problem can be seen as four tuples (x , x , x , x ), where
1 2 3 4
x represents the column number of queen Q . Two possible solutions for the 4-
i i