Data Structures. MOD_4
Data Structures. MOD_4
Binary Tree
● A tree where each node has at most two children (left and right).
● Used for various applications, including expression trees and binary heaps.
3. AVL Tree
4. Red-Black Tree
● A balanced binary search tree with an additional property of color (red or black) for each
node.
● Ensures that the tree remains approximately balanced, providing O(log n) time
complexity for operations.
5. B-Tree
● A self-balancing tree data structure that maintains sorted data and allows searches,
sequential access, insertions, and deletions in logarithmic time.
● Commonly used in databases and file systems.
● Height: The height of a tree is defined as the number of edges on the longest path from
the root to a leaf. For a tree with only one node (the root), the height is 0.
● Depth: The depth of a node is the number of edges from the root to that node. The root
node has a depth of 0.
● Structure: Each node has at most two children, referred to as the left child and the right
child.
● Recursive Definition: A binary tree is either empty or consists of a root node and two
subtrees (left and right).
● Traversal: Common traversal methods include in-order, pre-order, and post-order.
● Threaded Binary Tree: In a threaded binary tree, null pointers are replaced with
pointers to the in-order predecessor or successor, making in-order traversal faster
without using a stack or recursion.
● Standard Binary Tree: In a standard binary tree, null pointers point to null, and traversal
typically requires additional data structures or recursion.
1. Start at the root: Compare the value to be inserted with the root node.
2. Go left or right:
○ If the value is less than the current node's value, move to the left child.
○ If the value is greater, move to the right child.
3. Repeat: Continue this process until you find a null position where the new node can be
inserted.
4. Insert the node: Create a new node and attach it to the appropriate null position.
1. Perform standard BST deletion: Locate the node to be deleted and remove it using the
standard BST deletion process.
2. Rebalance the tree: After deletion, check the balance factor of each node from the
deleted node's parent up to the root.
3. Perform rotations:
○ If the balance factor is greater than 1, perform a right rotation or left-right rotation.
○ If the balance factor is less than -1, perform a left rotation or right-left rotation.
4. Update heights: After rotations, update the heights of the affected nodes.
○ In a balanced BST, the height of the tree is logarithmic relative to the number of
nodes, allowing for efficient searching.
● Worst Case: O(n)
○ In an unbalanced BST (e.g., a tree that resembles a linked list), the height can be
equal to the number of nodes, leading to linear time complexity.
○ AVL trees provide guaranteed logarithmic time complexity for search, insertion,
and deletion operations due to their self-balancing nature.
○ Binary trees are used to implement binary heaps, which are essential for priority
queues.
○ Heaps are used in algorithms like heapsort.
3. Expression Trees:
● Expression Trees:
○ Binary trees can represent mathematical expressions where each internal node
is an operator (e.g., +, -, *, /) and each leaf node is an operand (e.g., numbers).
● Parsing Process:
/\
+ 2
/\
3 5
○
● Evaluation:
1. Balanced: B-trees maintain balance by ensuring that all leaf nodes are at the same
level.
2. Multi-way Tree: Each node can have multiple children (more than two), which allows for
a higher branching factor.
3. Sorted Order: Keys within each node are stored in sorted order, allowing for efficient
searching.
4. Node Capacity: Each node can contain a predefined number of keys (between a
minimum and maximum), which helps in maintaining balance.
5. Dynamic Growth: B-trees can grow and shrink dynamically as keys are inserted or
deleted.
○ B-Trees: Commonly used in databases and file systems for efficient disk access.
○ BST: Used in memory-based applications where data is frequently accessed.
Insertion Algorithm:
1. Find the appropriate leaf node: Traverse the tree to find the correct leaf node where
the new key should be inserted.
2. Insert the key: If the node has space, insert the key in sorted order.
3. Split the node: If the node is full, split it into two nodes and promote the middle key to
the parent node.
4. Repeat: If the parent node is also full, repeat the split process up to the root.
Deletion Algorithm:
1. Find the key: Traverse the tree to locate the key to be deleted.
2. Delete the key:
○ If the key is in a leaf node, simply remove it.
○ If the key is in an internal node, replace it with its predecessor or successor and
delete that key.
3. Rebalance: If a node has fewer than the minimum number of keys after deletion, borrow
a key from a sibling or merge with a sibling.
○ B-trees are optimized for systems that read and write large blocks of data,
making them suitable for databases and file systems.
1. Graph: A collection of vertices (or nodes) and edges (connections between the vertices).
2. Vertex (Node): A fundamental unit of a graph, representing an entity.
3. Edge: A connection between two vertices, which can be directed or undirected.
4. Directed Graph (Digraph): A graph where edges have a direction, indicating a one-way
relationship.
5. Undirected Graph: A graph where edges have no direction, indicating a two-way
relationship.
6. Degree: The number of edges connected to a vertex. In directed graphs, it can be split
into in-degree (incoming edges) and out-degree (outgoing edges).
7. Path: A sequence of edges that connects a sequence of vertices.
8. Cycle: A path that starts and ends at the same vertex without repeating any edges.
9. Connected Graph: A graph where there is a path between every pair of vertices.
10.Subgraph: A graph formed from a subset of the vertices and edges of another graph.
● Directed Graph:
○ A list of edges, where each edge is represented as a pair (or tuple) of vertices.
○ Useful for sparse graphs and simple edge-based operations.
4. Incidence Matrix:
○ Adjacency List: O(V + E), where V is the number of vertices and E is the
number of edges. More space-efficient for sparse graphs.
○ Adjacency Matrix: O(V^2). Requires space for all possible edges, regardless of
whether they exist.
2. Time Complexity for Operations:
○ Adjacency List:
■ Checking for the existence of an edge: O(V) in the worst case (need to
search through the list).
■ Adding an edge: O(1) (just append to the list).
○ Adjacency Matrix:
■ Checking for the existence of an edge: O(1) (direct access).
■ Adding an edge: O(1) (direct access).
3. Use Cases:
○ Adjacency List: Preferred for sparse graphs where the number of edges is much
less than the maximum possible (V^2).
○ Adjacency Matrix: Useful for dense graphs where the number of edges is close
to the maximum possible.
○ Definition: A graph in which each edge has an associated weight (or cost),
representing a value such as distance, time, or capacity.
○ Characteristics:
■ Edges can have positive, negative, or zero weights.
■ Useful for applications like shortest path algorithms (e.g., Dijkstra's or
Bellman-Ford).
■ The weight of an edge influences the overall cost of traversing the graph.
2. Unweighted Graph:
○ Definition: A graph in which edges do not have weights; all edges are
considered equal.
○ Characteristics:
■ Typically used in scenarios where the presence of an edge is more
important than the cost of traversing it.
■ Algorithms like Breadth-First Search (BFS) can be used to find the
shortest path in terms of the number of edges.
● Complete Graph:
○ Definition: A graph in which every pair of distinct vertices is connected by a
unique edge.
○ Characteristics:
■ If a complete graph has ( n ) vertices, it contains ( \frac{n(n-1)}{2} ) edges.
■ Denoted as ( K_n ), where ( n ) is the number of vertices.
■ Every vertex is directly connected to every other vertex, making it highly
interconnected.
75.#include <stdio.h>
76.#include <stdlib.h>
77.
78.#define MAX_VERTICES 100
79.
80.void bfs(int graph[MAX_VERTICES][MAX_VERTICES], int num_vertices, int start) {
81. int visited[MAX_VERTICES] = {0};
82. int queue[MAX_VERTICES], front = 0, rear = 0;
83.
84. queue[rear++] = start; // Enqueue the start vertex
85. visited[start] = 1; // Mark as visited
86.
87. while (front < rear) {
88. int vertex = queue[front++]; // Dequeue the front vertex
89. printf("%d ", vertex); // Process the vertex
90.
91. for (int i = 0; i < num_vertices; i++) {
92. if (graph[vertex][i] == 1 && !visited[i]) { // Check for an edge and if not visited
93. queue[rear++] = i; // Enqueue the neighbor
94. visited[i] = 1; // Mark as visited
95. }
96. }
97. }
98.}
99.
100. int main() {
101. int graph[MAX_VERTICES][MAX_VERTICES] = {0};
102. int num_vertices = 6;
103.
104. // Example graph (adjacency matrix)
105. graph[0][1] = 1; graph[0][2] = 1; // A -> B, A -> C
106. graph[1][3] = 1; graph[1][4] = 1; // B -> D, B -> E
107. graph[2][5] = 1; // C -> F
108.
109. printf("BFS: ");
110. bfs(graph, num_vertices, 0); // Start from vertex 0 (A)
111. return 0;
112. }
Dijkstra's Algorithm is a popular algorithm used to find the shortest path from a starting node
(or vertex) to all other nodes in a weighted graph with non-negative edge weights.
How It Works:
1. Initialization:
○ Set the distance to the starting node to 0 and all other nodes to infinity.
○ Create a priority queue (or min-heap) to store nodes based on their current
shortest distance.
2. Processing Nodes:
○ The algorithm continues until all nodes have been processed, resulting in the
shortest path from the starting node to all other nodes.
The time complexity of Dijkstra's algorithm depends on the data structure used for the priority
queue:
● Using a simple array: O(V^2), where V is the number of vertices. This is because for
each vertex, you may need to scan through all vertices to find the minimum distance.
● Using a binary heap (priority queue): O((V + E) log V), where E is the number of
edges. This is more efficient because:
Warshall's Algorithm is used to find the transitive closure of a directed graph. It determines
whether there is a path between every pair of vertices in the graph.
Purpose:
● Transitive Closure: It creates a matrix that indicates whether a path exists between
pairs of vertices. If there is a path from vertex A to vertex B, the matrix entry will be true
(or 1); otherwise, it will be false (or 0).
● Applications: Useful in various applications, such as:
○ Analyzing reachability in networks.
○ Finding paths in databases.
○ Solving problems related to connectivity in graphs.
1. Subgraph: It is a subgraph that includes all the vertices of the original graph.
2. Connected: The spanning tree is connected, meaning there is a path between any two
vertices.
3. Acyclic: It contains no cycles, which means there are no closed loops.
4. Edges: For a graph with ( V ) vertices, a spanning tree will have exactly ( V - 1 ) edges.
5. Minimum Weight: In the case of a weighted graph, a minimum spanning tree (MST) is a
spanning tree with the minimum possible total edge weight.
Kruskal's Algorithm is a greedy algorithm used to find the minimum spanning tree of a
connected, weighted graph. Here’s how it works:
1. Sort Edges: Start by sorting all the edges in non-decreasing order of their weights.
2. Initialize: Create a forest (a set of trees), where each vertex is a separate tree. Also,
create a union-find data structure to keep track of connected components.
3. Process Edges:
○ Iterate through the sorted edges and for each edge:
■ Check if the edge connects two different trees (using the union-find
structure).
■ If it does, add the edge to the minimum spanning tree and unite the two
trees.
4. Termination: The algorithm stops when there are ( V - 1 ) edges in the spanning tree,
where ( V ) is the number of vertices.
Prim's Algorithm is another greedy algorithm used to find the minimum spanning tree of a
connected, weighted graph. Here’s how it works:
1. Initialization: Start with a single vertex (arbitrarily chosen) and mark it as part of the
minimum spanning tree.
2. Expand Tree:
○ While there are vertices not yet included in the tree:
■ Find the edge with the minimum weight that connects a vertex in the tree
to a vertex outside the tree.
■ Add this edge and the new vertex to the tree.
3. Termination: The algorithm continues until all vertices are included in the minimum
spanning tree.
● Approach:
○ Kruskal's Algorithm: Works by sorting edges and adding them one by one,
ensuring no cycles are formed.
○ Prim's Algorithm: Grows the spanning tree from a starting vertex by adding the
minimum edge that connects the tree to a new vertex.
● Data Structure: