0% found this document useful (0 votes)
17 views122 pages

ESC190 Data Structures Algorithms

Uploaded by

hanheelee26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views122 pages

ESC190 Data Structures Algorithms

Uploaded by

hanheelee26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 122

ESC190: Data Structures & Algorithms

C Tutorial

Tips for the Tests


● Write down test cases with small values.
● Don’t forget to #include the necessary libraries.
● Write pseudocode to think through the problem.
● Look through all the Qs.

Complexity Analysis, Problem-Solving with Recursion

L1: Deep Copy, Problem-solving with Recursion, Review of Complexity


Analysis, Shallow Copy vs. Deep Copy

1.0 Shallow Copy of Lists


A shallow copy constructs a new list and then populates it with references to the elements
found in the original. In the case of shallow copy, a reference of an object is copied into
another object.
● Note: The copying process does not recurse and therefore won’t create copies of the
elements themselves
● Key: It means that any changes made to a copy of an object do reflect in the original
object.

Lee 1
Example of Shallow Copy:
L = [[[[1,2]],5], 2]

# Shallow copy of L:
L2 = L[:] # L2 = L.copy() is the same

# L and L2 are separate lists, but the elements of L and L2 are aliases (point to the same
object)
# 1. NOT SAME: Changing L[0] is not the same as changing L2[0]
# 2. SAME: Changing the contents (ie. inside of the list) of L[0] is the same as changing the
contents of L2[0] since there is only one copy of the list.
# eg. L[0][0] = 3 changes L2[0][0] to 3

1.1 Deep Copy of List


A deep copy constructs a new list and then recursively populates it with copies of the
elements found in the original.
● Note: A copy of the object is copied into another object.
● Key: It means that any changes made to a copy of the object do not reflect in the
original object.

To create a deep copy of the list L, without knowing its structure ahead of time, make a new
list: [deepcopy(L[0]), deepcopy(L[1]), ...]
def deepcopy(L):
'''Return a deep copy of L, which is either a list (of lists of lists... of ints) or an int
'''
# Explanation for the function: So deepcopy(L) is a list of deep copies of the elements of L,
where the elements of L are either ints or lists.

# Why does recursion work? The recursion works by going through each element of L and making
a deep copy of it by calling the function again.

Lee 2
# Why is it a deepcopy? Because if the element is a list, we make a deep copy of it by
calling deepcopy on it. If the element is an int, we just append it to the list.

# 1. Base case: L is a simple list (ie. just an element)


if type(L) != list:
return L

# 2. Calling the function: Using a for loop with two different conditions.
res = []
for e in L:
# 1st Condition: if e is a list, append a deep copy of e to res (ie. recursive call)
if type(e) == list:
res.append(deepcopy(e))

# 2nd Condition: if e is an int, just append it to the list


else:
res.append(e)
return res

1.2 All Strings of Length 4 Over the Alphabet “abcdefgh”


alphabet = "abcdefgh"
for letter1 in alphabet:
for letter2 in alphabet:
for letter3 in alphabet:
for letter4 in alphabet:
print(letter1, letter2, letter3, letter4)

● Note: Complexity: 𝑂 𝑛 ( 4) where n is the length of the alphabet because it repeats as


4
many times as there are 4-letter strings over the alphabet (ie. repeats 𝑛 times).

1.3 Function that Returns All the Strings of Length k Over the Alphabet “abcdefgh”
Objective: Want to list all the strings of length k over the alphabet
● Note: Use a helper function to get all completions
● Note: 𝑙𝑒𝑡𝑡𝑒𝑟1 + (𝑠𝑡𝑟𝑖𝑛𝑔 𝑜𝑓 𝑙𝑒𝑛𝑔𝑡ℎ 𝑘 − 1 𝑜𝑣𝑒𝑟 𝑡ℎ𝑒 𝑎𝑙𝑝ℎ𝑎𝑏𝑒𝑡)
def all_strings(k, alphabet):
'''Return a list of all strings of length k over the alphabet'''
# Base Case: When k == 0, then it returns the empty string.
if k == 0:
return [""]

# 2. Recursive call to the function.

Lee 3
# This is to get all the strings of length k-1 over the alphabet by calling the function
recursively.
res = []
all_k1_strings = all_strings(k-1,alphabet)

# 3. Double for-loop
# The string is of length k-1, where the letter is the first loop, and string is the
second loop that is k-1 long.
for letter in alphabet:
for string in all_k1_strings:
res.append(letter + string)
return res

Explanation of the function


Get all the words that start with a, then with b, then with c, etc:
● letter + (string of length k-1 over the alphabet)#1
● letter + (string of length k-1 over the alphabet)#2
● letter + (string of length k-1 over the alphabet)#3

Examples of Using the Function


# eg. all_strings(0, "abcedf") => [""]

# eg. all_strings(1, "abcedf") => ["a", "b", "c", "e", "d", "f"]
#because k = 1, therefore, k-1 = 0, and the base case (ie. empty string) is returned after
you append the letter to the empty string in the loop.

# eg. all_strings(2, "abcedf") => ["aa", "ab", "ac", "ae", "ad", "af", "ba", "bb", "bc", "be",
"bd", "bf", "ca", "cb", "cc", "ce", "cd", "cf", "ea", "eb", "ec", "ee", "ed", "ef", "da", "db",
"dc", "de", "dd", "df", "fa", "fb", "fc", "fe", "fd", "ff"]
# because k = 2, therefore, k-1 = 1, and the function calls itself again to get all the
strings of length 1 over the alphabet, and then appends to get all these different combinations.

1.6 Complexity Analysis of all_strings(k, alphabet)


# 1. Create a call tree to see how many times the function is called.
# 2. Count the number of times the function is called.

# k-2 : n*(num of strings of length k-3) n*n^(k-3) = n^(k-2)


# |
# k-1 : n*(num of strings of length k-2) n*n^(k-2) = n^(k-1)
# \ \ \ | / / /
# k : n*(num of strings of length) = n*n^(k-1) = n^k

Lee 4
# len(alphabet): n

# Total number of times res.append(letter + string) repeats (ie. the number of calls is the sum
of the call tree):
# n^k + n^(k-1) + n^(k-2) + ... + n^2 + n + 1 = (n^(k+1)-1)/(n-1)

3. Recognize that (n^(k+1)-1)/(n-1) is a geometric series that is equal to O(n^k) because the
function is called n^k times (assuming k is constant).

1.6 Process of Recursion


1. Draw a diagram of a small test case and what we need to do.
2. Find the process (i.e. how to converge to the base case using normal techniques with
function headers, etc).
3. Figure out how to make it recursive using a base case and recursive case.

Lee 5
Lee 6
Introduction to C

L2: Introduction to C Part 1

2.1 Types in C
Every C variable needs to be declared (ie. you need to pre-specify what type of data is stored
in the variable).
1. int: integer
2. int VariableName[]: array of integers
3. long int: 64-bit integer
4. double: double-precision floating point (i.e. decimal value)
5. char: single character (e.g. ‘@’)
6. char VariableName[]: array of characters.
7. int *: pointer to an integer (i.e. address of an integer)
8. char *: pointer to a character (i.e. address of a character), which can store strings.
9. double *: pointer to a double.

L3: Introduction to C Part 2

3.0 Declaring/Initializing Variables


1. Declaring (Creating) Variables Without Assigning the Value
type variableName = value;

2. Declaring Without Assigning the Value, and Assign the Value Later
type variableName;
variableName = value;

3. Declare Multiple Variables


int variableName1 = value1, variableName2 = value2, ...;

4. Declare and Assign Multiple Variables


int variableName1, variableName2, variableName3,...;
variableName1 = variableName2 = variableName3 =...= value;

Lee 7
3.1 Format Specifiers and the Printf Function
A placeholder for the variable value.
● Note: Used together with the printf() function to tell the compiler what type of data to
print.
○ Note: Add a comma for every variable (ie. , variableName1, variableName2) after
the “”.
● Key: Must use the correct format specifier for the variable.
● Note 3: Depending on the format specifier used, it can interpret the variables
differently.

1. Example of Note 3:
#include <stdio.h>

int main()
{
char *s1 = "abcdef";
// eg. Printing the value and address of the string
printf("The string is : %s, the address is %ld\n", s1, s1);
// %ld interprets s1 as an integer, and prints the address of the string.
// %s interprets s1 as a string, and prints the string.

3.2 Pointers/Dereferencing
1. Pointers: Addresses of values are referred to as pointers
2. Dereferencing: Getting the value at the address stored in the pointer
● Key: C uses the * in two different ways
○ To declare a pointer variable.
○ To dereference a pointer variable.

Lee 8
3.3 Memory Table for Strings
How to store strings in memory in C?
● 1 byte per character
● 1 byte for null character
/* Now here's an example in the memory table for char *s1 = "HI!"
Address Value
1032 H
1033 I
1034 !
1035 \0
...
...
3066 1032 // stores the address of H in s1
3067
...
*/

3.4 C Operators
1. Arithmetic Operators

2. Assignment Operators

Lee 9
3. Comparison Operators

4. Logical Operators

Lee 10
Pointers, C Memory Model, Strings in C, Structures

L4 Casting, Pointer Arithmetic, Pointer Exercises

4.1 Arrays and Pointers


When used, arrays are converted to pointers to the first element of the array.
● Key: a[0] is the same as *(a + 0), the first element of a (ie. getting the value at the
address of a + 0 which is also a[0])

4.2 Casting
Converting between types of variables (in quotes), because it does the same thing as printf
with the wrong type (ie. interpret the data as if it were in the type I say)
// Example: (int) 1.2 -> 1 (truncates to 1)
// Example: (long int)str: just the address of the first element of str.

1. Example using the char * type in the printf function, but using the casting concept to
get rid of warnings (same as example 2 in Examples but with casting)
#include <stdio.h>
int main()
{
char *str1 = "hello";
printf("%s\n is stored at address %ld", str1, (long int)str1); // prints hello and the
address of hello
return 0;
}

4.3 Pointer Arithmetic and Arrays (Equivalent Relationships)


* 𝑎𝑟𝑟 = 𝑎𝑟𝑟[0] =* (𝑎𝑟𝑟 + 0) 𝑎𝑛𝑑 * 𝑠𝑡𝑟 = 𝑠𝑡𝑟[0] =* (𝑠𝑡𝑟 + 0)
● Note: p_a[0] = *(p_a + 0) = *p_a, where p_a is a pointer (ie. address value of a).

1. Example: Using arrays and finding their EQUIVALENT RELATIONSHIPS


#include <stdio.h>
int main()
{
int arr[] = {5,6,7};
printf("%ld\n", arr); // the address of the first element of arr (ie. location of 5)

Lee 11
printf("%d\n", arr[0]); // prints 5
printf("%ld\n", *arr); // prints 5
printf("%ld\n", *(arr + 0)); // prints 5
return 0;
}

2. Example: Using strings and finding their EQUIVALENT RELATIONSHIPS


#include <stdio.h>
int main()
{
char *str2 = "hello";
printf("%c\n", str2[0]); // prints h
printf("%c\n", *str2); // prints h
printf("%c\n", *(str2 + 0)); // prints h
return 0;
}

L5: Function Effect, Blocks of Values, Pointer Arithmetic

5.0 Review of L2-4

Strings

char *s = "xyz"; // s stores the address of the string x


● char * means a value of type "address of char"
● s is the address where 'x' is stored.

Arrays

int arr[] = {5,6,7}; // arr stores the address of the first element of the array 5
● Array elements and string elements are stored in consecutive cells in memory.
● arr gets converted into the address where the first element is stored when used.
● arr[0] == *arr == *(arr + 0)

Integers

int a = 42; // a stores the value 42


int *p_a = &a; // p_a stores the address of a
● p_a is the address where a is stored.
● int * means a value of type "stores the address of an integer"
● *p_a means the value at address of p_a

Lee 12
● &a means the address of a
● Note: int *p_a and *p_a are conceptually related but different.

5.1 Functions with Effect/No Effect on Integers


1. Example 1: No Effect
void dont_change_a(int a)
{
a = 42; // changes the value of the local variable a.
// Note: this local variable doesn't have to be named a, but this is just to emphasize that
it is a local variable and not associated with the variable a in the main function.

}
int main()
{
int a = 43;
dont_change_a(a); // a is unchanged because you are sending a COPY OF A, NOT the ADDRESS.
}

● Note: f(43) copies 43 to local variable a, so the function has no effect on the value of 43.
● Note: You copy the value to the local variable from the function call
○ Functions only have access to the copy of the variable.

2. Example 2: No Effect
void dont_change_a(int *p_a)
{
p_a = 42; // so the local variable p_a is now 0, but this doesn't affect the value of a!!
}

int a = 45;

● Note: f(&a) copies &a to LOCAL variable p_a. The variable p_a is local, so f once again
has no effect on the value of a.

3. Example 3: Effect
void change_a(int *p_a) // the type of p_a is type int * because p_a is the address of a.

{
*p_a = 42; // go to the address and change the value there to 0
}
int main()

Lee 13
{
int a = 45;
change_a(&a); // a is changed because the address of a must be sent to change_a.
}

● Scenario 1: f(&a) copies &a to LOCAL variable p_a. The variable p_a is local, but *p_a
is the same as a, so f does have an effect on the value of a.
● Scenario 2: f(a) would have an error or a really weird coincidence (ie. 45 is an address
then it would go to that address and change the value there to 0).
● How to remember if they get modified or not?
○ Simple rule for how values are passed to functions: They are always copied to
local variables.
■ Key Insight: See if it’s an address being sent or a value
● Key: Must always send the address of a variable to change it in a function.

5.2 Storing Blocks of Values


Blocks of values are stored in consecutive cells in memory.
● Examples: Strings and arrays are blocks of values.

Examples:
/* Eg. "hi"
Memory Value
0 'h'
1 'i'
2 '\0'

Eg. {5,6,7}
Memory Value
0 5
4 6
8 7
*/

● Note: Different types of values take up different amounts of memory.

5.3 Pointer Arithmetic


● Pointer arithmetic is a way to move through a block of values
● Pointer arithmetic is done in units of the size of the value being pointed to

Lee 14
● Pointer arithmetic is done using + and - operators, ++ and -- operators

1. Example 1: Using pointer arithmetic to move through a block of values


char *s = "hi";
s+1; // returns the address of 'i'
*(s+1); // returns the value of 'i'

/* Memory Value
0 'h'
1 'i'
2 '\0' */

2. Example 2: Using arrays with pointer arithmetic


int arr[] = {3,4};
arr+1;
*(arr+1); // returns arr[1] which is 4

/* Memory Value
2064 3
2068 4
*/

● Key: arr+1 adds 4 to the address of arr[0] because this is of type int (ie. 4 bits), but if the
type only takes 1 bit, then it would only add 1 (ie. string).

5.4 Examples
1. Example 1: arr[] and int *arr are not the same
Situation in which these two operators are not the same meaning for sizeof operator.
int main()
{
int arr1[] = {4,5,6};
// 1st:
sizeof(arr1); // returns 12
// the total number of memory cells occupied by the array arr
// gives you an idea of the number of elements in the array
sizeof(arr1)/sizeof(arr1[0]); // 12/4 = 3 (ie. the number of elements in the array)

// 2nd:
int *p_a0 = arr1; // p_a0 stores the address of the first element of arr1
sizeof(p_a0); // KEY: NOT the number of elements in the array * 4.
// the number of memory cells occupied by an address
}

Lee 15
5.5 How to get elements in an array
sizeof(arr)/sizeof(arr[0]);

// EXAMPLE:
int arr[] = {1,2,3,4,5};
int len = sizeof(arr) / sizeof(int); // 20 / 4 = 5

L6 Swapping, Changing Values in Blocks of Memory, Strings, Const

6.0 Template for Swapping


#include <stdio.h>
void swap(int *p_x, int *p_y) // p_ is notation for the pointer of the variable.
{
int temp = *p_x; // temp is a local variable.
*p_x = *p_y; // the value of x is now the value of y.
*p_y = temp; // the value of y is now the value of x.
// Note: it has to be in this order.
}

int main()
{
int x = 43;
int y = 44;
// swap(x,y) doesn’t work because the function only has access to the copy of the variables.
swap(&x,&y); // works because the function has access to the address of the variables.
printf("x = %d, y = %d\n", x, y); // prints "x = 44, y = 43"
}

6.2 Template for Changing Values in Blocks of Memory


1. Example 1: Effect of changing arr
void set_arr0(int *arr)
{
arr[0] = 44; // arr[0] is now 44
}

int main()
{
int arr[] = {5,6,7};
set_arr0(arr); // arr[0] is now 44 because arr is the address of the first element (ie. 5)
return 0;
}

Lee 16
2. Example 2: No Effect
void set_arr1(int *arr)
{
arr = 0; // changes the local array to 0, no effect outside the function because we are not
going to the address.
// This is just a local variable.
}

6.3 Strings in C
C does not have a string type.

Difference between a string and a character array:


● An array is generally a local variable, only exists while the function is running.
● A string stores the address of the first element of the string, and the string is stored in
memory.

Two Ways to Store Strings


1. 1st Way: Array Method
char s1[] = "abc"; // shorthand, an array of type char, with the characters 'a','b','c','\0'
char s1[] = {'a','b','c','\0'} // (ie. the same thing as line above)

// Function (from L16):


char *f()
{
char s2[] = "def";
return s2; // This is a pointer to a local variable, so it's undefined behavior after the
function returns.
}

2. 2nd Way: Char * Method


char *s2 = "abc"; // put the block 'a','b','c','\0' in memory, and store the address of the
first element (ie. a) in s2

// Function (from L16):


char *g()
{
char *s2 = "def";
return s2; // This is a pointer to a string literal, so it's fine (ie. can be used outside
of the function)
}

Lee 17
6.4 const
Declare the variable as "constant", which means unchangeable and read-only.
● Note: Constants have to be declared and assigned a value at once.
● Note: Declare with UPPERCASE name.
● Motivation: If you know something is never supposed to be modified, you can use
const to make sure it is never modified.

1. Example 1 Using const char*


● Note: The literal "abc" is actually of type const char * by definition.
● The compiler will not let you modify values at addresses of type const char *
○ But it will let you convert const char * to char * and then try to modify the values
at the memory address
int main()
{
char *s1 = "abc"; // warning: implicit conversion from const char * to char *
s1[0] = 'x'; // will compile, but might crash at runtime bc modifying a string literal is
undefined behavior in C.

const char *s2 = "abc"; // no warning, where s2 is a pointer to a const char * (ie. string
literal)
s2[0] = 'y'; // will not compile
// Note: char * is a pointer to a constant character, so you CANNOT modify it.
return 0;
}

● Note: const char * s: declare s as pointer to const char.


● Note: char *const s: declase s as const pointer to char.

2. Example using const int, and const char.


int main()
{
const int g = 42;
g = 43; // ERROR because g is a constant.

const char h = 'a';


h = 'b'; // ERROR because h is a constant.
}

3. Example using char * const.

Lee 18
int main()
{
char * const str5 = "hello"; // not allowed to change value of str5, but allowed to change
value of str5[0]
str5 = "world"; // ERROR because str5 is a pointer to a character h, but you are trying to
change what it points to.
str5[0] = "H"; // this is okay b/c you are modifying the content pointed to by str5, not the
pointer itself.
}

4. Example using const char * const (both)


int main()
{
// Eg.
const char * const str6 = "hello"; // not allowed to change value of str6, and not allowed to
change value of str6[0]
str6 = "world"; // ERROR because str6 is a constant pointer to a character.
str6[0] = "H" // ERROR because str6 is a pointer to a constant character.
}

6.5 String Literal


A constant character array. Therefore, const char * is any string in C, so if you try to modify
the contents it is undefined behavior.

L7: Sending Integers, Strings, and Lists in Python vs. C, Finding the
Length of a String, Malloc, and Sizeof

7.0 Sending integers, strings, and lists in Python vs. C


1. Integers
a. Example 1
void dont_change_a(int a)
{
a = 5;
}
// Python Equivalent: No equivalent in Python, we always pass the address of the variable, never
just the value

b. Example 2
void change_int(int *p_a)
{

Lee 19
*p_a = 43;
}

// Python Equivalent:
def change_int(a):

// Note: There is no universal syntax for "go to the address a and change a value there"

c. Example 3
void dont_change_pa(int *p_a)
{
p_a = 0;
}

// Python Equivalent:
def dont_change_int(a):
a = 42

2. Arrays
a. Example 1
void change_arr(int *arr)
{
arr[0] = 5;
}

// Python Equivalent:
def change_L(L):
L[0] = 5

b. Example 2
void dont_change_arr(int *arr)
{
arr = 0; // this doesn't change the original arr because this is the local variable arr.
// Note: works the same with ints
}

// Python Equivalent:
dont_change_L(L):
L = [1,2,3] # this doesn't change the original L because this is the local variable L.
# Note: works the same with ints

3. Strings
● Reminder for C: Not supposed to change contents of a string because they are of type
const char *

Lee 20
Important Key Point to Remember:
○ s = "abc";
○ s = s + "d";
■ Note: This is making a new string whose value is s + 'd" and reassigning it
to s, which is NOT EXPLICITLY CHANGING CONTENTS OF S

a. Example 1
void change_s(const char *s)
{
s[0] = 'x'; // compilation error because the contents of s are const
}

// Python Equivalent:
def change_str(s):
# no universal way to change the contents of a string in Python

b. Example 2 (works the same with ints)


void dont_change_arr(const char *arr)
{
arr = 0; // Note: This isn't changing the contents that arr points to, but only changes
locally, the variable arr. So no effect and no undefined behavior.
}

// Python Equivalent:
dont_change_s(s):
s = "abc"

c. Example 3
void change_str(char *s)
{
s[0] = 'x'; // no compilation error, but may crash if s point to a character that is actually
constant.
}

int main_change_str()
{
char *s1 = "abc"; // LHS is a char *, but RHS is a const char *
change_str(s1); // undefined behavior (ie. changing contents of a const char *), may crash

char s2[] = "abc";


change_str(s2); //OK, because you can change the contents of an array
return 0;
}

Lee 21
● Key Difference: You can change the contents of a character array, but not of a string
that uses the char * method.

7.1 Finding the length of a string


Problem: Starting at 1032 in memory, how many steps to get to '0\'?
1. Creating my_strlen using str[len]
int my_strlen(const char *str)
{
int len = 0;
while(str[len] != '\0')
{
len++; // syntactic sugar for len = len + 1
}
return len;
}

● Note: If str[0] = '\0', then len = 0, and the while loop will never run, but if str[0] != '\0',
then the while loop will run.

2. Creating my_strlen using *str


int my_strlen2(const char *str)
{
int len = 0;
while(*str != '\0')
{
len++; // syntactic sugar for len = len + 1
str++; // syntactic sugar for str = str + 1. This loops through the string and keeps
incrementing the address of str!
}
}

● Note: Keep advancing str and keep checking if the address at str is the null character.
● Note: If str is not the null character, then keep incrementing.

3. Creating another another version of my_strlen


int strlen_short(const char *str)
{
int len = 0;
while(*(str++)) // while *str != '\0', and also increment str by 1 after you done comparing
the value of str to '\0'.
{

Lee 22
len++; // increment len to get the total length of the string.
}
return len;
}

4. Creating a recursive version of my_strlen


int strlen_rec(const char *str)
{
// Base Case: *str is '\0' => return 0
// Recursive Step: return 1 + strlen_rec(str + 1) sending the next element of the string
// Explanation: If you step once more, then the length of the string is 1 + the length of
the rest of the string.

if(*str == '\0'){
return 0; // if the element is the null, then give 0.
}

else {
return 1 + strlen_rec(str + 1);
}
}

Example "abcdefg" with elements #: 1,2,...,7:


Let's break down the recursive steps for the string "abcdefg":
● First Call: strlen_rec("abcdefg")
○ *str is 'a', not '\0', so it goes to the recursive step.
○ Returns 1 + strlen_rec("bcdefg").
● Second Call: strlen_rec("bcdefg")
○ Now *str is 'b'.
○ Returns 1 + strlen_rec("cdefg").
● Third Call: strlen_rec("cdefg")
○ Now *str is 'c'.
○ Returns 1 + strlen_rec("defg").
● …Final Call: strlen_rec("")
○ Now *str is '\0' (the null terminator for the string).
○ Returns 0 (the base case).
● Unwinding the Recursion Calls:
○ strlen_rec("") returned 0.
○ strlen_rec("g") returns 1 + 0 = 1.

Lee 23
○ strlen_rec("fg") returns 1 + 1 = 2.
○ strlen_rec("efg") returns 1 + 2 = 3.
○ strlen_rec("defg") returns 1 + 3 = 4.
○ strlen_rec("cdefg") returns 1 + 4 = 5.
○ strlen_rec("bcdefg") returns 1 + 5 = 6.
○ strlen_rec("abcdefg") returns 1 + 6 = 7.

7.2 Malloc
Malloc allocates space in the memory table to store a block of values.
int *block_int = (int *)malloc(sizeof(int) * 150); // allocate space for 150 integers

● In English: Give me an address where I can store 150 integers, where block_int will be
the address of the first integer
● RHS: Cast the address to (int *) because malloc only computes the amount of memory.
● RHS: malloc is the function
● RHS: sizeof(int) * 150 is the amount of memory cells you want to allocate.
● LHS: block_int is the address of the first integer in the block of memory cells.
● Motivation:
○ Local arrays disappear once a function has finished running, but malloc allows
you to keep the arrays even after the function.
○ Arrays in C are not resizeable, but malloc allows you to resize arrays.

Examples
1. Example of storing values in this memory:
block_int[7] = 42;
*(block_int+7) = 42; // same thing as above, block_int + 7 gets to the right location in the
memory table because C knows how many memory cells ints take up

2. Example of using malloc


Assume ints take up 2 memory cells each.
● Note: Adding to block_int adjusts by the integer memory space.
○ Adding 7 to block_int adds sizeof(int)*7.
/*
Address Value
1032 5

Lee 24
1033
1034 7
*/
int *block_int = (int *)malloc(sizeof(int) * 2); // block_int is 1032.

block_int[0] = 5; // storing 5 in 1st memory cell


block_int[1] = 7; // storing 7 in the 2nd memory cell.

block_int + 1; // 1034 because block_int is 1032, and each int takes up 2 memory cells.
block_int[1]; // same as *(block_int + 1)

7.3 Sizeof
sizeof(int); // usually 4 bytes
sizeof(char); // always 1 byte
sizeof(char *); // usually addresses take up 8 bytes, NOT the length of a string/array
sizeof(int *); // usually addresses take up 8 bytes, NOT the length of a string/array

Common way to mess up sizeof array:


If arr is passed to a function, it is converted to a pointer, so be careful when taking the
sizeof().
● Key: Arrays get converted to pointers when passed to functions (ie. the address of the
first element of the array)

Example:
void sz(int *a) // int a[] is just syntactic sugar and won't help.
{
sizeof(a); // 8 bytes because a is an address pointing to an array, NOT an array.
}

int main()
{
int a[] = {1,2,3};
sz(a);
}

Lee 25
L8 Arrays vs. memory blocks, Free, Structures, Copies of Strings, Pointers
of Pointers

8.0 Arrays vs. Memory Blocks


● Arrays: Local arrays only exist until the function returns:
○ Can try returning the address of an element of an array (or the address of
another local variable, but behavior is undefined.
● Malloc: Can create a memory block in a function, return outside of the function and
use it (ie. malloc).

When to use malloc vs when to use an array?


Arrays: Only need to use it as a local variable in a function inside structs.
Malloc: Need to return this outside the function!

Examples
1. Example 1: Returning a local array from a function (BAD)
int *f() // f is of type int * (ie. f returns the address of an integer)
{
int arr[20];
arr[0] = 42;
return arr; // f would return the address of the first element of arr, which doesn't make
sense
}
int main()
{
int *p = f();
p[0]; // Note: This will compile, but undefined behavior because arr is a local variable,
so there's an address of arr[0], but after f returns, it's no longer valid.
return 0;
}

2. Example 2: Using malloc (ie. creating a memory block in a function)


#include <stdlib.h>

int *make_block_int(int sz)


{
int *p = (int *)malloc(sz * sizeof(int));

Lee 26
// Note: If malloc doesn't have space to allocate this much space, it returns NULL
if (p == NULL){
printf("out of memory\n");
exit(1); // this means exit the program in an orderly manner, with an error code of 1.
}
return p; // make_block_int gives back an address that points to a memory block of size sz
that we can use outside of the function
}

int main()
{
int *q = make_block_int(20);
q[7] = 50; // this is fine, because the memory block is created using malloc, so it's still
// there after the function returns
return 0;
}

● Note: The program will crash if you try to use malloc's space when it has returned
NULL.

8.1 Free
C cannot use a malloc-ed block of memory for something new until it's freed
● Good practice to free() memory blocks to you allocated
● Memory leak: A situation where memory is allocated but never freed (ie. no other
program can use that memory => computer freezes).
● Note: Be careful in the order in which you are free..

Example: Using malloc then freeing after using it (BAD)


#include <stdlib.h>
int main()
{
int *block = (int *)malloc(sizeof(int) * 100);
// use the block ...
free(block);

block[0]; // undefined behavior, might crash (ie. there might be something at this address)
bc we freed it.
return 0;
}

Lee 27
8.2 Blocks of Structs
Structs in C are used to group together different data types under a single name in a block of
memory.

General Structure
● Tag: Identifier that gives a name to the structure type.
○ Note: After defining a structure, you can use this tag to declare variables of that
structure type (eg. struct tag;).
● TypeName: Declare variables of this structure type more simply.
○ Instead of struct tag, we can use TypeName VarName.
● Members: Variables or data fields that make up the structure.

typedef struct tag {


// members
} TypeName;

Example: Using structs to make a new type called student


#include <stdlib.h>
typedef struct student{
char name[200];
int id;
int age;
int gpa
} student;

// 1. Making an array:
int main()
{
student students[500]; // Made a new type called student, and we made an array of 500
// students (ie. variable of type student)
student *students_block = (student *)malloc(sizeof(student)*500);
//Note: This is a memory block where you can store 500 students and students_block is the
address of the first student in the block
return 0;
}
/* Structural Setup of Memory Block for each Student
+----------------+--------+------+--------+
| name (200 bytes)| id (4) | age (4) | gpa (4) |
+----------------+--------+------+--------+
*/

Lee 28
8.3 Making Copies of Strings
1. Example 1: Making string aliases (BAD)
#include <string.h>
int main()
{
char s1[] = "hi"; // same as char s1[] = {'h', 'i', '\0'};
char *s2 = 0; // s2 is an address that starts out as 0 (by default, address 0 can't be used,
so we use it as a convention).

s2 = s1; // Strings are now aliases (ie. they store the same address).
return 0;
}

● Note on s1: Since "hi" is stored at addresses 1032, 1033, 1034, the address of the 'h' is
1032, and s1 gets converted to 1032 when used.
○ Key: Character arrays get converted to addresses when used in operations.
● Note on s2: When s2 stores address 1032, any operation through s2 has an effect on s1.

2. Example 2: Using strcpy (BAD)


#include <string.h>
int main()
{
char s1[] = "hi";
char *s2 = 0;

strcpy(s2, s1); // NOT OKAY, since cannot copy to address s2 (ie. s2 is not a valid address
yet, we only did initialization with 0)
return 0;
}

● What were doing: Go to s2 and copy s2[0] = s1[0], s2[1] = s1[1], .... but you are not
allowed to go to the address of some random memory and write there
○ Note: This is copying the contents of s1 into s2 to a different address, which is
not yet set.

3. Example 3: Using malloc and strcpy (GOOD)


#include <stdlib.h>
int main()
{
char s1[] = "hi";
char *s2 = 0;

Lee 29
s2 = (char *)malloc(sizeof(char)*(strlen(s1)+1)); // allocate memory for s2 and the +1 is
for the null character.
strcpy(s2, s1); // copy the contents of s1 into s2 once I have allocated memory for s2.
return 0;
}

8.4 Pointers to Pointers


Change the value of a pointer inside a function.

Examples:
1. Example 1: Pointer to Pointer
void set_to_0(int **p_p_a){ // p_p_a is the address of p_a (ie. the address of a pointer)
*p_p_a = 0; // set the value at the address p_p_a to 0
// p_p_a is of type int **
// *p_p_a is of type int *
}

int main()
{
int a = 42;
int *p_a = &a; // p_a is a pointer to a
set_to_0(&p_a); // p_a is now 0. a is not affected!
// &p_a is the address of p_a, which is of type int **
}

2. Example 2: Pointer vs. Pointer to Pointer Function


void set0(int *p)
{
*p = 0; // set the value at the address p of type int to 0
}

void set0_pointer(int **pp)


{
*pp = 0; // set the value at the address pp of type int * to 0
}

Lee 30
L10: Different Ways to Store Blocks of Structures, Rules for Structs

10.0 Rule for Structs


In general: If it's a pointer, then it must be an arrow. If it's not a pointer, then it must be a dot.
// Note: Using student2 as defined above.
student2 s2;
s2.gpa = 4.0;

student2 *p_s2 = &s2;


p_s2->gpa = 4.0; // Analogous to (*p_s2).gpa = 4.0; (but the arrow is a shorthand)

10.1 Different Ways of Storing Blocks of Structures


1. Array of student1’s: (Note: Using student1 above)
int main()
{
student1 s1[150]; // 150 cells of student1's.
return 0;
}

2. Malloc-ed block of student1’s (Note: Using student1 above)


#include <stdlib.h>
int main()
{
student1 *s1_block = (student1 *)malloc(150 * sizeof(student1)); // This is allocating memory
for 150 student1's.
return 0;
}

Difference between option 1 and 2:


● s1_block: Can return s1_block from a function, because the address s1_block is valid
until I free it.
● s1: Arrays cannot be returned from functions (ie. crashes most likely).

3. Array of student2’s (Note: Using student2 above)


int main()
{
student2 s2[150]; // 150 cells of student2's.
return 0;

Lee 31
}

● Ways to access and not access an array of student2's:


○ Invalid: CANNOT printf("%s", s2[0].name);
■ s2[0].name is not a valid address because inside the struct, name is of type
char *, so it's not pointing to anything (ie. garbage)
○ Valid: CAN printf("%f\n", s2[0].gpa);
■ s2[0].gpa has some arbitrary value, and it will be printed.
○ Valid: CAN printf("%ld\n", s2[0].name);
■ s2[0].name will print the address (since we’re using %ld).
● Key Point in Difference: You can print a number no problem, but not go to an address
and try to print the string from that address.

10.2 Using Structure to Store Names:


1. Putting a name to the pointer in the struct (Note: Using s2 above)
int main()
{
s2[0].name = "John"; // fine, but not allowed to modify s2[0].name[0] to 'j' because it's a
string literal (ie. const char *).
return 0;
}

2. Using Malloc and then Copying the Name into Allocated Memory (Note: Using s2
above)
#include <stdlib.h>
#include <string.h>
int main()
{
s2[0].name = (char *)malloc(50 * sizeof(char)); // This is allocating memory for 50
characters.
strcpy(s2[0].name, "John"); // This is copying the string "John" into the memory that
s2[0].name is pointing to.
// strcpy is same as doing this:
// s2[0].name[0] = 'J';
// s2[0].name[1] = 'o';
// s2[0].name[2] = 'h';
// s2[0].name[3] = 'n';
// s2[0].name[4] = '\0';
return 0;

Lee 32
}

3. Storing the address of each student in the block (ie. Option 3 from 10.0)**
● Summary: You are creating an array of pointers, where each pointer points to a
student2 object.
○ For each student2 * object, you allocate memory for the object itself and
additional memory for the address of name and GPA.
○ GPA: Directly assigned since it's a simple numeric value.
○ Name: Requires copying the string into allocated memory because its a char *
type, so we need more memory to have the address point to the string.
● Advantage: Only allocating memory for what you need.
#include <stdlib.h>
int main()
{
student2 **p_s2_block = (student2 **)malloc(150 * sizeof(student2 *));
// 150 addresses pointing to a student2 (ie. a block of addresses).
// Note: the address of the first element of the block of objects that are type student2 *.

// To use the block of pointers, you have to allocate memory for each student2.
int i;
for(i = 0; i < 150; i++){
p_s2_block[i] = (student2 *)malloc(sizeof(student2)); // Allocating space where I can
store a student2 (ie. name and GPA) for each ith address.
p_s2_block[i]->gpa = 4.0; // Allocating space for GPA for each ith student2.

p_s2_block[i]->name = (char *)malloc(50 * sizeof(char)); // Allocating space where I can


store a name for each ith student2.
strcpy(p_s2_block[i]->name, "John"); // copying name into memory that the address is
pointing to.
return 0;
}

L11: Realloc & Calloc, Error Checking, Analogy to Units in Physics for
Pointers, Strcat

11.0 Realloc & Calloc


Realloc: Can resize blocks of memory from malloc using realloc.
● Note: You must have something already malloc'd.
Lee 33
Example:
#include <stdlib.h>
int main()
{
char *str = (char *)malloc(100 * sizeof(char));

// If we want to make more space:


str = (char *)realloc(str, 200 * sizeof(char)); // making more space by reallocating.
return 0;
}

● 1st Argument: Pointer variable to the first address.


● 2nd Argument: Size of the memory block.

Calloc: Allocates the requested memory and returns a pointer to it, where the memory
allocated is set to 0.

Example:
a = (int *)calloc(n, sizeof(int));

● 1st Argument: Number of elements that need to be allocated.


● 2nd Argument: Size of the elements.

11.1 Error Checking


Malloc and realloc might not be able to find the amount of space you need.
#include <stdio.h>
int main()
{
char *block = malloc(10000000);
if (block == NULL)
{
printf("Out of memory\n");
exit(1); // exit terminates the program. the 1 is sent to the operating system
// result of malloc: NULL happens because there is not enough space for this memory.
}
return 0;
}

● Why exit()? Trying to access a NULL pointer (ie. no memory block) will lead to a crash
without an error message!
Lee 34
11.2 Pointers: Units in Physics
Pointers are like matching the units of a problem (ie. LS must equal RS). If the units don't
match, the answer is wrong.
● Operator: & adds a * to the type of the variable
● Operator: * removes a * from the type of the variable
● Two Ways of Thinking:
○ Units matching for either LS = RS, or what you are passing to the function must
be the same type.
○ If we want to change the value of a variable, we must send the address.

Example: Using Unit Matching


#include <stdlib.h>
void create_str(char **p_str, int sz)
// Since we want to change str, 1st argument must be char ** to send the address of str.
{
*p_str = (char *)malloc(sz * sizeof(char));
// Since the type on RS is char *, so type on LS must be char *.
// So to cancel out a char **, we use a * on the LS

// Now the pointer points to an address of a memory block.


if (*p_str == NULL){ // Check if malloc was successful
printf("Could not create string\n");
exit(1);
}
(*p_str)[0] = '\0'; // Set the first character to null (ie. empty string)
}

int main()
{
char *str = 0;

// Want to change str, so you must send the address of str.


create_str(&str, 100);
// Note: Since str is type char *, therefore, &str MUST BE char **.
// Note: We are sending the address of str, so we can change str, so it must be char**

// Explanation:
// str: of type char * (in main)
// p_str: of type char ** (in create_str)
// &str (== p_str): of type char ** (in main)
return 0;
}

Lee 35
11.3 Strcat
Normal strcat: strcat(str1, str2) concatenates str1 and str2, assuming that str1 has enough
space to accommodate extra characters from str2
● Note: Will crash if not enough space: it does not check if there is enough space

Safer Strcat:
● Assume: *p_str1 was allocated with malloc.
#include <stdlib.h>
#include <string.h>
void safer_strcat(char **p_str1, const char * const str2)
// Notation: p_str1 is a pointer of a pointer because strings by definition are pointers.
{
// 1. To find space for length of str1 + str2 + 1, we realloc.
*p_str1 = (char *)realloc(*p_str1, strlen(*p_str1) + strlen(str2) + 1);

// 2. Check if there was space to hold everything


if(*p_str1 == NULL){ // check if the return value is NULL as this is what we used realloc
for.
printf("Safer strcat failed to allocate memory\n");
exit(1);
}

// 3. Use the normal strcat from <string.h>


strcat(*p_str1, str2); // This we can do after we check.
}

● Note: We need p_str1 because there may not be enough space in str1 to hold str2, so to
change str1, the address of the variable is needed.
● Note: str2 is const for both the pointer and the contents of str2 to make sure we don't
change it.
● Key: Need enough space at *p_str1 to hold both str1 and str2.
○ Need strlen(*p_str1) + strlen(str2) + 1 bytes (for the null character).

Lee 36
Multi File Programs in C, C Preprocessor

L9: Using GCC, Header Files, Running Multi File Programs, Strings

9.1 How to find the contents of a folder, change directories, and run files?

General

Use up and down arrows to go back to something you wrote previously.

ls

Find the contents of the folder using "ls" in the terminal, which means list.

cd

Change directories using "cd" in the terminal, which means change directories.
● Ie. "cd foldername" will go into the folder called foldername
● Press “Tab” to have VSCode predict the folder that you want.
● Useful Commands:
○ “cd ..” moves up a directory.
○ “cd .” means this directory (ie. does nothing)
○ Use quotes when there is space in the folder name (ie. cd “folder name”)

./

Use "./" to run a file in the terminal.


● Eg. "./myprogram.exe" will run the file called myprogram.exe

9.2 Running GCC Manually (ie. Running the Compiler/Then Running the File)

Option 1

● Step 1: This will compile myprogram1.c and myprogram2.c into an executable called
myexec.exe, where the compiler is gcc.
○ The executable can be named anything.

Lee 37
● Step 2: Run the executable using ./myexec.exe
// Step 1
gcc myprogram1.c myprogram2.c -o myexec.exe

// Step 2
./myexec.exe

● Key: Must recompile every time we change the C file, but all the txt files must be in
the same directory if it's in use.

9.3 Header Files

Overview:

● Can give the compiler instructions for tasks that are performed before compilation.
● #include copy-and-pastes the file into the program.
● The file name has to be ".h"
● Note: Using your own files must be in the same directory.

Difference between <> and "":

● #include <stdio.h> will look for the file in the standard library
● #include "L9_example.h" will look for the file in the current directory

Example: Header file to run a program that prints "Hello World!" using L9_example.h

1. Write the program in a file called L9_example.h


a. Note: You don't need to include #include <stdio.h> in the header file, because
it's already included in the main file, but it doesn't hurt to include it.
2. Now put #include "L9_example.h" in the main file
3. Now you can use whatever is in the header file in the main file.

L9_example.h file:
#include <stdio.h>

void say_hi()
{

Lee 38
printf("Hi!\n");
}

Main file:
#include <stdio.h>
#include "L9_example.h"

Example: Using a header file L9_example_1.h to define a struct (ie. as if it had been copy
and pasted into the main file)

L9_example_1.h file:
typedef struct student{
char name[20];
int age;
} student;

Main file:
#include "L9_example_1.h"
int main()
{
student s;
s.age = 20;
return 0;
}

General Structure of H File Guards

#if !defined(MYFILE_H)
#define MYFILE_H
...
#endif

9.4 Preprocessor and Header Guards

Overview

In general, anything that starts with a # is a preprocessor directive.


#define PI 3.14
● Note: This substitutes 3.14 any time PI is in the program (ie. search and replace)
● Note: Different from defining PI as a variable -- this is simple search-and-replace.

Lee 39
● Note: bCan cause difficult-to-fix compile errors (e.g., if there is a type and we write
"3.14").
○ Usually it is better to define PI as a variable.

Useful for combination of header files and pre-processors:

Useful to only define structs once


● CAREFUL: Compilation error if you define something twice (ie. using student twice
in two different header files) and trying to compile it.
○ Can use an "include guard" in an h file to avoid defining things twice if the
header file is included multiple times.

Example: Causing a compilation error by defining a variable twice (motivation for header
guards)

L9_example_1.h file:
typedef struct student{
char name[20];
int age;
} student;

L9_example_2.h file:
typedef struct student{
char name[20];
int age;
} student;

Terminal:
gcc L9_example_1.h L9_example_2.h -o L9_example.exe // --> causes a compilation error, so use
guards.

Example: Using guards in a header file L9_example_3.h to avoid defining a variable twice.

#if !defined(L9_EXAMPLE_3_H)
#define L9_EXAMPLE_3_H

typedef struct student{


char name[20];
int age;
} student;

Lee 40
#endif // L9_EXAMPLE_3_H

● Logic: If L9_EXAMPLE_3_H is not defined (ie. if it's the first time we're including this
file), then define it and copy and paste lines 2-7. Otherwise, don't copy and paste lines
2-7.
● Note: This happens before compilation (ie. pre-processor).
● Note: Always do this for header files to avoid compilation errors.

9.5 Reminder on Strings


1. Array: Happens to be an array of characters.
a. Cannot reassign arrays, but can modify the content of arrays
2. Pointer: Address of the first character of a string of some length
a. If the address of a variable is only initialized, then cannot use strcpy.
b. But can use malloc.

Examples:
1. Eg. Cannot do this to copy a string into an variable without allocating PROPER
memory for the string
#include <string.h>
int main()
{
char *name; // name is an address of the first character where a string can be stored
strcpy(name, "Alice"); // BAD (ie. CANNOT): name is not a valid address (ie. there wasn't an
address), so you cannot copy "Alice" there
return 0;
}

2. Eg. Proper way to use malloc to allocate memory for a string.


#include <stdlib.h>
#include <string.h>
int main3()
{
name = (char *)malloc(100*sizeof(char)); // name is an address of the first character where a
string can be stored using malloc that gives a valid address.
strcpy(name, "Alice"); // GOOD (this is fine now)
name = "Alice"; // OK (ie. compiler is okay with this), where you stored Alice as a literal
(with quotes), and name is the address of A.
return 0;

Lee 41
}

● Problems with name = “Alice”;


○ Alice is a const char * (ie. literal), so you cannot modify the content of name (ie.
name[0] = 'a' is not allowed), may crash, but compiler doesn't catch it.
○ Since we overwrote the name with the address of A from Alice, we lost the
address of the memory we allocated with malloc, so we cannot free it.
■ If you didn't free the name, that's a memory leak.

9.6 Strings in Structs


#include <string.h>
#include <stdlib.h>
int main()
{
// 1st option
typedef struct student_wrtp{
char *name; // we store the address where the name is stored
}
// need to allocate name for each student

// 2nd option
typedef struct student_arr{
char name[200]; // we store 200 characters
}
// don't need to allocate name for each student

Examples:
1. Eg. Using student_arr (GOOD) with valid approach
#include <string.h>
int main()
{
student_arr s1;
strcpy(s1.name, "John"); // This is fine because s2 has the structure of an array, which
allows this.
return 0;
}

2. Eg. Using student_wrtp with one bad example and one good example
#include <string.h>
int main()
{

Lee 42
student_wrtp s2;
strcpy(s2.name, "John") // NOT fine because s2 is a pointer to a random address, so we cannot
use strcpy.
s2.name = "John"; // This is fine because we are storing the address of the first character
of "J" in s2.name.
return 0;
}

3. Eg. Using student_wrtp with malloc


#include <stdlib.h>
#include <string.h>
int main()
{
s2.name = (char *)malloc(200*sizeof(char)); // This is fine because we are allocating memory
for s2.name, so we can use strcpy.
strcpy(s2.name, "John"); // This is fine because we are storing the address of the first
character of "J" in s2 with a valid address.
// when freeing, need to say
free(s2.name);
return 0;
}

L12: Valgrind

12.0 Valgrind
Detects incorrect accesses to memory and memory leaks.

12.1 How to run Valgrind


1. In the terminal: gcc -g filename1.c … filenameN.c -o executable_name
a. Parameters:
i. Note: -g is to get line numbers.
ii. Note: -o allows us to specify an executable_name.
2. valgrind ./executable_name

12.2 What to notice in the output of Valgrind


1. Detection of memory leaks: It shows the line numbers and different areas in which
there are memory leaks.

Lee 43
a. Tip: In the file, you can put printf statements to see how the output of Valgrind
corresponds to your code sequentially.
2. LEAK SUMMARY: Look for a summary in the leaks.
a. Note: If you free properly, then it will say that there are no leaks

12.3 Segmentation Fault:


Trying to access memory that you are not allowed to access.
● Tip: Always free memory after using it.
● Careful: The program may run fine, but it may not be correct, which is worse as you
will find out later.

L13: Strcat using Multiple Files

13.0 Python String Overview

Normal Strcat

● strcat(str1, str2) concatenates str1 and str2, assuming that str1 has enough space to
accommodate extra characters from str2
● Will crash if not enough space: it does not check

Goal: implement Python strings

● Want to be able to concatenate strings, compute substrings without worrying about


memory allocation
● Want to be able to obtain string length (quickly)

Approach:

● Store the pointer to the memory where the string characters are stored
● Store length
● Reallocate pointer as necessary

Implementation Approach:

Lee 44
1. Part 1: Define a struct that stores the necessary data in a .h file
2. Part 2: Define the necessary functions in a C file
3. Part 3: Compile the C file with the functions together with the C file that uses the
functions.

13.1 Part 1: H-File (mystr.h)


#if !defined(MYSTR_H)
#define MYSTR_H
typedef struct mystr{
char *str;
int len;
} mystr;

void mystr_create(mystr *p_s, const char *str, int len);


void mystr_cat(mystr *p_dest, const mystr *p_src);
void mystr_destroy(mystr *p_s);
// Note: To be able to use the functions in mystr.c in other files, we need to declare the
functions in a header file.

#endif
// Note: typedef is so that we can use mystr (ie. the last part of the structure is the "name"
type) as a type name instead of struct mystr.

13.2 Part 2: C File (mystr.c)


#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "mystr.h"

// CREATE A NEW STRING


void mystr_create(mystr *p_s, const char *str, int len)
{
// 1. Allocate space for the string and put len into p_s->len.
p_s->len = len; // same as (*p_s).len = len;
p_s->str = (char *)malloc((len * sizeof(char)) + 1); // Purpose: Make enough space for the
str. +1 for the null character.

// 2. Check if p_s->str is NULL (ie. malloc failed)


if (p_s->str == NULL){
printf("Error: malloc failed\n");
exit(1); // need to include stdlib.h
}

Lee 45
// 3. Copy the string into the new string.
strcpy(p_s->str, str);
}

// CONCATENATING TWO STRINGS


void mystr_cat(mystr *p_dest, const mystr *p_src)
{
// 1. Allocate enough space for the destination, source, and null character.
p_dest->str = (char *)realloc(p_dest->str, (p_dest->len + p_src->len + 1) * sizeof(char));
// +1 for the null character.
// Realloc: arguments consist of the pointer to the memory block, and the new size of
the memory block.

// 2. Check if p_dest->str is NULL (ie. realloc failed)


if (p_dest->str == NULL){
printf("Error: realloc failed\n");
exit(1); // need to include stdlib.h
}
// 3. Concatenate the source string to the destination string.
strcat(p_dest->str, p_src->str);
// Note: const mystr *p_src: We don't want to modify the source string (ie. just to remind
ourselves that we want to copy the source string to the destination string).

// 4. Update the length of the destination string.


p_dest->len += p_src->len;
// Note: It's more efficient to send a pointer instead of the structure to the function
because we don't have to copy the entire structure.
}

// DESTROYING A STRING
void mystr_destroy(mystr *p_s)
{
// 1. Free the string stored in the pointer.
free(p_s->str);
// 2. Optional: Set the pointer to NULL to avoid dangling pointers.
p_s->str = NULL;
// 3. Optional: Set the length to 0 to avoid using the string after it's been freed.
p_s->len = 0;
}

13.3 Part 3: C File (use_mystr.c)


#include <stdio.h>
#include "mystr.h"

int main()
{

Lee 46
// Example 1: Creating a string using mystr_create
mystr s;
mystr_create(&s, "EngSci", 6);
// Result of the function: s.str points to a string "EngSci", s.len is 6.
// This is only possible since we passed the address of the s and go to the address
where s is stored and modify the contents of s..

// Example 2: Concatenating two strings using mystr_cat


mystr praxis;
mystr_create(&praxis, "PRAXIS!!!", 9);
mystr_cat(&s, &praxis);
printf("%s\n", s.str); // Output: EngSciPRAXIS!!! (ie. the concatenation of the two
strings).
// Note: (&s)->str is equivalent to s.str is equivalent to (*(&s)).str.

// Note: a->b is equivalent to (*a).b

// Example 3: Destroying a string using mystr_destroy


mystr_destroy(&s);
mystr_destroy(&praxis);
// Result of the function: s.str points to NULL, s.len is 0.
return 0;
}

Purpose of having use_mystr.c:


● We want to have the functions for creating, concatenating, and destroying strings in a
separate file (ie. mystr.c), and then we want to use these functions in another file (ie.
use_mystr.c).
● We do this by compiling both files together, and using mystr.h, which has a header
guard to prevent multiple inclusions of the same file.
○ Ie. the structure is used in both files, but the structure will only be defined once.
○ Convention: Structures in .h files, and functions in .c files.
● You could use C strcat, but it doesn't do error checking. As a result, our function is
more efficient and safer.

13.4 Multiple Files

Overview

● General structure:

Lee 47
○ Header file with structure definitions and function signatures with header
guard
○ C files contain the implementation of the functions and function definitions.
○ Main C file to test the functions.
● Note: Before mystr_creat, mystr_cat, and mystr_destory, you need to include mystr.h
with the function signatures inside the .h file.

How to compile multiple files using tasks.json?

1. Go to configure tasks and click on the compiler.


2. In tasks.json, you can add a new task to compile multiple files under
"args": [
"-fcolor-diagnostics",
"-fansi-escape-codes",
"-g",
"${fileDirname}/filenamei.c",

“${fileDirname}/filenamen.c",
"-o",
"${fileDirname}/${fileBasenameNoExtension}"
],
● Note: This is useful to use VSCode run to compile multiple files so we can use
breakpoints and debug. This is not possible with gcc in the terminal.
● Useful: Use different implementations of a function in different files with the main
function and the .h file as the same.
○ Result: Swap implementations for different algorithms and see which one is
faster.

Using Watch in the Debugger in VSCode

Watch the value of a variable in the debugger.


1. Run the debug
2. Press the plus icon and add the variable name in the watch window.

Lee 48
3. Go through the iterations to see how the variable changes value.

L14-16: Python Integers

14.0 Python 3 Integer Overview:

Motivation

At some point n will overflow and loop back to 0. This is because the integer is stored in a
fixed number of bits (32 bits in this case)
int main()
{
int n = 10;
while(1){
printf("n = %d\n", n);
n = n * 2;
int temp;
scanf("%d", &temp);
}
return 0;
}

Wants

● Want to be able to store unlimited numbers of digits


● Want to be able to perform arithmetic operations

Design

● Store the digits as a (safe, resizable) string


● Store numbers left-to-right so that addition is easier (ie. reverse order)

Algorithm:

Add digits in reverse order (ie. this is convenient because we are adding digits to the end,
not the beginning), and then reverse the string.

14.1 Part 1: H-File (pyint.h)


#if !defined(PYINT_H)

Lee 49
#define PYINT_H
typedef struct pyint{
int *buffer; // 190 is stored as 0 9 1 (ie. reverse order)
int length;
} pyint;

void create_pyint(pyint **p, int length);


void destroy_pyint(pyint *p);
void set_pyint(pyint *p, int value);
void print_pyint(pyint *p);
void add_pyint(pyint *p, int value);
void plusplus(pyint *p);
// plusplus changes the length and the buffer, such that the new value is the old value + 1

#endif

14.2 Part 2: C File (pyint.c)


#include "pyint.h"
#include <stdlib.h>
#include <stdio.h>

void create_pyint(pyint **p, int length)


// Note: We are passing the address of the struct, so we can modify the contents of the struct.
{
// 0. Allocate space for the structure.
*p = (pyint *)malloc(sizeof(pyint)); // Creates an address that points to struct.

// 1. Creating a block of memory for the buffer


(*p)->buffer = (int *)malloc(length * sizeof(int));

// 2. Storing the length


(*p)->length = length;

// 3. Set all the digits to 0 so that the digits aren't random.


for(int i = 0; i < length; i++){
(*p)->buffer[i] = 0;
// Note: Another way to do this is to use calloc:
// p->buffer = (int *)calloc(length, sizeof(int));
// Note: This is setting everything in p->buffer to 0 for length * sizeof(int)
bytes.
// Note Another way to do this is to use mset, which is a function that sets a block of
memory to a certain value.
// memset(p->buffer, 0, length * sizeof(int));
// Note: This is setting everything in p->buffer to 0 for length * sizeof(int)
bytes.
}

Lee 50
}

void set_pyint(pyint *p, int value)


{
// Example: value: 190
// buffer: 0 9 1 (ie. stored in reverse order)
int i = 0;

// 1. Set the digits in reverse order


while(value > 0){
p->buffer[i] = value % 10; // ie. 190 % 10 = 0, 19 % 10 = 9, 1 % 10 = 1, which is stored
in reverse order.
value = value / 10; // ie. 190 / 10 = 19, 19 / 10 = 1, 1 / 10 = 0
i++;
}
}

void plusplus(pyint *p)


{
/*
Example 0: Pictorially of what we have to implement
0 9 1
1
+-----
1 9 1

Example 2: Want: 0 0 2 (example of a carry-over)


1 1
9 9 1
1
+------
0 0 2

Example 1: Want: 0 0 0 1
1 1 1
9 9 9
1
+------
0 0 0 1
*/

// 1. Initialize carry and index i


int carry = 1;
int i = 0;
// 2. Set while-loop that keeps going until carry is 0 and i is less than the length of the
buffer (ie. we have to keep going until we run out of digits in the buffer or the carry is 0)
while(carry != 0 && i < p->length){

Lee 51
int sum = p->buffer[i] + carry; // Note this can be between 0 and 18 because carry can
be 9 at most and p->buffer[i] can be 9 at most.
p->buffer[i] = sum % 10; // Buffer is the last digit of the sum because sum is between 0
and 18.
carry = sum / 10; // Carry is going to be 0 if sum is less than 10, and 1 if sum is
10-18 (ie. integer division).
i++;
}

// 3. If carry is not 0, then we have to increase the size of the buffer by 1 and set the
rightmost digit to 1.
// Note: This is used when the buffer is full, but the carry is still 1 indicating we
have to add another digit (ie. 999 + 1 = 1000, 99 + 1 = 100, etc.)
if(carry != 0){
p->buffer = (int *)realloc(p->buffer, (p->length + 1) * sizeof(int)); // We are
increasing the size of the buffer by 1 to include another digit (ie. hundreds, thousands, etc.)
p->buffer[p->length] = 1; // The rightmost digit must be 1 because this only occurs when
99 + 1 or 999 + 1, etc.
p->length++; // Increases the length by 1 to indicate that we have added another digit.
}

void add_pyint(pyint *p, int value)


{
// 1. Use a for loop with plusplus to keep adding for value times.
for(int i = 0; i < value; i++){
plusplus(p);
}
}

void print_pyint(pyint *p)


{
int i;
// 1. Print the digits in reverse order using i-- (ie. starts from the last digit and goes
to the first digit)
for(i = p->length - 1; i >= 0; i--){
printf("%d", p->buffer[i]); // Print the digit inside the buffer.
}

// 2. Print a new line at the end.


printf("\n");
}

14.3 Part 3: C File (use_pyint.c)


#include "pyint.h"

Lee 52
#include <stdio.h>

int main()
{
// Example 1: Using create_pyint, and set_pyint.
pyint *p; // p stores an address of a pyint struct (initially it's garbage).
create_pyint(&p, 3); // Want to modify the value of p, so we send the address of p (THERE IS
NO OTHER WAY)
// Note: Also create_pyint requires a pointer to a pointer, so we need to send the
address of p.
set_pyint(p, 999);

// Example 2: Using print_pyint and plusplus.


print_pyint(p);
plusplus(p);
print_pyint(p);

// Example 3: Using add_pyint to add 10000000 to p.


print_pyint(p);
for (int i = 0; i < 1000; i++){
add_pyint(p, 10003);
}
print_pyint(p);
return 0;
}

14.4 Example of Using PlusPlus


● buffer = [0,9,1] representing 190
● carry = 1
● i=0
● Iteration 1:
○ sum = p→buffer[0] + carry = 0 + 1 = 1
○ p→buffer[0] = 1 % 10 = 0
○ carry = sum/10
● Iteration 2:
○ carry = 0, so loop ends.
○ buffer = [1,9,1]

Lee 53
Lee 54
Linked Lists

L17: Linked Lists

17.0 Linked List Overview


A way to store lists without pre-allocating blocks of memory.
● Purpose: Time complexity of using linked list is O(1) for some operations (eg. adding
an element to the list is constant time), while the previous ways are O(n) (ie. linear
time).
○ Note: This is because you don't have to copy the entire list to a new block of
memory like you do in arrays, strings, and Python 3 integers.
● Note: Inside function use node * cur instead of changing the head or we don’t know
where the function will start when changing it.

What we've seen so far with arrays, strings, and Python 3 integers:
You have to pre-allocate memory, and reallocate a new block of memory and copy the old
block of memory to the new block of memory.

What we want: Nodes (ie. chain of data), where at every node, there is some data, and a
pointer to the next node in the linked list.

Analogy: This is like a scavenger hunt, where you go in each room, and you find something,
but also a note on the next room to go into.
● Note: They do not have to be sequential memory states.

17.1 Node Structure


typedef struct node
{
int data; // Data stored in the node
struct node *next; // Pointer to the next node
// Note: This needs to be called struct node because node is not defined yet, but when
in use, you can use node *.
} node;

Lee 55
17.2 Creating Simple Linked Lists Manually
1. Example 1: Create a Simple Linked List
#include <stdlib.h>
int main()
{
// 1. Create a head node.
node *head = (node *)malloc(sizeof(node)); // Allocate memory for the head node (ie. the
first node in the linked list).

// 2. Set the data of the head node to 1.


head->data = 1;
// Note: head->data is only possible because we allocated memory for the head node.

// 3. Create a second node by allocating memory in the linked list.


node *n2 = (node *)malloc(sizeof(node));

// 4. Set the data of the second node to 5.


n2->data = 5;

// 5. Set the next pointer of the head node to the second node.
head->next = n2;

// Pictorial Representation so far:


// 1 5
// head -> n2
return 0;
}

2. Example 2: Insert a node between two nodes using the previous example
#include <stdlib.h>
int main()
{
// 1. Want to insert the value 10 between 1 and 5
node *n10 = (node *)malloc(sizeof(node)); // Allocate memory for the third node in the
linked list.
n10->data = 10; // Set the data of the third node to 10.

// 2. Set the next pointer of the head node to n10


head->next = n10;

// 3. Set the next pointer of n10 to n2.


n10->next = n2;

/* Pictorial Representation so far:


10
n10

Lee 56
/ \
1 5
head n2
*/

free(head); // Free the memory of the head node.


free(n2); // Free the memory of the second node.

return 0;
}

17.3 Creating Linked List Function


#include <stdlib.h>
void create_LL(node **p_head, int *data, int size)
{
// Pictorial representation
/*
head->A->B->...
*/

// 0. Check if the size is 0.


if(size == 0){
*p_head = NULL;
return;
}

// 1. Create the current node


node *cur = (node *)malloc(sizeof(node));
// Allocate memory for the current node (ie. the first node in the linked list).

// 2. Set the data of current node to data[0]


cur->data = data[0]; // Set the data of the current node to data[0].

// 3. Set the *p_head to the first node in the linked list.


*p_head = cur; // Set the *p_head to the first node in the linked list.

// 4. Create the rest of the nodes using a for-loop to store the data in the nodes.
for(int i = 1; i < size; i++){
cur->next = (node *)malloc(sizeof(node));
// Allocate memory for the next node in the linked list and setting it equal to the
cur->next to prepare.
// Note: This is making a node and setting the address of the new node to current next.

cur->next->data = data[i]; // Set the data of the next node to data[i].

cur = cur->next; // Set the current node to the next node.

Lee 57
}

// 5. Need to know when the node is the last node.


cur->next = NULL; // Set the next pointer of the last node to NULL.
}

17.4 Printing Linked List Function


#include <stdio.h>
void print_LL(node *head)
{
// 1. Use a while loop to print the linked list.
// Note: Head is always the (address of) node we're currently printing.
while(head != NULL){
printf("%d -> ", head->data); // Print the data of the current node.
head = head->next; // Set the current node to the next node.
}
// Reason for conditioning: if head is NULL, then it's the final node of the linked list.
}

17.5 Inserting a Node Into Linked List


#include <stdlib.h>
void insert_LL(node **p_head, int ind, int num)
{
// Purpose: Insert num at index ind. Need to find the node before the index.

//0. Create a new node that will be inserted in between two nodes.
node *new = (node *)malloc(sizeof(node)); // Allocate memory for the new node.
new->data = num; // Set the data of the new node to num.

// 1. Treat the special case where ind = 0.


// Special case: We need to insert before the current head, so we need to be able to
change the head to the new node.
if(ind == 0){

new->next = *p_head; // Set the next pointer of the new node to the current head node.

*p_head = new; // Set the head node to the new node, where it changes the address to
account for ind = 0.

// Pictorial Representation
/*
Before: A->B->C->D->E

Lee 58
After: X->A->B->C->D->E, where *p_head points to X now, not A.
*/
}

// 2. General case
else{
// Purpose: Go through the linked list until node i-1 and insert the new node after i-1
(ie. i)
// 1. Create a current node and set it to the head node.
node *cur = *p_head;

// 2. Go through the linked list until node i-1.


for(int i = 0; i < ind - 1; i++){
cur = cur->next; // Set the current node to the next node.
// Note: Each iteration moves cur one node forward until the (ind-1)th node.
}

new->next = cur->next; // Set the next pointer of the new node to the next node of the
current node.
// Purpose: We want the new node (ie. X) to point to the (ind+1)th node (ie. D).
Currently, cur->next points to D, but now we want the new->next (ie. X) to point to D.

cur->next = new; // Set the next pointer of the current node to the new node, inserting
the new node right after the cur node.
// Purpose: cur->next points to D, but now we want cur (ie. C) to have cur->next point
to the new node (ie. X).

// Pictorial Representation:
/*
Before: A->B->C->D->E
After: A->B->C->X->D->E
*/
}
}

19.0 Linked List


Overview

Motivation for Linked Lists:


● To remove an element from an array/block, need to potentially shift almost the
entire block to the left in memory
● Use a “linked-list” structure to store the data instead

Lee 59
Node:
● Each item is stored in a node that contains:
○ The value of the item (called the node’s data)
○ A pointer to the next node
List:
● A list consists of two pieces of information:
○ A pointer to the first node
○ The number of elements in the list
Advantage of Linked Lists:
● Their size is not fixed and can grow and shrink to accommodate exactly the number
of values actually stored

Linked List Insert

1. Suppose we want to insert value 34 at index 2 in the linked list below (the index of
each node is NOT stored in the linked list—it is indicated in the picture for
convenience)

2. First, we create a new node to store the new value

Lee 60
3. Next, we set the next pointer of the new node to the next pointer of the node at
index 1.

4. Next, we set the next pointer of the node currently at index 1 to point to the new
node

5. Finally, we update the value of n (it’s not necessary to store the number of elements
for a linked list, but it is often done for convenience)
a. The complexity is O(1)—assuming we already have a pointer to the element
at index 1

Lee 61
Linked List Remove

1. Now, suppose we want to remove the value at index 1 from the linked list below

2. First, we set the next pointer of the node at index 0 to the value of the next pointer
of the node at index 1

3. Next, we “delete” the old node at index 1—meaning we simply release the memory
that was allocated for the node

Lee 62
4. Finally, we update the value of n.
a. The complexity is O(1)—assuming we already have a pointer to the element
at index 0.

Linked List Get

1. Finally, suppose we want to get the value at index 2 from the linked list below.

2. This requires setting a pointer to point to each node in turn, keeping count, until we
reach index 2

Lee 63
3. The complexity is O(n) in the worst-case (when retrieving the item at the last index
in the list)

Lee 64
Summary of Work-Case Complexity of Array and Linked List
● n is the number of items in the list
● Key: The complexity listed for insert and remove for linked lists is only the time
taken for the actual insertion or removal.
○ Note: Not counting the time required to find the insertion/removal point,
which will be O(n) in the worst-case.
● Key: Worst-time complexity for array and removal is O(n) because you have to
move the entire block of memory somewhere else.

Operation Array Linked List

Insert O(n) O(1)

Remove O(n) O(1)

Get O(1) O(n) (or O(1) if index is


known)

Lee 65
Lee 66
Stacks, and Queues

L19: Abstract Data Types, Stack

19.1 Abstract Data Types


An “Abstract Data Type” (ADT) is any collection of values, together with operations on those
values.
● Eg. Int with operations + - * / %.
● Eg. Lists with operations insert, remove, get
● Note: An ADT specifies what values are represented and what operations can be
performed (i.e. values and operations), but not how to store the values or how to carry
out the operations (i.e. implementation).
● Note: A data structure is an implementation of an ADT—a way to represent the
values, and algorithms for each operation

19.2 Stack (Abstract Data Type)


● LIFO (Last In First Out): Means that the last element that is added to the stack is the
first one to be removed.
● Operation:
○ Push vs. Pop:
■ Push(elem): Add elem to the top of the stack.
■ Pop(): Return the elem from the top of the stack and remove it from the
stack.
○ isEmpty(): Return True if the stack is empty, and element cannot be popped,
false otherwise.
● Useful: Depth-first search algorithms.

19.3 Queue (Abstract Data Type)


● FIFO (First in first out): This means that the first element added to the queue will be
the first one to be removed. A
● Operation:
○ enqueue (to add an element to the rear of the queue)
○ dequeue (to remove and return the element from the front of the queue).
● Useful: Breadth-first search algorithms.

Lee 67
L20: Classes, Stack, Linked List

20.0 General Classes

Motivation

To remove the redundancy of having to pass the stack to every function.


● A way to combine C-like structs with dedicated functions.

General Structure of Classes


class ClassName:
def __init__(self, arg1, arg2, ...):
self.var1 = arg1
self.var2 = arg2
...
def function1(self,...):
def function2(self,...):

Terminology

1. Argument of __init__:
a. var = ClassName(arg1, arg2, ...) ClassName is the self argument, and arg1, arg2,
... are the arguments subsequently passed to the __init__ method.
2. __init__ function (initialize):
a. All classes have a function called __init__(), which is always executed when the
class is being initiated.
b. Use the __init__() function to assign values to object properties, or other
operations that are necessary to do when the object is being created.
c. The self parameter is a reference to the current instance of the class, and is used
to access variables that belong to the class.
3. Attribute:
a. An attribute is a variable that is part of a class.
b. Attributes are accessed using the dot notation, i.e. object.attribute.
c. Self is a reference to the current instance of the class, and is used to access
variables that belong to the class.
d. Inside functions in the class, we send the self argument to the function, so that
the function knows which object it is working with.

Lee 68
20.1 Example 1 of classes using Student
class Student:
# 1. Create a student using the __init__ method which initializes the student's name and
esc190mark.
def __init__(self, name, esc190mark):
self.name = name # C: s.name = "Jack"
self.esc190mark = esc190mark # C: s.esc190mark = 98

# 2. Print the student's name and esc190mark using the print method.
def print(self):
print(self.name, self.esc190mark)

s = Student("Jack", 98)
s.print() # same as Student.print(s), prints Jack 98, which is why using classes are convenient
because it allows for the use of dot notation that is shorter and more readable.
# So basically s is replacing the self argument (i.e. Student) when we initiate a variable of
the class.

20.2 Example 2 of classes using Stacks


class Stack:
# 1. Create a stack using the __init__ method which initializes the stack to be empty.
def __init__(self):
self.stack = [] # We want the stack to be empty, so when you create a stack it is empty.

# 2. Check if the stack is empty using the is_empty method.


def is_empty(self):
return len(self.stack) == 0 # O(1) because it is just checking the length of the stack.

# 3. Push an element onto the stack using the push method.


def push(self, elem):
self.stack.append(elem) # O(n), where n is len(self.stack) because the stack may not have
space for the new element, so the stack may have to be copied to a new location in memory.

# 4. Pop an element from the stack using the pop method.


def pop(self):
if self.is_empty():
print("Cannot pop from an empty stack.")
else:
return self.stack.pop() # O(1) because it is removing the last element from the
stack, so it does not have to copy the stack to a new location in memory.

# Using the stack function:


s = Stack()
s.stack #[]

Lee 69
s.push(52)
s.push(125)
print(s.stack) #[52, 125]
print(s.pop())
print(s.stack) #[52]
# Whenever I use s., it is replacing the self argument in the class Stack.

20.3 Example 3 of classes using Linked Lists


● Purpose: Use it to implement a stack with O(1) push and pop.
● Note: Everything in Python is an address, so you don't have to worry about pointers
like you do in C.
1. Linked List Class:
class LL:
#1.1 Create a Linked List using the __init__ method which initializes the head of the linked
list to be None.
def __init__(self):
self.head = None # Note: f.var is just an object attribute, so it can be any type of
object, not just a variable.

# 1.2 Insert an element into the linked list at a specific location using the insert method.
def insert(self, loc, element):
new_node = Node(element) # data: element, next: None

# 1.2.1 If location is 0, then insert the new node at the beginning of the linked list.
if loc == 0:
# Before: head -> n0 -> n1 ->...->nk -> None
# After: new_node (new head) -> previous head -> n0 -> n1 ->...->nk -> None
new_node.next = self.head # Order is important because if you do self.head = new_node
first, then you lose the reference to the rest of the list.
# Note: self.head is the start of the linked list, so it is the first node in the
list.
self.head = new_node

# 1.2.2 If location is not 0, then insert the new node at the location in the linked
list.
else:
# Before: head -> n0 -> n1 ->...->nk -> None

cur = self.head
for i in range(loc - 1): # Range is loc - 1, because if we want to insert at loc = 1,
then we want to insert after the head, so we want to stop at the head.
cur = cur.next
# After: head -> n0 -> new_node -> n1 ->...->nk -> None, where cur is at n0 and
cur.next is at n1.

Lee 70
new_node.next = cur.next # This is to connect the new_node to the rest of the list
(i.e. n1)
cur.next = new_node # This is to connect the previous node to the new_node (i.e. n0)

# 1.3 Remove an element from the linked list at a specific location using the remove method.
def delete(self, loc):
# 1.3.1 If location is 0, then remove the first node from the linked list.
if loc == 0:
# Before: head -> n0 -> n1 ->...->nk -> None (Note: You don't need to free in Python)
# After: head -> n1 ->...->nk -> None
self.head = self.head.next

# 1.3.2 If location is not 0, then remove the node at the location in the linked list.
else:
# Before: head -> n0 -> n1 ->...->nk -> None
cur = self.head
for i in range(loc - 1): # Range is loc - 1, because if we want to remove at loc = 1,
then we want to remove after the head, so we want to stop at the head.
cur = cur.next
# After: head -> n0 -> n2 ->...->nk -> None, where if loc is 1, then cur is at n0 and
cur.next is at n1.
cur.next = cur.next.next # This is to connect the previous node to the node after the
one being removed (i.e. removing n1).

2. Node Class:
class Node:
def __init__(self, data):
self.data = data # self.data exists after the _init_ method is called, so it is an object
attribute.
self.next = None

3. Stack Class:
class Stack:
#3.1 Create a stack using the __init__ method which initializes the stack to be empty.
def __init__(self):
self.stack = LL() # self.stack is a linked list, so it is an object attribute with head
== None.

# 3.2 Push: insert at 0. (i.e. O(1))


def push(self, elem):
self.stack.insert(0, elem) # Using the insert function from LL to insert at 0.

# 3.3 Pop: get 0th element, and then delete 0th element. (i.e. O(1))

Lee 71
def pop(self, elem):
cur_node = self.stack.head
self.stack.delete(0) # Using the delete function from LL to delete at 0.
return cur_node.data

● Why does push and pop have to be at element 0?


○ Push and pop have to be at element 0 because the head of the linked list is the
top of the stack, so we want to insert and delete at the head of the linked list.

4. Using the function:


n = Node(52) # Think of it like a struct, but in reality, the __init__ gets called, and so
self.data = 52, and self.next = None.
n.a1432141 = 12 # n.a1432141 has an attribute 12, but it is not a good practice to do this
because it is not clear what the attribute is for.

L21: Classes, Function and Operator Overloading

21.0 Example Class Attribute vs. Instance Attribute


Takeaway: __init__ sets up attributes, but anything dot anything is allowed as an attribute
outside of __init__.
1. Example 1: Using AcornEntry
# Note: The following is an unconventional method of using __init__ and attributes.
class AcornEntry:
def __init__(self):
self.mymark = 100

acorn = AcornEntry()
acorn.yourmark = 90
print(acorn.mymark) # 100
print(acorn.yourmark) # 90, Note this is an instance attribute since we didn’t set it up in the
class

2. Example of function attribute as a global variable:


def f(x):
print("Current argument:", x)
print("Previous argument:", f.mem)
f.mem = x # Note: f.mem is an attribute of the function f, in which it acts like a global
variable because f is a global function.

Lee 72
# Calling the function f to show the global variable.
f.mem = None
f(10)
f(20)
f(30)

21.1 Function and Operator Overloading


Define our own function and operator overloading to define how addition, conversion to
string, and comparison, etc work for the custom class.
● Note: __repr__, __lt__, and __init__ methods have their own functions made by Python
on default, but we are defining our own function and operator overloading.

1. Example 1: Using AcornEntry1


class AcornEntry1:
# 1. Initialization of the class
def __init__(self, course, mark):
self.course = course
self.mark = mark # Note: .mark and mark do not have to be the same.

# 2. Representation of the object


def __repr__(self):
return f"The mark in {self.course} is {self.mark}"
# Motivation: We want to print the object in a certain way, instead of using the default
__repr__, which isn't readable.
# Note: f string is used to format the string by using the curly braces.
# Note: The curly braces are used to insert the value of the variable in the string.

# 3. Less than __lt__ to compare by course, and then by mark.


# Note: The Python operator < will call this function to compare the objects (i.e. this
is used when < are used).
def __lt__(self, other):
# Python knows which one is which because we are using the less than operator, and the
first object is the one on the left, and the second object is the one on the right
if self.course < other.course:
return True
elif self.course == other.course:
return self.mark < other.mark
else:
return False

Lee 73
21.2 Examples of Using AcornEntry1
The examples correlate to each other.
1. Example 1: Initialization of the class using __init__.
acorn_artsie = AcornEntry1("CSC108", 100)
acorn_engsci = AcornEntry1("CSC180", 8)

2. Example 2: Printing the object using our own __repr__ method.


print(acorn_artsie) # The mark in CSC108 is 100
# Note: If our __repr__ wasn't used, it would print the address of the object, which is not
readable.
# The print function goes to our __repr__ method to print the object in a readable way.

3. Example 3: Comparing the objects using our own __lt__ method.


print(acorn_artsie < acorn_engsci) # False
# Note: Since we used a less than comparison, it will use our __lt__ method to compare the
objects.
# Note: It knows which one is which in the __lt__ method because we are using the less than
operator, and the first object is the one on the left, and the second object is the one on the
right

4. Example 4: Using the other implementation of __lt__ to compare by course, and then
by mark.
entries = [AcornEntry1("ESC180", 90),
AcornEntry1("ESC190", 87),
AcornEntry1("ESC180", 100),
AcornEntry1("ESC190", 89)]
entries.sort() # Note this will use our __lt__ method to sort the objects. Analogous to qsort by
providing a comparison function (i.e. __lt__ method).
print(entries) # [The mark in ESC180 is 90, The mark in ESC180 is 100, The mark in ESC190 is 87,
The mark in ESC190 is 89]

Summary:
● Print calls __repr__ to print the object in a readable way.
● < calls __lt__ to compare the objects.
● Sort calls __lt__ to sort the objects.

Lee 74
21.3 How to use Python in the terminal?
● python3 starts python in the terminal.
● Use exit() to exit python in the terminal.

Lee 75
Lee 76
Dynamic Programming

L22: Memoization and Dynamic Programming

Memoization
Maintain a table of values that were already computed.
● Ie. Store the result of a function call every time it happens.
● Idea: One way to improve is by not computing fib(5) the second time we need to
compute, but just remember it.

Fibonacci Sequence Using Normal, Memoization, and Dynamic Programming


1. Fibonacci sequence w/o memoization
def fib(n):
# 1. Base case
if n in [0, 1]:
return 1

# 2. Recursive call
return fib(n-1) + fib(n-2) # By definition of the Fibonacci sequence

'''
Example of the call stack for fib(3) of example 1:
fib(1) fib(0)
\ /
fib(2) fib(1)
\ /
fib(3)
'''

2. Fibonacci sequence w/ memoization


def fib_memo(n, fib_dict = {}):
# 1. Base case
if n in [0, 1]:
return 1

# 2. Recursive call using memoization.


# If the value n is in the dictionary, then return the value of the key n in the
dictionary
# If the value n is not in the dictionary, then compute the value of the key n in the
dictionary using the definition of the Fibonacci sequence

Lee 77
elif n not in fib_dict:
fib_dict[n] = fib(n-1, fib_dict) + fib(n-2, fib_dict)

return fib_dict[n]

3. Fibonacci sequence with memoization by inputting 0, 1 into dict right away.


def fib_memo2(n, fib_dict = {0: 1, 1: 1}):
# 1. Recursive call using memoization
if n not in fib_dict:
fib_dict[n] = fib(n-1, fib_dict) + fib(n-2, fib_dict)

return fib_dict[n]

'''
fib(1) fib(0)
\ /
fib(2) fib(1)
\ /
fib(3) fib(2)
/ \ /
fib(4) fib(3)
\ /
fib(5)
'''

1. Process: The second fib(2) will not branch out since it already knows the answer (i.e.
the right 2).
a. Therefore, the calling of fib(4) will go down the fib(3) branch, and then the fib(2)
branch.
b. So after the fib(3) branch is done, it will go to the fib(2) branch, which already
knows the answer in the dictionary.
2. Process: The second fib(3) will not branch out since it already knows the answer (i.e.
the right 3).
3. Shape of the tree:
a. Full branch always going left, of size n.
b. Branching out to branches to the right, 1 step every time.
c. Total: 2n calls.
d. O(n) calls.
4. Time complexity of fib_memo2: If counting the number of additions: O(n) is the time
complexity because every number is only computed once, and there's n of them.

Lee 78
a. If we're counting operations, we need to account for the fact that addition takes
O(k) time, where k is the number of digits in the larger number.

Dynamic Programming

Overview

● Solve subproblems, and store the solutions to those subproblems


● Use solutions to small subproblems to compute solutions to larger problems

Process

1. Divide a complex problem into a number of simpler overlapping problems


2. Define a relationship between solutions to more complex problems and solutions to
simpler problems.
3. Store solutions to each subproblem, solving each subproblem once.
4. Use stored solutions to solve the original problem.

Fibonacci Example
1. n+1 problems, where the i-th problem is the i-th Fibonacci number.
2. Can compute 𝐹𝑖 using 𝐹𝑖−1 𝑎𝑛𝑑 𝐹𝑖−2.
3. Use fib_list to store solutions
4. Solve the Fibonnaci sequence.

● Small problems are fib(0), fib(1), and fib(2) and building up to fib(3), fib(4), and fib(5),
....
● Note: The square brackets for the list is treated as an array, but this can also be a
recursive function by using round brackets.
○ Conceptually: They are the same whether you use an array or a recursive
function.

Painting Houses

Overview
Lee 79
Goal: paint a row of n houses red, green, or blue s.t.
● Total cost is minimized. cost(i, col) is the cost to paint the i-th house in color col
● No two adjacent houses have the same color

Step 1: Subproblems

1. R(i): min cost to paint the first i houses, with the i-th house painted red
2. G(i): min cost to paint the first i houses, with the i-th house painted green
3. B(i): min cost to paint the first i houses, with the i-th house painted blue
● Note: So i-th house would be determined, and all the houses to the left would be
calculated for the minimum cost.

Step 2: Relationship Between Problems

● Note: To paint a certain color RGB, there is an associated cost that is random for each
house.
● Note: The minimum cost is the minimum of the three costs of painting the house red,
green, or blue first.

Approaches

1. Brute force: Try all possible combinations of colors and find the one with the
minimum cost.
a. Time complexity: O(3^n) where n is the number of houses.

Lee 80
2. Dynamic programming.

L23: Painting Houses Implementation

Implementation With Memoization


1. Cost of painting the houses (red, blue, green respectively)
N = 6
houses = [[7, 6, 7, 8, 9, 20],
[3, 8, 9, 22, 12, 8],
[16, 10, 4, 2, 5, 7]]

2. Creating a matrix to store the min. cost of painting the house red, green, and blue
respectively.
a. Note: The cost array is used to generalize the problem to k colors.
cost = [[0] * N,
[0] * N,
[0] * N]

● Note: cost[0][i+1] is the min. cost of painting the house red (i.e. R[j+1])
● Note: cost[1][i+1] is the min. cost of painting the house green (i.e. G[j+1])
● Note: cost[2][i+1] is the min. cost of painting the house blue (i.e. B[j+1])
● Note: It is + 1 bc in the picture, the houses are indexed starting from 1 unless you start
the index from 1.
● Key: The first index is the color, and the second index is the house.

3. Cost of first houses being red, blue, green, respectively.


cost[0][0] = houses[0][0] # Note: This is the cost of painting the first house red.
cost[1][0] = houses[1][0] # Note: This is the cost of painting the first house green.
cost[2][0] = houses[2][0] # Note: This is the cost of painting the first house blue.

● Note: We don't have to calculate the min of the left because there is no left.

4. Computing the sub-problems


● R(k) = cost(k, "red") + // the cost of painting the house k red.
○ min(G(k-1), B(k-1)) // the min. cost of painting the previous house green or blue.

Lee 81
● Note: It has to be green or blue because the houses beside each other cannot be the
same color.
● Note: Either paint k-1 green or blue, whichever is cheaper.
for i in range(1,N):
# The min cost to paint the first i houses, with the i-th being painted red, green, or blue
respectively.
cost[0][i] = houses[0][i] + min(cost[1][i-1], cost[2][i-1])
cost[1][i] = houses[1][i] + min(cost[0][i-1], cost[2][i-1])
cost[2][i] = houses[2][i] + min(cost[0][i-1], cost[1][i-1])

5. Using the relationship of the min. cost to solve the larger problem
print(min(cost[0][5], cost[1][5], cost[2][5]))

● Note: This is min. cost of painting the first 6 houses.


● Why? The first 6 houses because the index is 5, and the index starts from 0.
● Note: We still need to find an efficient manner to find this ordering (i.e. how to order
the houses to get the min. cost).

6. Finding the ordering of the houses to get the min. cost


cols = [0] * N # N is the number of houses.
i = N-1

# Note: Painting the last house such that the cost is minimized.
if cost[0][N-1] <= min(cost[1][N-1], cost[2][N-1]): # If the cost of painting the last house red
is less than the min. cost of painting the last house green or blue, then painting the last
house red.
cols[N-1] = 0
elif cost[1][N-1] <= min(cost[0][N-1], cost[2][N-1]): # If the cost of painting the last house
green is less than the min. cost of painting the last house red or blue, then painting the last
house green.
cols[N-1] = 1
else:
cols[N-1] = 2 # If the cost of painting the last house blue is less than the min. cost of
painting the last house red or green, then painting the last house blue.

# Note: Painting the rest of the houses such that the cost is minimized.
for i in range(N-2, -1, -1): # Note: This is from N-2 to 0.
cur_min = 10000 # Note: This is an arbitrary large number.
cur_min_col = -1
for col in [0, 1, 2]: # Note: These are the colors.
if col == cols[i+1]: # If the color of the next house is the same as the current house,

Lee 82
then continue because the houses beside each other cannot be the same color.
continue
if cost[col][i] < cur_min: # If the cost of painting the house col is less than the
current min. cost, then update the current min. cost and the current min. color.
cur_min = cost[col][i]
cur_min_col = col
cols[i] = cur_min_col # Note: This is the current min. color.

Implementation Using Recursion Version


def paint_house_cost(houses, col, i):
'''Return the cost of painting houses
0, 1, 2, ,,, i, with the i-th houses painted col
and the total cost minimized'''
# Same as cost[col][i]

# 1. Base Case: If i is 0, then return the cost of painting the house col.
# Same as step 3 above.
if i == 0:
return houses[col][i]

cur_min = sum(sum(costs) for costs in houses) # Note: This is the upper bound of the cost of
painting the houses.
'''
cum_sum = 0

for cost in houses:


cur_cur_sum = 0
for c in cost:
cur_cur_sum += c
cum_sum += cur_cur_sum

Sum of every every element in the list of lists houses


Want to find an upper bound for what the minimal cost for anything could be, but this is
an upper bound because I'm painting every house in all of RGB.
But in the solution, for every house, would just pay for one of the colors.
'''
cur_min_col = -1

# This is the cost of painting the house col plus the min. cost of painting the previous
house green or blue.
for color in [0, 1, 2]:
if color == col: # If the color of the next house is the same as the current house, then
continue because the houses beside each other cannot be the same color.
continue

Lee 83
cost_color_i = paint_house_cost(houses, color, i-1) # This is the cost of painting the
house col plus the min. cost of painting the previous house green or blue.
if cost_color_i < cur_min: # If the cost of painting the house col plus the min. cost of
painting the previous house green or blue is less than the current min. cost, then update the
current min. cost and the current min. color.
cur_min = cost_color_i
cur_min_col = color
return houses[col][i] + cur_min # This is the cost of painting the house col.

L24: Coin Change Problem

Overview

Overview

Given a set of coin denominations (e.g., [1, 5, 10, 25, 100, 200] for Canadian currency*), and
an amount of money, find the way to represent the amount using the least number of coins

Problem

Given 𝑛 coin denominations {𝑑1, 𝑑2, … , 𝑑𝑛} and a target value 𝑉, find the least number of
coins needed to make change for 𝑉.

Subproblems

𝑂𝑃𝑇(𝑣): the least number of coins needed to make change for 𝑣.


● Build from the bottom up starting with OPT(0),...,OPT(v).
● To make v using denomination 𝑑𝑖, use 𝑂𝑃𝑇(𝑣 − 𝑑𝑖) + 1 𝑐𝑜𝑖𝑛𝑠
○ Try every possible 𝑑𝑖≤𝑣.

Dynamic Programming Recurrence

Lee 84
Introduction to Graphs

L26: Graphs

26.0 Introduction to Graphs

Overview

A graph 𝐺 = (𝑉, 𝐸) consists of a set of vertices (nodes) V and a set of edges E.


● Motivation: There are already standard algorithms for dealing with these types of
problems.
○ So if we can take our problem and reformulate it as a graph problem, we
automatically get the solution.

Types of Graphs

1. Traditional Graph

2. Directed Graphs (“digraphs”)


a. Edges have directions associated with them.
(
b. 𝑒1 = 𝑣2, 𝑣1 )
i. Predecessor (i.e. source): First element.
ii. Successor (i.e. target): Second element.

Lee 85
3. Weighted graphs
a. There is a weight associated with each edge.

Terminology

● Adjacent: Vertex 𝑣1 is adjacent to vertex 𝑣2 if an edge connects 𝑣1 and 𝑣2


( )
○ There exists an edge 𝑒 = 𝑣1, 𝑣2 ∈ 𝐸.
● Path: A path is a sequence of vertices in which each vertex is adjacent to the next one
( ) ( )
○ 𝑝 = 𝑣1,..., 𝑣𝑛 𝑠. 𝑡. 𝑣𝑖, 𝑣𝑖+1 ∈ 𝐸
○ The length of the path is the number of edges in it, NOT the number of
vertices.
( ) ( ) ( )
● Cycle: A cycle in a path is a sequence 𝑣1,..., 𝑣𝑛 𝑠. 𝑡. 𝑣𝑖, 𝑣𝑖+1 ∈ 𝐸 and 𝑣𝑛, 𝑣1 ∈ 𝐸.
● Acyclic Graph: A graph with no cycles is an acyclic graph
● Directed Acyclic Graph: A DAG is a directed acyclic graph
● Simple Path: A simple path is a path with no repetition of vertices
● Simple Cycle: A simple cycle is a cycle with no repetition of vertices

Lee 86
● Connected: Two vertices are connected is there is a path between them
● Connected Component: A subset of vertices is a connected component of G if each
pair of vertices in the subset are connected.
● Degree: The degree of vertex v is the number of edges associated with v

L27-28 Implementation of Graphs

General Ways of Implementation of Graphs

How to implement the graph ADT?

1. Adjacency Matrix
a. An 𝑛 × 𝑛 matrix where 𝑀[𝑖][𝑗] = 1 if there is an edge between 𝑣𝑖 and 𝑣𝑗, and 0
otherwise.

● Note: The matrix is symmetric for undirected graphs

2. Adjacency List

Lee 87
a. For 𝑛 = |𝑉| vertices, n linked lists. The i-th linked list, 𝐿[𝑖] is a list of all the
vertices that are adjacent to vertex i.

● First Node: 0 is connected to both 2 and 5, not that 2 is connected to 5.

Complexities and Space Requirements

Complexity of Operations

1. Is there an edge between 𝑣𝑖 and 𝑣𝑗?


a. Adjacency Matrix: 𝑂(1)
i. Note: We need to find the location using some math.
b. Adjacency List: 𝑂(𝑑)
i. d (the maximum degree in the graph)
ii. Note: We have to check each edge (i.e. the maximum number of
connections).
2. Find all vertices adjacent to 𝑣𝑖:
a. Adjacency Matrix: 𝑂(|𝑉|)
Lee 88
i.|V|: the number of vertices in the graph
ii.Note: Need to check row by row. (i.e. is 0 connected to 0, is 0 connected
to 1,... is 0 connected to n).
b. Adjacency List: 𝑂(𝑑)
i. E.g. d is the largest number of connections.

Space Requirements (How space in memory)


2
1. Adjacency Matrix: 𝑂 |𝑉| ( )
2
a. Need to store |𝑉| matrix entries.
2. Adjacency list: 𝑂(|𝑉| + |𝐸|)
a. Need to store |𝑉| linked lists. Collectively, the linked list contain |𝐸| entries, so
the space requirement is 𝑎1|𝑉| + 𝑎2|𝐸|, 𝑤ℎ𝑖𝑐ℎ 𝑖𝑠 𝑂(|𝑉| + |𝐸|).
i. Note: 𝑎1, 𝑎2 are constants.

Example of Graph

Adjacency List

class Node:
def __init__(self, data):
self.data = data
self.adjacent = []

node1 = Node('TO')
node2 = Node('Ottawa')
node3 = Node('Orlando')

node1.adjacent.append(node2)
node1.adjacent.append(node3)
node2.adjacent.append(node1)
node3.adjacent.append(node1)

Lee 89
● Note: There are N edges, then the sum of the lengths of all the adjacency lists is 2N.

Adjacency Matrix

For a symmetric matrix (i.e. only for undirected graphs), the matrix is

● Note: The entry 𝐴𝑖𝑗 = 1 if there is an edge starting at node i and ending at node j.
● Convention: ith row and jth column of the matrix A

Graph Implementation using Adjacency Matrix


class Graph:
def __init__(self, capacity):
self.capacity = capacity # The maximum number of nodes
self.cur_num_nodes = 0 # The current number of nodes
self.nodes = [] # The list of nodes
self.indices = {} # A dictionary mapping node names to indices
self.adj_array = [] # The adjacency array

# Initialize the adjacency array


for i in range(self.capacity):
self.adj_array.append([None] * self.capacity)

def expand(self):
'''
Expand the graph by doubling the capacity to accommodate new nodes.
'''
adj_array_new = []
self.capacity *= 2

Lee 90
# Create a new adjacency array by initializing with None.
for i in range(self.capacity):
adj_array_new.append([None] * self.capacity)

# Copy the old adjacency array into the new one by going into each index and copying it.
for i in range(self.cur_num_nodes):
for j in range(self.cur_num_nodes):
adj_array_new[i][j] = self.adj_array[i][j]

self.adj_array = adj_array_new

def register_node(self, name):


'''
Register a new node in the graph.
'''

# If the graph is full, expand it by calling the helper function.


if self.capacity == self.cur_num_nodes:
self.expand()

# Add the node to the graph by adding the element to the end of the list.
self.nodes.append(name)

# Add the node to the adjacency array, where the key is the node “name”, and
self.cur_num_nodes is the value
self.indices[name] = self.cur_num_nodes # The current number of nodes is incremented.

# Increment the number of nodes


self.cur_num_nodes += 1

# Initialize the new row and column with 0, indicating no connections yet.
for i in range(self.cur_num_nodes):
self.adj_array[i][self.cur_num_nodes-1] = 0
self.adj_array[self.cur_num_nodes-1][i] = 0

def connect_by_name(self, name1, name2):


'''
Connect two nodes in the graph by name.
'''

# If either node is not in the graph, add it by calling the helper function above.
if name1 not in self.indices:
self.register_node(name1)
if name2 not in self.indices:
self.register_node(name2)

Lee 91
# Connect the nodes by creating an edge between them by updating the adjacency matrix.
self.connect_by_index(self.indices[name1], self.indices[name2])

def connect_by_index(self, index1, index2):


'''
Connect two nodes in the graph by index.
'''
#Connects the two nodes by changing the adjacency matrix to 1 (i.e. indicating a
connection).
self.adj_array[index1][index2] = 1

L29: Graph Traversal

Difference between search and traversal


● Search is a process of finding a particular node in a graph.
● Traversal is a process of visiting all the nodes in a graph.

Graph Traversal

Overview

Want to visit (e.g. in order to print) each vertex exactly once.


● Note: We don’t care about the order, as long as we print everything.

Breadth-First Traversal

Lee 92
Depth First Traversal

Recursive Depth-First Traversal

Lee 93
Breadth First Search (First in First out)
● Queue: First in first out.
● Intuition:
1. For every node we visit, we are going to store in a visited set and print the node.
2. Since this is breadth, we are then going to visit its immediate neighbors and put them
in the queue.
3. Since this is a FIFO structure from being a queue, we will go to the first neighbor of the
starting node, and input its neighbors into the queue if not done already.
4. Repeat step 3 until all the starting node neighbors have been checked and inputting its
neighbors into the queue if not done already.
5. Repeat step 2-4 for all the other neighbors as well.
● Analogy to Chess:
○ One way I can approach the game is to look at all my options and for each
option, look at all my opponent's options, and so forth.
○ In a sense, my "looking-ahead" depth slowly increases.
This is one implementation of using BFS with lists.
class Node:
def __init__(self, data):
self.data = data
self.neighbours = []

class Graph:
def __init__(self):

Lee 94
self.nodes = [] # One representation

def bfs(graph):
visited = set() # Average insertion/lookup time is O(1)

for starting_node in graph.nodes:


queue = [starting_node]

while len(queue) > 0:


node = queue.pop(0)
if node not in visited:
print(node.data)
visited.add(node)
queue.extend(node.neighbours)

return visited

Here is another implementation with linked lists


def bfs(graph):
# 1. Set up a set where you store nodes that were already printed, and nodes that were
already visited
visited = set()

# 2. Iterate through all the nodes in the graph


for starting_node in graph.nodes.keys():
# Set up a queue (i.e., a list of nodes where you append in the end and remove nodes
from the front)
queue = [starting_node] # Add the starting node to the queue

# 3. Implement the algorithm: remove a node from the queue, add its unvisited neighbours
to the queue, and print the node
while len(queue) > 0: # While the queue is not empty
node = queue.pop(0) # Remove the first element from the queue
# remove the front, and add to the back.

# 4. If the node is not visited, then print the node, add it to the visited set, and
add its neighbours to the queue
if node not in visited:
print(node)
visited.add(node)

cur = graph.nodes[node].neighbours.head

while cur is not None:


if cur.data.value not in visited:
queue.append(cur.data.value)
cur = cur.next

Lee 95
Depth First Search (First in Last out)
● Stack: First in last out
● Intuition
1. I will visit a starting_node and print out the value and store in the visited set.
2. Afterwards, I will check its neighbors and input into the visited set.
3. Since this is a FILO, I will look at the last neighbor that I inputted, and print its node,
and store the neighbors in the stack.
4. Key: Since this is a stack, NOT A QUEUE, we will not look at the starting nodes
neighbors, but proceed down the newest neighbors that we inputted in step 3.
5. Then we will go down that line of neighbors, which is why its call depth, because we
are going in deep.
● Analogy to Chess:
○ Look at one particular option, and then look at one specific response from my
opponent to that option, and so forth.
■ Each computation I make in my head makes me look one step further (I
won't consider the other possible moves I can make, yet!)
■ Eventually, I'll get to the end (perhaps checkmate!) and I can start working
backwards slowly. If my opponent moved somewhere else, is it still
checkmate?
○ Depth first allows me to explore a particular line in extreme depth before
moving on to consider another line while breadth first allows me to explore all
lines simultaneously but slowly.

Lee 96
L34: Recursive DFS & Race to 21 (Example of Graph Search)

34.0 Recursive DFS

● Process:
1. For the recursive process, we call the starting node, and add it to the visited set.
2. Call dfs() on the first neighbor (this is arbitrary on which neighbor is called first).
3. Repeat step 2 on the first neighbor’s neighbor, and so on. As a result, we will get to a
point in which all of the neighbors for that one strand will have been visited, and it will
unwind.

34.1 Race to 21

Overview

Consider a game where two players race to 21. The game state starts at and each player can
either add +1 or +2 to the game state. The player who reaches 21 first wins.
● Challenge: Print all possible games of this game.
● Nodes: Game states for Race to 21.
○ Example: A possible state is 𝑣 = [0, 1, 3, 4] which corresponds to an incomplete
game where the players chose + 1, + 2, + 1, in that order. The first node is [0].
● Edges: Valid moves.

Lee 97
○ For example, game states 𝑣1 and 𝑣2 are connected iff there is a valid move from
𝑣1 and 𝑣2.

Process

Since the neighbors have some structure to it (i.e. we can easily predict the neighbors since
there's a maximum of 2 possible states stemming from a given state), we can just modify
our code directly.

● Note: A tree is kind of just a graph, where we can represent Race to 21 as a tree.
● Key: By using DFS, we can enumerate all of the possible games (i.e. challenge) that
can be played down. Since when we reach the bottom of the tree (i.e. 21), it will be a
game played, and we would go in-depth with all the other possibilities.
● Note: We don’t have to use a graph class since we already know the structure ahead
of time.

Implementation
def enum_all_stack(prefix, score, path):
stack = [(prefix, score)]
while stack != []:
prefix, score = stack.pop()
if score >= 21:
path.append(prefix)
continue
stack.append((prefix + '1', score + 1))
stack.append((prefix + '2', score + 2))

Lee 98
return path

Explanation of Implementation

● Note: Stack is implemented as we are doing a DFS.


● Terms:
○ Prefix: The game played up to now
○ Score: The current score
○ Path: All the possible games/"paths" that were played
1. If the score is greater than or equal to 21, then add the prefix to the path since this is
the only game that can be played.
2. Else, we can either add a 1 or a 2 to the prefix in the stack, and then call the function
again with the new prefix and score to play the race to 21 game
3. We can print after each iteration when a path is fully created.

Lee 99
Priority Queue

L35: Priority Queue and Heaps

Priority Queue

Overview

A priority queue is a queue where the first element dequeued is the one with the highest
priority.
● Convention: Lowest value ⇒ highest priority.

Operations

● Insert(S,x): Adds a new element with priority x to priority queue S


● min(S): Returns the element with the smallest value from the priority queue
● extract_min(S): Removes and returns the element with the smallest value from the
priority queue

Types of Implementation
● Note: Check Qilin’s notes for implementation.

1. Array, linked list (BAD)


a. Insert: O(1)
b. Min: O(n)
c. Extract_min: O(n)
2. Sorted array/linked list (BAD)
a. Insert: O(n)
b. Min: O(1)

Lee 100
c. Extract_min: O(1)
3. Heaps (GOOD)
a. Insert: O(log(n))
b. Min: O(1)
c. Extract min: O(log(n))

Heaps (Efficient Implementation of Priority Queue)

Overview

A tree is a collection of nodes, where each node has two children, except the leaves (nodes
at the bottom with no children), and every leaf is as far left as possible on the last level.

How to store and access the heap as an array?

Since the tree is complete, we will store the heap as an array as


[−, 𝑎, 𝑏, 𝑐, 𝑑, 𝑒, 𝑓, 𝑔, ℎ, 𝑖, 𝑗, 𝑘, 𝑙, 𝑚,??,??
● Note: First index is left blank for convenience of the following formulas.

Given a node at index i we can get the parent and the children via the following formulas:
𝑖
𝑃𝑎𝑟𝑒𝑛𝑡(𝑖) = 2
| 𝐿𝑒𝑓𝑡(𝑖) = 2𝑖 | 𝑅𝑖𝑔ℎ𝑡(𝑖) = 2𝑖 + 1

Two Properties

1. Complete Tree
a. A tree where all nodes except leaves have two children and each leaf is as far
left as possible.

2. Heap Order Property (Min-Heap)

Lee 101
a. Each node is smaller than its children, where the minimum element is at the
root.

Insert

● When we add a new element, we first add it to the leftmost position in the bottom
row.

Figure: This might break the heap order property as shown above, but to fix this we
percolate.

Percolate Up: Move the node upwards, exchanging it with the parent if it is smaller than the
parent as shown below:
● Difference: Only one comparison needs to be made. We need to compare it with the
left child and the right child. If it's bigger than either one, then we swap it with the
one it's bigger than.

Lee 102
Algorithm:

● n (size of the heap)


● pq (array storing the heap)
● k (current node)
● k/2 (parent node)
● loop (applying percolation)

Extract Min

● The minimum element is the first element (i.e. at index 1), however, we want to
remove it while maintaining heap-order property.

Trick:
1. Replace the minimum element (save it first) with the element at the end of the array
2. Percolate down the element now at index 1 down the array until the heap-order
property is satisfied.
a. Difference: It is possible for the current node we're percolating down to be
bigger than both children. In that case, we should swap it with the smaller
child, since the smaller child can be a parent of the larger one.

Lee 103
● Mistake: k = 1
● 𝑗 < 𝑛 𝑎𝑛𝑑 𝑝𝑞[𝑗] > 𝑝𝑞[𝑗 + 1] (ensures that the current node has two children and if it
does, selects the index of the smaller one)

Lee 104
Lee 105
Lee 106
Shortest Path in Graphs

L36: Shortest Path (Dijkstra’s Algoirthm)

Overview of Shortest Paths

Problem: What is the shortest path between 𝑣𝑠 and 𝑣𝑑?

Given a weighted connected graph 𝐺 = (𝑉, 𝐸) (with positive weights), and a pair of vertices
𝑣𝑠, 𝑣𝑑 ∈ 𝑉.

Find the path with the smallest sum of edge weights (i.e. optimal path)

● Note: The numbers in the nodes do not matter for the weightings. w

Approach

Break the problem into smaller shorter paths, in which all nodes go through:

E.g.:

Shortest Path (SP) from A to H: SP from A to E + SP from E to H.

Lee 107
General:
𝑆𝑃 𝑓𝑟𝑜𝑚 𝐴 𝑡𝑜 𝐻 = 𝑀𝐼𝑁(𝑆𝑃 𝑓𝑟𝑜𝑚 𝐴 𝑡𝑜 𝑣𝑖 + 𝑆𝑃 𝑓𝑟𝑜𝑚 𝑣𝑖 𝑡𝑜 𝐻)

Simplest Dijkstra’s Algorithm (Approach to solving the shortest path)

Intuition

Idea: Traverse through each unexplored node. For each unexplored node, the shortest
distance to each of its unexplored neighbors are set.
● Note: Initially, the distance from each node to the starting node (except the starting
node) is infinite:

Dikstra’s Algorithm Steps Visualization (Read left to right, top to bottom)

● Blue node (node we are currently exploring).


● Green nodes (we have already explored).

Lee 108
Pseudo Code

● V\S (set of all nodes in V that are not in S)


● |(𝑢, 𝑣)| (weight of the edge u,v)
● d(u) (distance to the vertex v by first using the distance from the source)
● Dijkstra Algorithm Example
○ Note: This shows the algorithm.

Lee 109
● Algorithm: When S is a visited set,where the nodes in the visited set will have their
shortest path from the source to that node is known.
○ Subsequently, an edge from the visited set (in S) to a new node (outside of S)
such that d(u) + |(u,v)| is the shortest path.
■ Note: We are expanding the visited set by one node at a time.

Proof of Dijkstra’s Algorithm

Prove:
(1) In the set of visited nodes S, the distance to each of these are the actual shortest
lengths
(2) Then when we add v, we assign the distance 𝑑(𝑣) = 𝑚𝑖𝑛{𝑑(𝑤) + |(𝑤, 𝑣)|: 𝑤 ∈ 𝑆},
this gives the shortest path from the source to v.

● Plain English: When a node has been marked as visited (i.e. added to the set S), we
do not revisit it. This means that we will never find a shorter path to a node that has
already been visited.

Intuitive Proof:
The area shaded in gray are the nodes that have been visited, and suppose the next node
we are adding is v.

Now there are two possible paths to get to v:


● Case 1: The first path goes through S for ALL edges except the last edge:
𝑠→𝑢→𝑣
● Case 2: The second path goes through S for some edges, and then goes through at
least two edges outside of S:
𝑠→𝑥→𝑦→𝑣

Lee 110
● If the shortest path is case 1, then we’re done.
● If the shortest path is case 2, then the algorithm wouldn’t choose to add v to S as the
next step, but would instead add y to S instead (contradicts the intro of the proof).

Therefore, choosing the next node to add to S by computing the minimum of


𝑑(𝑣) = 𝑚𝑖𝑛{𝑑(𝑤) + |(𝑤, 𝑣)|: 𝑤 ∈ 𝑆}
will always give us the optimal path.

Implementation
import numpy as np

class Node:
def __init__(self, value):
self.value = value
self.connections = []
self.distance_from_start = np.inf

class Con:
def __init__(self, node, weight):
self.node = node
self.weight = weight

def dijkstra(start, end):


'''Return the distance from node start to node end'''

# 1. Initialize the distance from start to start as 0


start.distance_from_start = 0

# 2. Initialize the set of visited nodes with start


visited = set([start])

# 3. Initialize the current node as start


current = start

# 4. While the current node is not the end node


while current != end:

# 5. Find the node v not in visited such that d(source, u) + |u, v| is minimized
cur_dist = np.inf
cur_v = None

# Try stepping one step away from every visited node


# Keep track of d(source, u) + |u, v| for each u in visited and v not in visited

Lee 111
# Visit the cheapest v

# 6. For each node in visited


for node in visited:

# 7. For each connection from the node


for con in node.connections:

# 8. If the node is in visited, then skip it


if con.node in visited:
continue

# 9. If the distance from start to the node is less than the current distance,
then update the current distance and the current node
if cur_dist > node.distance_from_start + con.weight:
cur_dist = node.distance_from_start + con.weight
cur_v = con.node

# 10. After the loop is done, add the current node to visited which is the node with the
smallest distance from start
current = cur_v
current.distance_from_start = cur_dist
visited.add(current)

return current.distance_from_start

Complexity

Add one vertex to S and then search through all possible additional vertices. For each
vertex, we're looking at all the other vertices, so this time complexity is
2
(
𝑂 |𝑉| )
Priority Queue Dijkstra’s Algorithm

Intuition

● When visiting u, and all of u's neighbors vi to the priority queue, with priority
d(source, u) + |u, vi|.
● At each iteration, pop the node with the smallest priority from the priority queue so
choose the next node to visit.
● We are choosing the node such that dist(source, u) + |u, v| is minimized, with u in
visited and v not in visited.

Lee 112
Pseudo Code

● Steps:
○ Stopping Condition: The code runs until the priority queue becomes empty.
○ Iteration Step: At each iteration, we pop the node with the smallest distance
from the priority queue.
■ If this node has already been explored, we continue.
■ Otherwise, we add it to the set of explored nodes and add all of its
neighbors to the priority queue, alongside their respective distances from
the source

Complexity

Traverse the nodes by sorting the nodes by their distance from the origin, which will give a
time complexity of
𝑂(|𝐸|𝑙𝑜𝑔|𝑉|)
● Why? Inserting and deleting elements in a priority queue has a complexity of
○ Since we perform these operations for each edge, therefore, 𝑂(|𝐸|𝑙𝑜𝑔|𝑉|).

Lee 113
L37: Shortest Path II (Greedy Best-First Search & A* Algorithm)

Greedy Best-First Search

Overview

Goal: Find the shortest path from a source node to a destination node.

Intuition

● The idea is to introduce a heuristic function that is an estimate for how far the node
is from the destination.
● This is not guaranteed to find the shortest path but will work well if the heuristic is
good.

Pseudo Code

● h(node) (estimate for how far the node is from the destination)
● Note: Not guaranteed to find the shortest path.

Examples

1. When Greedy Best-First Search will be efficient:

Lee 114
2. When Greedy Best-First Search will be inefficient:
If we introduce a wall, then the heuristic function will be wrong and we'll get the following
path:

● Process: It will find out that it's a dead end and it can't back track (because you can't
revisit nodes), so it starts going to the left.

Lee 115
○ Eventually, the cell to explore that minimizes the heuristic will be the cell to the
left of when we first hit the bottom wall.

A* Algorithm

Overview

The A* Algorithm take the best of both worlds. It combines the idea of Greedy Best-First
Search with Dijkstra's algorithm.

Pseudo Code

● Note: It is extremely similar to the optimized version of Dijkstra's, except the priority
queue is sorted by the sum of the heuristic function and the actual distance.

Example

Lee 116
Lee 117
Lee 118
Binary Search Trees & Hash Tables

L38: Implementing Sets Using Binary Search Trees & Hash Tables

Set ADT using Binary Search Trees

Overview

● Insert into the set


● Delete from the set
● Look up (is the element in the set?)

Binary Search Trees

The left descendents are smaller than the parent, the right descendants are larger than the
parent.

Intuition for is the element in the set:

Looking for 4:
○ 4 < 8, so look in the left subtree of 8
○ 4 > 3, so look in the right subtree of 3
○ 4<6, so look in the left subtree of 6
○ 4==4, done

Complexity:

Is the element in the set:


1. Tree is balanced
𝑂(ℎ) = 𝑂(𝑙𝑜𝑔(𝑛)) steps if the tree is complete.
2. Tree is unbalanced
𝑂(𝑛), where the BST is just a linked list, and the height is the same as the number of nodes.

Lee 119
Implementation of Binary Search Tree
class Node:
def __init__(self, key):
self.left = None # left child
self.right = None # right child
self.val = key # self.val is the value of the node, and setting to key.

def make_tree():
root = Node(3)
root.left = Node(2)
root.right = Node(5)
root.left.left = Node(0)
root.left.right = Node(2)
# 3
# / \
# 1 5
# / \
# 0 2
return root

Two Implementations of Search

1. Non-recursive
def in_tree(root, elem):
if root is None:
return False
if root.val == elem:
return True
if root.val < elem:
return in_tree(root.right, elem)
else:
return in_tree(root.left, elem)
2. Recursive
def search(root, key):
# Return True iff key is in the tree with root root.

Lee 120
# Complexity: O(logn) if the tree is balanced, O(n) if the tree is unbalanced.
# Key is the value we are searching for in the tree.
#

# 1. Base case: If the tree is empty, then the key is not in the tree
if root is None:
return False

#2. Recursive case: If the key is the root value, then we know the key is in the tree.
if root.val == key:
return True

# 3. If the key is not in the root, then it is in the left subtree if the key is less than
the root value
if key < root.val:
return search(root.left, key)

# 4. If the key is not in the root, then it is in the right subtree if the key is greater
than the root value
else:
return search(root.right, key)

Map ADT using Hashing

Overview

Different Styles of Implementation

1. Dictionary
2. Array

Lee 121
3. Hash tables

Hash Tables

1. Hash Function Intuition (Convert string to an integer corresponding to the table


cell number)
a. Convert the string into a list of ASCII values
b. Multiply each value by 10 raised to the power of its position in the list
c. Add all the values together.
d. Take the result mod n to get the index in the hash table.

Lee 122

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy