(MIT 6.1800) Spring 2025 Notes
(MIT 6.1800) Spring 2025 Notes
A client/server model helps us enforce modularity where the two modules reside
on different machines and communicate with RPCs.
A remote procedure call is different from a procedure call because the procedure caller
is on a different machine. This introduces the problem of network/server failures.
When designing a system, we also care about scalability, security, performance, and
fault-tolerance/reliability.
§2 Lecture 2: Naming
Names are used to allow modules to interact. They let us achieve modularity by providing
communication and organization.
§2.1 DNS
In DNS, the names are hostnames (e.g. eecs.mit.edu) and the values are IP addresses
(e.g. 18.25.0.23).
DNS is organized by a tree hierarchy system. The root nameserver looks up the IP of
the next nameserver, and keep propagating down until we find the correct server.
1
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
The memory management unit (MMU) needs to translate virtual memory ad-
dresses into physical ones. Memory (RAM) is temporary and used for quick access,
while storage is permanent and used for long-term data retention.
An OS uses page tables to virtualize memory to save space. It’s inefficient to index by
full virtual address.
Translation occurs by getting the virtual page number from the top 20 bits, looking
that up in a page table to get the physical page number, and then adding the offset
(bottom 12 bits).
If there is not enough memory to store all programs’ instructions, page table entries
contain additional bits that help us deal with this problem.
A multilevel page table saves space compared to normal page tables because you have
less rows per thing, but more table lookups and more exceptions.
A race condition is when two parties try to take the same action and one ends
up overwriting the other. This will happen in the basic version of send and receive.
1 send(bb, message):
2 while True:
3 if bb.in - bb.out < N:
4 bb.buf[bb.in mod N] <- message
5 bb.in <- bb.in + 1
6 return
2
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
1 receive(bb):
2 while True:
3 if bb.out < bb.in:
4 message <- bb.buf[bb.out mod N]
5 bb.out <- bb.out + 1
6 return message
A lock allows only one CPU to be inside a piece of code at a time. Programs can acquire
and release a lock.
Deadlock refers to when two programs are waiting on each other, and neither can
make progress until the other one does. This can be fixed with an additional acquire
and release.
We have to assume acquire and release are atomic actions, which means they can’t be
interrupted. Atomic actions and performance are a tradeoff.
§5 Lecture 5: Threads
A thread is a virtual processor that can suspend and resume. Threads allow multiple
programs to share a CPU.
Suspending a thread means pausing it and allowing another thread to run. This is
done with the keyword yield. Resuming a thread means the thread unpauses.
Condition variables let threads wait for events (”conditions”) and get notified, can
wait on a condition and be notified of it occurring.
3
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
The difference between yield and yield_wait is that yield_wait doesn’t acquire
and release t_lock since wait does this already.
Preemption means forcibly interrupting threads. Ensures wait and yield are ac-
tually called.
The VMM will intercept (”trap”) when guest OS executes a privileged instructions, and
then the VMM will emulate the instruction.
The VMM handles virtual memory for guest OSes by making two page tables:
which can be combined to form a host page table mapping guest virtual to host physical
addresses.
The VMM deals with the U/K bit for guest OSes by making guest OSes run in user
mode. VMM will replace problematic instructions with ones it can trap and emulate.
Architecture provides a special operating mode for VMMs in addition to user mode,
kernel mode.
§7 Lecture 7: OS Performance
A performance bottleneck is where the performance is being constrained.
It’s helpful to have a model of system when thinking about performance. Common
performance metrics are
4
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
For HDDs (common in datacenters), read/writes are slow but can be improved by
not doing random access. Seeking takes the longest time.
For SSDs, read/writes are faster because SSDs don’t involve moving parts. However, the
SSD controller is careful about how it writes new data and makes changes to existing data.
A database management system (DBMS) is good at predicting what the next query will
be, compared to a filesystem. It’s in a good position to exploit block-level control over
loading or evicting data to memory.
A layered model is useful because we can swap out protocols at one layer without much
change to protocols at other layers.
5
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
We use distributed routing protocols because these steps can happen periodically, which
allows the routing protocol to detect and respond to failures, and adapt to other changes
in the network.
Nodes keep track of which advertisements they’ve forwarded so they don’t re-forward
them. Nodes use Dijkstra’s algorithm to integrate advertisements. Each node keeps
track of a table with three columns, dst, route, and cost.
Nodes use the ... algorithm to integrate advertisements. This works as follows... Distance
vector routing has its pros and cons:
• Cons: Failures can be complicated because of timing. Failures are also hard to
handle.
The order in which advertisements are received by nodes matters. When there is a
failure, cost will count to infinity. The workaround for this is using the split horizon
strategy, which doesn’t send advertisements about a route to the node providing the
route, but this doesn’t always work.
6
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
Definition 10.2. Policy routing is where packets are forwarded based on specific
policies set by network administrator, not just shortest path.
Different AS relationships include
• Peering: allows free mutual access to each others’ customers (as long as amount of
traffic is approx equal in each direction)
All the top tier ISPs (Internet stakeholders) peer to allow for global connectivity.
Providers tell all neighbors about their customers and tell their customers about all
neighbors.
ASes will set their own import policies. If an AS hears about multiple routes to
a destination, it will prefer to use its customers first, then peers, then providers.
§10.2 BGP
BGP (border gateway protocol) as a distributed routing protocol:
1. Nodes learn about their neighbors via the HELLO protocol. Nodes send ”KEEPALIVE”
messages to their neighbors once every sixty seconds.
2. Nodes learn about other reachable nodes via advertisements. Advertisements differ
based on AS relationships (customer/provider, peer).
3. Nodes determine the min-cost routes. Nodes choose which routes to use based on
AS relationship and other properties.
7
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
BGP is an application-layer protocol, even though it deals with routing. It runs on top of
TCP which provides reliable transport.
This lets BGP handle failures differently than link-state and distance-vector routing.
BGP scales to the Internet, but size of routing tables and route instability all cause
scaling issues. BGP is not secure.
A TCP sender uses timeout to infer that a packet has been lost. It will resend the packet
after a certain timeout. Spurious retransmission is when the sender retransmitted a
packet that had already been ACKed.
• Fairness: under infinite offered load, split bandwidth evenly among all sources
sharing a bottleneck.
The window W refers to the max number of outstanding packets that the sender can
have. We control for W using AIMD.
8
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
An issue about TCP that will be addressed next lecture is that it doesn’t react to
congestion until after it’s a problem, so we want to get senders to react before queues
are full.
§12.2 Scheduling
Delay-based scheduling is when we put latency-sensitive traffic in its own queue and
serve that queue first. This doesn’t prevent latency-sensitive traffic from ”starving out”
the other traffic.
Round-robin scheduling can’t handle variable packet sizes and doesn’t allow us to
weight traffic differently. Deficit round-robin scheduling handles variable packet sizes
(even within the same queue), near-perfect fairness and low packet processing overhead.
9
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
The BitTorrent P2P network distributes large files efficiently by breaking them into
smaller pieces and sharing them across multiple peers.
The .torrent file contains info about the file name, file size, information about the
”blocks” of the file, and the tracker URL.
Peers are incentivized to upload data because peers prioritize uploading to those who
also upload back.
§13.2 CDNs
CDNs work by
3. When a client requests p, direct them to the ”best” server that has a copy of p.
A CDN owner (like Akamai) might take geographical proximity, RTT, bandwidth, and
throughput into account when deciding which server is ”best” for a particular client.
The network topology provides communications between racks. One example is the clos
topology which looks like many copies of trees (to provide for redundancies.
We route using multi-path routing, which can load-balance across paths, but we
need to be careful about how we divide traffic across the paths (since this makes conges-
tion control more difficult). The centralized controller in a datacenter is responsible
for managing and optimizing compute, storage, and network resources to ensure efficient
operations.
Compared to the Internet, datacenter networks are under the control of a single admin
entity, so we have a higher level of control.
10
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
§15.1 RAID
RAID stands for redundant array of independent disks. Different types of RAID.
• RAID 1: mirroring. Pro is that can recover from single-disk failure, con is requires
2N disks.
• RAID 4: dedicated parity disk (which is the XOR of all the previous disks).
Pros are can recover from single disk failure, requires N + 1 disks (instead of N ).
Performance benefits if you stripe a single file across multiple data disks. Con is
that all writes go to the parity disk (so the disk gets worn out faster).
• RAID 5: instead of writing on one disk, distribute the parities (for example, sector
i’s write can be on disk i (mod N + 1)).
Our main tool for improving reliability is redundancy. RAID 5 protects against
single-disk failures while maintaining good performance.
To do this, we make a shadow copy of the file called temp, which we don’t care
about being atomic.
Isolation refers to how and when the effects of one action are visible on another.
We keep a log of updates and commits, so that when we call read method, we look
through the logs to get the result. We will only read if a commit exists. If program
crashes halfway through, no commit, so value will not be read.
• writes contain the old and new value of a variable. each write is a small append to
the end of the log.
11
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
• to read a variable x, the system scans backwards through the log to find x’s last
uncommitted value.
• the commit point for a transaction is writing the COMMIT record.
The problem is that reads can be very slow. To fix this, we add cell storage on disk
which stores A and B. We also add a recover operation, which undos any writes that
don’t have an associated COMMIT. The issue is now that write becomes slow, because
write has to go to the log first and then cell storage.
Instead, we use a cache instead of disk. When we write, set cache value. When we read,
attempt to read from cache, otherwise read from disk. We also have an operation called
flush which is called occasionally to update the disk values to reflect the cache values.
To improve performance for recovery, we can write checkpoints and truncate the log.
This implies that some interleavings will not work, since some interleavings may produce
results that aren’t achievable through sequential running.
However, we may want to enforce even stricter conditions. For example, even if we end
up with the same result, what if a set of read or write occurs where this set would not
be possible in sequential order? Is this still ok? The answer is that it depends, as there
are different notions of serializability.
Even if we interleave, it’s possible that our order of conflicts is either same as one
of the sequential runs or different. A schedule is conflict serializable if the order of all
of its conflicts is the same as the order of the conflicts in some sequential schedule.
12
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
One issue here is that 2PL can result in deadlock. One solution is global ordering
on locks. A better solution is to take advantage of atomicity and abort one of the
transactions. I don’t really understand what they mean about ”aborting a transaction”,
they also don’t give an example of this.
One possible issue is if one server commits but the other doesn’t. Our goal is to
develop a protocol that can provide multi-site atomicity in the face of all sorts of
failures.
If worker failure happens during commit phase, we cannot abort the transaction. Workers
must be able to recover into a prepared state and then commit. Workers write PREPARE
records once prepared. The recovery process (reading through the log) will indicate
which transactions are prepared but not committed.
13
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
If coordinator fails during commit phase, machines must commit the transaction during
recovery. One performance issue is that if the coordinator fails during the prepare phase,
it will block the transaction from progressing.
Instead, we can try to make one replica the primary replica and have coordina-
tors in place to help manage failures. Clients communicate only with the coordinator
(not replica). The coordinator sends requests to primary server. Primary ACKs coordina-
tor only after it’s sure that backup has all updates. If primary fails, C (the coordinator)
switches to backup.
Let’s introduce the concept of a network partition. Machines on the same side
of the line can communicate with each other. Because two different replicas both think
that they are the primary replica, data can become inconsistent.
To fix this, we introduce the view server to determine which replica is primary, in hopes
that we can deal with network partitions. The view server keeps a table that maintains
a sequence of views. The view server alerts primary/backups about their roles.
§21 Authentication
We’re starting our last section on security (i.e. how our system copes in the face of
targeted attacks). For example,
14
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
• H is one-way: given x, it’s easy to compute H(x) but not the other way around.
If we store the hash of the password, this is now safer.
However, we have another problem: the adversary can easily pre-compute hashes for a
lot of passwords. So, we introduce a special type of hash function.
In this case, the attacker is incentivized to concentrate on the most common passwords.
One idea to remedy this is to add randomness.
§21.3 Randomness
We will associate a random value (a salt) with each user. These salts are stored in
plaintext and are not a secret.
Instead of storing a hash of the password, we will concatenate the password and the salt
and store the hash of that string.
The adversary can’t really do anything anymore because they would have to precompute
for every possible salt.
To return to main() after function() ends, we use BP (base pointer) to locate the start
of the current stack frame. The previous values of BP and IP (instruction pointer) are
located at a fixed offset from that so we can reset BP and IP and continue on.
Adversary’s goal is to input a string that overwrites modified. idk what’s happen-
ing anymore.
15
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
§22.1 Compilers
Compilers take source code as an input and output machine code.
Our policy is to provide confidentiality and integrity. We will encrypt and decrypt
a message using a key. The issue is that the adversary still can tamper with packets.
Instead, we send over ciphertext and a token, where MAC(key, message) = token.
We can’t get token without knowing key, we can’t get message even if we know token
and key. This solves the issue of integrity.
The problem is that an adversary can intercept a message and resend it at a later
time. Solution is to use sequence numbers.
§24 Tor
Today we’re still looking at adversaries observing data on the network. Symmetric-key
cryptography is when the same key is used to encrypt and decrypt. This means that
Alice and Bob share the same key, which is difficult because how do we even share the
key in the first place?
However, we want to provide anonymity, i.e. it’s a problem if the packet header
exposes to the adversary that A is communicating with S.
One solution to this is to have a proxy P , which states ”from A to P ”, and then
the packet header changes to ”from P to S”.
However, we have a new problem: no entity in the network should receive a packet from
A and send it directly to S. No entity in the network should keep state that links A to S.
However, what if the adversary has multiple vantage points and can observe the same
data traveling from A to S? This means that data cannot appear the same across packets.
Solution to this is onion routing, which adds layers of encryption that proxies strip
off one by one. The setup chains skipping one in the middle, i.e. A → P2 , followed by
P1 → P3 , followed by P2 → S.
16
Isabella Zhu — 5 May 2025 6.1800 Master Notes Doc
between A and each node in the circuit, and the layers of encryption use those symmetric
keys, which is what allows traffic to travel in both directions.
17