06 Application Architecture
06 Application Architecture
Lecture 6
Srinivas Narayana
http://www.cs.rutgers.edu/~sn624/553-S25
1
Review: Offline and Online components
Services, update
databases,
ML inference
User update
request
update
GFE
Word → Doc
Pick
one
Relevant
document
IDs
Snippet
results
Many apps can use partition-aggregate
• Need low latency, but single-threaded low latency is hard
• Data parallelism
• Little coordination across shards
• Inexpensive merges across partial results from shards
• Query parallelism
• More replicas/machines for more requests
• Use commodity (not fancy) hardware
• Turn high throughput into a latency advantage
• Focus on price per unit performance
• Significant problems: cooling for many compute servers
Tail performance becomes important
• With partition-aggregate, each
machine may serve many requests
within a single user-level request
• A single user sends multiple
requests over a session
• Many shards are queried for a single user-level query
• Delays in any one of them can delay the entire response
• Leaving the shard out degrades the result
• Example: 1000 shards* 10 user requests per session
• 1 delay in 10,000 machine-level responses can be visible to a user
• 99.99th percentile delay matters
• Lots of delay on cutting the tail: hedging, duplication, …
Map Reduce
Batch processing with simple abstractions
Example: Batch data processing
• Server access log: want to get top-5 URLs visited
192.0.2.1 - - [07/Dec/2021:11:45:26 -0700] "GET
/index.html HTTP/1.1" 200 4310
• Analytics (not real time user query). How would you go about it?
• One way: shell script
cat /var/log/nginx/access.log
| awk ‘{print $7}’
| sort
| uniq –c
| sort –r –n
| head –n 5
Example: Batch data processing
Another way: Python script
counts = {}
for line in open(“/var/log/access.log”):
url = line.split()[6]
counts[url] += 1
sorted_counts = counts.items().sort()[::-1]
print (sorted_counts[0:5])
MapReduce
• How to handle failures?
• Machine failures?
• What happens to partial computations?
• Should we replicate compute?
• What happens to intermediate results?
• Should you persist it? Replicate it?
• Stream Processing
• Incremental execution of batch jobs when new data arrives