0% found this document useful (0 votes)
22 views

Distributed File Systems: Arvind Krishnamurthy Spring 2001

Distributed file systems provide transparent access to files stored remotely. They aim to handle failures, improve performance through caching, and maintain cache coherence across clients. The Network File System is a widely used distributed file system that uses a stateless protocol and write-through caching to maintain consistency despite failures.

Uploaded by

hoang.van.tuan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Distributed File Systems: Arvind Krishnamurthy Spring 2001

Distributed file systems provide transparent access to files stored remotely. They aim to handle failures, improve performance through caching, and maintain cache coherence across clients. The Network File System is a widely used distributed file system that uses a stateless protocol and write-through caching to maintain consistency despite failures.

Uploaded by

hoang.van.tuan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Distributed File Systems

n Distributed File Systems provide transparent access to files stored on a


remote disk
n Themes:

Distributed File Systems


n Failures: what happens when server crashes, but client doesn’t? Or vice
versa?
n Performance è caching; use caching at both the clients and the server to
improve performance
n Cache coherence: how do we make sure each client sees most up to date
copy?
Arvind Krishnamurthy n Other issues: naming, scalability, security
Examples
Spring 2001 n

n NFS: Sun’s Network File System


n AFS: Andrew File System (CMU)
n Sprite: Berkeley research project; stressed caching
n Coda: CMU research project; stressed availability

DFS properties Simple Approach


n Network transparency n No caching:
n Looks like local file system n Use remote procedure calls to forward every file system request to remote
n Location transparency server
n No idea where the file is located n Example operations: open, seek, read, write, close
n Location independence n Server implements each operation as it would for a local request and
n Name of file shouldn’t change
sends back result to client
n User mobility n Advantage: server provides consistent view of file system to all clients
n Access anywhere n Problems: performance
n Fault tolerance n Going over network is much slower than going to local memory
n Lots of network traffic
n Access all the time
Server can be a bottleneck – what if lots of clients?
Scalability
n
n
Access from many machines
n
Cache
n File mobility
n Can move files from one location to another in a running system
Client S

Network File System (NFS) NFS structure (cont’d)


n History n An NFS file system can be mounted into the namespace of a Unix
machine
n Introduced by Sun Microsystems in 1985
n Sun published the protocol and licensed reference implementation Server exports file system: /usr/local (in /etc/exports)
n Since then, NFS has been supported by every Unix variants and PC- Clients: mount –t nfs nfsserver:/usr/local /usr/local
NFS
n Daemon processes
On the server: a set of nfsd daemons listen/respond to client requests;
n Design goals n
mountd handles mount requests
n Machine and OS independence, no recompilation of applications
n Crash recovery n On the client: a set of biod daemons handles asynchronous I/O for blocks
of NFS files
n Transparent access
n Unix semantics maintained on client
n Network lock manager & status monitor
n Reasonable performance (comparable to local FS) n Providing facilities for locking files over a network

1
Caching Motivation, part 1: Failures
n Idea: use caching to reduce network load n What if server crashes? Can client wait until server comes
n Cache file blocks, file headers, etc., at both clients and servers: back up, and continue as before?
n Client memory n Any data in server memory, but not yet on disk can be lost
Server memory
n
n Shared state across RPCs:
n Advantage: if open/read/write/close can be done locally, no network n Example: open, seek, read
traffic n What if server crashes after seek?
n Issues: failures and cache consistency n Client does “read”, it will fail

Cache Client n Message retries:


n Suppose server crashes after it does “rm foo”, but before
Cache acknowledging
n Message system will retry – send it again

S
n What if client crashes?
Cache Client n Might lose modified data in client cache

NFS Protocol (part 1): stateless NFS Protocol (contd.)


n Write-through caching: n Operations are “idempotent”:
n When a file is closed, all modified blocks are sent immediately to n All requests are ok to repeat
the server disk. To the client, “close” does not return until all bytes n If server crashes between disk I/O and message send, client can
are stored on disk resend message; server just does operation all over again
n Read and write file blocks are easy
n Stateless protocol:
n “remove file” is difficult: NFS just ignores this and returns a file
n server keeps no state about client, except as hints to help improve not found
performance
n Failures are transparent to client system
n Each read request gives enough information to do entire operation:
n Suppose you are an application, in the middle of reading a file,
n ReadAt(inumber, position), not Read(openfile) server crashes. Options:
n When server crashes and restarts, can start again processing n Hang until server comes back up

requests immediately, as if nothing happened n Return an error? Problem is most applications don’t know they

are talking over the network


n NFS uses both options: can select which one

Cache Consistency Sequential Ordering Constraints


n What if multiple clients are sharing the same files? n Cache coherence: what should happen?
n Easy if they are reading: each agent gets a copy of the file n What if one CPU changes file, and before it’s done, another CPU
reads file?
n What if one is writing? How do updates happen?
n Every operation takes time; actual read could occur anytime
n NFS has write through policy: between when system call is started and when it returns
n If one client modifies file, writes through to server n Assume what we want is distributed to behave exactly the same as
n How does other client find out about the change? if all processes are running on a single Unix system
n If read finishes before write starts, then get old copy
n NFS: client polls server periodically to check if file has changed
n If read starts after write finishes, then get new copy
n Polls server if data hasn’t be checked in last 3-30 seconds
n Otherwise, get either new or old copy
n When file is changed on one client, server is notified, but other
n Similarly, if write starts before another write finishes:
clients use old version of file until timeout
n May get either old or new version
n What if multiple clients write to same file at the same time?
n NFS: if read starts more than 30 seconds after write finishes, get
n Can get either version (or parts of both). Completely arbitrary. new copy. Otherwise, get partial update

2
NFS Summary Andrew File System
n Pros: n AFS (CMU, late 80’s) à DCE Distributed File Systems
n Simple n Callbacks: server records who has copy of file
n Highly portable n Write through on close
n If file changes, server is updated on close
Server then immediately tells all those with the old copy
Cons:
n
n
n Session semantics: updates visible only on close
n Sometimes inconsistent
n Unix: updates visible immediately to other programs who have the file
n Does not scale to large number of clients open
n AFS: everyone who has file open sees old version; anyone who opens
file again will see new version
n AFS:
n On open and cache miss: get file from server, set up callback
n On write close: send copy to server; tells all clients with copies to fetch
new version from server on next open

AFS (contd.) Coda and disconnected operation


n Files cached on local disk: n AFS users often go a long time without any communication
n NFS caches only in memory between their desktop client and any AFS server
n What if server crashes? n Coda says: “why can’t we use AFS-like implementation
n Lose all your callback state when disconnected from the network?”
n Reconstruct callback information from clients
n On an airplane
n Pros: n At home
n Relative to NFS, less server load
n During network failure
n Disk as cache à more files can be cached locally
n Callbacks à server not involved if file is read-only n Issues
n Cons: n Which files to get before disconnection
n Fast LANs, local disk is slower than remote memory n Consistency
n Central server is a bottleneck
n Performance (all data + misses goes to server), availability

Hoarding Consistency
n AFS keeps recently used files on local disk n What if two disconnected users write the same file at the
n Most of what you need will be around same time?
n No way to use callback promises since server and client cannot
n Users can specify “hoard lists” to tell Coda to cache a communicate
bunch of other things even if not already stored locally n Coda’s solution: cross your fingers, hope it does not
happen, and pick up pieces if it does
n Log of changes kept while disconnected
n System can also learn over time which files a user tends to
n Apply changes upon reconnect
use
n If conflict detected, try to resolve automatically, else ask the user
n In practice, unfixable conflicts almost never happen

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy