0% found this document useful (0 votes)
112 views

03 Nfs PDF

This document provides an overview of the Network File System (NFS) protocol. It describes the basics of NFS including its evolution through different versions. Key points include that NFS is designed for transparent access to remote files, uses remote procedure calls (RPCs) to define operations, and aims to maintain local file system semantics. The document outlines the stateless design of early NFS versions and challenges this created for cache consistency.

Uploaded by

Karan Deep Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views

03 Nfs PDF

This document provides an overview of the Network File System (NFS) protocol. It describes the basics of NFS including its evolution through different versions. Key points include that NFS is designed for transparent access to remote files, uses remote procedure calls (RPCs) to define operations, and aims to maintain local file system semantics. The document outlines the stateless design of early NFS versions and challenges this created for cache consistency.

Uploaded by

Karan Deep Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

An Introduction to NFS

Avishay Traeger

IBM Haifa Research Lab


Internal Storage Course

November 2010
v1.2
Outline

 The Basics
 NFSv2
 NFSv3
 NFSv4
 NFSv4.1

2
Typical Use

ws-avishay ws-bob ws-carl mount -t nfs


(NFS Client) (NFS Client) (NFS Client) nfsserv:/home /home

10.0.2.56 10.0.2.103 10.0.2.81

nfsserv (NFS Server) /etc/exports:


/
/home 10.0.2.*(rw)

home …

avishay bob carl


Some benefits of NFS:
1. All clients have the same view
2. Centralized storage management
RAID Storage
3
NFS Evolution

 NFS is a standardized protocol


Version Year RFC # Pages Status

NFSv2 1989 1094 27 Obsolete

NFSv3 1995 1813 126 Most popular

Available on several OSs, slowly


NFSv4 2003 3530 275
but surely replacing NFSv3

NFSv4.1 2010 5661 617 Early adopters only

4
Design Goals

 OS independence & interoperability


 Simple crash recovery for clients and servers
 Transparent access (client programs do not
know files are remote)
 Maintain local file system semantics
 Reasonable performance

5
Remote Procedure Call (RPC)

 NFS is defined as a set of RPCs – their


arguments, results, and effects
 RPCs are synchronous
 The use of RPCs makes the protocol easier to
understand

6
NFS Client/Server

7
Outline

 The Basics
 NFSv2
 NFSv3
 NFSv4
 NFSv4.1

8
Stateless Protocol

 The server does not keep state for RPCs


 Each RPC contains the necessary information
to complete the call
 This makes crash recovery easy
 Server crash: server does no crash recovery, clients
resubmit requests
 Client crash: no crash recovery for client or server

 This is nice in theory, but


 Adds complexity
 Not really stateless...
 File locking adds state, provided by separate protocol & daemon
 Server keeps an RPC reply cache to handle duplicate non-
idempotent RPC 9
File Handles

 The most common NFS procedure parameter


is a structure called a file handle (fh, fhandle)
 Provided by the server and used by the client
to reference a file
 The fhandle is opaque to the client
 New fhandles returned by LOOKUP, CREATE,
MKDIR, ...
 The fhandle for the root of the file system is
obtained by the client when it mounts the file
system

10
Operations

 NULL() returns ()
 Do nothing procedure to used for pinging the server
 LOOKUP(dirfh, name) returns (fh, attr)
 Returns a new fh and attributes for the named file in
the directory specified by dirfh
 CREATE(dirfh, name, attr) returns (newfh,
attr)
 Creates a new file name in the directory dirfh and
returns the new fh and attributes.
 REMOVE(dirfh, name) returns (status)
 Removes the file name from directory dirfh.

11
Operations

 GETATTR(fh) returns (attr)


 Returns file attributes (similar to stat syscall)
 SETATTR(fh, attr) returns (attr)
 Sets the mode, uid, gid, size, access time, and
modify time of a file. Setting the size to zero
truncates the file.

12
Operations

 READ(fh, offset, count) returns (attr, data)


 Returns up to count bytes of data from a file starting
offset bytes into the file.
 Returns the attributes of the file.
 WRITE(fh, offset, count, data) returns (attr)
 Writes count bytes of data to a file beginning at
offset bytes from the beginning of the file.
 Returns the new attributes of the file after the write.

13
Operations

 RENAME(dirfh, name, tofh, toname) returns


(status)
 Renames name in directory dirfh, to toname in
directory tofh.
 LINK(dirfh, name, tofh, toname) returns
(status)
 Creates a hard link toname in directory tofh, that
points to name in directory dirfh.

14
Operations

 SYMLINK(dirfh, name, string) returns (status)


 Creates a symlink name in the directory dirfh with
value string. The server does not interpret the
string argument in any way, just saves it and
makes an association to the new symlink file.
 READLINK(fh) returns (string)
 Returns the string which is associated with the
symlink file.

15
Operations

 MKDIR(dirfh, name, attr) returns (fh, newattr)


 Creates a new directory name in the directory dirfh
and returns the new fh and attributes.
 RMDIR(dirfh, name) returns (status)
 Removes the empty directory name from the parent
directory dirfh.

 STATFS(fh) returns (fsstats)


 Returns file system information such as block size,
number of free blocks, etc.

16
Operations

 READDIR (dirfh, cookie, count) returns


(entries)
 Returns up to count bytes of directory entries from
the directory dirfh.
 Each entry contains a file name, file id, and an
opaque pointer to the next directory entry called a
cookie.
 The cookie is used in subsequent readdir calls to
start reading at a specific entry in the directory.
 A readdir call with the cookie of zero returns entries
starting with the first entry in the directory.

17
The MOUNT Protocol

 The MOUNT protocol takes a directory


pathname and returns an fhandle if the client
has permissions to mount the file system
 Separate protocol
 Easier to plug in new permission check methods
 Separates the OS-dependent aspects of the protocol
 Other OS implementations can change the
MOUNT protocol without having to change the
NFS protocol

18
The Linux File Handle

 Remember that information contained in the


fhandle is only meaningful on the server
 If the local FS on the server reuses an inode
number, an NFS client could mistakenly use an
old file handle and access the new file. File
systems include generation numbers in the
inode to avoid this. The value is usually taken
from a counter used across the file system.
 Important file handle fields:
 Major/minor number of the exported device
 Inode number
 Generation number
19
Security

 NFSv2 uses UNIX-style permission checks


 The client passes uid/gid info in RPCs, and
the server performs permission checks as if the
user was performing the operation locally
 Problem – the mapping from uid/gid to user
must be the same on the client and server
 Can be solved via Network Information Service (NIS)
 Another problem – should root on the client
have root access to files on the server?
 Server specifies policy

20
Cache Consistency Problems

 Clients use caching and write buffering to


improve performance, but this causes issues
 Problem: Update visibility; If client C1 buffers writes
in its cache, client C2 will see the old version
 NFSv2 solution: Close-to-open consistency – Clients
flush on close(), so other clients will see the
latest version on open()
 Problem: Stale cache; If C1 has a file cached, it will
see old data even if the file is updated by C2
 NFSv2 solution: Send a GETATTR and check the
file's modification time to see if it has been
updated. Cache attributes for a few seconds to
reduce the number of GETATTR calls.
21
Strong Semantics for Write

 Because the NFS server is stateless, when


servicing an NFS request it must commit any
modified data to stable storage before
returning results
 The implication for UNIX based servers is that
requests which modify the file system must
flush all modified data & metadata to disk
before returning from the call
 This can be a big performance bottleneck
unless something is done to improve write
performance (e.g., NetApp's WAFL file system)
22
Outline

 The Basics
 NFSv2
 NFSv3
 NFSv4
 NFSv4.1

23
Major Changes from NFSv2 to v3

 Sizes and offsets are widened from 32 bits to


64 bits
 A new COMMIT RPC allows for reliable
asynchronous writes
 A new ACCESS RPC improves support for
ACLs and super-user
 All operations now return attributes to reduce
the number of subsequent GETATTR
procedure calls
 The 8KB data size limitation on the READ and
WRITE procedures is relaxed
24
Major Changes from NFSv2 to v3

 A new READDIRPLUS RPC returns both file


handle and attributes to eliminate LOOKUP
calls when scanning a directory

25
Asynchronous Writes

 In NFSv3, the server can reply to WRITE RPCs


immediately, without syncing to disk
 When the client wants to ensure that data is on
stable storage, it sends a COMMIT RPC
 Asynchronous writes are optional, and
negotiated at mount time

26
Asynchronous Writes: Crash
Recovery
 The client must keep all uncommitted data in
case of a server crash
 Replies for WRITE and COMMIT RPCs include
a write verifier for server crash detection
 Write verifier: 8-byte value that the server must
change if it crashes (many use boot time)
 The client must save verifiers returned by
async WRITE RPCs, and compare them to the
verifier returned by a leter COMMIT RPC
 If the verifiers don't match, the client must
rewrite all uncommitted data
27
Outline

 The Basics
 NFSv2
 NFSv3
 NFSv4
 NFSv4.1

28
Additional Goals for NFSv4

 Improved access and good performance on the


Internet
 Only TCP
 Easy to transit firewalls: uses one port (mount & lock
protocols merged into NFS)
 COMPOUNDs, delegations, uid/gid issue resolved
 Strong security with negotiation built in
 Better cross-platform interoperability
 Better extensibility
 New security types, new attributes, etc.

 Big design change – NFSv4 is stateful


29
Security

 For previous versions, only UNIX permissions


were widely adopted
 NFSv4 mandates the use of strong RPC
security flavors that depend on cryptography
 Security type negotiation is done securely and
in-band
 User and groups are identified with strings, not
numbers
 Access control policies compatible with both
UNIX and Windows
 The problematic MOUNT protocol is removed
30
RPCSEC_GSS

 A framework adopted by NFSv4 to provide


authentication, integrity, and privacy at the
RPC level
 The following mechanisms must be
implemented: Kerberos v5, LIPKEY, SPKM3
 Security options are negotiated at mount time
 The SECINFO operation allows a client to
determine the security policy (usually on
mount, but can be on a per-filehandle basis)
 RPCSEC_GSS can be used with previous
versions of NFS, but in NFSv4 support is
mandatory 31
Identifying Users

 In v2 and v3, users and groups were


represented as integers
 This required all clients and the server to agree
on user and group assignments - not practical
(especially over the Internet)
 NFSv4 uses strings ‘user@domain’ and
‘group@domain’, where domain represents a
registered DNS domain or a sub-domain
 On Linux, idmapd translates NFSv4 IDs

32
COMPOUND Procedure

 NFSv4 has 2 procedures: NULL and


COMPOUND
 The COMPOUND procedure can contain
several operations (similar to previous NFS
procedures)
 Possible example: {LOOKUP, OPEN, READ}
 Operations are evaluated in order, and each
can have a return value
 If an operation fails, the server stops evaluating
the COMPOUND and returns

33
Filehandles

 Current filehandle: used by most operations


 Saved filehandle: used as an additional
operand

 Example from Linux #1: WRITE request


 PUTFH(fh): set CURFH to the target file
 WRITE: write the data to the current file
 GETATTR: get attributes for the current file

34
Filehandles

 Example from Linux #2: CREATE request


 PUTFH(dirfh): set CURFH to the directory
 SAVEFH: save CURFH (SAVEDFH=CURFH)
 CREATE: create the file (CURFH=NEWFH)
 GETFH: return CURFH to the client
 GETATTR: get the attributes of the new file
 RESTOREFH: (CURFH=SAVEDFH)
 GETATTR: get the attributes of the directory

35
Some Differences in Operations

 CREATE now creates file, directories, and


special files
 LOOKUPP was introduced to look up the
parent directory – no special meaning for ‘.’
and ‘..’ as in previous NFS versions (better
cross-platform interoperability)
 READDIRPLUS removed - READDIR now
returns requested attributes

36
Filehandle Types

 In previous NFS versions, the fhandle was


valid for the lifetime of the file system object
 Now these fhandles are called “persistent
filehandles”
 “Volatile filehandles” may become invalid, but
the client is prepared to deal with these
semantics

37
File System Migration/Replication

 Migration
 The file system locations attribute provides a method
for the client to probe the server about the location
of a file system
 In the event of a file system migration, the client will
receive an error when operating on the file system
and it can then query as to the new location
 Replication
 The client is able to query the server for the multiple
available locations of a particular file system
 From this information, the client can use its own
policies to access the appropriate file system
location
38
Attribute Types

 Mandatory: minimal set of file or file system


attributes that must be provided by the server
 type, filehandle expiration type, change indicator,
size, fsid, lease duration, etc.
 Recommended: represent different file system
types and operating environments
 case insensitive, hidden, max file size, max read
size, max write size, UNIX mode bits, owner string,
group string, modify/create/access time, etc.
 Named: Similar to extended attributes,
implemented as hidden directories
 ACLs: implemented as recommended attribute
39
Pseudo Filesystems

 In NFSv4, the server presents a single seamless view


of all the exported file systems to a client
 The client can use the fsid to notice changes
 mount -t nfs4 servername:/ /mnt/dir

40
Client Caching

 File, attribute, and directory caching is similar


to previous versions: clients determine what to
cache and for how long, and when to see if an
update occurred
 Close-to-open consistency
 Client checks if cached data is valid on OPEN
 Client writes modified data on CLOSE
 Sufficient for most applications and users

41
Leases

 A lease is a time-bounded grant of control of the state of a


file, from the server to the client (lock or delegation)
 During a lease interval a server may not grant conflicting
control to another client
 The client may assume that a lock granted by the server
will remain valid for a fixed (server-specified) interval and
is subject to renewal by the client
 The client is responsible for refreshing the lease
 If the lease interval expires without a refresh from the
client, the server assumes the client has failed and may
allow other clients to acquire the same lock
 If the server fails, on reboot the server waits a duration
equal to a lease interval for clients to reclaim the locks
that they may still hold, before allowing any new lock
requests 42
File Locking

 Support for byte-range file locking part of


protocol
 Lease-based model: lease state is stored on
the server
 Clients must either explicitly renew leases
(RENEW), or implicitly renew them (usually
READ)
 A refresh of any lock by the client validates all
locks held by the client to a particular server
(reduces the number of refreshes)

43
Delegations

 The server may grant a read or write


delegation for a file to a client
 Read delegation: client is assured that no other client
will write to the file for the duration of the delegation
 Write delegation: like read delegation, but other
clients may not read or write
 Delegations may be recalled using a callback
 A callback is a server → client RPC
 A client must support callbacks in order to get a
delegation - tested with CB_NULL request
 Delegations allow clients to service operations
like OPEN, CLOSE, LOCK, READ, WRITE
without immediate interaction with the server 44
Outline

 The Basics
 NFSv2
 NFSv3
 NFSv4
 NFSv4.1

45
Parallel NFS (pNFS)

 Clients may now access storage devices


directly and in parallel
 Eliminates the classic NFS bottleneck of having
only one server
 The management
protocol is NFSv4.1
 The data protocol can
be NFSv4.1, OSD,
or FC

46
Other NFSv4.1 Highlights

 Sessions
 Session layer on top of the transport layer
 Solves many issues with dropped connections, as
well as client and server crashes

 Delegation support for directories

47
References

 http://pages.cs.wisc.edu/~cs537-1/notes/34_file-nfs.pdf

 RFC1094 - NFS version 2


 RFC1813 - NFS version 3
 RFC1831 - RPC: Remote Procedure Call Protocol Specification Version 2
 RFC1832 - XDR: External Data Representation Standard
 RFC1964 - The Kerberos Version 5 GSS-API Mechanism
 RFC2025 - The Simple Public-Key GSS-API Mechanism (SPKM)
 RFC2054 - WebNFS Client Specification
 RFC2055 - WebNFS Server Specification
 RFC2203 - RPCSEC_GSS Protocol Specification
 RFC2224 - NFS URL Scheme
 RFC2581 - TCP Congestion Control
 RFC2623 - NFS Version 2 and Version 3 Security Issues and the NFS Protocol's Use of
RPCSEC_GSS and Kerberos V5
 RFC2624 - NFS Version 4 Design Considerations
 RFC2224 - Security Negotiation for WebNFS
 RFC2743 - Generic Security Service Application Program Interface, Version 2, Update 1
 RFC2847 - LIPKEY - A Low Infrastructure Public Key Mechanism Using SPKM
 RFC3010 - NFS version 4 Protocol (Obsoleted by RFC3530)
 RFC3530 - NFS version 4 Protocol
 RFC5661 - NFS version 4 Minor Version 1 Protocol

48

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy