File Systems2023Part1
File Systems2023Part1
• PART 1:
• Secondary storage devices
• Types of storage devices
• Disk structure
• Disk formatting
• Files
• File systems
• Definition of “file system”
• Address mapping
• Strategies for allocating disk space to files
• PART 2:
• Windows file systems
• FAT 12,16, 32
• NTFS
• Linux file systems
• Linux file structure on disk
• ext2
• Mounting a file system in Linux and Windows
• The boot sequence
• An HDD is a set of
spinning platters of
magnetically-coated
material under moving
read-write heads
• Components:
• Platters
• arms
• read-write heads
• tracks
• cylinders
• sectors
Hard disk drives
Result
Controller
Software Hardware Media Time
Queue
(Seek+Rot+Xfer)
(Device Driver)
Example of current HDDs
• Seagate Exos X18 (2020)
• 18 TB hard disk
• 9 platters, 18 heads
• Helium filled: reduce friction and power
• 4.16ms average seek time
• 4096 byte physical sectors
• 7200 RPMs
• Dual 6 Gbps SATA /12Gbps SAS interface
• 270MB/s MAX transfer rate
• Cache size: 256MB
• Price: $ 562 (~ $0.03/GB)
HDD SDD
Require seek + rotation No seeks
Not parallel (one head) Parallel
Brittle (moving parts) No moving parts
Random reads take 10s Random reads take 10s
milliseconds microseconds
Slow (Mechanical) Wears out
Cheap/large storage Expensive/smaller storage
Storing information
• Applications can store information in a process address space
• This is a bad idea, why?
• Storage size is limited to the size of the virtual address space
• May not be sufficient for large applications such as banking, etc.
• Data of the application is lost when the process exits or when computer crashes
• Multiple processes might want to access the same data, but couldn’t
• Rather we want to be able to store very large amount of data; which
survive processes; and be able to have concurrent access to it by
multiple processes
• Solution:
• Store information on disks in units called files
• Files are persistent, and only owner can explicitly delete them
• Files are managed by the OS, HOW? The File System, which is how the OS manages
files
Files
• Files are logical units of information created by processes.
• When a process creates a file, it gives the file a name.
• When the process terminates, the file continues to exist and
can be accessed by other processes using its name.
• So, files provide a way to store information on the disk and
read it back later.
File naming
• The exact rules for file naming vary among file systems,
but all current file systems allow at least strings of one to
eight characters as legal file names.
• andrea, bruce, and cathy are possible file names.
• Digits and special characters are also permitted, like 2,
urgent!, and Fig.2-14
• Many file systems support names as long as 255
characters.
• Some file systems distinguish between upper- and
lowercase letters (UNIX), whereas others do not (MS-DOS)
File extensions
• Many file systems support two-part file names, with the two
parts separated by a period, as in prog.c.
• The part following the period is called the file extension and
usually indicates something about the file.
• In MS-DOS, for example, file names are 1 to 8 characters, plus
an optional extension of 1 to 3 characters.
• In UNIX, the size of the extension, if any, is up to the user, and
a file may even have two or more extensions:
homepage.html.zip
• UNIX file extensions are just conventions and are not enforced
by the operating system.
• However C compiler might insist on its extensions • They are
useful for C
• Windows is aware of the extensions and assigns meaning to
them. When a user double clicks on a file name, the program
assigned to its file extension is launched with the file as a
parameter.
•
Typical file extensions
.
Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639
File access
• Sequential access:
• read all bytes/records from the beginning
• cannot jump around, could rewind or back up
• convenient when medium was tapes
• Random access:
• bytes/records read in any order
• read can be …
• move file marker (seek), then read or …
• read and then move file marker
File attributes
• Every file has a name and its
data.
• In addition, all file systems
associate other information
with each file, the date and
time the file was last
modified and the file’s size.
• These extra items are the
file’s attributes or metadata.
• The list of attributes varies
considerably from system to
system
Basic file operations
• Create a file
• Write to a file
• Read from a file
• Seek to somewhere in a file
• Delete a file
• Truncate a file
• Rename a file
• Append to a file
Directories
• To keep track of files, file systems normally have directories,
which are themselves files.
• A directory is a file with a special structure
• Two types of directories: Single-Level Directory and
Hierarchical directory
• Single-Level Directory Systems: one directory containing all
the files
• Was common in early personal computers
• Pros: simplicity, ability to quickly locate files
• Cons: inconvenient naming (uniqueness, all files must have
different names)
Hierarchical directory systems
• It is a tree where leaves are data-files and internal nodes are
directory-files
• Each directory-file entry points a mix of data-files and
subdirectories
• The tree has a root node which is a directory, the root
directory
• Solve name collisions, a file name is the whole path
Path names
• When the file system is based on a directory tree, a file is
identified using its name and a path in the tree.
• Two different methods:
• Absolute path name consisting of the path from the root directory
to the file. As an example, the path /usr/ast/mailbox
• Relative path name. In conjunction with working directory (also
called the current directory). Path names beginning with the
working directory
File systems implementations
File system implementation
• The most important task of a file system is the
allocation of hard disk storage to files that are
created by processes.
• Lesser but still important tasks are to implement
the other operations on files such as Write, Read,
Delete, Rename a file, etc.
• 2- Use a bitmap. A
disk with n blocks
requires a bitmap
with n bits.
• Free blocks are
represented by 1s in
the map, allocated
blocks by 0s
• 1-TB disk, we need 1
billion bits for the
map, which requires
around 130,000 1-KB
blocks to store
Allocation of blocks to files
• After the superblock and the management of free blocks in the
file system, come strategies to allocate free blocks to files.
Examples:
• i-nodes (Unix), an array of data structures, one per file,
telling all about the file
• FAT (MS-DOS, early Windows versions), file allocation table
• NTFS (Windows)
• the root directory, which contains the top of the file-system
tree.
• the remainder of the disk contains all the other directories
and files.
File storage allocation methods
• Files are stored in blocks (sectors) of the disk, so there must
be a method to allocate blocks to files (similar to allocate
main memory to processes)
• Allocation methods:
• Contiguous blocks
• Linked list of blocks
• Linked list using table
• I-nodes
Contiguous allocation
• The simplest allocation scheme is to store each file as a contiguous
sequence of disk blocks.
• With 1-KB blocks, a 50-KB file would be allocated 50 consecutive
blocks
• The directory entry of each file only needs to keep the address of the
first block
• Drawback: over time, the disk becomes fragmented.
• Contiguous Allocation
• At file creation time, a sequence of
free blocks is allocated, only
Contiguous allocation remember the address of the first
block
• File cannot grow beyond that size
• Fragmentation a problem
• Free list
• Allocation may be by first or best fit
• Requires periodic compaction
Linked list allocation
Data blocks
100
Directory