0% found this document useful (0 votes)
12 views22 pages

12 FileDirectory

Uploaded by

Karina Nathalie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views22 pages

12 FileDirectory

Uploaded by

Karina Nathalie
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Files and Directories

2024/25 COMP3230B
Contents

• Overview of storage disks

• Overview of file systems

• What is a file?

• What is a directory?

2
Related Learning Outcome

• ILO 2d - describe the principles and techniques used by OS to


support persistent data storage

3
Readings & References
• Required Reading
• Chapter 39 – Interlude: Files and Directories
• http://pages.cs.wisc.edu/~remzi/OSTEP/file-intro.pdf

• References
• Chapter 36 – I/O Devices
• http://pages.cs.wisc.edu/~remzi/OSTEP/file-devices.pdf
• Chapter 37 – Hard Disk Drives
• http://pages.cs.wisc.edu/~remzi/OSTEP/file-disks.pdf
• Chapter 44 – Flash-based SSDs
• http://pages.cs.wisc.edu/~remzi/OSTEP/file-ssd.pdf
• How do SSDs work?
• http://www.extremetech.com/extreme/210492-extremetech-explains-how-do-ssds-work

4
Secondary Storage
• Most secondary storage devices involve magnetic disks, which are
random-access storage
• Data can be accessed by read-write head in any order

• Disk drives (magnetic or solid-state) are part of a class of storage called


block devices.
• These devices treat the storage space as a large 1-dimensional arrays of logical disk
blocks, in which, the logical block is the smallest unit of data transfer
• Commonly-used disk block size is 4 KiB
• Logical blocks are addressed from 0 to N-1, where the disk has N logical disk blocks

5
Physical layout of HDD
• A disk consists of a number of
magnetic platters with recording
surfaces on both sides
• Rotate on spindle
• Each surface is divided into a
number of concentric tracks
• Each track is divided in to a number
of sectors
• Vertical sets of tracks form
cylinders
6
Physical layout of HDD

• Each sector typically contains 512


bytes
• The unit of I/O operation is a
logical block, typically of 4 KiB,
which maps onto the sector(s)
• Convert a logical block number into a
cylinder #, a head #, and a sector #

7
Performance Characteristics of HDD
• The data in a particular disk sector can be read/written
• To access a data block
• Disk arm must move to the target track; then rotate the disk to put target sector under the read-
write head; then record is read-from/write-to the disk
• Performance characteristics
• Seek time
• Time for read-write head to move to target track from current location
• average seek times is around 0.5 to 2 milliseconds
• Rotational latency
• Time for rotate the platter until the target sector is underneath read-write head
• depends on the spinning rate; roughly around 2 ms
• Transfer time
• Time for further rotate the head to read/write the entire sector and transfer the data

8
Performance Considerations of HDD
• Ways to improve disk I/O performance
• Disk scheduling
• In multiprogramming environment, multiple processes can generate I/O requests at the same time,
there may have several pending requests queued up at the disk queue
• Which request should the system do first?
• Because of the high cost of I/O, the OS historically played a role in deciding the order of I/Os issued to the disk
• To optimize the data transfer with the minimum mechanical motion - seek time and rotational time
• Caching
• A disk cache buffer (in main memory) is used to temporarily hold disk data
• Defragmentation
• Place related data in contiguous sectors
• Decreases number of seek operations required
• Multiple disks
• Disk I/O performance may be increased by spreading the operation over multiple disks

9
Physical layout of SSD
• SSD storage medium is called NAND flash memory

• Flash memory is non-volatile memory, which is organized


as a grid of storage cells
• Depends on the technologies, each cell can store 1 bit or 2 bits
or 3 bits (or even more)

• A group of cells is organized into a “page”, which is the


smallest structure that’s readable / “writable” in a SSD
• Today 4KiB (or 8KiB) pages are common on SSDs

• Pages are grouped together into blocks


• It’s common to have 128 pages in a block (512KiB in a block)
(Source: The SSD Anthology)

10
Physical layout of SSD
• Blocks are then grouped into planes, and you’ll
find multiple planes on a single NAND-flash chip

• A block is the smallest structure that can be


erased in a NAND-flash device
• You can read from and “write” to a page
• But you cannot rewrite to a page unless erasing the
whole block (128 pages at a time!!)
• This is where many of the SSD’s problems stem from
(Source: The SSD Anthology)

• Another issue is that frequently erase and write


to a page/block will cause it to wear out (around
100,000 times)
11
Performance Characteristics of SSD
• Flash-based SSD provides the standard block interface
• To access a logical data block (e.g. 4 KiB)
• The built-in control logic turns the requests into low-level read, erase, and write commands on
the underlying physical blocks and physical pages
• Performance characteristics
• Read a physical page
• Able to access any location with the same performance - random access device
• Typically quite fast, around 10s of microseconds
• Erase the whole block
• It is quite expensive as it takes a few milliseconds
• In addition, to preserve some data in the block, they must be copied to somewhere before the erase
• Write (program) to a page
• Usually takes around 100s of microseconds

12
Abstraction of Persistent Storages
• Two key abstractions – Files and Directories
• What is a file?
• From a user’s perspective, it is a collection of related information that is recorded on persistent
storage with a human-readable name given to it
• From the system’s perspective, it is a linear array of bytes, grouped in (logical) blocks, stored in
somewhere, and has some kind of low-level id given to the file
• In Unix systems, we call this low-level id – inode number
• In Windows systems, it is called file reference number
• This low-level id leads us to a data structure, where the attributes of the files are kept, e.g.,
locations of data of the file, ownership, etc.
• What is a directory?
• Actually, it is a file, but its file content is a mapping table that maps filenames (in that directory)
to their low-level ids
• one entry for each file in that directory; can be a regular file or a directory file

13
File Systems
• Files (include directories) are managed by OS, and the part of OS dealing
with files is known as the file system
• File Management
• Providing services to users and applications in the use of files & directories
• Users should be able to refer to their files by symbolic names rather than having to use
physical device names and physical location
• Storage management
• Allocating space for files on storage devices
• File integrity
• To guarantee, to the extent possible, that the data in the file are valid
• Security
• Data stored in file systems should be subject to strict access controls

14
File Abstraction
• From a user’s standpoint, how to
• locate the file, name the file, access the file, protect the file

• File system represents file as an abstract data type,


• which consists of a set of operations:
• Open – associate the target file to the process, allowing the process to perform specific functions on the
file
• Close – process no longer perform functions on the file until it is reopened
• Create – a new file is defined (space is allocated) and a new entry must be made in the directory
• Delete – release all file space and erase the directory entry
• Write – make updates to the file according to the current file pointer
• Read – copy data (starting from location points by the current file pointer) from a file to the memory
• :
• :

15
File Abstraction
• which consists of a set of attributes (metadata) associate to a file:
• Name – human-readable name
• Low-level id – unique tag identifies a file within file system
• Location – pointer to storage locations of the data of the file on device
• Size – current file size
• Accessibility – restrictions placed on access to file data
• controls who can do Read, Write, Execute
• Time, date, and user identification – data for protection, security, and usage monitoring
:
:

• Where to store the metadata that associated to each file?


• Partly in the directory
• Mostly in the file control block (FCB) of that file
• In Unix, this is the inode; in Windows, this is the file record

16
Directories
• As said, a directory is also a file

• The content of a directory can be seen as a symbol table, which associates file and
directory names within that directory to the corresponding directory entries
• Each entry stores the low-level id and other information of the file or directory
• Operations on directory
• Search for a file
Name Low-level ID
• Create a directory 34
.
• Delete a directory 56
..
• List a directory c0230a 123
• Rename a file c0234a 125

• Traverse the file system

17
Directory Structure
• Hierarchically Structured File System
/
• By placing directories within other directories, we have a directory tree,
where all files and directories are stored

usr home
• A file system starts at a root directory “/”
• The root directory contains various directories in the directory hierarchy
c3230a c3230b

• The full name of a file is usually formed as the pathname from the
root directory to the file – absolute path name bin src .bashrc
• e.g., /home/c3230a/src/ws5.cc
• Pros ws5.cc
• File names need to be unique only within a given directory
• Give more flexibility to users to name and group files
• Efficient searching – by simply traverse the path to locate the files

18
Directory Structure
• To simplify the navigation by using absolute path name, the concept
of “Working directory” (Current directory) is used
• Enables users to specify a pathname that does not begin at the root
directory – relative path name
• Absolute path (i.e., the path beginning at the root) = working directory +
relative path name

• Link: a mechanism to create another directory entry that refers to


an existing file/directory in another location
• Adv: Facilitates data sharing and can make it easier for users to access files
located throughout a file system’s directory structure

19
Directory Structure
• Hard link: create another directory entry that
maps to the same low-level id of the original /
file
• Unix – ln target new
usr home
• Windows – CreateHardLink
• The file is not copied at all; the system just creates
two directory entries at different locations but refer
to the same inode (file control block) src atctam c3230a
Hard link

• Remove one directory entry will not cause the file linux bin src .bashrc mandel.c memscan.c
to be deleted
• The system keeps track on how many different directory
entries have been linked to the same low-level id – linux mandel.c
reference count
• The system deletes the file, only when reference count
reaches zero

20
Directory Structure
/
• Limitation of hard link
• Can’t create a hard link to directory usr home
• Can’t create hard link to files in other disk
partitions (i.e. another file system)
src atctam c3230a

• Symbolic link: create another file that


contains the pathname of original file linux bin src .bashrc mandel.c memscan.c

as its data
Symbolic link
• Unix – ln -s target new linux mandel.c
• Windows – mklink, Shortcut
• Symbolic link is a special file type
• Remove the original file causes the soft link
atctam@atctam-LinuxPC:~/src> ls -l linux
lrwxrwxrwx 1 atctam users 14 Nov 30 16:48 linux -> /usr/src/linux

to be invalid – dangling reference atctam@atctam-LinuxPC:~/src> stat linux


File: `linux' -> `/usr/src/linux'
Size: 14 Blocks: 0 IO Block: 4096 symbolic link

21
Summary

• Describe what is the basic structure of a storage disk


• Discuss a few factors that affect the performance of the storage
systems
• Understand the key concept of the file system – the FILE
• What a file is consisted of? How a file provide persistent storage to the user?
• Describe what is the purpose of using DIRECTORY
• What a directory is? How it is being structured? How it is related to the files?

22

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy