Os Lesson 3 File Management
Os Lesson 3 File Management
OKELLO
EMAIL: fredrick.ochieng@zetech.ac.ke
FOLLOWING COURSES:
DCS,DSE,DIT,DBIT
FILE MANAGEMENT
File system – concerned with managing secondary storage space particularly disk storage. It consists of
A directory structure which organizes and provide information about all the files in the system.
File Naming
A file is named for convenience of its human users and a name is usually a string of characters e.g.
“good.c”. When a file is named it becomes independent of the process, the user and even the system that
created it i.e. another user may edit the same file and specify a different name.
File Types
An o/s should recognize and support different files types so that it can operate and file in reasonable
ways.
Files types can be implemented by including it as part of file name. The name is split into 2 parts – a
name and extension usually separated by a period character. The extension indicates the type of file and
Spreadsheet .xls
Archive .zip, .arc.rar (compressed files)
File Attributes
Protection – Access control to information in file (who can read, write, execute and so on)
Time, Date and user identification –This information is kept for creation, last modification and
last use. These data can be useful for protection, security and usage monitor.
Volatility – frequency with which additions and deletions are made to a file
Activity – Refers to the percentage of file’s records accessed during a given period of time.
File Operations
Individual data items within the file may be manipulated by operations like:
Truncating – delete some data items but file retains all other attributes
File Structure
Refers to internal organization of the file. File types may indicate structure. Certain files must conform to
a required structure that is understood by the o/s. Some o/s have file systems that does support multiple
structure while others impose (and support) a minimal number of file structures e.g. MS DOS and UNIX.
UNIX considers ach file to be a sequence of 8-bit bytes. Macintosh o/s supports a minimal no of file
structure and it expects executables to contain 2 parts – a resource fork and a data fork. Resource fork
contains information of importance to user e.g. labels of any buttons displayed by program.
File Organization
Refers to the manner in which records f a file are arranged on secondary storage.
a) Sequential
Records placed in physical order. The next record is the one that physically follows the previous record. It
b) Direct
Records are directly (randomly) accessed by their physical addresses on a direct Access by storage device
(DASD). The application user places the records on DASD in any order appropriate for a particular
application.
c) Indexed Sequential
Records are arranged in logical sequence according to a key contained in each record. The system
maintains an index containing the physical addresses of certain principal records. Indexed sequential
records may be accessed sequentially in key order or they may be accessed directly by a search through
d) Partitioned
It is a file of sequential sub files. Each sequential sub file is called a member. The starting address of each
member is stored in the file directory. Partitioned files are often used to store program libraries.
Deals with how to allocate disk space to different files so that the space is utilized effectively and files
1. Contiguous Allocation
Files are assigned to contiguous areas of secondary storage. A user specifies in advance the size of area
needed to hold a file to be created. If the desired amount of contiguous space is not available the file
cannot be created.
Advantages
i) Speeds up access since successive logical records are normally physically adjacent to one another
ii) File directories are relatively straight forward to implement for each file merely retain the address
Disadvantages
i) Difficult to find space for a new file especially if the disk is fragmented into a number of separate
holes (block)
ii) Another difficulty is determining how much space is needed for a file since if too little space is
allocated initially and then when that amount is not large enough, another chunk of contiguous
It is used since files do tend to grow or shrink overtime and because users rarely know in advance how
a) Linked Allocation
It solves all problems of contiguous allocation. Each file is a linked list of disk blocks, the disk blocks
may be scattered anywhere on the disk. The directory contains a pointer to the first and last disk block of
Advantage
Disadvantages
It is very effective only for sequential access files i.e. where you have to access all blocks.
Space may be wasted for pointers since they take 4 bytes. The solution is to collect blocks into
Not very reliable since loss or damage of pointer would lead to failure
b) Indexed Allocation
It tries to support efficient direct access which is not possible in linked allocation. Pointers to the blocks
are brought together into one location (index block). Each file has its own index block, which is an array
Advantage
Disadvantage
Suffers from wasted space due to pointer overhead of the index block
The biggest problem is determining how large the index block would be
i) Linked scheme – index block is normally one disk block; to allow large files we may
ii) Multilevel index – use a separate index block to point to the index blocks which point
iii) Combined scheme – keeps the first say 15 pointers of the index block in the file’s
index block. The first 12 of these pointers point to direct blocks that contain
addresses of blocks that contain data of the file. The next 3 pointers point to indirect
File Implementation
It is concerned with issues such as file storage and access on the most common secondary storage
medium the hard disk. It explores ways to allocate disk space, to recovers freed space, to track the
locations of data and to interface other parts of the o/s to the secondary storage.
Directory implementation
The selection of directory allocation and directory management algorithms has a large effect on the
Algorithms used:
a) Linear is the simplest method of implementing a directory. It uses a linear list of filenames with
pointers to the data blocks. It requires a linear search to find a particular entry.
b) Hash Table – A linear list stores the directory entries but a hash data structure is used. The hash
table takes a value computed from the file name and returns a pointer to the file name in the linear
list.
Free Space Management
To keep track of free disk space the system maintains a free space list which records all disk blocks that
are free (i.e. those not allocated to some file or directory). To create a file we search the free space list for
the required amount of space and allocate that space to the new file. This space is then removed from the
a) Bit vector – Free space list is implemented as a bit map or bit vector. Each block is represented by
1 bit. If the block is free, the bit is 1; if the block is allocated, the bit 0.
b) Linked List – links together all the free disk blocks, keeping a pointer to the first free block in a
special location on the disk and caching it in memory. This first block contains a pointer to the
c) Grouping – Is a modification of free list approach and stores the addresses of n free blocks in the
first free block. The first n-1 of these blocks are actually free blocks and so on. The importance of
this implementation is that the addresses of a large number of free blocks can be found quickly
d) Counting – it takes advantage of the fact that generally several contiguous blocks may be
allocated or freed simultaneously, particularly when space is allocated with contiguous allocation
algorithm or through clustering. Thus rather than keeping a list of n-free disk addresses we can
keep the address of the first free block and the number n of free contiguous blocks that follow the
first block. Although each entry requires more space that would a simple dist address; the overall
This is essential when it comes to files and directory implementation since disks tends to be a major
bottleneck in system performance i.e. they are the slowest main computer component. To improve
performance disk controllers are provided to include enough local memory to create on-board cache that
In a multi user system there is almost a requirement for allowing files to be shared among a number of
i) Access rights – should provide a number of options so that the way in which a particular file is
accessed can be controlled. Access rights assigned to a particular file include: Read, execute,
ii) Simultaneous access – a discipline is required when access is granted to append or update a file to
more than one user. An approach can allow a user to lock the entire file when it is to be updated.