0% found this document useful (0 votes)
16 views32 pages

CSC 202 Note

A file is a structured collection of records, each made up of fields containing data values. Key components include fields, records, and files, with attributes such as name, type, location, size, and protection defining their characteristics. File organization methods, including pile/serial, sequential, indexed, and direct files, influence how data is stored and accessed, with various criteria impacting the choice of organization.

Uploaded by

Ikumapayi jude
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views32 pages

CSC 202 Note

A file is a structured collection of records, each made up of fields containing data values. Key components include fields, records, and files, with attributes such as name, type, location, size, and protection defining their characteristics. File organization methods, including pile/serial, sequential, indexed, and direct files, influence how data is stored and accessed, with various criteria impacting the choice of organization.

Uploaded by

Ikumapayi jude
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Elements of a File

A file consists of a number of records. Each record is made up of a number of fields and
each field consists of a number of characters.

Logical Components of File

The logical components deal with the real-world objects the data represent. These are field,
record and file. However, in today’s an information system, files most often exist as parts of
database, or organised collections of interrelated data.

Field

A field is the basic element of data. An individual field contains a single value, such as an
employee’s last name, a date, or the value of a sensor reading. It is characterised by its
length and data type (e.g., ASCII, string, decimal). Depending on the file design,
fields may be fixed length or variable length. In the latter case, the field often consists
of two or three subfields: the actual value to be stored, the name of the field, and, in
some cases, the length of the field. In other cases of variable-length fields, the length of the
field is indicated by the use of special demarcation symbols between fields.

Record

A record is a collection of related fields that can be treated as a unit by some application
program. For example, an employee record would contain such fields as name,
identification number, job designation, date of employment, and so on. Again, depending on
design, records may be of fixed length or variable length. A record will be of variable
length if some of its fields are of variable length or if the number of fields may vary. In the
latter case, each field is usually accompanied by a field name. In either case, the entire
record usually includes a length field.

File

A file is a collection of related records. The file is treated as a single entity by users and
applications and may be referenced by name. Files have names and may be created and
deleted. Access control restrictions usually apply at the file level. That is, in a shared
system, users and programs are granted or denied access to entire files. In some more
sophisticated systems, such controls are enforced at the record or even the field level.

File Naming

Files are abstraction mechanisms. They provide a way to store information and read it back
later. This must be done in a way as to shield the user from the details of how and where
the information is stored, and how the disks actually work. When a process creates a file, it
gives the file a name. When the process terminates, the file continue to exist, and can be
accessed by other processes using its name.

The exact rules for file naming vary somewhat from system to system, but all operating
systems allow strings of one to eight letters as legal file names. The file name is chosen by
the person creating it, usually to reflect its contents. There are few constraints on the format
of the file name: It can comprise the letters A-Z, numbers 0-9 and special characters $
# & + @ ! ( ) - { } ' ` _ ~ as well as space. The only symbols that cannot be used to identify
a file are * | < > \ ^ = ? / [ ] ' ; , plus control characters. The main caveat on chosen a file
name is that there are different rules for different operating systems that can present
problems when files are moved from computer to another. For example, Microsoft Windows
is case insensitive, so files like MYEBOOKS, myebooks, MyEbooks are all the same
to Microsoft Windows. However, under the UNIX operating system, all three would be
different files as, in this instance, file names are case sensitive.

Naming Convention

Usually a file would have two parts with “.” separating them. The part on the left side of
the period character is called the main name while the part on the right side is called the
extension. A good example of a file name is “course.doc.” The main name is course while
the extension is doc. File extension differentiates between different types of files. We can
have files with same names but different extensions and therefore we generally refer to a
file with its name along with its extension and that forms a complete file name.

File Name Extension

A filename extension is a suffix to the name of a computer file applied to indicate the
encoding convention or file format of its contents. In some operating systems (for example
UNIX) it is optional, while in some others (such as DOS) it is a requirement. Some operating
systems limit the length of the extension (such as DOS and OS/2, to three characters)
while others (such as UNIX) do not. Some operating systems (for example RISC OS) do not
use file extensions.

The following tables, which are extracted from Microsoft® Encarta


(2007), show examples of some common filename extensions:

Table 1: Filename extension of Textual Files

TEXT
FILE TYPE CONTENT APPLICATION
.html Hypertext Mark-Up Internet browser such
Language, the code of as Internet Explorer,
simple Web pages. Crazy Browser,
Usually plain texts file Mozilla Firefox and
with embedded Opera.
formatting instructions.
.pdf Portable Document Adobe Acrobat
Format, a document
presentation format,
downloads as binary.
.rtf Rich Text Format, a Any word processing
document format that application
can be shared between
different word
processors.
.txt A plain and simple text Any word processing
file application
.doc Word processing files Microsoft Word
.dot created with popular (.doc), the related .dot
.abw packages. extension for Microsoft
.lwp Word Template, Abiword
(.abw), and Lotus
WordPro (.lwp)

Table 2: Filename extension of Image Files

IMAGES
FILE TYPE CONTENT APPLICATION
.gif General Interchange Lview and many
Format, though not the most others
economical, the most
common graphics format
not found on the Internet.

.jpg Joint Picture Experts Lview and many


.jpeg Group, a 24 bit graphic others
format
.mpg Moving Picture Experts Sparle, Windows
.mpeg Group, a standard Media Player, Quick
internet movie platform Time, and many
others
.mov Quick time Movie, Sparle, Windows
apple Macintosh, native Media Player, Quick
movie platform Time, and many others

Table 3: Filename extension of Sound Files

SOUND
FILE TYPE CONTENT APPLICATION
.mp3 Audio Files on both Windows Media
PC and Mac Player
.wav Audio Files on PC Real Player
.ra Real Audio, a
proprietary system for
delivering and playing
streaming audio on the
Web
.aiff Audio Files on Mac.

Table 4: Filename extension of Utility type programme

UTILITIES
FILE TYPE CONTENT APPLICATION
.ppt A presentation file (for Microsoft Powerpoint
slide shows)
.xls Spreadsheet files Microsoft Excel, Lotus
.123 123
.mdb A database file Microsoft Access

Table 5: Filename extension of other types of files

OTHERS
FILE TYPE CONTENT APPLICATION
.dll Dynamic Link This is a compiled
Library. This is a system file-one that should
compiled set of not be moved or altered
procedures and/or
drivers called by
another program.
.exe A DOS/ Windows Downloads and
program or a DOS/ launches it in its own
windows Self Extracting temporary directory
Archive
.zip Various popular WinZip, ZipIt, PKzip,
.sit compression formats for and others
.tar the PC, Macintosh,
and
UNIX respectively

CLASS AND HOME EXERCISE 1

1. What is a file?
2. What are the terms commonly used in discussing structure of a file?
3. How can you distinguish one file from another?

File Attributes

The particular information kept for each file varies from operating system to operating
system. No matter what operating system one might be using, files always have certain
attributes or characteristics. Different file attributes are discussed as follow.

File Name

The symbolic file name is the only information kept in human-read form. As it is
obvious, a file name helps users to differentiate between various files.

File Type

A file type is required for the systems that support different types of files. As discussed
earlier, file type is a part of the complete file name. We might have two different files; say
“cit381.doc” and “cit381.txt”. Therefore the file type is an important attribute which
helps in differentiating between files based on their types. File types indicate which
application should be used to open a particular file.

Location
This is a pointer to the device and location on that device of the file. As it is clear from the
attribute name, it specifies where the file is stored.

Size
Size attribute keeps track of the current size of a file in bytes, words or blocks. The size of a
file is measured in bytes. A floppy disk holds about
1.44 Mb; a Zip disk holds 100 Mb or 250 Mb; a CD holds about 800
Mb; a DVD holds about 4.7 Gb.

Protection
Protection attribute of a file keeps track of the access-control information that
controls who can do reading, writing, executing, and so on.

Usage Count
This value indicates the number of processes that are currently using
(have opened) a particular file.

Time, Date and Process Identification


This information may be kept for creation, last modification, and last use. Data provided
by this attribute is often helpful for protection and usage monitoring. Each process has its
own identification number which contains information about file hierarchy.

Attribute Values

In addition, all operating systems associate other information with each file. The list of
attributes varies considerably from system to system. The table below shows some of
the possibilities, but other ones also exist. No existing system has all of these, but each is
present in some system.

Table 6: Fields and various attribute values


FIELD MEANING
Protection Who can access the file and in what way?
Password Password needed to access the file
Creator Identity of the person who created the file
Owner Current owner
Read-only flag 0 for read/write, 1 for read only
Hidden flag 0 for normal, 1 for do not display in listing
System flag 0 for normal file, 1 for system file
Archive flag 0 has been backed up, 1 for needs to be
backed up
ASCII/binary file 0 for ASCII file, 1 for binary file
Random access file 0 for sequential access only, 1 for random
access
Temporary flag 0 for normal, 1 for delete on process exit
Lock flags 0 for unlocked, nonzero for locked
Record length Number of bytes in a record
Key position Offset of the key within each record
Key length Number of bytes in the key field
Creation time Date and time file was created
Time of last access Date and time file was last accessed
Time of last change Date and time file was last changed
Current size Number of bytes in the file
Maximum size Maximum size file may grow

nd
Source: Modern Operating Systems, 2 ed. by Andrew S.
Tanenbaum (2006).

The first four attributes relate to the file’s protection and tell who may access it and who
may not. All kinds of scheme are possible; in some systems the user must present a
password to access a file, in which case the password must be one of the attributes.

The flags are bits or short fields that control or enable some specific property. Hidden files,
for example, do not appear in listing of the files. The archive flag is a bit that keeps track of
whether the file has been backed up. The backup program clears it, and the operating
system sets it whenever a file is changed. In this way, the backup program can tell which
files need backing up. The temporary flag allows a file to be marked for automatic
deletion when the process that created it terminates.

The record length, key position, and key length fields are only present in files whose records
can be looked up using a key. They provide the information required to find the keys.

The various times keep track of when the file was created, most recently accessed and most
recently modified. These are useful for a variety of purposes. For example, a source file that
has been modified after the creation of the corresponding object file needs to be recompiled.
These fields provide the necessary information.

The current size tells how big the file is at present. Some mainframe operating systems
require the maximum size to be specified when the file is created, to let the operating
system reserve the maximum amount of storage in advance. Minicomputers and personal
computer systems are clever enough to do without this item.

CLASS AND HOME EXERCISE

1. What do you understand by file attributes?


2. List out some attributes a file could possess.

SUMMARY

In this lecture, you have learnt that:

File is the basic unit of storage that enables a computer to distinguish one set of information
from another.
In naming a file, a file would have two parts with a period character separating them.
The part on the left side of the period character is called the main name while the part on
the right side is called the extension.
File extension shows the type of file and the application that the
Operating System will use in opening it.
Files have attributes which vary considerably from system to system. No existing operating
system has all of these, but each is present in some systems.
A file might or might not be stored in human-readable form, but it is invariably the “glue”
that binds a conglomeration of instructions, numbers, words, or images into a coherent unit
that a user can retrieve, delete, save, sometimes change, or send to an output device.

FILE ORGANISATION AND ACCESS METHODS

File Organization and Access Methods

In this unit, we use the term file organisation to refer to the structure of a file (especially a
data file) defined in terms of its components and how they are mapped onto backing
store. Any given file organisation supports one or more file access methods. Organisation
is thus closely related to but conceptually distinct from access methods. Access method is
any algorithm used for the storage and retrieval of records from a data file by determining
the structural characteristics of the file on which it is used.

 File Organisation Criteria


 In choosing a file organisation, several criteria are important: Short access time
 Ease of update
 Economy of storage Simple maintenance Reliability.

The relative priority of these criteria will depend on the applications that will use the file.
For example, if a file is only to be processed in batch mode, with all of the records accessed
every time, then rapid access for retrieval of a single record is of minimal concern. A file
stored on CD- ROM will never be updated, and so ease of update is not an issue. These
criteria may conflict. For example, for economy of storage, there should be minimum
redundancy in the data. On the other hand, redundancy is a primary means of increasing the
speed of access to data. An example of this is the use of indexes.

File Organisation Methods

The number of alternative file organisations that have been implemented or just proposed is
unmanageably large. In this brief survey, we will outline five fundamental organisations.
Most structures used in actual systems either fall into one of these categories or can be
implemented or a combination of these organisations. The five organisations, the first four
of which are depicted in Figure 01, are:

 The pile/serial
 The sequential file
 The indexed sequential file
 The indexed file
 The direct, or hashed, file

Table 7 Summarises relative performance aspects of these


five organisations.
Fig 3: Common File Organisation (Source: Operating
Systems by Stalling).
th
Source: Operating Systems; Internal and Design Principles, 5 ed.
by William Stallings (2004).

Table 7: Grade of Performances for Five Basic File Organisation


Source: Operating Systems; Internal and Design Principles, 5th ed. by William
Stallings (2004).

CLASS AND HOME EXERCISE

1. What is file organisation?


2. What is the relationship between file organisation and access method?
3. What are the important issues to consider when selecting a file organisation?

The Pile/Serial

The least-complicated form of file organisation may be termed the pile/serial. Data are
collected in the order in which they arrive. Each record consists of one burst of data. The
purpose of the pile/serial is simply to accumulate the mass of data and save it. Records may
have different fields, or similar fields in different orders. Thus, each field should be self-
describing, including a field name as well as a value. The length of each field must be
implicitly indicated by delimiters, explicitly included as a subfield, or known as default for
that field type. Because there is no structure to the pile/serial file, record access is by
exhaustive search. That is, if we wish to find a record that contains a particular field
with a particular value, it is necessary to examine each record in the pile until the desired
record is found or the entire file has been searched. If we wish to find all records that
contain a particular field or contain that field with a particular value, then the entire file must
be searched.

Pile/serial files are encountered when data are collected and stored prior to processing or
when data are not easy to organise. This type of file uses space well when the stored data
vary in size and structure; is perfectly adequate for exhaustive searches, and is easy to
update. However, beyond these limited uses, this type of file is unsuitable for most
applications.
The Sequential File

The most common form of file structure is the sequential file. In this file organisation, a
fixed format is used for records. All records are of the same length, consisting of the same
number of fixed-length fields in a particular order. Because the length and position of
each field are known, only the values of fields need to be stored; the field name and length
for each field are attributes of the file structure. One particular field, usually the first field in
each record, is referred to as the key field. The key field uniquely identifies the record; thus
key values for different records are always different. Further, the records are stored in
key sequence: alphabetical order for a text key, and numerical order for a numerical key.

Sequential files are typically used in batch applications and are generally optimum for such
applications if they involve the processing of all the records (e.g., a billing or payroll
application).The sequential file organisation is the only one that is easily stored on tape as
well as disk. For interactive applications that involve queries and/or updates of individual
records, the sequential file provides poor performance. Access requires the sequential
search of the file for a key match. If the entire file, or a large portion of the file, can
be brought into main memory at one time, more efficient search techniques are possible.

Nevertheless, considerable processing and delay are encountered to access a record in a


large sequential file. Additions to the file also present problems. Typically, a sequential
file is stored in simple sequential ordering of the records within blocks. That is, the physical
organisation of the file on tape or disk directly matches the logical organisation of the file.
In this case, the usual procedure is to place new records in a separate pile file, called a log
file or transaction file. Periodically, a batch update is performed that merges the log file
with the master file to produce a new file in correct key sequence.

An alternative is to organize the sequential file physically as a linked list. One or more
records are stored in each physical block. Each block on disk contains a pointer to the
next block. The insertion of new records involves pointer manipulation but does not
require that the new records occupy a particular physical block position. Thus, some added
convenience is obtained at the cost of additional processing and overhead.

The Indexed Sequential File

A popular approach to overcoming the disadvantages of the sequential file is the indexed
sequential file. The indexed sequential file maintains the key characteristic of the sequential
file: records are organised in sequence based on a key field. Two features are added:

an index to the file to support random access,


and an overflow file.

The index provides a lookup capability to quickly reach the vicinity of a desired record. The
overflow file is similar to the log file used with a sequential file but is integrated so that a
record in the overflow file is located by following a pointer from its predecessor record.

In the simplest indexed sequential structure, a single level of indexing is used. The index in
this case is a simple sequential file. Each record in the index file consists of two fields: a
key field, which is the same as the key field in the main file, and a pointer into the main file.
To find a specific record, the index is searched to find the highest key value that is equal to
or precedes the desired key value. The search continues in the main file at the location
indicated by the pointer.

To see the effectiveness of this approach, consider a sequential file with


1 million records. To search for a particular key value will require on average one-half
million record accesses. Now suppose that an index containing 1000 entries is constructed,
with the keys in the index more
or less evenly distributed over the main file. Now it will take on average
500 accesses to the index file followed by 500 accesses to the main file to find the record.
The average search length is reduced from 500,000 to
1000.

Additions to the file are handled in the following manner: Each record in the main file
contains an additional field not visible to the application, which is a pointer to the overflow
file. When a new record is to be inserted into the file, it is added to the overflow file. The
record in the main file that immediately precedes the new record in logical sequence is
updated to contain a pointer to the new record in the overflow file. If the immediately
preceding record is itself in the overflow file, then the pointer in that record is updated. As
with the sequential file, the indexed sequential file is occasionally merged with the overflow
file in batch mode.

The indexed sequential file greatly reduces the time required to access a single record,
without sacrificing the sequential nature of the file. To process the entire file sequentially,
the records of the main file are processed in sequence until a pointer to the overflow file is
found, then accessing continues in the overflow file until a null pointer is encountered, at
which time accessing of the main file is resumed where it left off.

To provide even greater efficiency in access, multiple levels of indexing can be used. Thus
the lowest level of index file is treated as a sequential file and a higher-level index file is
created for that file. Consider again a file with 1 million records. A lower-level index with
10,000 entries is constructed. A higher-level index into the lower level index of 100
entries can then be constructed. The search begins at the higher-level index (average length
= 50 accesses) to find an entry point into the lower-level index. This index is then
searched (average length = 50) to find an entry point into the main file, which is then
searched (average length = 50). Thus the average length of search has been reduced from
500,000 to 1000 to 150.

The Indexed File

The indexed sequential file retains one limitation of the sequential file: effective processing
is limited to that which is based on a single field of the file. For example, when it is
necessary to search for a record on the basis of some other attributes than the key field, both
forms of sequential file are inadequate. In some applications, the flexibility of efficiently
searching by various attributes is desirable.

To achieve this flexibility, a structure is needed that employs multiple indexes, one for each
type of field that may be the subject of a search. In the general indexed file, the concept of
sequentiality and a single key are abandoned. Records are accessed only through their
indexes. The result is that there is now no restriction on the placement of records as long as
a pointer in at least one index refers to that record. Furthermore, variable-length
records can be employed.

Two types of indexes are used. An exhaustive index contains one entry for every record in
the main file. The index itself is organized as a sequential file for ease of searching. A
partial index contains entries to records where the field of interest exists. With variable-
length records,
some records will not contain all fields. When a new record is added to the main file, all of
the index files must be updated. Indexed files are used mostly in applications where
timeliness of information is critical and where data are rarely processed exhaustively.
Examples are airline reservation systems and inventory control systems.

The Direct or Hashed File

The direct or hashed file exploits the capability found on disks to access directly any block
of a known address. As with sequential and indexed sequential files, a key field is required
in each record. However, there is no concept of sequential ordering here. The direct file
makes use of hashing on the key value. Direct files are often used where very rapid access
is required, where fixed length records are used, and where records are always
accessed one at a time. Examples are directories, pricing tables, schedules, and name lists.

CLASS AND HOME EXERCISE

1. List the different file organisation methods.


2. In what situation will a particular file access method be useful?
3. Outline the shortfalls of each of the file organisation.

SUMMARY

In this lecture, you have learnt that:

File organisation refers to the logical structuring of the records as determined by the way in
which they are accessed.
Short access time, ease of update, economy of storage, simple maintenance and reliability
are important criteria in choosing a file organisation
Major types of file organisation methods are pile, sequential file, indexed sequential file,
indexed file, direct/hashed file.
File organisation determines the applicable access methods. Access methods are principally
sequential and direct.
Each file organisation method has its peculiar advantages and disadvantages.

FILE MANAGEMENT

File Management System

The file management system, FMS is the subsystem of an operating system that manages the
data storage organisation on secondary storage, and provides services to processes related to
their access. In this sense, it interfaces the application programs with the low-level media-
I/O (e.g. disk I/O) subsystem, freeing on the application programmers from having to
deal with low-level intricacies and allowing them to implement I/O using convenient
data-organisational abstractions such as files and records. On the other hand, the FMS
services often are the only ways through which applications can access the data stored in the
files, thus achieving an encapsulation of the data themselves which can be usefully
exploited for the purposes of data protection, maintenance and control.
Typically, the only way that a user or application may access files is through the file
management system. This relieves the user or programmer of the necessity of developing
special-purpose software for each application and provides the system with a consistent,
well-defined means of controlling its most important asset.

Objectives of File Management System

We can summarise the objectives of a File Management System as follows:

Data Management. An FMS should provide data management services to applications


through convenient abstractions, simplifying and making device-independent of the
common operations involved in data access and modification.

Generality with respect to storage devices. The FMS data abstractions and access methods
should remain unchanged irrespective of the devices involved in data storage.

Validity. An FMS should guarantee that at any given moment the stored data reflect the
operations performed on them, regardless of the time delays involved in actually
performing those operations. Appropriate access synchronization mechanism should be
used to enforce validity when multiple accesses from independent processes are possible.

Protection. Illegal or potentially dangerous operations on the data should be


controlled by the FMS, by enforcing a well defined data protection policy.

Concurrency. In multiprogramming systems, concurrent access to the data should be


allowed with minimal differences with
respect to single-process access, save for access synchronization enforcement.

Performance. The above functionalities should be offered achieving at the same a good
compromise in terms of data access speed and data transferring rate.

File Management Functions

With respect to meeting user requirements, the extent of such requirements depends on the
variety of applications and the environment in which the computer system will be used. For
an interactive, general- purpose system, the under listed constitutes a minimal set of
requirements:

Each user should be able to create, delete, read, write, and modify files.
Each user may have controlled access to other users’ files.
Each user may control what types of accesses are allowed to the user’s files.
Each user should be able to restructure the user’s files in a form appropriate to the problem.
Each user should be able to move data between files.
Each user should be able to back up and recover the user’s files in case of damage.
Each user should be able to access his or her files by name rather than by numeric
identifier.

File System Architecture


One way of getting a feel for the scope of file management is to look at a depiction of a
typical software organisation, as suggested in Figure 2. Of course, different systems will be
organised differently, but this organisation is reasonably representative.

Fig 2: File System Software Architecture


th
Source: Operating Systems; Internal and Design Principles, 5 ed. by William
Stallings (2004).

Device Drivers
At the lowest level, device drivers communicate directly with peripheral devices
or their controllers or channels. A device driver is responsible for starting I/O operations on
a device and processing the completion of an I/O request. For file operations, the typical
devices controlled are disk and tape drives. Device drivers are usually considered to be
part of the operating system.

Basic File System


The next level is referred to as the basic file system, or the physical I/O level. This is the
primary interface with the environment outside of the computer system. It deals with blocks
of data that are exchanged with disk or tape systems. Thus, it is concerned with the
placement of those blocks on the secondary storage device and on the buffering of those
blocks in main memory. It does not understand the contents of the data or the structure of
the files involved. The basic file system is often considered part of the operating system.

Basic I/O Supervisor


The basic I/O supervisor is responsible for all file I/O initiation and termination. At this
level, control structures are maintained that deal with device I/O, scheduling, and file
status. The basic I/O supervisor selects the device on which file I/O is to be performed,
based on the particular file selected. It is also concerned with scheduling disk and tape
accesses to optimize performance. I/O buffers are assigned and secondary memory is
allocated at this level. The basic I/O supervisor is part of the operating system.

Logical I/O
Logical I/O enables users and applications to access records. Thus, whereas the basic file
system deals with blocks of data, the logical I/O module deals with file records. Logical I/O
provides a general-purpose record I/O capability and maintains basic data about files. The
level of the file system closest to the user is often termed the access method. It provides a
standard interface between applications and the file systems and devices that hold the data.
Different access methods reflect different file structures and different ways of accessing and
processing the data. Some of the most common access methods are shown in Figure 01, and
they have been described in the previous unit.
CLASS AND HOME ASSESSMENT
1. What is a file management system?
2. Give examples of devices controlled by device drivers.
3. What are the different functions performed by file management systems?

Elements of File Management


Another way of viewing the functions of a file system is shown in Figure 03. Let us
follow this diagram from left to right. Users and application programs interact with the
file system by means of commands for creating and deleting files and for performing
operations on files. Before performing any operation, the file system must identify and
locate the selected file. This requires the use of some sort of directory that serves to
describe the location of all files, plus their attributes. In addition, most shared systems
enforce user access control: Only authorised users are allowed to access particular files in
particular ways.

Fig 3: Elements of File Management ( Source: Operating


System by Stalling)
th
Source: Operating Systems; Internal and Design Principles, 5 ed. by William
Stallings (2004).

The basic operations that a user or application may perform on a file are performed at the
record level. The user or application views the file as having some structure that organises
the records, such as a sequential structure (e.g., personnel records are stored alphabetically
by last name). Thus, to translate user commands into specific file manipulation commands,
the access method appropriate to this file structure must be employed. Whereas users and
applications are concerned with records or fields, I/O is done on a block basis. Thus, the
records or fields of a file must be organised as a sequence of blocks for output and
unblocked after input. To support block I/O of files, several functions are needed. The
secondary storage must be managed. This involves allocating files to free blocks on
secondary storage and managing free storage so as to know what blocks are available for
new files and growth in existing files. In addition, individual block I/O requests must be
scheduled. Both disk scheduling and file allocation are concerned with optimising
performance. As might be expected, these functions therefore need to be considered
together.

Furthermore, the optimisation will depend on the structure of the files and the access
patterns. Accordingly, developing an optimum file management system from the
point of view of performance is an exceedingly complicated task.

Figure 3 suggests a division between what might be considered the concerns of the file
management system as a separate system utility and the concerns of the operating system,
with the point of intersection being record processing. This division is arbitrary; various
approaches are taken in various systems

Operations Supported by File Management System


Users and applications wish to make use of files. Typical operations that must be supported
include the following:

Retrieve _All
Retrieve all the records of a file. This will be required for an application that must process
all of the information in the file at one time. For example, an application that produces a
summary of the information in the file would need to retrieve all records. This
operation is often equated with the term sequential processing, because all of the records
are accessed in sequence.

Retrieve _One
This requires the retrieval of just a single record. Interactive, transaction-oriented
applications need this operation.

Retrieve _Next
This requires the retrieval of the record that is “next” in some logical sequence to the most
recently retrieved record. Some interactive applications, such as filling in forms, may
require such an operation. A program that is performing a search may also use this
operation.

Retrieve _Previous
Similar to Retrieve_Next, but in this case the record that is “previous” to the currently
accessed record is retrieved.

Insert _One
Insert a new record into the file. It may be necessary that the new record fit into a particular
position to preserve a sequencing of the file.
Delete_One
Delete an existing record. Certain linkages or other data structures may need to be updated
to preserve the sequencing of the file.

Update_One
Retrieve a record, update one or more of its fields, and rewrite the updated record back into
the file. Again, it may be necessary to preserve sequencing with this operation. If the length
of the record has changed, the update operation is generally more difficult than if the length
is preserved.

Retrieve_Few
Retrieve a number of records. For example, an application or user may wish to retrieve all
records that satisfy a certain set of criteria.

The nature of the operations that are most commonly performed on a file will influence the
way the file is organized, as discussed under file organisation, which in the next unit. It
should be noted that not all file systems exhibit the sort of structure discussed in this
subsection. On UNIX and UNIX-like systems, the basic file structure is just a stream of
bytes. For example, a C program is stored as a file but does not have physical fields,
records, and so on.

CLASS AND HOME EXERCISE

1. List different operations supported by a file management system.


2. What are the objectives of a file management system?

SUMMARY

In this lecture, you have learnt that:

File management system is the subsystem of an operating system that manages the data
storage organisation on secondary storage, and provides services to processes related to file
access.
FMS objectives include data management, protection of files against dangerous operations,
control over, etc.
Different operations are supported by FMS which include retrieve_one, retrieve_all,
and so on.
To a user, FMS serves as an interface to file creation and deletion, file ownership
and access control, logical identification
of data and technical failure prevention as a result of data redundancy.
FILE DIRECTORIES
Concept of File Directory

To keep track of files, the file system normally provides directories, which, in many systems
are themselves files. The structure of the directories and the relationship among them are the
main areas where file systems tend to differ, and it is also the area that has the most
significant effect on the user interface provided by the file system.

Contents of File Directory

Table 8 on next page suggests the information typically stored in the directory for each file in
the system. From the user’s point of view, the directory provides a mapping between file
names, known to users and applications, and the files themselves. Thus, each file entry
includes the name of the file. Virtually all systems deal with different types of files and
different file organisations, and this information is also provided. An important category of
information about each file concerns its storage, including its location and size. In shared
systems, it is also important to provide information that is used to control access to the file.
Typically, one user is the owner of the file and may grant certain access privileges to other
users. Finally, usage information is needed to manage the current use of the file and to record
the history of its usage.

Table 8: Information Elements of a File Directory

File Directory Structure


The number of directories varies from one operating system to another. In this section, we
describe the most common schemes for defining the logical structure of a directory. These
are:

• Single-Level Directory

• Two-Level Directory

• Tree-Structured Directory

• Acyclic Graph Directory

Single-Level Directory
In a single-level directory system, all the files are placed in one directory. This is
very common on single-user operating systems. A single-level directory has significant
limitations when the number of files increases or when there is more than one user.
Since all files are in the same directory, they must have unique names. If there are two
users who call their data file “cit381note.doc”, then the unique-name rule is violated.
Even with a single user, as the number of files increases, it becomes difficult to
remember the names of all the files in order to create only files with unique names.

The Figure 5 below shows the structure of a single-level directory system.

Fig 5: Single Level Directory


th
Source: Operating System Concepts with Java, 6 ed. by Abraham
Silberschatz and Others. (2004)
Two-Level Directory

In the two-level directory system, the system maintains a master block that has one
entry for each user. This master block contains the addresses of the directory of the
users. There are still problems with two-level directory structure. This structure
effectively isolates one user from another. This design eliminates name conflicts among
users and this is an advantage because users are completely independent, but a
disadvantage when the users want to cooperate on some task and access files of other
users. Some systems simply do not allow local files to be accessed by other users. It is
also unsatisfactory for users with many files because it is quite common for users to
want to group their files together in a logical way.

Figure 6 below shows the double-level directory.


Fig 6: Two-Level Directory

Tree-Level Structural Directories

In the tree-structured directory, the directory themselves are considered as files. This leads to
the possibility of having sub-directories that can contain files and sub-subdirectories. An
interesting policy decision in a tree-structured directory structure is how to handle the
deletion of a directory. If a directory is empty, its entry in its containing directory can simply
be deleted. However, suppose the directory to be deleted is not empty, but contains several
files or sub-directories then it becomes a bit problematic. Some systems will not delete a
directory unless it is empty. Thus, to delete a directory, someone must first delete all the files
in that directory. If there are any subdirectories, this procedure must be applied recursively to
them so that they can be deleted too. This approach may result in a substantial amount of
work. An alternative approach is just to assume that when a request is made to delete a
directory, all of that directory’s files and sub-directories are also to be deleted. This is the
most common directory structure.

A typical tree-structured directory system is shown in Figure 7.

Fig. 7: Tree-Structure System

th
Source: Operating System Concepts with Java, 6 ed. by Abraham
Silberschatz and Others. (2004)
Acyclic-Graph Directories

The acyclic directory structure is an extension of the tree-structured directory structure. In


the tree-structured directory, files and directories starting from some fixed directory are
owned by one particular user. In the acyclic structure, this prohibition is taken out and thus
a directory or file under directory can be owned by several users

The figure 8 shows an acyclic-graph directory structure.

Fig 8: Acyclic Graph Directory


th
Source: Operating System Concepts with Java, 6 ed. by Abraham
Silberschatz and others. (2004)

CLASS AND HOME EXERCISE

1. Explain briefly what you understand by directory structure.


2. Outline the different directory structure.

Path Names
When a file system is organized as a directory tree, some way is needed for specifying the
filenames. The use of a tree-structured directory minimizes the difficulty in assigning
unique names. Any file in the system can be located by following a path from the root or
master directory down various branches until the file is reached. The series of directory
names, culminating in the file name itself, constitutes a pathname for the file. Two different
methods commonly used are:

Absolute Path name


Relative Path name

Absolute Path Name


With this path name, each file is given a path consist of the path from the root directory
to the file. As an example, the file in the lower left- hand corner of Figure 09 has the
pathname User_B/Word/Unit_A/ABC. The slash is used to delimit names in the sequence.
The name of the master directory is implicit, because all paths start at that directory. Note
that it is perfectly acceptable to have several files with the same file name, as long as they
have unique pathnames, which is equivalent to saying that the same file name may be used
in different directories. In this example, there is another file in the system with the file name
ABC, but that has the pathname /User_B/Draw/ABC.

Fig 9:Another Representation of a Tree-Structure Directory


th
Source: Operating Systems; Internal and Design Principles, 5 ed. by William
Stallings (2004).

Note that absolute file names always start at the root directory and are unique. In UNIX the
file components of the path are separated by /. In MS-DOS the separator is \. In
MULTICS it is >. No matter which character is used, if the first character of the path
name is the separator, then the path is absolute.

Relative Path Name


Although the pathname facilitates the selection of file names, it would be awkward for a
user to have to spell out the entire pathname every time a reference is made to a file.
Typically, an interactive user or a process has associated with it a current directory, often
referred to as the working directory or current directory. Files are then referenced relative to
the working directory. For example, if the working directory for user B is “Word,” then the
pathname Unit_A/ABC is sufficient to identify the file in the lower left-hand corner of
Figure 09. When an interactive user logs on, or when a process is created, the default for the
working directory is the user home directory. During execution, the user can navigate up or
down in the tree to change to a different working directory.

CLASS AND HOME EXERCISE

1. What is a path name?


2. What is the difference between relative and absolute path names?
3. Use the diagram below to answer the following questions:
KWASU

FILE-A
FILE-B
STUDENT PG TEST.TXT
EXAM.DOC
DEPT.RTF

STAFF ASSU SALARY.XLS

I.D.MDB
NASU ARREARS.REC

a. Write out the absolute pathname for EXAM.DOC, TEST.TXT,


SALARY.XLS, and ARREARS.REC
b. Write out the relative pathname for DEPT.RTF and ID.MDB using
STUDENTS and STAFF as the working directory respectively.

FILE AND DIRECTORY OPERATIONS


Operations on Files and Directories

The operating system provides systems calls to create, write, read, reposition, truncate and
delete files. The following sub-units discuss the specific duties a file system must do for
each of the following basic file operations.
File Operations

The following are various operations that can take place on file:

a. Creating a File
When creating a file, a space in the file system must be found for the file and then an entry
for the new file must be made in the directory. The directory entry records the name of the
file and the location in the file system.

b. Opening a File
Before using a file, a process must open it. The purpose of the OPEN call is to allow the
system to fetch the attributes and list of secondary storage disk addresses into main memory
for rapid access on subsequent calls.

c. Closing a File
When all the accesses are finished, the attributes and secondary storage addresses are no
longer needed, so the file should be closed to free up internal table space. Many systems
encourage this by imposing a maximum number of open files on processes.

d. Writing a File
To write a file, a system call is made specifying both the name of the file and the
information to be written to the file. Given the name of the file, the system searches the
directory to find the location of the file. The directory entry will need to store a pointer to
the current block of the file (usually the beginning of the file). Using this pointer, the
address of the next block can be computed where the information will be written. The write
pointer must be updated ensuring successive writes that can be used to write a
sequence of blocks to the file. It is also important to make sure that the file is not
overwritten in case of an append operation, i.e. when we are adding a block of data at the
end of an already existing file.

e. Reading a File
To read a file, a system call is made that specifies the name of the file and where (in
memory) the next block of the file should be put. Again, the directory is searched for the
associated directory entry, and the directory will need a pointer to the next block to be
read. Once the block is read, the pointer is updated.

Repositioning a File
When repositioning a file, the directory is searched for the appropriate entry, and the current
file position is set to a given value. This file operation is also called file seek.

a. Truncating a File
The user may erase some contents of a file but keep its attributes. Rather than forcing the
user to delete the file and then recreate it, this operation allows all the attributes to remain
unchanged, except the file size.

b. Deleting a File
To delete a file, the directory is searched for the named file. Having found the associated
directory entry, the space allocated to the file is released (so it can be reused by other files)
and invalidates the directory entry.

c. Renaming a File
It frequently happens that user needs to change the name of an existing file. This system
call makes that possible. It is not always strictly necessary, because the file can always be
copied to a new file with the new name, and the old file then deleted.
d. Appending a File
This call is a restricted form of WRITE call. It can only add data to the end of the file.
System that provide a minima set of system calls do not generally have APPEND, but many
systems provide multiple ways of doing the same thing, and these systems sometimes have
APPEND.
The ten operations described comprise only the minimal set of required file operations.
Others may include copying, and executing a file. Also of use are facilities to lock
sections of an open file for multiprogramming access, to share sections, and even to
map sections into memory or virtual-memory systems. This last function allows a part of the
virtual address to be logically associated with section of a file. Reads and writes to that
memory region are then treated as reads and writes to the file.

HOME AND CLASS EXERCISE


1. Write briefly on the five types of operation that can be performed on a
file.
2. What is the difference between appending and writing a file?
Directory Operations

When considering a particular directory structure, we need to keep in mind the operations
that are to be performed on a directory.

a. Create a File
New files need to be created and added to the directory.

b. Delete a File

When a file is no longer needed, we want to remove it from the directory. Only an
empty directory can be deleted.

c. Open a File

Directories can be read. For example, to list all files in a directory, a listing program opens
the directory to read out the names of all the files it contains. Before a directory can be
read, it must be opened.

d. Close a File

When a directory has been read, it should be closed to free up internal table space.

e. Read a File

This call returns the next entry in an open directory. Formerly, it was possible to read
directories using the usual READ system call, but that approach has the disadvantage of
forcing the programmer to know and deal with the internal structure of directories. In
contrast, READDIR always returns one entry in a standard format, no matter which of the
possible directory structure is being used.

f. Rename a File

Because the name of a file represents its contents to its uses, the name must be changeable
when the contents or use of the file changes. Renaming a file may also allow its
position within the directory structure to be changed.

g. Search for a File

We need to be able to search a directory structure to find the entry for a particular file.

h. List a Directory

We need to list the files in a directory and the contents of the directory entry for each file
in the list.
Note that the above list gives the most important operations, but there are a few others
as well, for example, for managing the protection information associated with a
directory.

26
26
File Sharing
In a multiuser system, there is almost always a requirement for allowing files to be shared
among a number of users. Two issues arise: access rights and the management of
simultaneous access.
Access Right
The file system should provide a flexible tool for allowing extensive file sharing among
users. The file system should provide a number of options so that the way in which a
particular file is accessed can be controlled. Typically, users or groups of users are granted
certain access rights to a file. A wide range of access rights are in use. The following list is
representative of access rights that can be assigned to a particular user for a particular file:

None: The user may not even know of the existence of the file, not to talk of accessing it.
To enforce this restriction, the user would not be allowed to read the user directory that
contains this file.
Knowledge: The user can determine that the file exists and who its owner is. The user
is then able to petition the owner for additional access rights.
Execution: The user can load and execute a program but cannot copy it. Proprietary
programs are often made accessible with this restriction.
Reading: The user can read the file for any purpose, including copying and execution.
Some systems are able to enforce a distinction between viewing and copying. In the former
case, the contents of the file can be displayed to the user, but the user has no means for
making a copy.
Appending: The user can add data to the file, often only at the end, but cannot modify
or delete any of the file’s contents. This right is useful in collecting data from a number of
sources. Updating: The user can modify, delete, and add to the file’s data. This normally
includes writing the file initially, rewriting it completely or in part, and removing all
or a portion of the data. Some systems distinguish among different degrees of updating.
Changing protection: The user can change the access rights granted to other users.
Typically, this right is held only by the owner of the file. In some systems, the owner
can extend this right to others. To prevent abuse of this mechanism, the file owner
will typically be able to specify which rights can be changed by the holder of this
right.
Deletion: The user can delete the file from the file system.
These rights can be considered to constitute a hierarchy, with each right implying those
that precede it. Thus, if a particular user is granted the updating right for a particular file,
then that user is also granted the following rights: knowledge, execution, reading, and
appending.
One user is designated as owner of a given file, usually the person who initially created
the file. The owner has all of the access rights listed previously and may grant rights to
others. Access can be provided to different classes of users:
Specific user: Individual users who are designated by user ID. User groups: A set of
users who are not individually defined. The system must have some way of
keeping track of the membership of user groups.
All: All users who have access to this system. These are public files.

Simultaneous Access
When access is granted to append or update a file to more than one user, the operating
system or file management system must enforce discipline. A brute-force approach is to
allow a user to lock the entire file when it is to be updated. A finer grain of control is
to lock individual records during update.

27
27
HOME AND CLASS EXERCISE

1. How can access to files in networked environment be controlled?


2. What are the different groups of people access can be granted to?

SUMMARY
In this lecture , you have learnt that:
Different operations can be performed on files and directories Examples of operations
on files include creating, reading, closing, writing, opening, repositioning, renaming,
and others Creating, deleting, opening, closing and reading are some of the operations that
can be performed on directories
File system provides a number of options with regards to right to accessing a file so as to
control the way in which a particular file is accessed
More than one user can be granted access to a file but with some level of discipline.

28
28
29
29
30
30
31
31

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy