OS Lec 11 & 12
OS Lec 11 & 12
• Identifier: Every file is identified by a unique tag number within a file system known as an identifier.
• Type: This attribute is required for systems that support various types of files.
• Protection. This attribute assigns and controls the access rights of reading, writing, and executing the file.
• Time, date and security: It is used for protection, security, and also used for monitoring
File System
• File Type
• File type refers to the ability of the operating system to distinguish different types of file such as text files source
files and binary files etc. Many operating systems support many types of files. Operating system like MS-DOS and
UNIX have the following types of files −
• Ordinary files
• These are the files that contain user information.
• These may have text, databases or executable program.
• The user can apply various operations on such files like add, modify, delete or even remove the entire file.
• Directory files
• These files contain list of file names and other information related to these files.
• Special files
• These files are also known as device files.
• These files represent physical device like disks, terminals, printers, networks, tape drive etc.
• These files are of two types −
• Character special files − data is handled character by character as in case of terminals or printers.
• Block special files − data is handled in blocks as in the case of disks and tapes.
Access Method
• There are several ways the operating system can access the information in the files called File Access Methods. The file access
methods provide operations like reading from or writing to a file stored in the computer memory. The various file access methods
are,
• Sequential Access Method
• Direct Access Method
• Indexed Access Method
• Sequential Access Method :
• Among all the access methods, it is considered the simplest method. As the name itself suggest it is the sequential (or series)
processing of information present in a file. Due to its simplicity most of the compilers, editors, etc., use this method. Processing is
carried out with the use of two operations namely, read and write. Read operation is responsible for reading the next portion of the
file and after a successful read of the record, the pointer proceeds automatically to the next record which tracks the I/O location.
Write operation is responsible for writing at the end of the file and shifts the pointer towards the end of the newly added record.
• In this type of access, while processing the records sequentially, some of the records can be skipped in both directions (either
forward or backward) and can also be reset or rewind to the head of the file (beginning). The above figure shows a tape model of
sequential access of files.
• Advantages of Sequential Access Method :
• This method of file access is easy to implement.
• It provides fast access to the next record using lexicographic order.
• Disadvantages of Sequential Access Method :
• This type of file access method is slow if the file record to be accessed next is not present next to the current record.
• Inserting a new record may require moving a large proportion of the file.
Access Method
• Direct Access Method :
• This access method is also called real-time access where the records can be read irrespective of their sequence. This means they can
be accessed as a file that is accessed from a disk where each record carries a sequence number. For example, block 40 can be
accessed first followed by block 10 and then block 30, and so on. This eliminates the need for sequential read or write operations.
• A major example of this type of file access is the database where the corresponding block of the query is accessed directly for instant
response. This type of access saves a lot of time when a large amount of data is present. In such cases, hash functions and index
methods are used to search for a particular block.
•
The read next and write next operations of sequential access are modified to read n and write n, where 'n' is the block number. A
more promising approach is to use direct access to map the position of the file and sequential access for performing read next
operation in that block.
•
This can be done by using relative block numbers which are nothing but index related to the access of blocks. For example, relative
block numbers 0, 1, 2, ... can be allocated to block numbers 72, 1423, 20, etc. This approach is adopted by some of the operating
systems while others use either sequential or direct access.
•
Advantages of Direct Access Method :
• The files can be immediately accessed decreasing the average access time.
• In the direct access method, in order to access a block, there is no need of traversing all the blocks present before it.
Access Method
• Indexed Access Method
• This method is typically an advancement in the direct access method which is the consideration of index. A particular record is
accessed by browsing through the index and the file is accessed directly with the use of a pointers or addresses present in the index
as shown below.
• To understand the concept consider a book store where the database contains a 12-digit ISBN and a four-digit product price. If the
disk can carry 2048 (2kb) of bytes per block then 128 records of 16 bytes (12 for ISBN and 4 for price) can be stored in a single
block. This results in a file carrying 128000 records to be reduced to 1000 blocks to be considered in the index each entry carrying
10 digits. To find the price of a book binary search can be performed over the index with which the block carrying that book can be
identified.
•
A drawback of this method is, it is considered ineffective in the case of a larger database with very large files which results in
making the index too large.
Directory Structure
• A Directory is the collection of the correlated files on the disk. In simple words, a directory is like a container which
contains file and folder. In a directory, we can store the complete file attributes or some attributes of the file. A
directory can be comprised of various files. With the help of the directory, we can maintain the information related to
the files.
• There are various types of information which are stored in a directory:
• Name: - Name is the name of the directory, which is visible to the user.
• Type: - Type of a directory means what type of directory is present such as single-level directory, two-level directory,
tree-structured directory, and Acyclic graph directory.
• Location: - Location is the location of the device where the header of a file is located.
• Size: - Size means number of words/blocks/bytes in the file.
• Position: - Position means the position of the next-read pointer and the next-write pointer.
• Protection: - Protection means access control on the read/write/delete/execute.
• Usage: - Usage means the time of creation, modification, and access, etc.
• Mounting: - Mounting means if the root of a file system is grafted into the existing tree of other file systems.
Directory Structure
• Operations on Directory
• The various types of operations on the directory are:
• Creating: - In this operation, a directory is created. The name of the directory should be unique.
• Deleting: - If there is a file that we don’t need, then we can delete that file from the directory. We can also remove
the whole directory if the directory is not required. An empty directory can also be deleted. An empty directory is a
directory that only consists of dot and dot-dot.
• Searching: - Searching operation means, for a specific file or another directory, we can search a directory.
• List a directory: - In this operation, we can retrieve all the files list in the directory. And we can also retrieve the
content of the directory entry for every file present in the list.
• Types of Directory Structure
• There are various types of directory structure:
• Single-Level Directory
• Two-Level Directory
• Tree-Structured Directory
• Acyclic Graph Directory
Directory Structure
• Single-level directory –
Single level directory is simplest directory structure, In it all files are contained in same directory which make it easy to support and
understand.
• A single level directory has a significant limitation, however, when the number of files increases or when the system has more than
one user. Since all the files are in the same directory, they must have the unique name . if two users call there dataset test, then the
unique name rule violated.
• Advantages
• Since it is a single directory, so its implementation is very easy.
• If files are smaller in size, searching will faster.
• The operations like file creation, searching, deletion, updating are very easy in such a directory structure.
• Disadvantages:
• There may chance of name collision because two files can not have the same name.
• Searching will become time taking if directory will large.
• In this can not group the same type of files together.
Directory Structure
• Two-level directory – be grouped together in the same user.
As we have seen, a single level directory often leads to
confusion of files names among different users. the solution to
this problem is to create a separate directory for each user.
• In the two-level directory structure, each user has there
own user files directory (UFD). The UFDs has similar
structures, but each lists only the files of a single user.
system’s master file directory (MFD) is searches whenever a
new user id=s logged in. The MFD is indexed by username or
account number, and each entry points to the UFD for that user.
• Advantages:
• We can give full path like /User-name/directory-name/.
• Different users can have same directory as well as file
name.
• Searching of files become more easy due to path name
and user-grouping.
• Disadvantages:
• A user is not allowed to share files with other users.
• Still it not very scalable, two files of the same type cannot
Directory Structure
• Tree Structured Directory:
• Once we have seen a two-level directory as a tree of height 2,
the natural generalization is to extend the directory structure to a
tree of arbitrary height.
This generalization allows the user to create there own
subdirectories and to organize on their files accordingly.
• A tree structure is the most common directory structure. The
tree has a root directory, and every file in the system have a
unique path.
• Advantages:
• Very generalize, since full path name can be given.
• Very scalable, the probability of name collision is less.
• Searching becomes very easy, we can use both absolute path as
well as relative.
• Disadvantages:
• Every file does not fit into the hierarchical model, files may be
saved into multiple directories.
• We can not share files.
• It is inefficient, because accessing a file may go under multiple
directories.
Directory Structure
• Acyclic graph directory – • We share the files via linking, in case of deleting it may create
An acyclic graph is a graph with no cycle and allows to share the problem,
subdirectories and files. The same file or subdirectories may be
in two different directories. It is a natural generalization of the • If the link is softlink then after deleting the file we left with a
dangling pointer.
tree-structured directory.
• It is used in the situation like when two programmers are • In case of hardlink, to delete a file we have to delete all the
reference associated with it.
working on a joint project and they need to access files. The
associated files are stored in a subdirectory, separated them
from other projects and files of other programmers since they
are working on a joint project so they want to the subdirectories
into there own directories. The common subdirectories should
be shared. So here we use Acyclic directories.
• It is the point to note that shared file is not the same as copy file
if any programmer makes some changes in the subdirectory it
will reflect in both subdirectories.
• Advantages:
• We can share files.
• Searching is easy due to different-different paths.
• Disadvantages:
File System Mounting
• Mounting refers to the grouping of files in a file system structure accessible to the user of the group of users. It can be local or
remote, in the local mounting, it connects disk drivers as one machine, while in the remote mounting it uses Network File System
(NFS) to connect to directories on other machines so that they can be used as if they are the part of the user’s file system.
• The directory structure can be built out of multiple volumes which are supposed to be mounted to make them available within the
file-system namespace. The procedure for mounting is simple, the OS is given the name of the device and the location within the file
structure where the file system is attached.
• For example, in a UNIX system, there is a single directory tree, and all the accessible storage must have a location in the single
directory tree. Mounting is used to make the storage accessible. A file system containing the user’s home directories might be
mounted as /home, and they can be accessed by using directory names with time like /home/janc. Similarly, if the file system is
mounted as /user, then we will use /user/janc to access it. Then the operating system verifies if the device contains a valid file system
by asking the device driver to read the directory and verify that the directory has the expected format. Then the operating system
finally notes down the directory structure that the file system is mounted at the specified mount point.
• There are two types of mounts, a remote mount and a local mount. Remote mounts are done on a remote system on which data is
transmitted over a telecommunication line. Remote file systems, such as Network File System (NFS), require that the files be
exported before they can be mounted. Local mounts are mounts done on your local system.
• Each file system is associated with a different device (logical volume). Before you can use a file system, it must be connected to the
existing directory structure (either the root file system or to another file system that is already connected). The mount command
makes this connection.
• The same file system, directory, or file can be accessed by multiple paths. For example, if you have one database and several users
using this database, it can be useful to have several mounts of the same database. Each mount should have its own name and
password for tracking and job-separating purposes. This is accomplished by mounting the same file system on different mount
points. For example, you can mount from /home/server/database to the mount point specified as /home/user1, /home/user2,
and /home/user3:
File System Mounting
• Mounting refers to the grouping of files in a file system structure accessible to the user of the group of users. It can be local or
remote, in the local mounting, it connects disk drivers as one machine, while in the remote mounting it uses Network File System
(NFS) to connect to directories on other machines so that they can be used as if they are the part of the user’s file system.
• The directory structure can be built out of multiple volumes which are supposed to be mounted to make them available within the
file-system namespace. The procedure for mounting is simple, the OS is given the name of the device and the location within the file
structure where the file system is attached.
• For example, in a UNIX system, there is a single directory tree, and all the accessible storage must have a location in the single
directory tree. Mounting is used to make the storage accessible. A file system containing the user’s home directories might be
mounted as /home, and they can be accessed by using directory names with time like /home/janc. Similarly, if the file system is
mounted as /user, then we will use /user/janc to access it. Then the operating system verifies if the device contains a valid file system
by asking the device driver to read the directory and verify that the directory has the expected format. Then the operating system
finally notes down the directory structure that the file system is mounted at the specified mount point.
• There are two types of mounts, a remote mount and a local mount. Remote mounts are done on a remote system on which data is
transmitted over a telecommunication line. Remote file systems, such as Network File System (NFS), require that the files be
exported before they can be mounted. Local mounts are mounts done on your local system.
• Each file system is associated with a different device (logical volume). Before you can use a file system, it must be connected to the
existing directory structure (either the root file system or to another file system that is already connected). The mount command
makes this connection.
• The same file system, directory, or file can be accessed by multiple paths. For example, if you have one database and several users
using this database, it can be useful to have several mounts of the same database. Each mount should have its own name and
password for tracking and job-separating purposes. This is accomplished by mounting the same file system on different mount
points. For example, you can mount from /home/server/database to the mount point specified as /home/user1, /home/user2,
and /home/user3:
File Sharing
• File sharing is the practice of sharing or offering access to digital information or resources, including documents, multimedia
(audio/video), graphics, computer programs, images and e-books. It is the private or public distribution of data or resources in a
network with different levels of sharing privileges. The first implemented method involves manually transferring files between
machines via programs like ftp. The second major method uses a distributed file system (DFS) in which remote directories are
visible from a local machine. In some ways, the third method, the World Wide Web, is a reversion to the first.
A browser is needed to gain access to the remote files, and separate operations (essentially a wrapper for ftp) are used to transfer
files. ftp is used for both anonymous and authenticated access.
• The Client- Server Mode
• l Remote file systems allow a computer to mount one or more file systems from one or more remote machines. In this case, the
machine containing the files is the server, and the machine seeking access to the files is the client. The client-server relationship is
common with networked machines. Generally, the server declares that a resource is available to clients and specifies exactly which
resource (in this case, which files) and exactly which clients. A server can serve multiple clients, and a client can use multiple
servers, depending on the implementation details of a given client-server facility. The server usually specifies the available files on a
volume or directory level.
• Distributed Information Systems
• To make client-server systems easier to manage, distributed information systems, also known as distributed naming services, provide
unified access to the information needed for remote computing. The domain name system (DNS) provides host-name-to-network-
address translations for the entire Internet (including the World Wide Web). Before DNIS became widespread, files containing the
same information were sent via e-mail or ftp between all networked hosts. This methodology was not scalable. DNS is further
discussed in Section 16.5.1. Other distributed information systems provide user name/password/user ID/group ID space for a
distributed facility. UNIX systems have employed a wide variety of distributed-information methods.
File Sharing
• Failure Modes
• Local file systems can fail for a variety of reasons, including failure of the disk containing the file system, corruption of the directory
structure or other disk-management information (collectively called metadata), disk-controller failure, cable failure, and host-adapter
failure. User or systems-administrator failure can also cause files to be lost or entire directories or volumes to be deleted. Many of
these failures will cause a host to crash and an error condition to be displayed, and human intervention will be required to repair the
damage. Remote file systems have even more failure modes. Because of the complexity of network systems and the required
interactions between remote machines, many more problems can interfere with the proper operation of remote file systems. In the
case of networks, the network can be interrupted between two hosts.
Any Question?