0% found this document useful (0 votes)
12 views26 pages

Sequential Files

Uploaded by

csoundes2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views26 pages

Sequential Files

Uploaded by

csoundes2005
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

File Structure and Data Structure : FSDS

Chapter 2 : Sequential Structures

DR. L AKE HAL ABDE RRAHIM

Email : abderrahim.lakehal@univ-setif.dz 1
THE BLOCKS OF A FILE

1 - Global Organization of Blocks

D1

F1 F2 F3 F4

D2 D4

D3

•Either the file is "seen as an array -F- ": all the blocks that make it up are contiguous.

•Or the file is "seen as a list -D- ": the blocks are not necessarily contiguous, but are
linked together.
THE BLOCKS OF A FILE
1 - Global Organization of Blocks

Among the required characteristics for managing a file viewed as an array, we mention:
•The number of the first block,
•The number of the last block (or alternatively the number of blocks used)."

Header (Number of the first block, Number of blocks)

D1

F1 F2 F3 F4

D2 D4

D3
THE BLOCKS OF A FILE
2 - Interne Organization of Blocks

The blocks are supposed to contain the records of a file. The records:

•Can be of fixed length,


1 2 3

RECORD Number 1 RECORDS BLOCK

•Can be of variable length


1 2 3
THE BLOCKS OF A FILE

2 - Interne Organization of Blocks (Fixed length record)

CONST MaxE=10; // Maximum number of records

Type Trec = Struct // Content of the record


Id : String;
Name : String;
Age : Integer;
end

Type TBloc = Struct // Content of the bloc


tab : Table [MaxE] of Trec // Table of record with maximum 10 records
NR : integer // Number of inserted records 0<NE<MaxE
End.
THE BLOCKS OF A FILE
2 - Interne Organization of Blocks (Fixed length record: Limitation !!!)

❑ There is one or more fields of variable sizes (variable of list type) in the structure of a
record, or the number of fields varies from one record to another.

❑ The number of fields varies from one record to another

Actual Size (Fixed length)

RECORD
Field

Occupied Size (Actual length)


THE BLOCKS OF A FILE
2 - Interne Organization of Blocks (Variable length record)

• A variable-length record will be viewed as a sequence of bytes or characters (of variable length).
• To separate the fields within the record, either a special character can be used, or the fields can be prefixed by
their size.

RECORD R1 : Three (03) positions are used to indicate


the size of the fields. 004ABCD 010ABCDEFGHIJ 006ABCDEF

Field 1 represented Field 2 represented Field 3 represented


on 4 characters on 10 characters on 6 characters

RECORD R2 : The special character & is used to separate


the fields fgDR&ABCDEFGHIJ&ABCDFG
THE BLOCKS OF A FILE
2 - Interne Organization of Blocks (Variable length record)
Fields Separation : By using a fixed number of positions to represent size, the maximum size of the value is thus
limited. In the previous example, with 3 positions, this means that the maximum size of a field cannot exceed 999
characters (or bytes). If, on the other hand, the maximum size is not known at the time of designing the method, it is
possible to represent the size using a variable number of positions, ending with a special character.

004ABCD 999ABCDEFGHIJklfl;ro… 006ABCDEF


Records Separation
To separate records from one another, the same techniques are used as those applied in separating fields within the
same record (either with a special character '$', or by prefixing each record with its size)."
RECORDS BLOCKs

980LVM5-0ABCD…… 014jjnhf8;;-gh 070jhfgdjfklpk…. LVM&5-0&AB…&...$ jKll&jnh&8;;..$ J90&hfgd&jfk…$


1 2 3 1 2 3

RECORD length Records separation by special character ($)


8
THE BLOCKS OF A FILE
2 - Interne Organization of Blocks (Variable length record)

❑ In the case of variable-sized records, the block cannot be defined as an array of records because the
elements of an array must always be of the same size.

❑ The solution is to consider the block as a large fixed-size character array containing the different records
(stored character by character).

Tbloc = struct // Structure of a block


tab: array[b] of char // an array b char
next: integer // number of next record
End

Note : Even if the records are of variable lengths, the block size remains fixed.
THE BLOCKS OF A FILE
2 - Interne Organization of Blocks (Variable length record with/without overlap)

❑ To minimize wasted space in blocks (in the case of variable format only), one can opt for an organization with
overlap between two or more blocks.
❑ When inserting a new record into a block that is not yet full, and the remaining empty space is not sufficient to
fully contain the record, it is split into two parts in such a way that the first part occupies all the empty space
in the block, while the rest (the second part) is inserted into a new block allocated to the file.
❑ It is then said that the record spans across 2 blocks.
❑ This approach can easily be generalized to support records that span across multiple blocks (as in the case of
large records, possibly larger than a physical block)

BLOCK 1 BLOCK 2

980LVM5-0ABCD. 008jjnhf8;; 070jhfgdjfklpk….…… 080jhfg Iiilk0-jnh….…. 008jjnhf8;;

Overlap
TAXONOMY OF SIMPLE FILE STRUCTURES

Type of sequential file structures

Sequential access methods for organizing data on disk use the following notation:
T: for a file viewed as a table, L: for a file viewed as a list
O: for an ordered file, O: for an unordered file
F: for fixed-format records, V: for variable-format records
C: with overlap of records between blocks, C: without overlap
Sequential files
T L

O O O O
F V F V F V F V
C C C C C C C C

The leaves of the following tree represent the 12 sequential access methods 11
TAXONOMY OF SIMPLE FILE STRUCTURES

1. Example : File type T Ō V C

For example, the T Ō V C method represents the organization of a file viewed as a table (T),
unordered (Ō), with variable-sized records (V) and accepting overlaps between blocks (C):

BLOCK 0 BLOCK 1 BLOCK 2 BLOCK N-1

…….

Records
Overlap

Search is sequential, insertion at the end of the file, and deletion is logical
12
TAXONOMY OF SIMPLE FILE STRUCTURES

2. Example : File type LOF

In the case of an LOF file (file viewed as a list, ordered with fixed-size records), each block may
contain, for example, a record array (tab), an integer indicating the number of records in the array
(nb), and an integer to keep track of the next block in the list (next):
Blocks
File (HEAD)
0 1 2 3

Tab Tab Tab Tab


nb next nb next nb next nb next -1

Records

The search is sequential, insertion only causes intra-block shifts (to maintain the order of records), and deletion
can be either logical or physical. 13
TAXONOMY OF SIMPLE FILE STRUCTURES

3. Example : File type TOF

(file viewed as a table, ordered with fixed-size records)

File (HEAD)
Blocks

Tab Tab Tab Tab


0 1 2 3

Records

❑ The search for a record is binary (fast).

❑ Insertion can cause intra- and inter-block shifts (costly).

❑ Deletion can be done by reverse shifts (physically costly deletion) or simply by using a Boolean indicator
(logical deletion, much faster).
14
TAXONOMY OF SIMPLE FILE STRUCTURES

Example : File type TOF


(file viewed as a table, ordered with fixed-size records)

❑ The initial loading operation consists of constructing an ordered file with n initial records, leaving
some empty space in each block. This will help minimize the shifts that might be caused by future
insertions.

❑ Over time, the file's load factor (number of insertions / number of available spaces in the file)
increases due to future insertions, and logical deletions do not free up spaces. As a result,
performance tends to degrade over time. It is therefore recommended to reorganize the file by
performing a new initial load. This is the periodic reorganization operation.

15
FILE DECLARATION
Let b = 30 // maximum capacity of the blocks (in number of records)

// The types used:


Trec = struct
deleted: boolean // boolean for logical deletion
key: someType // the field used as the search key
field3: someType // other fields of the record, not important here
field4: someType
... Record number 26 of Trec
Tbloc (structure)

End
Tbloc = struct // Structure of a block Deleted
tab: array[b] of Trec // an array of records with a maximum capacity = b Key
NB: integer // number of records in the tab (≤ b) field3
End field4
Bloc i of Tbloc
0 1 2 3 26 29

NB (Integer)
Tab : Table of 30 records of type (Trec)
GLOBAL VARIABLES: F AND BUF

F: File of Tbloc, Buffer buf, Header (integer, integer)

/*Description of the file header F:

The header contains two integer-type characteristics.

•The first is used to keep track of the number of blocks used (or the logical number of the last
block in the file)

•The second will serve as an insertion counter to quickly calculate the load factor, and thus
determine if file reorganization is necessary.

*/
17
SEARCH MODULE: (BINARY SEARCH)

Input system
The key (c) to search for.

Output system
The boolean Trouv, the block number (i) containing the key, and the index (j) (position within
the block).

The cost of the search operation is logarithmic because binary search performs, in the worst case,
log₂ N block reads for a file consisting of N blocks. The complexity is O(log N).

18
SEARCH MODULE:
(BINARY SEARCH)

19
INSERTION MODULE: (WIT H POSS IBLE INT RA - AND INTE R-BLOCK S HIF TS)

Input System

The record to be inserted (e) contains the key for searching the location (block i, position j)."This
suggests that the record to be inserted (e) contains a key, which is used to identify the correct location
for insertion within the file system at block i, position j.

20
INSERTION MODULE:
(W IT H P OS S IB LE INT R A - A ND IN T E R-
B LOC K S HI FTS )

21
LOGICAL DELETION:

Consists of searching for the record and setting the 'deleted' field to true
Input System:
The location to delete (record j in block i)
INITIAL LOADING

The initial loading of an ordered file consists


of constructing a new file containing n
records from the start. This is done to leave
some space in each block, which could be
used later for new insertions while avoiding
inter-block shifts (which are very costly in
terms of disk access).

Input System :
U : a decimal value between 0 and 1
that indicates the loading rate.
REORGANIZATION

The reorganization of the file involves copying the records to a new


file in such a way that the new blocks contain some empty space
(1-u). This operation is similar to the initial loading, except that the
records are read from the old file.
MERGING OF 2 ORDERED FILES (TOF)
We traverse the two files (F1 and F2) in parallel with two buffers (buf1 and buf2), and
fill a third buffer (buf3) to construct a third file (F3) in ascending order.
MERGING OF 2 ORDERED
FILES (TOF)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy