0% found this document useful (0 votes)

18 views7 pages

7.3 Section 3 File Organisation

The document outlines file organization and database concepts, detailing the hierarchy of data from bits to databases. It discusses variable and fixed length records, transaction and master files, and various file organization methods including serial, sequential, and indexed sequential. Additionally, it covers the advantages and disadvantages of each method, as well as the processes for adding, deleting, and updating records.

Uploaded by

nicholastakudzwa8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views7 pages

7.3 Section 3 File Organisation

Uploaded by

nicholastakudzwa8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

SIR.

OWEN NYAMAROPA

7.3 SECION 3: FILE ORGANISATION AND DATA BASE CONCEPTS

RECORDS AND FILES

HIERACHY OF DATA
BIT All data is stored in a computer’s memory or storage devices in the form of binary
digits or bits. A bit can be either ‘ON’ or ‘OFF’ representing 1 or 0.
BYTE A group of eight bits. One byte represent one character or in different contexts,
other data such as a sound, part of picture etc. the most common code used to
represent characters is ASCII (American Standard Code for Information Interchange).
FIELD Characters are grouped together to form fields. Data held about a person can be
split into many fields, e.g. ID Number, Surname, First Name, Address, DOB, etc.
RECORD All the information about one person or item is held in a record.
FILE A file is a collection of records. A stock file will contain a record for each item of
stock, a payroll file a record for each employee and so on.
DATABASE A database may consists of many different files, linked together so that information
can be retrieved from several files simultaneously.

Variable length records

 Records in a file may not all be of the same length. They are called variable length records.
Variable length records may be used when either:
 The number of characters in a field varies between records.
 Records have a varying number of fields.
 A variable length record has to have some way of showing where each field ends, and where the
record ends, in order that it can be processed. There are two ways of doing this:
 Use a special end-of-field character at the end o each field, and an end-of-record marker at
the end of each record.
 Use a character count at the beginning of each field and an end-of-record marker.
Advantages of variable length record
 Less space is wasted on the storage medium
 No truncation of data occurs
 It enables as many fields as necessary to be held on a particular record.
 It may reduce time taken to read a file because the records are more tightly packed.
Disadvantages
 The processing required to separate out the fields is more complex.
 The record cannot be updated in situ.
 It is harder to estimate file sizes accurately when a new system is being designed.

Fixed records
 When data is stored in fixed length records, the same number of bytes is allocated to each data
item (field) with no reference to how much data is stored.
Advantages
 Data can be updated in situ
 It’s possible to estimate file size accurately when a new system is being designed.
Disadvantages
 It wastes space on storage medium.
 There is truncation of data.

1
Transaction file
 Is a collection of records used in batch processing to update master file.
 It contains data of all transactions that have occurred in the last period. A period may be a day, a
week, or a month.
Master files
 Are permanent file of data, which is a principal source of information for a job.
 They are kept up-to-date by applying the transactions that occur during the operation of
business. They contain two basic types of data:
 Data of a more or less permanent nature such as name, address, rate of pay etc.
 Data which will change every time transactions are applied to the file, e.g. gross pay to date,
tax paid to date, etc.

File organisation
 Files stored on magnetic media can be organised in a number of ways. The method chosen will
depend on several factors such as:
 How the file is to be used
 How many records are processed each time the file is updated.
 Whether individual records need to be quickly accessed.
Serial file organisation
 Records on serial files are not in any particular sequence.
 Records are stored in the order in which they are received, with new records added to the end
of the file.
 Serial files are used as temporary files to store transaction data.
Access method
 Serial access this means record is read from a the disk into main memory one after the other i.e.
in the order they occur on the disk.
 The method can be used with magnetic tape.
Adding a record – new records are simply appended to the end of the file.
Deleting a record – create a new tape, copy all the records up to the record to be deleted, leaving
that one off the new tape and then copy out all the rest of the records to the new tape.
Uses of transaction file
 Used as transaction file for recording data in the order in which events takes place.
 Used to update MF and to restore data in the event of disaster like a disk head crash.

Sequential file organisation

 Records are stored one after the other but in a sequence according to the record key.
 Methods of access is serial/sequential.
 The method can be used with magnetic tape/magnetic disk

Adding a record – make a new copy of file, copying over all records until a new one can be written in
its proper place and then copy over the rest of the records.

Deleting a record - create a new tape, copy all the records to a new tape/disk , leaving out the
record to be deleted.

Uses of sequential files

 Used as master file for high hit rate applications such as payroll.

2
Updating sequential files
 A MF is updated when one or more records is altered by applying a transaction or a file of
transaction to it.
 The method used to update a sequential file is called updating by copying.
 It requires the transaction file to be sorted in the same order as the MF.

MF TF
Day 1 Day 1

Grandfather

Update

TF
MF
Day 2
Day 2

Father

Update

MF
Day 3
Grandfather-Father-Son method of updating.
Son

The steps are as follows

 A record is read from MF into memory.
 A record is read from the TF into memory
 The record keys from each file are compared. If no updating is required to the MF record in
memory (the master key is less than the transaction key) the master record is copied from
memory to a new MF on different tape or area of disk and another MF record is read into
memory, overwriting the previous one. This step is then repeated.
 If there is a transaction for the master record currently in memory, the record is updated. It will
be retained in memory in case there are any more transactions that apply to it. Steps 2 – 4 are
then repeated.

NB Three versions of MF will be created and this is called grandfather – father- son.

3
Algorithm to update a sequential master file

Open master file for reading

Open transaction file for reading
Open new master file for writing
Repeat
Read next transaction record
While master record key < transaction record key
Write master record to new master file
Read next master record
EndWhile
Update record
Until End Of File (transaction)
While Not End Of File (master)
Read next master record
Write master record to new master file
EndWhile

Random files (hash file, direct or relative file)

 Direct access file is a collection of records, where each record is stored at a disk address,
calculated from the record’s primary key.
 Records are stored or retrieved according to either disk address or their relative position within
the file.
 Relative file addressing is that record number 1 is stored in block 1, record number 2 in block 2
and so on.
 Using relative position to store the records is a waste of space, e.g. if there are 1 000 records to
store and each record key is 5 digits, we need 99999 blocks to store the records.
 Using disk address – hashing algorithm is used to translate the key into an address.
 Synonyms are bound to occur when two record keys generate same address.
 Resolving synonyms is to place the record that caused the collision in the next available free
space. When the highest address is reached, the next record can be stored on address 0 (known
as wrap round).
 When this method is used, searching for a record, the search has to continue until the record is
found or a blank space is found.
 Another method is to use a separate overflow area and leave a tag in the original location to
indicate where to look next.

Adding a record
 Apply hashing algorithm to the key field to generate storage address. E.g. the address of record
75481 would be calculated as follows:
75481/1000 = 75 remainder 481. (where 1000 is the number of records to be stored).
Address = 481
 If the address is already full, the record can be put in the next available place or leave a tag in
the original location and place it an overflow area.
Properties of a good hashing algorithm
The algorithm should chosen so that:
 It can generate any of the available addresses on the file;
 It fast to calculate;

4
 It minimise collisions (synonyms).

Deleting a record
 Leave the deleted record in place and set a flag labelling it deleted. The record will be logically
deleted but physical present.
Updating the file
 A record is read into memory, update it, and write it back to its original location.
 This is called updating by overlay or updating in place or updating in situ.

Use of direct access files

 Used in situations where extremely fast access to individual records is required. E.g. in an airline
booking system where thousands of bookings are made everyday for each airline from terminals
all over the country.

Indexed Sequential file Organisation

 Records are held in sequence in blocks and space is left in each block when the file is created so
that additional records can be added and the correct sequence maintained.
 An index is held in the front of the file showing the highest key in each block of records.
 Records that will not fit in the home block have to be placed in an overflow area and the tag
giving the record key and its overflow address is left in the home block to show where the record
is.
 An indexed sequential file consists of 3 areas:
1. A home area where the records are initially stored.
2. An index area containing an entry giving for each block address, the highest key in the block.
3. An overflow area to hold records that have been subsequently added and will not fit into
the correct home block.
 For a file held on disk pack, more than one level of index is required. The indexing technique
required is cylinder-surface-sector indexing.
 For each disk pack in the file, there is a cylinder index or primary index which is read into
memory and held there while the file is in use.
 It contains a list of highest key in each cylinder of the file.
 When looking for a record, with a particular key, this index is searched from the beginning until
an entry is found which is greater than or equal to the key required. The process is called
seeking.
 A cylinder on a disk is made of all the tracks that can be accessed from one position.
 Having reached the correct cylinder, a further index is read and searched. This is the surface
index or secondary index which holds list of surface numbers and the highest key to be found
there.
 By comparing these hi-keys with the one required, the correct surface can be selected. The
process is known as switching.
 Once on the right track, a third level of index, the sector index can be read and searched to give
the sector number that which the record is to be found.
Example Looking for a record key 5584
Cylinder Hi-key
0 193  Searching the index, 1st # which is greater
1 346 or equal to 5584 is 6608. So the record
. … exists is on cylinder 21. Thus, the
19 4382
read/write heads are moved to cylinder 21.
20 5495
21 6608 On arrival, the surface index is located on
surface 0 and is read
5
. …
199 49999
Surface Hi-Key
0 5510  This means that the record 5584 should be on
1 5622 surface 1, so the read head of that surface is
2 5843 activated. The sector index located on sector 0
. …… of cylinder 21, surface 1 is then read.
. ……
7 6608

 The record with key 5584 should be in sector 5, so

Sector Hi-key that sector 5 is read into memory. It will then be
0 5521 serially searched until the correct record is located.
1 5538 If it is not found then:
2 5560
 Either the record does not exist.
3 5568
4 5583  Or when the record was added to the file, there
5 5597 was no room for it in cylinder 21, surface 1,
6 5606 sector 5 and so the record was overflowed
7 5622 elsewhere.

Advantages of the file organisation

 Faster than serial search of a sequential file.
 It can be processed either randomly using the indexes or sequentially without using the indexes
Disadvantages
 The disk accessing and searching is time consuming
 The indexes take up quite a lot of space.

Blocks
 Both disks and tapes transfer data between CPU and backing store in chunks called blocks.
 Blocking factor is the number of records stored in each block.
 A block can be called a sector on a disk.
 A user must specify number of records in each sector/block when setting an indexed sequential
file.
 A blocking strategy is to put several records in one block, but to leave enough free space for
extra records to fit in each block before overflow occurs.
 Blocking packing density refers to the ratio of tracks initially set aside for records to the number
of available tracks on the cylinder.

File reorganisation
If records are continually added to and deleted from an indexed sequentially file, a large proportion
of records will end up in the overflow area. This increase access time since several blocks may have
to be read to locate a record. So it becomes necessary to reorganise the file i.e. copy files to another
file allowing free space in each block for additional records and recreating the indexes at the same
time.

Uses of indexed sequential files

 Suitable for real-time stock control systems. E.g. when a purchase is made the stock can be
immediately updated and the record written back to the file. (Updated in situ). The file can be
processed sequentially not using index at all when reports of sales or stock are needed.

6
7

OS-Chapter 5 - File Management
100% (1)
OS-Chapter 5 - File Management
10 pages
05 Expedition Audit L3
No ratings yet
05 Expedition Audit L3
54 pages
TOPIC THREE-File System
No ratings yet
TOPIC THREE-File System
15 pages
Topic 3 - File Organisation
No ratings yet
Topic 3 - File Organisation
4 pages
Of February 1978, Sex: Male, Class: Form 4A: Compiled by Kapondeni T. 11-Feb-14
No ratings yet
Of February 1978, Sex: Male, Class: Form 4A: Compiled by Kapondeni T. 11-Feb-14
7 pages
Data Structure Unit 5
50% (4)
Data Structure Unit 5
14 pages
Chapter 1
No ratings yet
Chapter 1
11 pages
Unit 1 Introduction To Dbms
No ratings yet
Unit 1 Introduction To Dbms
27 pages
Unit 6 File Management
No ratings yet
Unit 6 File Management
70 pages
WINSEM2024-25 CBS1003 ETH VL2024250505129 2025-04-08 Reference-Material-I
No ratings yet
WINSEM2024-25 CBS1003 ETH VL2024250505129 2025-04-08 Reference-Material-I
12 pages
Unit 6 (22516)
No ratings yet
Unit 6 (22516)
40 pages
OSY Chapter 6 SSP
No ratings yet
OSY Chapter 6 SSP
24 pages
Concept of Computer File
No ratings yet
Concept of Computer File
3 pages
1.file Organization
No ratings yet
1.file Organization
90 pages
File Organisation 2023
No ratings yet
File Organisation 2023
5 pages
Lecture 4.Pptx 2
No ratings yet
Lecture 4.Pptx 2
15 pages
DSA Unit6 Theory
No ratings yet
DSA Unit6 Theory
23 pages
Computer Science Notes - Files
No ratings yet
Computer Science Notes - Files
17 pages
SS 2 Second Term Data Processing Note
No ratings yet
SS 2 Second Term Data Processing Note
32 pages
File Organization
No ratings yet
File Organization
4 pages
File Organization Midterm
No ratings yet
File Organization Midterm
43 pages
File and Database Design
No ratings yet
File and Database Design
28 pages
FP-Lecture-6 01
No ratings yet
FP-Lecture-6 01
33 pages
File Organization EDIT
No ratings yet
File Organization EDIT
17 pages
File Organization
No ratings yet
File Organization
1 page
Unit 3 Os 4TH Sem
No ratings yet
Unit 3 Os 4TH Sem
36 pages
File Organisation
No ratings yet
File Organisation
14 pages
MODULE-5 FILE & Their Organization
No ratings yet
MODULE-5 FILE & Their Organization
13 pages
Dbms Unit III Notes
No ratings yet
Dbms Unit III Notes
27 pages
Lec 5DB
No ratings yet
Lec 5DB
40 pages
CH 13
No ratings yet
CH 13
6 pages
File Organization
No ratings yet
File Organization
4 pages
OS Unit 3 Part 2
No ratings yet
OS Unit 3 Part 2
20 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
17 pages
UNIT 5 File Organization in DBMS
No ratings yet
UNIT 5 File Organization in DBMS
22 pages
Grade 11 - File Organisation and File Access New
No ratings yet
Grade 11 - File Organisation and File Access New
2 pages
File Organization in DBMS
No ratings yet
File Organization in DBMS
13 pages
File Organization
100% (1)
File Organization
4 pages
File Structure and Indexing
No ratings yet
File Structure and Indexing
18 pages
Explain File Management in An Operating System
No ratings yet
Explain File Management in An Operating System
57 pages
File System
No ratings yet
File System
8 pages
OS Unit5
No ratings yet
OS Unit5
23 pages
Chapter 11 File Management
No ratings yet
Chapter 11 File Management
13 pages
Lecture 37-39
No ratings yet
Lecture 37-39
35 pages
OSY Chapter 6
No ratings yet
OSY Chapter 6
12 pages
File Organization in RDBMS
No ratings yet
File Organization in RDBMS
9 pages
13.2 File Organisation & Access (MT-L)
No ratings yet
13.2 File Organisation & Access (MT-L)
6 pages
2022 - CMP 262 - File Organisation - Slides
No ratings yet
2022 - CMP 262 - File Organisation - Slides
19 pages
Dbms (Data Base Management System)
No ratings yet
Dbms (Data Base Management System)
31 pages
Module 5
No ratings yet
Module 5
68 pages
Presentation ON File Organisation: Submitted To: Mrs. Sonal Beniwal
No ratings yet
Presentation ON File Organisation: Submitted To: Mrs. Sonal Beniwal
23 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
1 File Structure & Organization
No ratings yet
1 File Structure & Organization
23 pages
Ds Mod 5
No ratings yet
Ds Mod 5
17 pages
File Organization Notes
No ratings yet
File Organization Notes
21 pages
DBMS Book Special Notes PDF
No ratings yet
DBMS Book Special Notes PDF
68 pages
Beginner's Guide for Cybercrime Investigators
From Everand
Beginner's Guide for Cybercrime Investigators
Nicolae Sfetcu
5/5 (1)
Best Free Open Source Data Recovery Apps for Mac OS English Edition
From Everand
Best Free Open Source Data Recovery Apps for Mac OS English Edition
Cyber Jannah Sakura
No ratings yet
Computer Science I Essentials
From Everand
Computer Science I Essentials
Randall Raus
5/5 (7)
Linux 5 Day Introduction Course
From Everand
Linux 5 Day Introduction Course
Stephen Edwards
No ratings yet
Oracle Database 12c Quickstart
From Everand
Oracle Database 12c Quickstart
Michael Elliott
5/5 (5)
Module 3 Report
No ratings yet
Module 3 Report
66 pages
Applied Logistic Regression - 3rd Edition Scribd Download
100% (8)
Applied Logistic Regression - 3rd Edition Scribd Download
17 pages
401 Presentation: Group - II
No ratings yet
401 Presentation: Group - II
33 pages
Lecture 2 - Process Design & Analysis
No ratings yet
Lecture 2 - Process Design & Analysis
29 pages
Grade 11 ICT - Learning Actvity 001
No ratings yet
Grade 11 ICT - Learning Actvity 001
7 pages
Isolation Forest Step by Step. Overview - by Hyunsu Kim - Medium
No ratings yet
Isolation Forest Step by Step. Overview - by Hyunsu Kim - Medium
5 pages
Product Data Sheet Metco 5MPE Series Powder Feeders
No ratings yet
Product Data Sheet Metco 5MPE Series Powder Feeders
4 pages
IAPP CERTIFICATION ExamUpdates 072120.2 PDF
No ratings yet
IAPP CERTIFICATION ExamUpdates 072120.2 PDF
1 page
Linux Commands Everyone Should Know
No ratings yet
Linux Commands Everyone Should Know
19 pages
Copyright in Digital Age
100% (1)
Copyright in Digital Age
12 pages
Do Not Dare To Copy It
No ratings yet
Do Not Dare To Copy It
37 pages
Comp1 Midterm Rev Ae
No ratings yet
Comp1 Midterm Rev Ae
8 pages
CV - Yuvaraj 2022
No ratings yet
CV - Yuvaraj 2022
5 pages
Q.A. Basic Provison and Salary Package
No ratings yet
Q.A. Basic Provison and Salary Package
2 pages
Guidelines For Final Year BE Project Report Submission
No ratings yet
Guidelines For Final Year BE Project Report Submission
4 pages
Single Linked List
No ratings yet
Single Linked List
14 pages
1664189682389-2ba010110 Cced25030
No ratings yet
1664189682389-2ba010110 Cced25030
1 page
MPR-214F Instruction
No ratings yet
MPR-214F Instruction
35 pages
950H Valvula de Linha
No ratings yet
950H Valvula de Linha
8 pages
Fdd5614P: 60V P-Channel Powertrench Mosfet
No ratings yet
Fdd5614P: 60V P-Channel Powertrench Mosfet
6 pages
FYP Final Report Preparation 2019-2020 - MKMJ PDF
No ratings yet
FYP Final Report Preparation 2019-2020 - MKMJ PDF
10 pages
Scanviewer V6.1.1 User Manual
No ratings yet
Scanviewer V6.1.1 User Manual
47 pages
Aishwarya Digitec Profile Present
No ratings yet
Aishwarya Digitec Profile Present
11 pages
BWTS Sampling Procedure V1
No ratings yet
BWTS Sampling Procedure V1
5 pages
Lab1 Linux and Program
No ratings yet
Lab1 Linux and Program
49 pages
Whatsapp Document PDF
No ratings yet
Whatsapp Document PDF
5 pages
Programming in C - CS8251 2017 Regulation - Semester Question Paper 2019 Nov Dec
No ratings yet
Programming in C - CS8251 2017 Regulation - Semester Question Paper 2019 Nov Dec
5 pages
Company Profile: Pt. Rekayasa Energi Bersama
No ratings yet
Company Profile: Pt. Rekayasa Energi Bersama
35 pages
Solving XOR Problem Using DNN AIDS
100% (1)
Solving XOR Problem Using DNN AIDS
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

7.3 Section 3 File Organisation

Uploaded by

7.3 Section 3 File Organisation

Uploaded by

SIR.

7.3 SECION 3: FILE ORGANISATION AND DATA BASE CONCEPTS

RECORDS AND FILES

Variable length records

Sequential file organisation

Uses of sequential files

The steps are as follows

Open master file for reading

Random files (hash file, direct or relative file)

Use of direct access files

Indexed Sequential file Organisation

 The record with key 5584 should be in sector 5, so

Advantages of the file organisation

Uses of indexed sequential files

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.