MCS 207
MCS 207
In addition,
this system is used for issue and return of books to its students. This database is used to find the
following details by the students of the university and the staff of the library: List of the classification
number, ISBN number, Title, Author Names, Subject Area of the books. Searching of books using
subject area, Title and Author name. List of books that are issued to a specific student. Draw an ER
diagram for the library. Specify key attributes and constraints on each entity type and on each
relationship type. Note any unspecified requirements and make appropriate assumptions to make the
specification complete.
To create an Entity-Relationship (ER) diagram for the university library system, we'll identify entities, their
attributes, and the relationships between them. Here's a breakdown of the entities and their key
attributes based on the requirements you have described:
### Entities:
1. **Book**
- **Attributes:**
- ISBN Number
- Title
- Author Names
- Subject Area
- **Constraints:**
2. **Student**
- **Attributes:**
- Name
- Phone Number
- **Constraints:**
3. **Library Staff**
- **Attributes:**
- Name
- Position
- **Constraints:**
4. **Loan/Transaction**
- **Attributes:**
- Date Issued
- Due Date
- Date Returned
- **Constraints:**
### Relationships:
1. **Issues**
- **Attributes:**
- **Constraints:**
- A student can borrow multiple books, and a book can be issued to multiple students over time
(many-to-many relationship).
```plaintext
| |
| |
[Loan/Transaction] |
[Library Staff]
```
### Assumptions:
- Each book can be written by multiple authors, and each author can write multiple books.
- The library can have multiple copies of the same book, but each copy can be identified by a unique
Classification Number.
- The system records the transactions (issue and return) using a unique Transaction ID.
- Library staff manage the database and can issue and return books.
### Notation:
- You may wish to include a "Return" relationship between the Student and Book entities that is just a
function of the existing Issue relationship (to maintain the record of transactions).
- Unspecified requirements for the system may include the management of overdue fines, book
reservations, and detailed cataloging procedures which can be included if necessary.
This ER diagram structure serves as a foundation for creating a comprehensive database for the library
system. If any specific element or requirement needs further exploration, feel free to ask!
d. Design normalised tables in 3NF for the ER diagram drawn in part (c), with the required integrity
constraints.
To design normalized tables in Third Normal Form (3NF) for the university library system based on the ER
diagram provided, we’ll convert the identified entities and relationships into tables while ensuring we
meet the 3NF criteria. The key requirements for 3NF are:
2. There must be no transitive dependencies; that is, non-prime attributes should depend only on the
primary key.
Given the entities identified (Book, Student, Library Staff, and Loan/Transaction), here is how the tables
will look:
```sql
SubjectArea VARCHAR(255),
);
```
**Integrity Constraints:**
```sql
PhoneNumber VARCHAR(15)
);
```
**Integrity Constraints:**
```sql
Position VARCHAR(100),
);
```
**Integrity Constraints:**
```sql
StudentID INT,
ClassificationNumber VARCHAR(20),
DateReturned DATE,
);
```
**Integrity Constraints:**
- **Books Table**: Stores unique records of books. Attributes such as ISBN and Title are directly related
to the book classified by the primary key.
- **Students Table**: Contains unique records of students with no dependencies on non-key attributes.
- **Library Staff Table**: Maintains records of library staff, ensuring that staff identifiers and emails are
unique.
- **Transactions Table**: Links students to books in terms of issued transactions while capturing
pertinent issuing details. This table uses foreign keys to reference StudentID and ClassificationNumber,
establishing the relationships without introducing transitive dependencies.
By ensuring these tables adhere to 3NF, we maintain the integrity and minimize redundancy within our
database for the university library system. If you need further clarification or specifics on any aspect,
please let me know!
Explain how the secondary index can be created in a file. Also, explain the advantages and disadvantages
of using secondary indexes. When should you use secondary Indexes? Give reasons in support of your
answer.
A secondary index is an additional index structure that allows efficient searching of records in a data file
based on attributes other than the primary key. Unlike the primary index, which is typically created on a
unique, ordering key, a secondary index can be created on any attribute (or combination of attributes)
and may contain duplicate values.
1. Choose the Indexing Attribute(s): First, you need to decide which attribute(s) you want to create
the secondary index on. This decision is driven by the types of queries you anticipate running
frequently that don't use the primary key. For example, in a library database, you might want
secondary indexes on Author_Name or Subject_Area.
2. Scan the Data File: The system needs to read through the entire data file, record by record.
3. Extract Indexing Attribute and Record Pointer: For each record, the value(s) of the chosen
indexing attribute(s) are extracted, along with a pointer (or reference) to the actual location of
that record in the data file. This pointer could be:
o Record ID or Primary Key Value: If the primary key is known and relatively small, the
secondary index might store the indexing attribute value and the corresponding primary
key value. This requires an additional lookup in the primary index (if it's a separate
structure) or the data file itself using the primary key to retrieve the full record.
o Block Number and Offset: The pointer could directly store the block number and the
offset within that block where the record resides in the data file. This allows direct
access to the record but might require adjustments if records are moved or the file is
reorganized.
4. Build the Index Structure: The extracted (indexing attribute value, pointer) pairs are then used
to build the secondary index structure. Common index structures include:
o Ordered File: The pairs are sorted based on the indexing attribute value and stored in a
separate file. For attributes with duplicate values, multiple pointers will be associated
with the same attribute value.
o B-trees or B+ trees: These tree-based structures are highly efficient for searching,
insertion, and deletion, even with a large number of entries. The indexing attribute value
acts as the search key in the tree, and the leaf nodes (or internal nodes in some B-tree
variations) contain the pointers to the data records.
o Hash Tables: If range queries are not frequent, a hash table can provide very fast
lookups for specific attribute values. The indexing attribute value is hashed to find the
bucket containing the pointers.
5. Store the Index: The created secondary index structure is stored persistently, usually in a
separate file or a dedicated area within the database system's storage.
Improved Query Performance: Secondary indexes significantly speed up queries that search
based on the indexed attribute(s). Without a secondary index on an attribute like Author_Name,
a query to find all books by a specific author would require a full scan of the data file. With the
index, the system can quickly locate the relevant entries in the index and then directly access the
corresponding records in the data file.
Support for Diverse Search Criteria: They allow users and applications to efficiently search the
data based on various criteria beyond the primary key, making the system more flexible and
user-friendly.
Faster Sorting on Non-Primary Key Attributes: If the results of a query need to be sorted by an
attribute with a secondary index, the system can often leverage the sorted order of the index to
avoid a separate sorting operation on the entire data file.
Increased Storage Overhead: Secondary indexes require additional storage space to store the
index structure itself. The size of the index depends on the number of records in the data file,
the size of the indexing attribute(s), and the type of index structure used.
Increased Maintenance Overhead: When the data file is modified (insertion, deletion, or update
of records), the secondary indexes also need to be updated to reflect these changes. This adds to
the processing time of these operations. For example, inserting a new book might require adding
an entry to the author index, the subject index, etc.
Potential Performance Degradation for Certain Operations: While secondary indexes speed up
retrieval based on the indexed attributes, they can slightly slow down operations that modify the
data, especially if there are many secondary indexes. The system needs to update each index
whenever the corresponding data changes.
Complexity of Index Management: Managing multiple indexes adds complexity to the database
system's implementation and maintenance. Choosing the right attributes to index and optimizing
index performance requires careful consideration.
o Reasoning: To significantly improve the response time of these common queries, making
the system more performant and user-friendly.
Need to Sort Results by Non-Primary Key Attributes: If you often need to present data sorted by
attributes other than the primary key, a secondary index on that attribute can speed up the
sorting process.
o Reasoning: To avoid costly sorting operations on the entire data file, especially for large
datasets.
Queries Involve a Specific Subset of Records Based on Non-Key Attributes: If your queries often
narrow down the search space based on certain non-key attribute values before potentially
looking up records by primary key or other criteria.
o Reasoning: To quickly identify the relevant subset of records without scanning the entire
file.
The Indexed Attribute is Rarely Used in Queries: Creating an index on an attribute that is
seldom used will result in unnecessary storage and maintenance overhead without providing
significant performance benefits.
o Reasoning: To avoid wasting storage space and slowing down data modification
operations without any corresponding gain in query performance.
The Data File is Very Small: For very small files, the overhead of maintaining an index might
outweigh the performance benefits of using it for searches. A full file scan might be just as fast
or even faster.
o Reasoning: The initial cost of accessing the index and then the data records might be
comparable to or greater than simply scanning the small file.
The Data File Has High Update Frequency: If the data file is updated very frequently (insertions,
deletions, updates), the overhead of maintaining multiple secondary indexes can become
significant and impact the overall performance of data modification operations.
The Indexed Attribute Has Very Few Distinct Values: An index on an attribute with very few
distinct values (e.g., a boolean flag with mostly one value) might not be very effective in
narrowing down the search space.
o Reasoning: The index might still point to a large number of records, and the system
might end up having to examine many of them anyway.
In summary, the decision to create secondary indexes involves a trade-off between improved query
performance for specific types of searches and the overhead of increased storage space and
maintenance costs. Careful analysis of the expected query patterns and data modification frequency is
crucial for making informed decisions about which attributes to index.