0% found this document useful (0 votes)
10 views25 pages

Moocs

The document is a seminar report on Machine Learning Foundations submitted by Rajendra Singh Rathor for the B.Tech in CSE program at Graphic Era Hill University. It includes a certificate of completion, acknowledgments, and detailed content covering various aspects of database management systems, including SQL commands, data manipulation, and relational algebra. The report emphasizes the importance of data integrity, query writing, and the use of ER models in database design.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views25 pages

Moocs

The document is a seminar report on Machine Learning Foundations submitted by Rajendra Singh Rathor for the B.Tech in CSE program at Graphic Era Hill University. It includes a certificate of completion, acknowledgments, and detailed content covering various aspects of database management systems, including SQL commands, data manipulation, and relational algebra. The report emphasizes the importance of data integrity, query writing, and the use of ER models in database design.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

MOOC BASED SEMINAR REPORT

On

Machine Learning Foundations


Submitted in partial fulfilment of the requirement for Seminar in 6stSemester.

of
B.Tech in CSE
RAJENDRA SINGH RATHOR

Under the Guidance of


Mr. Aviral Awasthi
(Assistant Professor, DEPT. OF CSE)

DEPARTMENT OF COMPUTER SCIENCE ENGINEERING


GRAPHIC ERA HILL UNIVERSITY
BHIMTAL

SESSION (2024-2025)
CERTIFICATE

THIS IS TO CERTIFY THAT RAJENDRA SINGH RATHOR HAS SATISFACTORILY


PRESENTED MOOC BASED SEMINAR ON THE COURSE TITLE MACHINE LEARNING
FOUNDATION COURSE IN PARTIAL FULLFILLMENT OF THE SEMINAR
PRESENTATION REQUIREMENT IN 6ND SEMESTER OF B.TECH. DEGREE COURSE
PRESCRIBED BY GRAPHIC ERA HILL UNIVERSITY DURING THE ACADEMIC
SESSION 2024-2025

MOOCS - Coordinator and Mentor

Mr. Aviral Awasthi

SIGNATURE
TABLE OF CONTENT

S. NO. CONTENT PAGE NO.


1 ACKNOWLEDGEMENT
2 INTRODUCTION 1
3 WEEK 1 4
4 WEEK 2 8
5 WEEK 3 10
6 WEEK4 12
7 WEEK5 15
8 WEEK6
17
ACKNOWLEDGEMENT

I take this opportunity to express my profound gratitude and deep regards to my guide Mr. Aviral
Awasthi for her exemplary guidance, monitoring and constant encouragement throughout the course.
The blessing, help and guidance given by her time to time helped me throughout the project. The
success and final outcome of this course required a lot of guidance and assistance from many people
and I am extremely privileged to have got this all along the completion of my report. All that I have
Done is only due to such supervision and assistance and I would not forget to thank them. I am
Thankful to and fortunate enough to get constant encouragement, support and guidance from all the
People around me which helped me in successfully completing my online course.
Introduction

Introduction to the DBMS Course

The Database Management System (DBMS) course introduces the concepts and principles behind
managing data in an organized and efficient way. A DBMS is a software system that helps users
create, manage, and interact with databases.
The course covers various aspects of databases, including how data is stored, accessed, and
maintained. It also provides insights into solving real-world problems by organizing data
systematically and securely. This module lays the foundation for understanding how modern
applications rely on databases to function smoothly.

Tables and Keys


Tables are the core structure of any database. They store data in rows and columns, making it easy to
organize and retrieve information. Each column represents a specific attribute, while rows hold
individual records. For example, a "Students" table may have columns like ID, Name, and Age, and
each row would store details for one student.
Keys are essential for ensuring data uniqueness and relationships between tables:
• Primary Key: Uniquely identifies each record in a table. For instance, ID in the "Students"
table can serve as a primary key.
• Foreign Key: Links two tables together by referencing the primary key in another table. This
helps establish relationships between related data.
• Composite Key: A combination of two or more attributes used as a key when a single
attribute isn’t sufficient.
Integrity Constraints
Integrity constraints ensure that the data in a database remains accurate, consistent, and reliable. Some
common types of constraints include:
• Primary Key Constraint: Ensures that no two rows in a table have the same primary key
value.
• Foreign Key Constraint: Maintains consistency between tables by ensuring that the value in
the foreign key column matches a value in the referenced table.
• Not Null Constraint: Ensures that certain fields cannot have null (empty) values.
• Unique Constraint: Guarantees that all values in a column are unique.
• Check Constraint: Validates that the data in a column satisfies a specific condition, such as
ensuring the Age column has values greater than zero.
Integrity constraints play a critical role in preserving the quality and reliability of data stored in
databases. They prevent errors and maintain the trustworthiness of the database system.

1
ER Models and Diagrams

The Entity-Relationship (ER) model is a high-level data modeling technique


used to visually represent the structure of a database. It helps in understanding
how different entities (objects) in a database relate to one another. This model is
often used in the design phase of database development to create a blueprint of
the database.
Key components of the ER model include:
• Entities: Objects or things in the real world that can be identified
distinctly. For example, a "Student" or "Course".
• Attributes: Properties or details about an entity. For example, a student
entity may have attributes like Name, Age, and Roll Number.
• Relationships: Associations between entities. For example, a student
enrolls in a course.
ER Diagrams
An ER diagram (Entity-Relationship diagram) is a graphical representation of
the ER model. It is widely used to visualize how entities and their relationships
are structured in a database. The main elements of an ER diagram are:
Entities
• Represented as rectangles.
• Example: A "Student" entity might be shown as a rectangle labeled
"Student".
Attributes
• Represented as ovals and connected to their respective entities.
• Example: Attributes like Name and Roll Number would be connected to
the "Student" entity.
Relationships
• Represented as diamonds between two or more entities.
• Example: A "Student" entity might have a relationship called "Enrolled
2
• In" with a "Course" entity.

Cardinality
• Specifies how many instances of one entity are associated with instances
of another entity.
• Types of cardinality include:
o One-to-One (1:1): A student has one ID card
o Many-to-Many (M:N): Students enroll in many courses, and
courses have many students.
Importance of ER Models and Diagrams
• Clarity: Provides a clear structure for the database design.
• Simplicity: Simplifies communication between developers and
stakeholders.
• Error Reduction: Identifies potential design issues early in the
development process.
• Foundation for Implementation: Acts as a roadmap for creating the
actual database.

3
Relational Algebra

Introduction to Relational Algebra


Relational Algebra is a procedural query language used in databases. It provides
a set of operations to retrieve and manipulate data stored in relational databases.
Instead of directly accessing the data, relational algebra uses mathematical
notations to describe the steps needed to get the desired output.
Relational Algebra is essential for understanding how database queries are
processed and optimized. It forms the theoretical foundation for Structured
Query Language (SQL) and ensures accurate and efficient data handling.
Basic Operators in Relational Algebra
Relational Algebra includes several basic operators to perform operations on
relations (tables). These operators can be divided into two categories: Unary
Operators (operate on one relation) and Binary Operators (operate on two
relations).
Unary Operators
1. Selection (σ): Retrieves rows from a table that satisfy a specific
condition.
Example: σ(Age > 18) (Students) retrieves all students older than 18.
2. Projection (π): Retrieves specific columns from a table.
Example: π(Name, Age) (Students) retrieves only the Name and Age
columns.
3. Rename (ρ): Renames a relation or its attributes.
Example: ρ(NewName, Students) renames the "Students" table to
"NewName".
Binary Operators
1. Union (∪): Combines rows from two relations, eliminating duplicates
4
Example: Students1 ∪ Students2 gives a list of all unique students.
2. Intersection (∩): Retrieves rows that are common in two relations.
Example: Students1 ∩ Students2 gives students present in both lists.
3. Difference (-): Retrieves rows that are in one relation but not in another.
Example: Students1 - Students2 gives students only in Students1.
4. Cartesian Product (×): Combines every row of one relation with every
row of another.
Example: Students × Courses pairs all students with all courses.
Joins
Joins are used to combine data from two or more relations based on a related
column. They

are fundamental in relational databases to bring meaningful information


together.
• Theta Join (θ): Combines rows from two relations that satisfy a
condition.
• Equi Join: A special case of Theta Join where the condition is equality.
• Natural Join: Combines relations based on columns with the same name
and values, removing duplicates.
• Outer Join: Includes unmatched rows in the results:
o Left Outer Join: Includes all rows from the left relation, even if
there is no match in the right relation.
o Right Outer Join: Includes all rows from the right relation, even if
there is no match in the left relation.
o Full Outer Join: Includes all rows from both relations, with
unmatched rows filled with nulls.

5
Division Operator
The Division operator is used to find rows in one relation that match all values
of another relation. It is typically used when dealing with "all" conditions.
Example:
If we have two tables:
• StudentsCourses (Student, Course)
• Courses (Course)
The query "Which students are enrolled in all courses?" can be answered using
the division operator:
StudentsCourses ÷ Courses
Structured Query Language (SQL) stands as the cornerstone of relational
database management systems (RDBMS), serving as a universal language for
managing, manipulating, and querying data. Its significance lies in its ability to
interact with databases, enabling users to retrieve, insert, update, and delete data
with ease and efficiency.
SQL operates on the principle of relational algebra, offering a standardized
approach to interact with databases irrespective of the underlying RDBMS
platform. It provides a rich set of commands and syntax for performing various
operations on data, making it a versatile tool for developers, data analysts, and
database administrators alike.
The primary components of SQL include:
1. Data Definition Language (DDL): DDL commands facilitate the creation,
modification, and deletion of database objects such as tables, indexes, and
views. Commands like CREATE, ALTER, and DROP are used to define the
structure and organization of data within a database.
2. Data Manipulation Language (DML): DML commands enable users to
manipulate the data stored in tables. Commands like SELECT, INSERT,
UPDATE, and DELETE are used to retrieve, add, modify, and remove data
6
from tables, facilitating seamless data manipulation.
3. Data Control Language (DCL): DCL commands govern the access and
permissions granted to users for database objects. Commands like GRANT and
REVOKE regulate user privileges, ensuring data security and integrity within
the database environment.
4. Data Query Language (DQL): DQL is primarily concerned with querying
data from databases using SELECT statements. It allows users to retrieve
specific data subsets based on specified criteria, enabling comprehensive data
analysis and reporting

7
SQL

Introduction

The SQL course commenced with an in-depth introduction to Structured Query


Language (SQL), elucidating its significance in database management systems.
It laid the foundation by explaining the fundamental concepts and principles of
SQL, providing a comprehensive understanding of its usage and applications in
database management.

Commands and Data Types

This section delved into the various SQL commands essential for data
manipulation and retrieval. It covered a wide array of commands including
SELECT, INSERT, UPDATE, and DELETE, elucidating their syntax and
usage. Additionally, the course explored different data types supported by SQL,
emphasizing their significance in defining the structure of a database.

Constraints

Constraints play a crucial role in maintaining data integrity within a database.


This section provided an extensive overview of constraints in SQL, including
NOT NULL, UNIQUE, PRIMARY KEY, FOREIGN KEY, and CHECK
constraints. It elucidated how constraints ensure data consistency and prevent
anomalies in the database.

Query Writing

Query writing is a fundamental skill in SQL for extracting meaningful insights


from databases. This section focused on honing query writing skills, covering
various clauses such

Sample Tables

Understanding the structure and organization of tables is imperative in SQL. In


this section, participants were introduced to sample tables representing different
entities and relationships. They learned to create, modify, and manage tables
using SQL commands, thereby gaining practical experience in database schema
design.

as SELECT, FROM, WHERE, GROUP BY, HAVING, and ORDER BY.


Through hands-on exercises, participants gained proficiency in crafting
complex queries to retrieve specific data from databases efficiently.

8
Joins in SQL

Joins are pivotal for fetching data from multiple tables based on common
columns. This section elucidated different types of joins including INNER
JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN. Participants
learned to leverage joins effectively to combine data from disparate tables,
facilitating comprehensive data analysis.

Subquery

Subqueries enable nesting one query within another, allowing for complex data
retrieval and manipulation. This section explored the concept of subqueries,
illustrating their applications in filtering, sorting, and aggregating data.
Participants gained insights into writing efficient subqueries to solve intricate
database problems.

Derived Table

Derived tables, also known as inline views, are temporary result sets generated
within a SQL query. This section delved into the creation and utilization of
derived tables to simplify complex queries and improve query performance.
Participants learned to leverage derived tables effectively for data analysis and
reporting purposes.

Views

Views provide a virtual representation of data stored in one or more tables,


offering a layer of abstraction for data access. This section elucidated the
creation, modification, and usage of views in SQL. Participants learned to
create custom views tailored to specific business requirements, enhancing data
accessibility and security.

In conclusion, the SQL course comprehensively covered essential topics


ranging from basic commands to advanced query writing techniques and
database management concepts. Participants acquired practical skills and
theoretical knowledge essential for proficiently working with
databases using SQL.

9
Data Manipulation Language
Introduction to Data Manipulation Language

Data Manipulation Language (DML) is a category of SQL (Structured Query


Language) commands used to manage and manipulate data stored in a database.
Unlike commands that define the structure of the database (DDL), DML
focuses on handling the data itself, such as adding, updating, or removing
records from database tables.

DML operations are essential for interacting with the database and ensuring
data remains up-to-date and accurate.

INSERT Command

The INSERT command is used to add new rows (records) to a table.

Example:

Imagine a "Students" table with columns for ID, Name, and Age. To add a new
student, you would use the INSERT command. For example, you could add a
student with an ID of 1, a name of "John Doe," and an age of 20.

This allows new records to be stored in the database efficiently.

Points to Note:

• All required columns must be provided with appropriate values.


• If some columns are left out, they must either have default values or
allow null values.

UPDATE Command

The UPDATE command is used to modify existing records in a table.

Example:

Suppose you need to update the age of the student with an ID of 1 in the
"Students" table. If the student’s age was originally 20 and you want to change
it to 21, you would use the UPDATE command.

This is useful for correcting or revising existing information without deleting


and re-entering the data.

10
Points to Note:

• The condition specified ensures that only the intended record(s) is


updated.
• If no condition is given, all records in the table will be updated, which
might not be desirable.

DELETE Command

The DELETE command is used to remove records from a table.

Example:

If a student with an ID of 1 leaves the system and you no longer need to keep
their information, you can use the DELETE command to remove their record
from the "Students" table.

This helps maintain a clean and accurate database by removing outdated or


unnecessary data.

Points to Note:

• Be cautious while using the DELETE command, especially without


conditions, as it can remove all records from a table.
• For large-scale deletions, ensure proper backups are in place.

11
Data Definition Language
Introduction to Data Definition Language

Data Definition Language (DDL) consists of commands used to define, modify,


or remove the structure of database objects such as tables, indexes, and
schemas. Unlike Data Manipulation Language (DML), which deals with data,
DDL commands focus on the design and structure of the database.

DDL operations are crucial for creating and maintaining the database
framework, ensuring that it meets the requirements of the application or
organization.

CREATE Command

The CREATE command is used to create new database objects like tables,
indexes, or schemas.

Example:

If you want to create a table named "Students" with columns for ID, Name, and
Age, the CREATE command helps set up the structure for storing this data.

This is the starting point for defining the data storage framework of the
database.

ALTER Command

The ALTER command is used to modify the structure of an existing database


object, such as adding a new column to a table or changing the data type of an
existing column.

Example:

If the "Students" table needs an additional column for storing email addresses,
the ALTER command can be used to add the new column without affecting the
existing data.

DROP Command

The DROP command is used to permanently delete database objects such as


tables or schemas. Once executed, the object and all its data are removed and
cannot be recovered.

12
Example:

If the "Students" table is no longer needed, the DROP command can remove it
from the database entirely.

This command is useful for cleaning up unnecessary or outdated objects, but it


must be used with caution due to its irreversible nature.

TRUNCATE Command

The TRUNCATE command is used to remove all rows from a table without
deleting the table itself. Unlike the DELETE command, TRUNCATE resets the
table and frees up storage space used by the data.

Example:

If you want to remove all records from the "Students" table but keep the table
structure intact for future use, the TRUNCATE command is ideal.

It is faster and more efficient than DELETE for bulk data removal.

DELETE Command (in Context of DDL)

While primarily a DML command, the DELETE command can be used to


remove specific rows from a table. Unlike TRUNCATE, DELETE allows you
to specify which records to remove by using a condition.

Example:

If you only need to remove records of students older than 25 from the
"Students" table, DELETE is appropriate.

This makes DELETE more flexible than TRUNCATE but less efficient for
clearing entire tables.

13
Data Control Language
Introduction to Data Control Language

Data Control Language (DCL) consists of commands used to control access to


data in a database. DCL ensures that only authorized users have access to
sensitive information or specific database operations. The two primary DCL
commands are GRANT and REVOKE, which help manage user permissions
and maintain data security.

GRANT Command

The GRANT command is used to provide specific privileges to users or roles in


the database. These privileges define what operations a user can perform, such
as reading, writing, or modifying data.

Example:

If a user named "Alice" needs permission to read data from a table called
"Students," the GRANT command can give her the necessary access.

GRANT can assign various privileges, including:

• SELECT: Allows the user to view data.


• INSERT: Allows the user to add new records.
• UPDATE: Allows the user to modify existing records.
• DELETE: Allows the user to remove records.
• ALL PRIVILEGES: Grants all possible permissions to the user.

The GRANT command ensures that users only have access to the data and
operations they need, minimizing security risks.

REVOKE Command

The REVOKE command is used to take back privileges previously granted to a


user or role. It ensures that access to certain data or operations can be restricted
when it is no longer required or if there are security concerns.

The REVOKE command helps maintain strict control over database security by
allowing administrators to modify or remove user privileges as necessary.

14
Normalization
Introduction to Normalization

Normalization is a process used in database design to organize data into tables


in a way that reduces redundancy and ensures data integrity. It involves dividing
large tables into smaller ones and defining relationships between them.

The main goals of normalization are:

• To minimize duplicate data.


• To ensure data is stored logically and consistently.
• To make it easier to maintain and update the database.

Normalization is achieved through a series of steps, called normal forms (NF),


each with specific rules and criteria.

First Normal Form (1NF)

A table is in the First Normal Form (1NF) if:

1. All columns contain atomic (indivisible) values.


2. Each column contains values of a single type.
3. Each row in the table is unique, meaning there are no duplicate rows.

Example:

A table storing student details is not in 1NF if the "Subjects" column contains
multiple subjects like "Math, Science". To make it 1NF, each subject should be
stored in a separate row.

Second Normal Form (2NF)

A table is in the Second Normal Form (2NF) if:

1. It is in 1NF.
2. All non-key attributes are fully dependent on the primary key.

Example:

In a "StudentSubjects" table, if "StudentID" and "SubjectID" together form the


primary key but "StudentName" depends only on "StudentID," the table is not
in 2NF. To make it 2NF, "StudentName" should be moved to a separate table
with "StudentID" as its primary key.

15
Third Normal Form (3NF)

A table is in the Third Normal Form (3NF) if:

1. It is in 2NF.
2. There are no transitive dependencies.

Example:

In a "Students" table, if "StudentID" determines "Department" and


"Department" determines "DepartmentHead," the table is not in 3NF. To
achieve 3NF, "DepartmentHead" should be moved to a separate table with
"Department" as its primary key.

Boyce-Codd Normal Form (BCNF)

A table is in the Boyce-Codd Normal Form (BCNF) if:

1. It is in 3NF.
2. For every functional dependency (A → B), A must be a superkey (a key
that uniquely identifies rows).

Example:

In a "Courses" table, if both "CourseID" and "InstructorID" together form the


primary key, but "InstructorID" also uniquely determines "InstructorName," the
table violates BCNF. To fix this, "InstructorName" should be moved to a
separate table with "InstructorID" as its primary key.

Normalization helps in designing efficient and reliable databases by reducing


redundancy and improving consistency. Let me know if you need further details
or adjustments!

16
File Structure
Files and Indexing

In databases, files store the data, and indexing helps improve the speed of data
retrieval. Indexing is a technique used to quickly locate and access data in a
database table without having to scan every row.

Files:

Files are used to store data in a structured way. In databases, data is usually
stored in files that represent tables. These files can be organized in different
formats, such as flat files or more complex hierarchical or relational formats.

Indexing:

Indexing involves creating a separate data structure (an index) that points to the
locations of data within the file. This index allows for faster retrieval of data by
searching through the index rather than scanning the entire file.

There are different types of indexing techniques:

• Primary Index: Built on the primary key, it ensures unique access to


records.
• Secondary Index: Built on non-primary key attributes, it allows efficient
access to records based on attributes other than the primary key.

B-Trees

B-Trees are a type of self-balancing tree data structure that maintains sorted
data and allows searches, insertions, deletions, and other operations in
logarithmic time. They are widely used in database indexing because they keep
data sorted and are efficient for disk-based storage.

Characteristics of B-Trees:

• Balanced: All leaf nodes are at the same level.


• Multi-way Tree: Each node can have multiple children.
• Efficient: B-Trees allow fast searching, insertion, and deletion, making
them suitable for use in databases.

B-Trees are used in situations where the data is too large to fit into memory and
needs to be stored on disk, such as in databases and filesystems.

17
B+ Trees

B+ Trees are an extension of B-Trees, where all data records are stored in the
leaf nodes, and internal nodes only store keys (pointers to child nodes). The
main advantage of B+ Trees is that they allow efficient range queries since all
the data records are stored in leaf nodes and are linked sequentially.

Characteristics of B+ Trees:

• Leaf Nodes Linked: All leaf nodes are linked together, making range
queries faster and easier.
• Internal Nodes Store Keys: Internal nodes do not store data, only keys
that help in navigating the tree.
• Efficient for Range Queries: Since leaf nodes are linked in a sequence,
you can perform range queries by simply following the links between leaf
nodes.

18
Conclusion
In this seminar report on Database Management Systems (DBMS), we have
explored several fundamental concepts and modules that form the backbone of
modern database management. From understanding the core principles of
DBMS to diving deep into the technical aspects of Data Definition Language
(DDL), Data Manipulation Language (DML), Normalization, File Structures,
and more, this report aims to provide a comprehensive overview of how
databases are designed, managed, and optimized.

The ER Models and Diagrams help in visualizing relationships and ensuring


that the database structure aligns with real-world processes. The introduction of
Relational Algebra and Data Manipulation Language enhances our ability to
query and manipulate data effectively. Normalization, particularly through the
1NF, 2NF, 3NF, and BCNF techniques, ensures data integrity and eliminates
redundancy, while File Structures like B-Trees and B+ Trees optimize the
efficiency of database queries.

Through this exploration, we see that DBMS not only plays a crucial role in
organizing, storing, and retrieving data but also supports the smooth functioning
of various applications, ranging from small-scale systems to large enterprise-
level solutions. The knowledge gained from studying these topics allows us to
better understand the complexity of database design and management, preparing
us for real-world challenges.

The understanding of Data Control Language (DCL) and security mechanisms


provided here further highlights the importance of securing data access and
preventing unauthorized manipulation. With this knowledge, we are equipped to
build more efficient, scalable, and secure databases that meet the growing
demands of data in today's digital age.

In conclusion, Database Management Systems serve as the backbone of modern


information systems. Understanding the key components, from the basic
structures to the advanced mechanisms of data handling, is vital for anyone
pursuing a career in this field. This seminar has provided valuable insights into
DBMS, enhancing our ability to design, manage, and query databases
effectively.

19
Certificate

20

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy