0% found this document useful (0 votes)
12 views8 pages

PracticalList - EDT - BCA - 2024 SET B1 - 4

The document outlines the practical workbook plan for the BCA 6th semester course CS5006: Essentials of Data and Text Processing at B.V. Patel Institute of Computer Science for the academic year 2024-2025. It details course objectives, outcomes, and the mapping of course outcomes to program outcomes, along with a schedule of practical assignments and their respective assessment criteria. Each practical task focuses on various aspects of data analysis, including data cleaning, transformation, visualization, and web scraping.

Uploaded by

patelvidhin25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views8 pages

PracticalList - EDT - BCA - 2024 SET B1 - 4

The document outlines the practical workbook plan for the BCA 6th semester course CS5006: Essentials of Data and Text Processing at B.V. Patel Institute of Computer Science for the academic year 2024-2025. It details course objectives, outcomes, and the mapping of course outcomes to program outcomes, along with a schedule of practical assignments and their respective assessment criteria. Each practical task focuses on various aspects of data analysis, including data cleaning, transformation, visualization, and web scraping.

Uploaded by

patelvidhin25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

B.V.

Patel Institute of Computer Science 2024-2025

Practical Workbook Plan

BCA (6th Semester)

CS5006: Essentials of Data and Text Processing

Course Objective: To familiarise the concept of data and text analysis, measurement levels and
choose the relevant cleaning, transformation techniques to overcome data fallacies for effective
representation, analysis for useful pattern identification.

Course Outcomes: Upon completion of the course, the student shall be able to

CO1: Describe the data types, quality measurement for data analytics.
Discuss importance and difference between Data Mining, and Data Science and Machine
CO2:
Learning for emerging domains.
Identify and use selected data acquisition techniques for data gathering through scraping
CO3:
data from specific data sources.
Summarise, describe and visualise data by utilising relevant data representation
CO4:
techniques.
Apply relevant data cleaning and transformation techniques to standardise the data for
CO5:
analytics along with dimension reduction.
Understand and apply approaches to text and document processing for statistical modelling
CO6:
and document summarization.
Programme Outcomes: The student will have

PO1: Ability to understand the concepts of key areas in computer science.


Ability to design and develop system, component or process as well as test and maintain
PO2:
it so as to provide promising solutions to industry and society.
PO3: Effective communication and presentation skill.
PO4: Ability to understand professional and ethical responsibility.
PO5: Recognition of the need for life-long learning.

Programme Outcomes and Course Outcomes mapping:

Course Outcomes Programme Outcomes


PO1 PO2 PO3 PO4 PO5
CO1 √
CO2 √ √
CO3 √ √
CO4 √ √ √
CO5 √ √
CO6 √ √ √ √ √

MS. BHUMIKA DESAI, MS. JENISHA TAILOR 1


B.V. Patel Institute of Computer Science 2024-2025

Time required to
Minimum required
Number of implement and Submission
Unit Number of workbook
Questions debug the question Deadlines
Certification
(in hours)
1 2 9 2 4th Week of Semester
2 1 6 1 6th Week of Semester
3 3 12 3 8th Week of Semester

4 1 6 1 10th Week of
Semester
5 1 6 1 12th Week of
Semester
6 2 9 2 14th Week of
Semester
TOTAL 10 48 10 -

MS. BHUMIKA DESAI, MS. JENISHA TAILOR 2


B.V. Patel Institute of Computer Science 2024-2025

Set B

Practical Enrollment No:


No. 1
Practical Consider University data set
Problem Stude Full Cour Major Yea Gend Ag CGP Attenda Scholars
nt ID Nam se r of er e A nce % hip
e Stu
dy
S001 Aara B.Te Computer 2nd Male 20 8.5 92% Yes
v ch Science Yea
Shah r
S002 Riya B.Sc. Mathema 3rd Fema 21 7.8 88% No
Patel tics Yea le
r
S003 Man B.Co Finance 1st Male 19 8.2 85% Yes
an m Yea
Meht r
a
S004 Pooj B.A. Literature 4th Fema 22 9.1 95% Yes
a Yea le
Desa r
i
S005 Kara B.Te Electrical 2nd Male 20 7.5 80% No
n ch Yea
Sing r
h
S006 Sneh B.Sc. Physics 3rd Fema 21 8.3 90% Yes
a Yea le
Roy r
S007 Ankit B.Co Marketin 1st Male 18 7.9 87% No
Kum m g Yea
ar r

1. How many elements are in the data set? Write down these elements.
2. How many variables are in the data set? Write down these variables.
3. How many observations are in the data set? Write down these observations.
4. Which of the above variables are qualitative and which are quantitative?
5. Identify the scale of measurement used to store data in each of the above
given variables and justify the same.
6. Write all steps to store this tabular data into CSV format.
7. Write the statement to read and view CSV file of above given data in R.

Objective( To understand the types of data and identification of scale of measurement for each
s) variable.
Pre- Basics of data and its characteristics
requisite
CO(s) to CO1

MS. BHUMIKA DESAI, MS. JENISHA TAILOR 3


B.V. Patel Institute of Computer Science 2024-2025

be
achieved
Skill Analytical Skill, Technical Writing Skill
mapped
Nature of Handwritten practical solution with output
workbook
submissio
n
Assessment
Parameter Marks
Understanding of Scale of Measurements [5 Marks]
Basic of R [5 Marks]
Signature & Date

MS. BHUMIKA DESAI, MS. JENISHA TAILOR 4


B.V. Patel Institute of Computer Science 2024-2025

Practical No. 2 Enrollment No:


Practical Refer Kaggle data sets (https://www.kaggle.com/datasets?fileType=csv) and
Problem download the dataset for Car Price Prediction.
Answer the following questions:
1. Download the dataset and summarize it using Python or R.
2. Calculate the mean, median, and standard deviation for the Price column.
3. Visualize the distribution of Mileage using a histogram.
4. Check for missing values in the dataset and handle them by replacing with
the mean.
5. Identify categorical variables in the dataset (e.g., Car Make, Transmission)
and explain why they are categorical.
6. Normalize the Price column to the range [0, 1].
7. Find the correlation between Price and Mileage.
8. Identify any outliers in the Price column using the IQR method.
9. Categorize the Price column into "Low", "Medium", and "High" based on
predefined thresholds.
10. Create a simple scatter plot between Mileage and Price.
Objective(s) To apply subset operation on data frame and performing basic operation on data
frame.
Pre-requisite Scale of Measurement, Fundamental knowledge of Central Tendency
CO(s) to be CO1
achieved
Skill mapped Technical Skill , Technical Writing Skill
Nature of Handwritten practical solution with output
workbook
submission
Assessment
Parameter Marks
Understanding of Scale of Measurements [2 Marks]
Operations in R [5 Marks]
Technical Skill [3 Marks]
Signature & Date

MS. BHUMIKA DESAI, MS. JENISHA TAILOR 5


B.V. Patel Institute of Computer Science 2024-2025

Practical Enrollment No:


No. 3
Practical 1. Construct a sitemap to extract job postings, including titles, companies, locations, and
Problem salaries, from a job portal (e.g., LinkedIn, Glassdoor, or Indeed).
2. Perform web scraping across multiple job listings using the designed sitemap.
3. Clearly outline the detailed steps to accomplish tasks 1 and 2.
4. Verify if the scraped job posting dataset contains any missing or null values.
5. Display the total count of missing values in the dataset.
6. Calculate the percentage of missing values for each column in the dataset.
7. Remove all rows containing missing values from the dataset.
8. Drop all columns where missing values are present.
9. Replace missing values in numerical variables with a default value of -1, and provide the
command to replace them with the median.
10. Use three distinct imputation methods (e.g., mean substitution, predictive modeling, KNN
imputation) to handle missing values in categorical variables.
11. Replace missing values in categorical variables with the second most frequent value in
each respective variable.

Objective(s) To experience data scraping and understand that how to handle with missing values in data set.
Pre- Basics of Data Scraping and Data Pre-processing
requisite
CO(s) to be CO1, CO3, CO4
achieved
Skill Technical Skill, Analysis Skill
mapped
Nature of Handwritten practical solution with output
workbook
submission
Assessment
Parameter Marks
Knowledge of Data Pre-processing [5 Marks]
Technical Knowledge [2 Marks]
Technical Writing [3 Marks]
Signature & Date

MS. BHUMIKA DESAI, MS. JENISHA TAILOR 6


B.V. Patel Institute of Computer Science 2024-2025

Practical Enrollment No:


No. 4
Practical Consider the following dataset to solve the below-given tasks:
Problem
Mileage Consumer Satisfaction
Brand Model Price
(km/l) (1 to 5)

Toyota Corolla 18 1500000 4


Honda Civic 16 1400000 5
Hyundai Elantra 17 1350000 3
Ford Focus 14 1300000 3
Volkswagen Jetta 15 1450000 2
Nissan Altima 13 1250000 1

Perform the following operations on the dataset:

1. Smooth the Mileage attribute using equal-width bins.


2. Generalize the Customer Satisfaction into three categories: High (4-5), Medium
(3), and Low (1-2).
3. Normalize the Price attribute to the range [0, 1].
4. Replace any missing values in the Mileage attribute with the mean value.
5. Calculate the percentage of Customer Satisfaction in the dataset.

Objective(s) Student shall be able to understand the data representation and data summarization by
applying its various techniques.
Pre- Fundamentals of Central Tendency
requisite
CO(s) to be CO1, CO4
achieved
Skill Technical Skill, Analysis Skill, Technical Writing Skill
mapped
Nature of Handwritten practical solution with output
workbook
submission
Assessment
Parameter Marks
Data Representation [5 Marks]
Data Summarisation [5 Marks]
Signature & Date

MS. BHUMIKA DESAI, MS. JENISHA TAILOR 7


B.V. Patel Institute of Computer Science 2024-2025

MS. BHUMIKA DESAI, MS. JENISHA TAILOR 8

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy