0% found this document useful (0 votes)
3 views2 pages

IDA - Sample Questions FA1

The document consists of multiple choice, short answer, and practical questions focused on data science concepts, particularly in Python and machine learning. It covers topics such as data manipulation with Pandas, Exploratory Data Analysis (EDA), supervised and unsupervised learning, feature engineering, and model evaluation techniques. The questions are designed to assess knowledge and practical skills in handling datasets and implementing machine learning algorithms.

Uploaded by

lufunosape
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views2 pages

IDA - Sample Questions FA1

The document consists of multiple choice, short answer, and practical questions focused on data science concepts, particularly in Python and machine learning. It covers topics such as data manipulation with Pandas, Exploratory Data Analysis (EDA), supervised and unsupervised learning, feature engineering, and model evaluation techniques. The questions are designed to assess knowledge and practical skills in handling datasets and implementing machine learning algorithms.

Uploaded by

lufunosape
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Section A: Multiple Choice Questions

1. Which of the following is a popular Python library used for data manipulation?
a) NumPy
b) Pandas
c) Matplotlib
d) Seaborn
2. In Exploratory Data Analysis (EDA), which technique is commonly used to
visualize the distribution of a single variable?
a) Scatter plot
b) Histogram
c) Box plot
d) Heatmap
3. Which of the following is an example of a supervised learning algorithm?
a) K-means clustering
b) Linear Regression
c) Principal Component Analysis (PCA)
d) DBSCAN
4. Which Python function is used to load a CSV file into a Pandas DataFrame?
a) read_file()
b) load_csv()
c) read_csv()
d) import_csv()
5. Which of the following metrics is used to evaluate the performance of a
classification model?
a) Mean Squared Error (MSE)
b) Accuracy
c) R-squared
d) Adjusted R-squared

Section B: Short Answer Questions (20 Marks)

Answer all questions. Each question carries 5 marks.

1. Explain the importance of feature engineering in data science. Provide an


example of a feature engineering technique.
2. Describe how you would handle missing data in a dataset. Mention at least two
techniques.
3. Differentiate between supervised and unsupervised learning with examples of
algorithms used in each.
4. What is the purpose of splitting a dataset into training and testing sets in
machine learning?
Section C: Practical and Theoretical Questions (25 Marks)

Answer all questions. Each question carries 5 marks.

1. You have been given a dataset data.csv containing information about customer
purchases. Write a Python script using Pandas to perform the following tasks:
a) Load the dataset into a DataFrame.
b) Display the first 5 rows of the DataFrame.
c) Handle any missing values by filling them with the mean of the respective
columns.
2. Given a DataFrame df with a categorical column Category and a numerical
column Sales, write a Python script to create a bar plot showing the total sales
for each category.
3. Using the Python library Scikit-learn, write a script to split the dataset df into
training and testing sets with 80% of the data for training and 20% for testing.
4. Explain the working principle of a Decision Tree algorithm. How does a Decision
Tree make decisions at each node, and what are the criteria used to split the
data?
5. Evaluate the performance of a Decision Tree model by explaining how the
Accuracy score and Confusion Matrix can be used to assess its effectiveness.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy