0% found this document useful (0 votes)
6 views3 pages

GE02 (DAVP) Assignment

The document outlines an assignment for a course on Data Analysis and Visualization using Python, detailing tasks across six sets. Each set includes specific data manipulation and analysis tasks using Python libraries such as Numpy and Pandas, focusing on employee salaries, footballer goals, growth rates, dataframes, and personal fitness tracking. The assignment emphasizes practical application of data structures, statistical calculations, and visualization techniques.

Uploaded by

U Soni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views3 pages

GE02 (DAVP) Assignment

The document outlines an assignment for a course on Data Analysis and Visualization using Python, detailing tasks across six sets. Each set includes specific data manipulation and analysis tasks using Python libraries such as Numpy and Pandas, focusing on employee salaries, footballer goals, growth rates, dataframes, and personal fitness tracking. The assignment emphasizes practical application of data structures, statistical calculations, and visualization techniques.

Uploaded by

U Soni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

ASSIGNMENT SHEET

Name of the Paper : Data Analysis and Visualization using Python


Unique Paper Code : 2344001201
Name of the Course : GE- Computer Science
Semester : IInd Semester
Academic Year : 2022-2023

SET-1
Consider the annual income in INR lakhs of 10 employees of a company:

Salary = [7.50, 5.80, 4.45, 6.90, 5.12, 10.0, 5.20, 9.75, 6.10, 8.50, 8.96, 8.10,
7.30, 4.80, 8.70, 6.50 ]

Import Numpy library and use suitable commands to-


a. Use this data to create a 4x4 two dimensional array.
b. Calculate the monthly income of each of these employees and store the result
in a new 2D array.
c. Give an increment of Rs. 1.5 lakhs to all the employees whose salary is less
than the average salary of the company.
d. Reshape this array into a 3D array of shape (2,2,4).
e. Perform a transpose of this 3D array so that the result is a new 3D array of the
shape (4,2,2).

SET-2
Consider a dictionary:
dict1 = {Chhetri: 80, Shabbir: 23, Gouramangi: 6,
Subrata: 92, Vijayan: 29, Gawli: NULL, Nabi: 7,
Renedy: 4, Lalpekhlua: 23, Baichung:41, Surkumar: 2}
Write suitable Python command(s) in Pandas library:
a. Create a Pandas Series for the dictionary dict1 where the key is name of the footballer and the value is
the number of goals scored by him. The Series should have the names of the footballers as its index and
values as goals scored.
b. Display the names of Footballers who have scored more than 20 goals.
c. Due to the good performance of top six footballers, their rankings have increased and the number of
goals scored by them need to be increased by 25. Update the Series to reflect these changes.
d. Include a 12th man named 'Mondal' in the above Series whose number of goals scored is not known.
e. Display the list of Footballers whose number of goals scored is NOT NULL.
f. Due to injury, 'Shabbir' was replaced by 'Sandhu' who number of goals scored is 5. Reflect this change
in the Series and display the new Series.
SET-3
Consider a list of values:

rate = [4.23,3.8,2.98,2.56,3,114,3.8,3.78,2.98,4.8,4.10,3.65]

a. Import the appropriate Python libraries to create a one-dimensional ndarray called growth_rate
from the list rate. Create another one-dimensional array named twos having the same number of
elements as growth_rate, all set to 2.
b. Use Numpy library to find the index of the maximum and the minimum values in the array
growth_rate.
c. Concatenate the two arrays growth_rate and twos, and reshape the resulting array to have four
rows and appropriate number of columns, call it results.
d. Find the mean, median, mode and standard deviation of each column in results array.

SET-4
Create a dataframe object for the following data:

Write suitable Python command(s) in Pandas library to:


a. Display the number of rows and columns present in the DataFrame df.
b. Display the names of columns that have NULL values present in them, along with the count of NULL
values. Also replace the NULL values present in the column with the lowest value in that column.
c. Create a new column in df named Rating, which contains the mean of User_rating and Critic_rating
columns. Create another column, Profit, which contains the difference of Gross_collections and Budget.
d. Find the correlation between Budget and Rating. Based on the correlation values between two variables,
what inference(s) can be drawn about the relationship between them?
e. Find the most profitable director.
SET-5
Consider a list of values:
bag = [25,26,21,22,31,29,33,34,26,30,31,46]
a. Import the appropriate Python libraries to create an ndarray called bag_weights, having 3 rows and 4
columns from the list bag.
b. Display the mean, variance and median of the given data in bag_weights.
c. Write a command to display the count of values greater than the median in bag_weights.
d. Transpose bag_weights and then split it in two arrays bagA and bagB having 2 rows and three
columns each.
e. Sort bagA such that it brings the highest value of the row in the first column. Sort bagB such that it
brings the lowest value of the row in the first column.

SET-6
Priya maintained a healthy lifestyle by tracking her physical activity for 7 days leading up to a fitness challenge. She
also took three health assessments (one for each metric) at the end of each day to monitor her progress. The three
health metrics were – Cardio Endurance, Flexibility, and Strength. Now, Priya has stored her daily exercise hours in
a Series object and her daily performance scores in all three assessments in a DataFrame object (wherein, columns
represent the metric and rows represent her daily scores). Use random functions of the Numpy library to fill the
Series and DataFrame objects.

Now, perform the following operations on the Series and DataFrame objects using Matplotlib or Seaborn library:
a. Draw a line curve between Priya’s exercise hours and her performance in each assessment (across 7
days) in the same subplot using appropriate legend.
b. Provide the following style properties to each of the line curves:
i. Line curve for Cardio Endurance scores should be of magenta color, should be drawn using dashed
line, and should display the data points using square-shaped markers.
ii. Line curve for Flexibility scores should be of orange color, should be drawn using dotted line, and
should display the data points using triangle-shaped markers.
iii. Line curve for Strength scores should be of blue color, should be drawn using dash-dotted line, and
should display the data points using circle-shaped markers.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy