GE - Computer Scien EaQvs42
GE - Computer Scien EaQvs42
(iii)
Name of the Paper Data Analysis and Visualization
Score [Score [' Score3'] < 801
Using Python
(iv) Score['Class'].value_counts0.sort index0 )
J-" or the Course Computer Science: Generic
f,lective (G.E.)
(v) Score.sum(axis="columns")
(NEP-UGCF-2022)
Write a function diff to compute the difference
Semester II
between the maximum and minimum of
each column of dataframe Score and apply it to Duration : 3 Hours Maximum Marks : 90
dataframe Score. (10)
Instructions for Candidates
Question I in Section om
) I * *
4 Attempt any 4 questions
(2000) P.T.O.
6060 2 6060 ll
Section A (ii) Assign rank in descending order
Assume numpy has been imported as np and pandas (iii) Retrieve all values except NaN. (6)
has been imported as pd.
(iv) arrl > 5 (v) To create an array of l's with the same
shape and type as the given array num.
(v\ arr2 l2l =4
(b) Consider the dataframe Score given below:
(b) List and describe different types of sampling of ) )
data. (5)
N!.tue cl.rt scor.1 9cor2 9corc3
A 1 85 90 88
B 2 '14 86 80
C 1, 83 'tl 92
(c) Consider the Series object Company having D 2 64 68 13
'Company_Name' as index and Profit (in Crores) E 2 11 62 '12
as values: (3) r 1 90 87 92
P.T.O
6060 10 6060 3
positions.
P.T.O.
6060 4 6060 9
Import necessary libraries. Assign the title of the 5 (a) Define categorical and interval data. Give example
plot as 'Revenue vs Expenditure' and label y-axis of each. (4)
as 'Expenditure'. Assign red color to 'Expenditure'
(b) What is hierarchical Indexing? Why do we use
data points and green color to 'Revenue' data
hierarchical indexing in pandas? Which pandas
points.
feature enables you to have multiple index
) ) levels on an axis? Give an example of hierarchical
(e) Define correlation and covariance Outline the indexing. (6)
difference between the two. (s)
(c) Consider the data fame df 2 given below: (5)
(f1 Create a DataFrame having five rows and four
I[rmc Agra
columns and populate it with random values in the 0 Rohit 10
range I to 100. Set the index of the rows as ['L', 1 Amit 13
Ankur t2
'M', 'N', 'O', 'P'l and column indexes as ['Col1',
'Col2', 'Col3', 'Col4'1. (4) Write python commands to perform following
operations:
(g) Give the output of the following code : (3) (i) Create a new object df 3 by reindexing
df 2 row index as 10, 1,2,3,41 and column
import Pandas as pd
\ index as ['x','y'].
sl : pd. Series(['Certificate', 'Bachelor',
) )
(ii) Delete the entry of 'Amit' from dR.
'Master','Doctorate'l,index : [2,4,6,8])
(iii) Rename index of df 2 as [], 2, 3].
s I .reindex(range( l0), method = 'ffill')
(iv) Check if the entry 'Rohit' exists in df 2.
print(sl)
(v) Modify Age of 'Ankur' to 15 usings loc
command.
P.T.O.
6060 8 6060 5
(ii) Print the first l0 rows of data. 2 (a) Consider the following DataFrame House_Rent
given below: ( l0)
(iii) Display the 5 summary statistics for each
column of data.
) )
Roc0! &a. Brthro{ lUrnirhiag_Strtu! nnt
(iv) Remove the rows with all null values 2 1100 2 Unfurnisbed 10000
2 800 1 Semi-Eurnished 16000
(v) Identify duplicate values in 2 900 2 Furnished 22000
data. 1 250 1 Unfilrnished 5000
2 1000 2 Semi-Eurnished 23000
(c) Consider the following piece of code and give the 3 L200 2 Semi-Furnlshed 25000
1 4 00 1 Unfurnished 7000
output: (5) 1 250 1, Furnished 6500
1 37s 1 Unfurnished 6000
import pandas as pd 3 900 2 Unfurnished 8500
3 L286 2 Furnlshed 3s000
dfl : pd.DataFrame({'id': [1,3,6,7], , vat, : f'a,, 2 600 1 Seml-Furnished 8000
2 800 1 Unfurnished 12000
'b','c','d'l))
df2 : pd.DataFrame( {'id' : [1,2,3,5,6,8], .val' : Write python commands to perform the follOwing
['p','q','r','s','t','u']]) operations:
dR : pd.merge(dil, dn, on : 'id', how - 'outer') ) ) (i) Find the index of house with maximum rent.
print(dB)
(ii) Sort the dataframe House_Rent on ..Area',.
How many NaN values are there in the data frame
(iii) Calculate total Area and total rent.
df 3? Write pandas command to replace NaN with
the last known valid value in df3. (iv) Compute the count of houses having rooms
1, 2, 3 etc.
P.T.O
6060 6 6060 7
(v) Create a new DataFrame df having a (c) Consider numpy array arr given below: (5)
Large'1. (5)
) 4 (a) What is data wrangling? Identify the possible
(b) Write a numpy code to create a 3D array a3 of issues that can arise in data wrangling process?
(5)
size 4 x 5 x 3 of random numbers in range I to
60 and swap axis I with axis 2. Identify the number (b) Consider a csv file test.csv having 3 columns and
of matrices in the array a 3, dimension of a matrix 50 rows. Write python command to perform
in array a3 and the datatype of array a3. (5) following operations : (5)
P.T.O.