0% found this document useful (0 votes)
5 views6 pages

GE - Computer Scien EaQvs42

The document contains a question paper with various programming tasks related to data analysis using Python, including operations with pandas and numpy. It includes instructions for candidates, a breakdown of sections, and specific coding tasks to perform on dataframes and arrays. The paper emphasizes the application of functions, data manipulation, and visualization techniques.

Uploaded by

JOHN sorus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views6 pages

GE - Computer Scien EaQvs42

The document contains a question paper with various programming tasks related to data analysis using Python, including operations with pandas and numpy. It includes instructions for candidates, a breakdown of sections, and specific coding tasks to perform on dataframes and arrays. The paper emphasizes the application of functions, data manipulation, and visualization techniques.

Uploaded by

JOHN sorus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

6060 t2 [This question paper contains l2 printed pages.

Give the output of following commands : Your Roll No...............

(i) Score[['Name','Class']l Sr. No. of Question Paper 6060 H


(ii) Score[Score['Class'] ==ll ['Name'] Unique Paper Code 2344001201

(iii)
Name of the Paper Data Analysis and Visualization
Score [Score [' Score3'] < 801
Using Python
(iv) Score['Class'].value_counts0.sort index0 )
J-" or the Course Computer Science: Generic
f,lective (G.E.)
(v) Score.sum(axis="columns")
(NEP-UGCF-2022)
Write a function diff to compute the difference
Semester II
between the maximum and minimum of
each column of dataframe Score and apply it to Duration : 3 Hours Maximum Marks : 90
dataframe Score. (10)
Instructions for Candidates

l. Write your Roll No. on the top immediately on receipt


of this question paper. cort 6
2 This question paper has

Question I in Section om
) I * *
4 Attempt any 4 questions

5 Parts of a question must b eat da

6 Section A carries 30 marks and i.rch question in


Section B carries 15 marks.
7 Use of Calculator is not allowed.

(2000) P.T.O.
6060 2 6060 ll
Section A (ii) Assign rank in descending order

Assume numpy has been imported as np and pandas (iii) Retrieve all values except NaN. (6)
has been imported as pd.

7 (a) Write Numpy commands to perform the following


l. (a) Consider the following numpy arrays r 5\ operations on array num : (5)

arr I = np.array(L14,3,2), t l,9,6ll) ) ) (i) Create an array num containing values


from 3l to 4 6.
arr2 - np.array(tt3,7,51, [2,9,8], ts,l,6ll)
(ii) Convert datatype of array num to floating
Give the output of the following commands :
type data.
(i) arr2 [1] [] (iii) Reshape array num to at array of size 4x4.
(ii) arrl [:2, -l]
(iv) Replace the diagonal elements of array
(iii) arrl * 3 num to 0.

(iv) arrl > 5 (v) To create an array of l's with the same
shape and type as the given array num.
(v\ arr2 l2l =4
(b) Consider the dataframe Score given below:
(b) List and describe different types of sampling of ) )
data. (5)
N!.tue cl.rt scor.1 9cor2 9corc3
A 1 85 90 88
B 2 '14 86 80
C 1, 83 'tl 92
(c) Consider the Series object Company having D 2 64 68 13
'Company_Name' as index and Profit (in Crores) E 2 11 62 '12

as values: (3) r 1 90 87 92

P.T.O
6060 10 6060 3

6 (a) Consider the pandas series s2 = pd.Series ([2,4,


6,8, 10, r2l). Coq)lny NaDa Profit
TCS 350
Write python code. to plot cumulative sum of s2. Reliance 204
Set the x limit to [ 0, l0] and y limit to [0,50]. Set
LET 800
the style of line graph to dot(.) pattern and marker
Wipro 150
to star shape. Set appropriate values for xticks )
and yticks. (5) Write the python commands to perform the
following operations :

(b) Consider dataframe df given below: (4)


(i) To display the Company_Name having
nuEb€! Ona Tro Ibraa profit > 25 0.
Stat
Ohio 0 2
CoIolrdo 3 4 5 (ii) To display the index.

(iii) To assign name 'Company_Name' to index.


Provide the output of following commands.

(i) df.stackO (d) Write a python code to draw a scatter plot


comparing monthly revenue (in Crores) and
(ii) df.unstack(level:0)
monthly expenditure (in Crores) of a company for
) ) year 2021. (5)
(c) Consider the series a given below and write
commands to perform the following operations : revenue: [58 ], 684, 739,563,856, 716, 589, 820,
792, 695,770,8t21
a : pd. Series([6,np.nan,-4,np.nan,3,8,np.nan,5])
expenditure = [63 ], 545, 435, 532, 688, 540, 485,
(i) Sort the values and keep NaN in initial 679,709,5351 .

positions.

P.T.O.
6060 4 6060 9

Import necessary libraries. Assign the title of the 5 (a) Define categorical and interval data. Give example
plot as 'Revenue vs Expenditure' and label y-axis of each. (4)
as 'Expenditure'. Assign red color to 'Expenditure'
(b) What is hierarchical Indexing? Why do we use
data points and green color to 'Revenue' data
hierarchical indexing in pandas? Which pandas
points.
feature enables you to have multiple index
) ) levels on an axis? Give an example of hierarchical
(e) Define correlation and covariance Outline the indexing. (6)
difference between the two. (s)
(c) Consider the data fame df 2 given below: (5)
(f1 Create a DataFrame having five rows and four
I[rmc Agra
columns and populate it with random values in the 0 Rohit 10
range I to 100. Set the index of the rows as ['L', 1 Amit 13
Ankur t2
'M', 'N', 'O', 'P'l and column indexes as ['Col1',
'Col2', 'Col3', 'Col4'1. (4) Write python commands to perform following
operations:
(g) Give the output of the following code : (3) (i) Create a new object df 3 by reindexing
df 2 row index as 10, 1,2,3,41 and column
import Pandas as pd
\ index as ['x','y'].
sl : pd. Series(['Certificate', 'Bachelor',
) )
(ii) Delete the entry of 'Amit' from dR.
'Master','Doctorate'l,index : [2,4,6,8])
(iii) Rename index of df 2 as [], 2, 3].
s I .reindex(range( l0), method = 'ffill')
(iv) Check if the entry 'Rohit' exists in df 2.
print(sl)
(v) Modify Age of 'Ankur' to 15 usings loc
command.

P.T.O.
6060 8 6060 5

(i) Read the file test.csv into a DataFrame Section B


data.

(ii) Print the first l0 rows of data. 2 (a) Consider the following DataFrame House_Rent
given below: ( l0)
(iii) Display the 5 summary statistics for each
column of data.
) )
Roc0! &a. Brthro{ lUrnirhiag_Strtu! nnt
(iv) Remove the rows with all null values 2 1100 2 Unfurnisbed 10000
2 800 1 Semi-Eurnished 16000
(v) Identify duplicate values in 2 900 2 Furnished 22000
data. 1 250 1 Unfilrnished 5000
2 1000 2 Semi-Eurnished 23000
(c) Consider the following piece of code and give the 3 L200 2 Semi-Furnlshed 25000
1 4 00 1 Unfurnished 7000
output: (5) 1 250 1, Furnished 6500
1 37s 1 Unfurnished 6000
import pandas as pd 3 900 2 Unfurnished 8500
3 L286 2 Furnlshed 3s000
dfl : pd.DataFrame({'id': [1,3,6,7], , vat, : f'a,, 2 600 1 Seml-Furnished 8000
2 800 1 Unfurnished 12000
'b','c','d'l))
df2 : pd.DataFrame( {'id' : [1,2,3,5,6,8], .val' : Write python commands to perform the follOwing
['p','q','r','s','t','u']]) operations:

dR : pd.merge(dil, dn, on : 'id', how - 'outer') ) ) (i) Find the index of house with maximum rent.
print(dB)
(ii) Sort the dataframe House_Rent on ..Area',.
How many NaN values are there in the data frame
(iii) Calculate total Area and total rent.
df 3? Write pandas command to replace NaN with
the last known valid value in df3. (iv) Compute the count of houses having rooms
1, 2, 3 etc.

P.T.O
6060 6 6060 7

(v) Create a new DataFrame df having a (c) Consider numpy array arr given below: (5)

hierarchical index on columns "Rooms" and afr : t t0, 1, 2, 31,


"Furnishing Status". 14, s, 6,7),
[8,9, 10, l1],
(b) Refer to DataFrame House_Rent given in question 112, 13, 14, t5l,
2(a), Write a python code to plot a bar plot ll6, 17, 18, l9l,
displaying no of Furnished, Unfurnished, Semi-
) ) [20,21,22, z3l]
Furnished houses. Import appropriate libraries. The Write numpy commands to retrieve following
title of graph should be "House Data". Give elements:
appropriate labels for x and y axis. Save the figure
with name "house.jpg". (5) (i) (1,4), (3, l), (s,0), and (2,3)

(ii) Retrieve 0,2,4 rows (use positive index)


3 (a) Write python code to create a numpy array al
(iii) Retrieve l, 3, 5 rows (use negative index)
containing 50 floating points values in the range
0to 1. Put the data ofnumpy array a1 into 5 bins. (iv) Retrieve values greater than l0
Set the precision to 4. Assign names to bins
as ['Small','Medium','Large','x-Large','xx-
(v) Retrieve rows l to 4.

Large'1. (5)
) 4 (a) What is data wrangling? Identify the possible
(b) Write a numpy code to create a 3D array a3 of issues that can arise in data wrangling process?
(5)
size 4 x 5 x 3 of random numbers in range I to
60 and swap axis I with axis 2. Identify the number (b) Consider a csv file test.csv having 3 columns and
of matrices in the array a 3, dimension of a matrix 50 rows. Write python command to perform
in array a3 and the datatype of array a3. (5) following operations : (5)

P.T.O.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy