0% found this document useful (0 votes)
73 views9 pages

CS3361 Lab Exp

Uploaded by

aiguestacc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views9 pages

CS3361 Lab Exp

Uploaded by

aiguestacc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

CS3361 – DATA SCIENCE LABORATORY

(Regulations 2021)

1 Numpy Programs

a) Write a NumPy program to create a null vector of size 10 and update sixth value to 11

b) Write a NumPy program to convert an array to a float type

c) Write a NumPy program to create a 3x3 matrix with values ranging from 2 to 10

d) Write a NumPy program to convert a list of numeric value into a one-dimensional NumPy
array

e) Write a NumPy program to convert an array to a float type

f) Write a NumPy program to create an empty and a full array

g) Write a NumPy program to convert a list and tuple into arrays

h) Write a NumPy program to find the real and imaginary parts of an array of complex numbers

i) Write a NumPy program to merge three given NumPy arrays of same shape

j) Write a NumPy program to add a border (filled with 0's) around an existing array

k) Write a NumPy program to append values to the end of an array

l) Write a NumPy program to convert the values of Centigrade degrees into Fahrenheit
degrees and vice versa. Values have to be stored into a NumPy array.

m) Write a NumPy program to convert a NumPy array into a csv file

n) Write a NumPy program to perform the basic arithmetic operations

o) Write a NumPy program to transpose an array

p) Use NumPy , Create an array with 5 dimensions and verify that it has 5 dimensions

q) Using NumPy, Sort a boolean array

r) Create two arrays of six elements. Write a NumPy program to count the number of
instances of a value occurring in one array on the condition of another array.
Sample Output:
Original arrays:
[ 10 -10 10 -10 -10 10]
[0.85 0.45 0.9 0.8 0.12 0.6 ]
Number of instances of a value occurring in one array on the condition of another array:
3

Page 1 of 9
s) Create a 2-dimensional array of size 2 x 3, composed of 4-byte integer elements. Write a
NumPy program to find the number of occurrences of a sequence in the said array.
Sample Output:
Original NumPy array:
[[1 2 3]
[2 1 2]]
Type: <class 'numpy.ndarray'>
Sequence: 2,3
Number of occurrences of the said sequence: 2

t) Write a NumPy program to combine last element with first element of two given ndarray with
different shapes.

Sample Output:
Original arrays:
['PHP', 'JS', 'C++']
['Python', 'C#', 'NumPy']
After Combining:
['PHP' 'JS' 'C++Python' 'C#' 'NumPy']

u) Write a NumPy program to convert a Python dictionary to a NumPy ndarray.


Sample Output:
Original dictionary:
{'column0': {'a': 1, 'b': 0.0, 'c': 0.0, 'd': 2.0},
'column1': {'a': 3.0, 'b': 1, 'c': 0.0, 'd': -1.0},
'column2': {'a': 4, 'b': 1, 'c': 5.0, 'd': -1.0},
'column3': {'a': 3.0, 'b': -1.0, 'c': -1.0, 'd': -1.0}}
Type: <class 'dict'>
ndarray:
[[ 1. 0. 0. 2.]
[ 3. 1. 0. -1.]
[ 4. 1. 5. -1.]
[ 3. -1. -1. -1.]]
Type: <class 'numpy.ndarray'>

v) Write a NumPy program to search the index of a given array in another given array.

Sample Output:
Original NumPy array:
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
Searched array:
[4 5 6]
Index of the searched array in the original array:
[1]

Page 2 of 9
2. PANDAS Program

a) Create your own simple Pandas DataFrame and print its values

b) Create your own DataFrame from dict of narray/list

c) Perform appending, slicing, addition and deletion of rows with a Pandas DataFrame.

d) Using Pandas, Create a DataFrame with a list of dictionaries, row indices, and column
indices.

e) Use index label to delete or drop rows from a Pandas DataFrame.

f) Write a Pandas program to get the powers of an array values element-wise.


Note: First array elements raised to powers from second array
Sample data: {'X':[78,85,96,80,86], 'Y':[84,94,89,83,86],'Z':[86,97,96,72,83]}
Expected Output:
XYZ
0 78 84 86
1 85 94 97
2 96 89 96
3 80 83 72
4 86 86 83

g) Write a Pandas program to select the specified columns and rows from a given data
frame. Sample Python dictionary data and list labels:
Select 'name' and 'score' columns in rows 1, 3, 5, 6 from the following data frame.
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael',
'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
Select specific columns and rows:
score qualify
b 9.0 no
d NaN no
f 20.0 yes
g 14.5 yes

h) Write a Pandas program to create and display a DataFrame from a specified dictionary data
which has the index labels.
Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew',
'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:

Page 3 of 9
attempts name qualify score
a 1 Anastasia yes 12.5
b 3 Dima no 9.0
.... i 2 Kevin no 8.0
j 1 Jonas yes 19.0

i) Write a Pandas program to select the rows where the number of attempts in the examination is
greater than 2.
Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew',
'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
Number of attempts in the examination is greater than 2:
name score attempts qualify
b Dima 9.0 3 no
d James NaN 3 no
f Michael 20.0 3 yes

j) Write a Pandas program to get the first 3 rows of a given DataFrame.


Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew',
'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
First three rows of the data frame:
attempts name qualify score
a 1 Anastasia yes 12.5
b 3 Dima no 9.0
c 2 Katherine yes 16.5

k) Write a Pandas program to get first n records of a DataFrame.


Sample Output:
Original DataFrame
col1 col2 col3
0147
1255
2368
3 4 9 12
4751
5 11 0 11
Page 4 of 9
First 3 rows of the said DataFrame':
col1 col2 col3
0147
1255
2368

l) Write a Pandas program to select all columns, except one given column in a DataFrame.

Sample Output:
Original DataFrame
col1 col2 col3
0147
1258
2 3 6 12
3491
4 7 5 11
All columns except 'col3':
col1 col2
014
125
236
349
475

m) Write a Pandas program to select the rows where the score is missing, i.e. is NaN.

Sample Python dictionary data and list labels:


exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura',
'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
Rows where score is missing:
attempts name qualify score
d 3 James no NaN
h 1 Laura no NaN

n) Write a Pandas program to count the number of rows and columns of a DataFrame.
Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael',
'Matthew', 'Laura', 'Kevin', 'Jonas'],
'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
Expected Output:
Number of Rows: 10
Number of Columns: 4

Page 5 of 9
o) Write a Pandas program to count number of columns of a DataFrame.
Sample Output:
Original DataFrame
col1 col2 col3
0147
1258
2 3 6 12
3491
4 7 5 11
Number of columns:
3

p) Write a Pandas program to group by the first column and get second column as lists in
rows

Sample data:
Original DataFrame
col1 col2
0 C1 1
1 C1 2
2 C2 3
3 C2 3
4 C2 4
5 C3 6
6 C2 5
Group on the col1:
col1
C1 [1, 2]
C2 [3, 3, 4, 5]
C3 [6]
Name: col2, dtype: object

q) Write a Pandas program to check whether a given column is present in a DataFrame or


not. Sample data:
Original DataFrame
col1 col2 col3
0147
1258
2 3 6 12
3491
4 7 5 11
Col4 is not present in DataFrame.
Col1 is present in DataFrame.

r) Write a Pandas program to get the numeric representation of an array by identifying distinct
values of a given column of a dataframe.
Sample Output:
Original DataFrame:
Name Date_Of_Birth Age
Page 6 of 9
0 Alberto Franco 17/05/2002 18.5
1 Gino Mcneill 16/02/1999 21.2
2 Ryan Parkes 25/09/1998 22.5
3 Eesha Hinton 11/05/2002 22.0
4 Gino Mcneill 15/09/1997 23.0
Numeric representation of an array by identifying distinct values:
[0 1 2 3 1]
Index(['Alberto Franco', 'Gino Mcneill', 'Ryan Parkes', 'Eesha Hinton'], dtype='object')

s) Write a Pandas program to check for inequality of two given DataFrames.


Sample Output:
Original DataFrames:
WXYZ
0 68.0 78.0 84 86
1 75.0 85.0 94 97
2 86.0 NaN 89 96
3 80.0 80.0 83 72
4 NaN 86.0 86 83
WXYZ
0 78.0 78 84 86
1 75.0 85 84 97
2 86.0 96 89 96
3 80.0 80 83 72
4 NaN 76 86 83
Check for inequality of the said dataframes:
WXYZ
0 True False False False
1 False False True False
2 False True False False
3 False False False False
4 True True False False

3. Using Pandas library,

i.Load the iris.CSV file

ii.Convert it into the data frame and read it .

iii.Display records only with species "Iris-setosa".

4. Reading data from text files, Excel and the web and exploring various commands for doing
descriptive analytics on the Iris data set

5. Use the diabetes data set from Pima Indians Diabetes data set for performing the following:

Apply Univariate analysis:

Frequency
Mean,
Median,
Page 7 of 9
Mode,
Variance
Standard Deviation
Skewness and Kurtosis

6. Use the diabetes data set from Pima Indians Diabetes data set for performing the following:

Apply Bivariate analysis:

Linear and logistic regression modeling

Multiple Regression analysis

7. Apply and explore various plotting functions on UCI data set for performing the following:

Normal values
Density and contour plots
Three-dimensional plotting
Correlation and scatter plots
Histograms

8. Apply and explore various plotting functions on Pima Indians Diabetes data set for
performing the following:

Normal values
Density and contour plots
Three-dimensional plotting
Correlation and scatter plots
Histograms

9. Compare the results of the Univariate and Bivariate analysis for the UCI diabetes data set

10. Perform Multiple Regression analysis on your own dataset ( For example, Car dataset with
information Company Name, Model, Volume, Weight, CO2) with more than one independent
value to predict a value based on two or more variables.

11. 1Using various functions in numpy library, mathematically calculate the values for a normal
distribution and create Histograms to plot the probability distribution curve

12. Using plt.contour(), plt.contourf(), plt.imshow(), plt.colorbar(), plt.clabel() functions visualize a


contour plot

13. Using the “concrete strength” dataset, explore relationships between two continuous variables
with Scatterplots

14. Draw a Scatter Plot for the following Pandas DataFrame with Team name and Rank Points as
x and y axis

Page 8 of 9
["Australia", 2500],["Bangladesh", 1000],["England", 2000],["India", 3000],["Srilanka", 1500]

15. Make a three-dimensional plot with randomly generate 50 data points for x, y, and z. Set the
point color as red, and size of the point as 50.

16. How will you plot and visualize geographical data with the help of Basemap. State the
Procedure for it with an example.

17. Perform Univariate analysis with the following pandas DataFrame

'points': [1, 1, 2, 3.5, 4, 4, 4, 5, 5, 6.5, 7, 7.4, 8, 13, 14.2]

'assists': [5, 7, 7, 9, 12, 9, 9, 4, 6, 8, 8, 9, 3, 2, 6]

'rebounds': [11, 8, 10, 6, 6, 5, 9, 12, 6, 6, 7, 8, 7, 9, 15]

18. Perform Bivariate analysis using the pandas DataFrame that contains information about two
variables: (1) Hours spent studying and (2) Exam score received by 20 different students:

Page 9 of 9

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy