Pandas_Worksheet
Pandas_Worksheet
a. Create a series with 5 elements. Display the series sorted on index and also sorted on
values separately.
Code
Output
b. Create a series with N elements with some duplicate values. Find the minimum and
maximum ranks assigned to the values using ‘first’ and ‘max’ methods.
Code
Input
c. Display the index value of the minimum and maximum element of a Series.
Code
Output
2. Create a data frame having at least 3 columns and 50 rows to store numeric data
generated using a random function. Replace 10% of the values by null values whose index
positions are generated using random function. Do the following:
Code
Output
b. Drop the column having more than 5 null values.
Code
Output
c. Identify the row label having maximum of the sum of all values in a row and drop that row.
Code
Output
f. Display a summary of the data distribution for all attributes in the dataframe.
g. Compute the pairwise correlation between all attributes in the dataframe.
4. Consider the Titanic dataset, which contains information about passengers on board the
Titanic,
including their age, gender, passenger class, survival status, and other attributes. Write a
program
using the Pandas library to perform the following operations on the Titanic dataset:
b. Check for any duplicate records and missing values in the dataset and handle them
appropriately.
c. Calculate and display the total number of passengers who survived and those who did not.
d. Filter the DataFrame to select only the records of passengers who were under the age of
18.
e. Calculate the average age for passengers belonging to each of the passenger class.
h. Calculate the correlation between age and fare attributes of the dataset.
j. Create a contingency table that shows the count of passengers based on their survival
status(survived or not) and passenger class (first, second, or third class).
5. Consider the following data frame containing a family name, gender of the family member
and her/his monthly income in each record.
b. Calculate and display the member with the highest monthly income.
c.Calculate and display monthly income of all members with income greater than Rs.
60000.00.
d.Calculate and display the average monthly income of the female members
6. Consider two excel files having attendance of two workshops. Each file has three fields
‘Name’,‘Date, duration (in minutes) where names are unique within a file. Note that duration
may take one of three values (30, 40, 50) only. Import the data into two data frames and do
the following:
a. Perform merging of the two data frames to find the names of students who had attended
both workshops.
b. Find names of all students who have attended a single workshop only.
C. Merge two data frames row-wise and find the total number of records in the data frame.
d. Merge two data frames row-wise and use two columns viz. names and dates as multi-row
indexes.