Prac 2
Prac 2
Practical – 2
Create a NumPy array of shape (5, 5) with values ranging from 1 to 25. •
Perform the following operations: • Flatten the array into a 1D array. •
Calculate the mean, median, and standard deviation of the array. • Reshape
the array back into a 5x5 matrix and replace all values greater than 10 with 0.
import numpy as np
array_flattened = array_2d.flatten()
mean_value = np.mean(array_flattened)
median_value = np.median(array_flattened)
std_deviation = np.std(array_flattened)
array_reshaped = array_flattened.reshape(5, 5)
array_reshaped[array_reshaped > 10] = 0
print("Original 2D Array:")
print(array_2d)
print("\nFlattened Array:")
print(array_flattened)
print("\nMean:", mean_value)
print("Median:", median_value)
print("Standard Deviation:", std_deviation)
print("\nModified 2D Array:")
print(array_reshaped)
Output:
22SE02ML063 Business Analytics
• Create two NumPy arrays: a 3x3 matrix of random integers between 1 and
10 and a 3x1 column vector of random integers between 1 and 5. • Perform
the following: o Multiply the matrix by the column vector. o Transpose the
resulting matrix. o Find the determinant of the original 3x3 matrix.
import numpy as np
transposed_matrix = result_matrix.T
determinant = np.linalg.det(matrix)
Create a Pandas DataFrame with columns Name, Age, Height, and City with
the following data: • Perform the following tasks: o Display the first 3 rows of
the DataFrame. o Add a new column Weight with random values. o Filter the
rows where Age is greater than 25 and display only the Name and Height
columns
import numpy as np
import pandas as pd
data = {
"Name": ["Alice", "Bob", "Charlie", "David", "Eve"],
"Age": [23, 30, 35, 22, 28],
"Height": [5.5, 6.0, 5.8, 5.9, 5.7],
"City": ["New York", "Los Angeles", "Chicago", "Houston", "Phoenix"]
}
df = pd.DataFrame(data)
Create a DataFrame containing Name, Age, Salary columns with some missing
(NaN) values. • Fill the missing Age values with the mean value of the
column.• Drop any rows where Salary is missing
import numpy as np
import pandas as pd
data_with_nan = {
"Name": ["Frank", "Grace", "Hank", "Ivy", "Jack"],
"Age": [25, np.nan, 29, np.nan, 32],
"Salary": [50000, 60000, np.nan, 75000, 80000]
}
df_nan = pd.DataFrame(data_with_nan)
df_nan["Age"].fillna(df_nan["Age"].mean(), inplace=True)
df_nan.dropna(subset=["Salary"], inplace=True)
Create a line plot that represents the relationship between two lists x = [1, 2,
3, 4, 5] and y = [2, 4, 6, 8, 10]. • Label the x-axis as "X values" and the y-axis as
"Y values". • Add a title "Simple Line Plot".
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
plt.plot(x, y, marker='o')
plt.xlabel("X values")
plt.ylabel("Y values")
plt.title("Simple Line Plot")
plt.grid(True)
plt.show()
Output:
22SE02ML063 Business Analytics
Output:
22SE02ML063 Business Analytics
Plot histograms for both total_bill and tip. Compare their distributions. •
Create overlapping histograms for total_bill for lunch and dinner times. What
differences do you notice? • Adjust the number of bins in the histogram to
50. How does it affect the visualization?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data = sns.load_dataset('tips')
Observation: Adjusting bins to 50 creates more granular insights into the distribution of
values.
Output:
22SE02ML063 Business Analytics
22SE02ML063 Business Analytics
Create a boxplot comparing tip amounts for smokers and non-smokers. What
trends can you identify? • Add a swarmplot over the boxplot (use
sns.swarmplot) for total_bill by day. Does it add any additional insights? •
Group the boxplot by sex and time (e.g., use hue='sex' and x='time') to see if
there are any differences in spending habits.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data = sns.load_dataset('tips')
plt.figure(figsize=(8, 6))
sns.boxplot(x='smoker', y='tip', data=data)
plt.title("Boxplot of Tip Amounts for Smokers and Non-Smokers")
plt.xlabel("Smoker")
plt.ylabel("Tip Amount")
plt.show()
Observation: Boxplot reveals trends such as whether smokers tend to tip more or less than
non-smokers.
plt.figure(figsize=(10, 6))
sns.boxplot(x='day', y='total_bill', data=data, palette='Set2')
sns.swarmplot(x='day', y='total_bill', data=data, color='black', alpha=0.7)
plt.title("Boxplot with Swarmplot Overlay of Total Bill by Day")
plt.xlabel("Day")
plt.ylabel("Total Bill")
plt.show()
Observation: Swarmplot provides additional insights into individual data points and outliers.
22SE02ML063 Business Analytics
plt.figure(figsize=(10, 6))
sns.boxplot(x='time', y='total_bill', hue='sex', data=data, palette='coolwarm')
plt.title("Boxplot of Total Bill Grouped by Sex and Time")
plt.xlabel("Time")
plt.ylabel("Total Bill")
plt.legend(title="Sex")
plt.show()
Observation: Grouping by sex and time shows differences in spending habits between males
and females during lunch and dinner.
Output:
22SE02ML063 Business Analytics