batch2 ds
batch2 ds
Program:
import numpy as np
list_data = [1, 2, 3, 4, 5]
array_from_list = np.array(list_data)
array_from_tuple = np.array(tuple_data)
def centigrade_to_fahrenheit(celsius):
def fahrenheit_to_centigrade(fahrenheit):
fahrenheit_values = centigrade_to_fahrenheit(centigrade_values)
centigrade_from_fahrenheit = fahrenheit_to_centigrade(fahrenheit_array)
output:
Centigrade values: [ 0 10 20 30 40 50]
2. i. Write a NumPy program to find the real and imaginary parts of an array of
complex numbers.
Program:
import numpy as np
# Create a NumPy array of complex numbers
complex_array = np.array([2 + 3j, 4 - 5j, -1 + 2j, 3 + 4j])
# Extract the real parts of the complex numbers
real_parts = np.real(complex_array)
# Extract the imaginary parts of the complex numbers
imaginary_parts = np.imag(complex_array)
# Print the results
print("Complex array:", complex_array)
print("Real parts:", real_parts)
print("Imaginary parts:", imaginary_parts)
output:
Complex array: [ 2.+3.j 4.-5.j -1.+2.j 3.+4.j]
Real parts: [ 2. 4. -1. 3.]
Imaginary parts: [ 3. -5. 2. 4.]
ii. Write a NumPy program to convert a NumPy array into a csv file
program:
import numpy as np
output:
1,2,3
4,5,6
7,8,9
3. i. Write a NumPy program to perform the basic arithmetic operations
Program:
import numpy as np
# Addition
# Subtraction
# Multiplication
# Division
output:
Array 1: [10 20 30 40 50]
Array 2: [1 2 3 4 5]
Program:
import numpy as np
transposed_array = np.transpose(array)
# transposed_array = array.T
print("Original Array:")
print(array)
print("\nTransposed Array:")
print(transposed_array)
output:
Original Array:
[[1 2 3]
[4 5 6]
[7 8 9]]
Transposed Array:
[[1 4 7]
[2 5 8]
[3 6 9]]
4) i. Use NumPy , Create an array with 5 dimensions and verify that it has 5
dimensions.
Program:
import numpy as np
output:
Array Shape: (2, 3, 4, 5, 6)
Number of Dimensions: 5
sorted_array = np.sort(boolean_array)
output:
Original Boolean Array: [ True False True False True False]
data = {
df = pd.DataFrame(data)
print(df)
output:
Name Age City
0 Alice 24 New York
1 Bob 27 Los Angeles
2 Charlie 22 Chicago
3 David 32 Houston
4 Eve 29 Phoenix
ii. Create your own DataFrame from dict of narray/list.
Program:
import pandas as pd
import numpy as np
data = {
df = pd.DataFrame(data)
print(df)
output:
Program:
import pandas as pd
data = {
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
df = df.append(new_row, ignore_index=True)
print(df)
print(sliced_df)
print(df)
df = df.drop(2)
print("\nDataFrame after deleting row with index 2:")
print(df)
output:
Original DataFrame:
2 Charlie 22 Chicago
3 David 32 Houston
2 Charlie 22 Chicago
3 David 32 Houston
4 Eve 29 Phoenix
2 Charlie 22 Chicago
2 Charlie 22 Chicago
3 David 32 Houston
4 Eve 29 Phoenix
5 Frank 30 Dallas
DataFrame after deleting row with index 2:
3 David 32 Houston
4 Eve 29 Phoenix
5 Frank 30 Dallas
7.i. Using Pandas, Create a DataFrame with a list of dictionaries, row indices,
and column indices.
Program:
import pandas as pd
data = [
print(df)
output:
Name Age City
C Charlie 22 Chicago
D David 32 Houston
ii. Use index label to delete or drop rows from a Pandas DataFrame.
Program:
import pandas as pd
data = {
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# 1. Drop a row by index label (e.g., drop row with index 'B')
df_dropped = df.drop('B')
print(df_dropped)
# 2. Drop multiple rows by index labels (e.g., drop rows with index 'A' and 'D')
print(df_dropped_multiple)
# 3. Drop a row in-place (this will modify the original DataFrame)
df.drop('C', inplace=True)
print(df)
output:
Original DataFrame:
C Charlie 22 Chicago
D David 32 Houston
C Charlie 22 Chicago
D David 32 Houston
C Charlie 22 Chicago
D David 32 Houston
8.Using Pandas library,
i.Load the iris.CSV file
ii.Convert it into the data frame and read it .
iii.Display records only with species "Iris-setosa"
program:
import pandas as pd
df = pd.read_csv('iris.csv')
# Step 2: Display the entire DataFrame or the first few rows to ensure it's loaded correctly
print(df.head())
print(setosa_df)
output:
First few records of the DataFrame:
...
9. Use the diabetes data set from UCI, Perform Univariate analysis.
Program:
import pandas as pd
# You can replace this URL with the actual URL of the dataset or load it from a local file.
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-
diabetes.data.csv'
df = pd.read_csv(url, names=columns)
print(df.head())
print("\nSummary Statistics:")
print(df.describe())
df.hist(bins=20, figsize=(15,10))
plt.tight_layout()
plt.show()
plt.figure(figsize=(15, 10))
sns.boxplot(data=df)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
sns.countplot(x='Outcome', data=df)
plt.show()
output:
10.Use the diabetes data set from Pima Indians Diabetes , Perform Bivariate
analysis.
Program:
import pandas as pd
df = pd.read_csv(url, names=columns)
print(df.head())
plt.figure(figsize=(10, 8))
correlation_matrix = df.corr()
plt.show()
plt.figure(figsize=(15, 10))
plt.subplot(2, 3, 1)
plt.title('Glucose vs Outcome')
plt.subplot(2, 3, 2)
plt.title('BMI vs Outcome')
plt.subplot(2, 3, 3)
sns.scatterplot(x='Age', y='Outcome', data=df)
plt.title('Age vs Outcome')
plt.subplot(2, 3, 4)
plt.title('Insulin vs Outcome')
plt.subplot(2, 3, 5)
plt.title('BloodPressure vs Outcome')
plt.subplot(2, 3, 6)
plt.title('Pregnancies vs Outcome')
plt.tight_layout()
plt.show()
# Step 3: Pairplot to visualize the relationships between multiple features and 'Outcome'
plt.show()
output:
import pandas as pd
import statsmodels.api as sm
data = {
'CO2': [120, 110, 140, 200, 180] # CO2 emissions in grams per km
# Convert to DataFrame
df = pd.DataFrame(data)
# Since we are predicting CO2 based on Volume and Weight, we can drop 'Company Name' and
'Model' for now
X = df[['Volume', 'Weight']]
y = df['CO2']
X = sm.add_constant(X)
print(model.summary())
regressor = LinearRegression()
# Train the model
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)
print(f"Coefficients: {regressor.coef_}")
print(f"Intercept: {regressor.intercept_}")
r2 = r2_score(y_test, y_pred)
print(f"R-squared: {r2}")
plt.scatter(y_test, y_pred)
plt.xlabel("Actual CO2")
plt.ylabel("Predicted CO2")
plt.show()
output:
12.Perform Bivariate analysis using the pandas DataFrame that contains
information about two variables: (1) Hours spent studying and (2) Exam score
received by 20 different students
Program:
import pandas as pd
import numpy as np
data = {
'Hours Studying': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
'Exam Score': [35, 40, 50, 60, 65, 70, 75, 80, 85, 88, 90, 92, 94, 95, 96, 98, 99, 99, 100, 100]
df = pd.DataFrame(data)
print("Descriptive Statistics:")
print(df.describe())
plt.figure(figsize=(8, 6))
plt.xlabel('Hours Studying')
plt.ylabel('Exam Score')
plt.grid(True)
plt.legend()
plt.show()
plt.xlabel('Hours Studying')
plt.ylabel('Exam Score')
plt.show()
output:
Program:
import pandas as pd
import numpy as np
data = {
df = pd.DataFrame(data)
print("Descriptive Statistics:")
print(df.describe())
plt.figure(figsize=(12, 6))
plt.subplot(1, 3, 1)
plt.title('Distribution of Points')
plt.xlabel('Points')
plt.ylabel('Frequency')
plt.subplot(1, 3, 2)
plt.title('Distribution of Assists')
plt.xlabel('Assists')
plt.ylabel('Frequency')
plt.subplot(1, 3, 3)
plt.title('Distribution of Rebounds')
plt.xlabel('Rebounds')
plt.ylabel('Frequency')
plt.tight_layout()
plt.show()
plt.figure(figsize=(12, 6))
plt.subplot(1, 3, 1)
sns.boxplot(y=df['points'], color='blue')
plt.title('Boxplot of Points')
plt.subplot(1, 3, 2)
sns.boxplot(y=df['assists'], color='green')
plt.title('Boxplot of Assists')
plt.subplot(1, 3, 3)
sns.boxplot(y=df['rebounds'], color='red')
plt.title('Boxplot of Rebounds')
plt.tight_layout()
plt.show()
points_skew = skew(df['points'])
points_kurt = kurtosis(df['points'])
assists_skew = skew(df['assists'])
assists_kurt = kurtosis(df['assists'])
rebounds_skew = skew(df['rebounds'])
rebounds_kurt = kurtosis(df['rebounds'])
output:
14. i) Using various functions in numpy library, mathematically calculate the
values for a normal distribution and create Histograms to plot the probability
distribution curve.
Program:
import numpy as np
plt.figure(figsize=(10, 6))
count, bins, ignored = plt.hist(data, bins=30, density=True, alpha=0.6, color='g')
plt.xlabel("Data points")
plt.ylabel("Density")
plt.grid(True)
plt.show()
output:
14.ii) Using plt.contour(), plt.contourf(), plt.imshow(), plt.colorbar(), plt.clabel()
functions visualize a contour plot.
Program:
import numpy as np
x = np.linspace(-3, 3, 100)
y = np.linspace(-3, 3, 100)
X, Y = np.meshgrid(x, y)
# Add a colorbar
plt.colorbar()
plt.show()
output:
15 Make a three-dimensional plot with randomly generate 50 data points for x,
y, and z. Set the point color as red, and size of the point as 50.
Program:
import matplotlib.pyplot as plt
import numpy as np
x = np.random.rand(50) * 10
y = np.random.rand(50) * 10
z = np.random.rand(50) * 10
# Create a 3D plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
plt.show()
output: