0% found this document useful (0 votes)

6 views34 pages

batch2 ds

Uploaded by

ece apce

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views34 pages

batch2 ds

Uploaded by

ece apce

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

1)i. Write a NumPy program to convert a list and tuple into arrays.

Program:
import numpy as np

# Convert a list to a NumPy array

list_data = [1, 2, 3, 4, 5]

array_from_list = np.array(list_data)

print("Array from list:", array_from_list)

# Convert a tuple to a NumPy array

tuple_data = (10, 20, 30, 40, 50)

array_from_tuple = np.array(tuple_data)

print("Array from tuple:", array_from_tuple)

Array from list: [1 2 3 4 5]

Array from tuple: [10 20 30 40 50]

ii.Write a NumPy program to convert the values of Centigrade degrees into

Fahrenheit degrees and vice versa. Values have to be stored into a NumPy
array.
Program:
import numpy as np

# Function to convert Centigrade to Fahrenheit

def centigrade_to_fahrenheit(celsius):

return (celsius * 9/5) + 32

# Function to convert Fahrenheit to Centigrade

def fahrenheit_to_centigrade(fahrenheit):

return (fahrenheit - 32) * 5/9

# Create a NumPy array of Centigrade temperatures

centigrade_values = np.array([0, 10, 20, 30, 40, 50])

# Convert Centigrade to Fahrenheit

fahrenheit_values = centigrade_to_fahrenheit(centigrade_values)

# Create a NumPy array of Fahrenheit temperatures

fahrenheit_array = np.array([32, 50, 68, 86, 104, 122])

# Convert Fahrenheit to Centigrade

centigrade_from_fahrenheit = fahrenheit_to_centigrade(fahrenheit_array)

# Print the results

print("Centigrade values:", centigrade_values)

print("Converted Fahrenheit values:", fahrenheit_values)

print("\nFahrenheit values:", fahrenheit_array)

print("Converted Centigrade values:", centigrade_from_fahrenheit)

output:
Centigrade values: [ 0 10 20 30 40 50]

Converted Fahrenheit values: [ 32. 50. 68. 86. 104. 122.]

Fahrenheit values: [ 32 50 68 86 104 122]

Converted Centigrade values: [ 0. 10. 20. 30. 40. 50.]

2. i. Write a NumPy program to find the real and imaginary parts of an array of
complex numbers.
Program:
import numpy as np
# Create a NumPy array of complex numbers
complex_array = np.array([2 + 3j, 4 - 5j, -1 + 2j, 3 + 4j])
# Extract the real parts of the complex numbers
real_parts = np.real(complex_array)
# Extract the imaginary parts of the complex numbers
imaginary_parts = np.imag(complex_array)
# Print the results
print("Complex array:", complex_array)
print("Real parts:", real_parts)
print("Imaginary parts:", imaginary_parts)

output:
Complex array: [ 2.+3.j 4.-5.j -1.+2.j 3.+4.j]
Real parts: [ 2. 4. -1. 3.]
Imaginary parts: [ 3. -5. 2. 4.]

ii. Write a NumPy program to convert a NumPy array into a csv file
program:
import numpy as np

# Create a NumPy array

array_data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Save the NumPy array into a CSV file

np.savetxt('array_data.csv', array_data, delimiter=',', fmt='%d')

print("Array has been saved to 'array_data.csv'.")

output:
1,2,3
4,5,6
7,8,9
3. i. Write a NumPy program to perform the basic arithmetic operations
Program:
import numpy as np

# Create two NumPy arrays

array1 = np.array([10, 20, 30, 40, 50])

array2 = np.array([1, 2, 3, 4, 5])

# Addition

addition_result = array1 + array2

# Subtraction

subtraction_result = array1 - array2

# Multiplication

multiplication_result = array1 * array2

# Division

division_result = array1 / array2

# Exponentiation (array1 raised to the power of array2)

exponentiation_result = array1 ** array2

# Print the results

print("Array 1:", array1)

print("Array 2:", array2)

print("\nAddition (Array1 + Array2):", addition_result)

print("Subtraction (Array1 - Array2):", subtraction_result)

print("Multiplication (Array1 * Array2):", multiplication_result)

print("Division (Array1 / Array2):", division_result)

print("Exponentiation (Array1 ** Array2):", exponentiation_result)

output:
Array 1: [10 20 30 40 50]

Array 2: [1 2 3 4 5]

Addition (Array1 + Array2): [11 22 33 44 55]

Subtraction (Array1 - Array2): [ 9 18 27 36 45]

Multiplication (Array1 * Array2): [ 10 40 90 160 250]

Division (Array1 / Array2): [10. 10. 10. 10. 10.]

Exponentiation (Array1 ** Array2): [ 10 400 27000 1600000 9765625]

ii.Write a NumPy program to transpose an array.

Program:
import numpy as np

# Create a 2D NumPy array

array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Transpose the array

transposed_array = np.transpose(array)

# Alternatively, you can also use the shorthand `.T` to transpose

# transposed_array = array.T

# Print the original and transposed arrays

print("Original Array:")

print(array)

print("\nTransposed Array:")

print(transposed_array)

output:
Original Array:

[[1 2 3]

[4 5 6]

[7 8 9]]

Transposed Array:

[[1 4 7]

[2 5 8]

[3 6 9]]
4) i. Use NumPy , Create an array with 5 dimensions and verify that it has 5
dimensions.
Program:
import numpy as np

# Create a 5-dimensional NumPy array with random integers

array_5d = np.random.randint(1, 10, size=(2, 3, 4, 5, 6))

# Verify the number of dimensions using .ndim

print("Array Shape:", array_5d.shape)

print("Number of Dimensions:", array_5d.ndim)

output:
Array Shape: (2, 3, 4, 5, 6)

Number of Dimensions: 5

ii. Using NumPy, Sort a boolean array.

Program:
import numpy as np

# Create a boolean NumPy array

boolean_array = np.array([True, False, True, False, True, False])

# Sort the boolean array

sorted_array = np.sort(boolean_array)

# Print the original and sorted arrays

print("Original Boolean Array:", boolean_array)

print("Sorted Boolean Array:", sorted_array)

output:
Original Boolean Array: [ True False True False True False]

Sorted Boolean Array: [False False False True True True]

5) i. Create your own simple Pandas DataFrame and print its values.
Program:
import pandas as pd

# Create a simple dictionary with data

data = {

'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],

'Age': [24, 27, 22, 32, 29],

'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']

# Create a DataFrame from the dictionary

df = pd.DataFrame(data)

# Print the DataFrame

print(df)

output:
Name Age City
0 Alice 24 New York
1 Bob 27 Los Angeles
2 Charlie 22 Chicago
3 David 32 Houston
4 Eve 29 Phoenix
ii. Create your own DataFrame from dict of narray/list.
Program:
import pandas as pd

import numpy as np

# Create a dictionary with NumPy arrays or lists

data = {

'Product': ['Laptop', 'Phone', 'Tablet', 'Monitor', 'Keyboard'],

'Price': np.array([1000, 600, 300, 250, 100]),

'Stock': np.array([50, 200, 150, 80, 500])

# Create a DataFrame from the dictionary

df = pd.DataFrame(data)

# Print the DataFrame

print(df)

output:

Product Price Stock

0 Laptop 1000 50
1 Phone 600 200
2 Tablet 300 150
3 Monitor 250 80
4 Keyboard 100 500
6. Perform appending, slicing, addition and deletion of rows with a Pandas
DataFrame.

Program:
import pandas as pd

# Create a simple DataFrame

data = {

'Name': ['Alice', 'Bob', 'Charlie', 'David'],

'Age': [24, 27, 22, 32],

'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']

df = pd.DataFrame(data)

# Print the original DataFrame

print("Original DataFrame:")

print(df)

# 1. Appending a new row to the DataFrame

new_row = {'Name': 'Eve', 'Age': 29, 'City': 'Phoenix'}

df = df.append(new_row, ignore_index=True)

print("\nDataFrame after appending a new row:")

print(df)

# 2. Slicing the DataFrame (selecting specific rows)

sliced_df = df[1:3] # Selecting rows 1 and 2 (indexing starts from 0)

print("\nSliced DataFrame (rows 1 to 2):")

print(sliced_df)

# 3. Adding a new row with 'loc'

df.loc[len(df)] = ['Frank', 30, 'Dallas']

print("\nDataFrame after adding a new row with 'loc':")

print(df)

# 4. Deleting a row (deleting row with index 2)

df = df.drop(2)
print("\nDataFrame after deleting row with index 2:")

print(df)

output:
Original DataFrame:

Name Age City

0 Alice 24 New York

1 Bob 27 Los Angeles

2 Charlie 22 Chicago

3 David 32 Houston

DataFrame after appending a new row:

Name Age City

0 Alice 24 New York

1 Bob 27 Los Angeles

2 Charlie 22 Chicago

3 David 32 Houston

4 Eve 29 Phoenix

Sliced DataFrame (rows 1 to 2):

Name Age City

1 Bob 27 Los Angeles

2 Charlie 22 Chicago

DataFrame after adding a new row with 'loc':

Name Age City

0 Alice 24 New York

1 Bob 27 Los Angeles

2 Charlie 22 Chicago

3 David 32 Houston

4 Eve 29 Phoenix

5 Frank 30 Dallas
DataFrame after deleting row with index 2:

Name Age City

0 Alice 24 New York

1 Bob 27 Los Angeles

3 David 32 Houston

4 Eve 29 Phoenix

5 Frank 30 Dallas

7.i. Using Pandas, Create a DataFrame with a list of dictionaries, row indices,
and column indices.
Program:

import pandas as pd

# Create a list of dictionaries

data = [

{'Name': 'Alice', 'Age': 24, 'City': 'New York'},

{'Name': 'Bob', 'Age': 27, 'City': 'Los Angeles'},

{'Name': 'Charlie', 'Age': 22, 'City': 'Chicago'},

{'Name': 'David', 'Age': 32, 'City': 'Houston'}

# Define custom row indices and column indices

row_indices = ['A', 'B', 'C', 'D']

column_indices = ['Name', 'Age', 'City']

# Create the DataFrame

df = pd.DataFrame(data, index=row_indices, columns=column_indices)

# Print the DataFrame

print(df)
output:
Name Age City

A Alice 24 New York

B Bob 27 Los Angeles

C Charlie 22 Chicago

D David 32 Houston

ii. Use index label to delete or drop rows from a Pandas DataFrame.
Program:
import pandas as pd

# Create a simple DataFrame

data = {

'Name': ['Alice', 'Bob', 'Charlie', 'David'],

'Age': [24, 27, 22, 32],

'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']

df = pd.DataFrame(data)

# Set custom row indices

df.index = ['A', 'B', 'C', 'D']

# Print the original DataFrame

print("Original DataFrame:")

print(df)

# 1. Drop a row by index label (e.g., drop row with index 'B')

df_dropped = df.drop('B')

print("\nDataFrame after dropping row with index 'B':")

print(df_dropped)

# 2. Drop multiple rows by index labels (e.g., drop rows with index 'A' and 'D')

df_dropped_multiple = df.drop(['A', 'D'])

print("\nDataFrame after dropping rows with index 'A' and 'D':")

print(df_dropped_multiple)
# 3. Drop a row in-place (this will modify the original DataFrame)

df.drop('C', inplace=True)

print("\nDataFrame after dropping row with index 'C' in-place:")

print(df)

output:
Original DataFrame:

Name Age City

A Alice 24 New York

B Bob 27 Los Angeles

C Charlie 22 Chicago

D David 32 Houston

DataFrame after dropping row with index 'B':

Name Age City

A Alice 24 New York

C Charlie 22 Chicago

D David 32 Houston

DataFrame after dropping rows with index 'A' and 'D':

Name Age City

B Bob 27 Los Angeles

C Charlie 22 Chicago

DataFrame after dropping row with index 'C' in-place:

Name Age City

A Alice 24 New York

B Bob 27 Los Angeles

D David 32 Houston
8.Using Pandas library,
i.Load the iris.CSV file
ii.Convert it into the data frame and read it .
iii.Display records only with species "Iris-setosa"
program:
import pandas as pd

# Step 1: Load the iris CSV file into a Pandas DataFrame

# Replace 'iris.csv' with the correct file path if necessary

df = pd.read_csv('iris.csv')

# Step 2: Display the entire DataFrame or the first few rows to ensure it's loaded correctly

print("First few records of the DataFrame:")

print(df.head())

# Step 3: Display only the records with species 'Iris-setosa'

setosa_df = df[df['species'] == 'Iris-setosa']

# Display the filtered DataFrame

print("\nRecords with species 'Iris-setosa':")

print(setosa_df)

output:
First few records of the DataFrame:

sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 Iris-setosa

1 4.9 3.0 1.4 0.2 Iris-setosa

2 4.7 3.2 1.3 0.2 Iris-setosa

3 4.6 3.1 1.5 0.2 Iris-setosa

4 5.0 3.6 1.4 0.2 Iris-setosa

Records with species 'Iris-setosa':

sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 Iris-setosa

1 4.9 3.0 1.4 0.2 Iris-setosa

2 4.7 3.2 1.3 0.2 Iris-setosa

3 4.6 3.1 1.5 0.2 Iris-setosa

4 5.0 3.6 1.4 0.2 Iris-setosa

...

9. Use the diabetes data set from UCI, Perform Univariate analysis.
Program:

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

# Step 1: Load the diabetes dataset from the UCI repository

# You can replace this URL with the actual URL of the dataset or load it from a local file.

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-
diabetes.data.csv'

columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',

'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome']

df = pd.read_csv(url, names=columns)

# Step 2: Check the first few rows of the dataset

print(df.head())

# Step 3: Summary statistics for numerical features

print("\nSummary Statistics:")

print(df.describe())

# Step 4: Visualizing the distribution of each feature (Univariate Analysis)

# Histograms for all features

df.hist(bins=20, figsize=(15,10))

plt.tight_layout()

plt.show()

# Step 5: Boxplots for all features to check for outliers

plt.figure(figsize=(15, 10))
sns.boxplot(data=df)

plt.xticks(rotation=45)

plt.tight_layout()

plt.show()

# Step 6: Checking the distribution of 'Outcome' (Diabetes status)

sns.countplot(x='Outcome', data=df)

plt.title('Distribution of Outcome (Diabetes Status)')

plt.show()

output:

10.Use the diabetes data set from Pima Indians Diabetes , Perform Bivariate
analysis.
Program:

import pandas as pd

import seaborn as sns

import matplotlib.pyplot as plt

# Load the dataset from the UCI repository or local file

url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-
diabetes.data.csv'

columns = ['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',

'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome']

df = pd.read_csv(url, names=columns)

# Display first few rows of the dataset

print(df.head())

# Step 1: Correlation Heatmap to analyze relationships between numerical features

plt.figure(figsize=(10, 8))

correlation_matrix = df.corr()

sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f', linewidths=0.5)

plt.title('Correlation Heatmap of Diabetes Dataset')

plt.show()

# Step 2: Scatter plots between features and target variable 'Outcome'

plt.figure(figsize=(15, 10))

# Plotting scatter plot for 'Glucose' vs 'Outcome'

plt.subplot(2, 3, 1)

sns.scatterplot(x='Glucose', y='Outcome', data=df)

plt.title('Glucose vs Outcome')

# Plotting scatter plot for 'BMI' vs 'Outcome'

plt.subplot(2, 3, 2)

sns.scatterplot(x='BMI', y='Outcome', data=df)

plt.title('BMI vs Outcome')

# Plotting scatter plot for 'Age' vs 'Outcome'

plt.subplot(2, 3, 3)
sns.scatterplot(x='Age', y='Outcome', data=df)

plt.title('Age vs Outcome')

# Plotting scatter plot for 'Insulin' vs 'Outcome'

plt.subplot(2, 3, 4)

sns.scatterplot(x='Insulin', y='Outcome', data=df)

plt.title('Insulin vs Outcome')

# Plotting scatter plot for 'BloodPressure' vs 'Outcome'

plt.subplot(2, 3, 5)

sns.scatterplot(x='BloodPressure', y='Outcome', data=df)

plt.title('BloodPressure vs Outcome')

# Plotting scatter plot for 'Pregnancies' vs 'Outcome'

plt.subplot(2, 3, 6)

sns.scatterplot(x='Pregnancies', y='Outcome', data=df)

plt.title('Pregnancies vs Outcome')

plt.tight_layout()

plt.show()

# Step 3: Pairplot to visualize the relationships between multiple features and 'Outcome'

sns.pairplot(df, hue='Outcome', diag_kind='hist', markers=["o", "s"])

plt.suptitle('Pairplot of Features with Outcome', y=1.02)

plt.show()
output:

11.Perform Multiple Regression analysis on your own dataset ( For example,

Car dataset with information Company Name, Model, Volume, Weight, CO2)
with more than one independent value to predict a value based on two or
more variable.
Program:
# Import necessary libraries

import pandas as pd

import statsmodels.api as sm

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LinearRegression

from sklearn.metrics import mean_squared_error, r2_score

import matplotlib.pyplot as plt

# Step 1: Create or Load your Dataset

# Sample data representing car information

data = {

'Company Name': ['Toyota', 'Honda', 'Ford', 'BMW', 'Audi'],

'Model': ['Corolla', 'Civic', 'Focus', 'X5', 'A4'],

'Volume': [1.8, 2.0, 1.5, 3.0, 2.5], # Engine volume in liters

'Weight': [1300, 1200, 1400, 2000, 1800], # Weight in kilograms

'CO2': [120, 110, 140, 200, 180] # CO2 emissions in grams per km

# Convert to DataFrame

df = pd.DataFrame(data)

# Step 2: Preprocess the Data

# Since we are predicting CO2 based on Volume and Weight, we can drop 'Company Name' and
'Model' for now

df = df.drop(columns=['Company Name', 'Model'])

# Independent variables (Volume, Weight)

X = df[['Volume', 'Weight']]

# Dependent variable (CO2)

y = df['CO2']

# Step 3: Add a constant to the independent variables (for intercept)

X = sm.add_constant(X)

# Step 4: Perform Multiple Regression using statsmodels

model = sm.OLS(y, X).fit()

# Step 5: Display the summary of the regression analysis

print("Multiple Regression Analysis Summary (statsmodels):")

print(model.summary())

# Step 6: Perform Multiple Regression using scikit-learn

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(df[['Volume', 'Weight']], df['CO2'], test_size=0.2,

random_state=42)

# Initialize the Linear Regression model

regressor = LinearRegression()
# Train the model

regressor.fit(X_train, y_train)

# Predict on the test set

y_pred = regressor.predict(X_test)

# Step 7: Evaluate the model

print("\nMultiple Regression Analysis using scikit-learn:")

print(f"Coefficients: {regressor.coef_}")

print(f"Intercept: {regressor.intercept_}")

# Calculate R-squared value and Mean Squared Error (MSE)

r2 = r2_score(y_test, y_pred)

mse = mean_squared_error(y_test, y_pred)

print(f"R-squared: {r2}")

print(f"Mean Squared Error: {mse}")

# Step 8: Plotting the results

plt.scatter(y_test, y_pred)

plt.xlabel("Actual CO2")

plt.ylabel("Predicted CO2")

plt.title("Actual vs Predicted CO2")

plt.show()

output:
12.Perform Bivariate analysis using the pandas DataFrame that contains
information about two variables: (1) Hours spent studying and (2) Exam score
received by 20 different students
Program:
import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

from scipy.stats import pearsonr

# Step 1: Create the DataFrame

data = {

'Hours Studying': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20],

'Exam Score': [35, 40, 50, 60, 65, 70, 75, 80, 85, 88, 90, 92, 94, 95, 96, 98, 99, 99, 100, 100]

# Convert the dictionary to a pandas DataFrame

df = pd.DataFrame(data)

# Step 2: Descriptive Statistics

print("Descriptive Statistics:")

print(df.describe())

# Step 3: Calculate Correlation

correlation, _ = pearsonr(df['Hours Studying'], df['Exam Score'])

print(f"\nCorrelation between Hours Studying and Exam Score: {correlation:.2f}")

# Step 4: Scatter Plot

plt.figure(figsize=(8, 6))

plt.scatter(df['Hours Studying'], df['Exam Score'], color='blue', label='Data Points')

plt.title('Hours Studying vs Exam Score')

plt.xlabel('Hours Studying')

plt.ylabel('Exam Score')

plt.grid(True)

plt.legend()
plt.show()

# Step 5: Linear Regression Line (Fit a regression line)

sns.regplot(x='Hours Studying', y='Exam Score', data=df, scatter_kws={'color':'blue'},

line_kws={'color':'red'})

plt.title('Linear Regression Line: Hours Studying vs Exam Score')

plt.xlabel('Hours Studying')

plt.ylabel('Exam Score')

plt.show()

output:

13 . Perform Univariate analysis with the following pandas DataFrame 'points':

[1, 1, 2, 3.5, 4, 4, 4, 5, 5, 6.5, 7, 7.4, 8, 13, 14.2] 'assists': [5, 7, 7, 9, 12, 9, 9, 4, 6,
8, 8, 9, 3, 2, 6] 'rebounds': [11, 8, 10, 6, 6, 5, 9, 12, 6, 6, 7, 8, 7, 9, 15].

Program:
import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

# Step 1: Create the DataFrame

data = {

'points': [1, 1, 2, 3.5, 4, 4, 4, 5, 5, 6.5, 7, 7.4, 8, 13, 14.2],

'assists': [5, 7, 7, 9, 12, 9, 9, 4, 6, 8, 8, 9, 3, 2, 6],

'rebounds': [11, 8, 10, 6, 6, 5, 9, 12, 6, 6, 7, 8, 7, 9, 15]

# Convert the dictionary to a pandas DataFrame

df = pd.DataFrame(data)

# Step 2: Descriptive Statistics for each column

print("Descriptive Statistics:")

print(df.describe())

# Step 3: Visualizing the Distribution of each variable

# Plot histograms for each variable

plt.figure(figsize=(12, 6))

# Histogram for 'points'

plt.subplot(1, 3, 1)

sns.histplot(df['points'], kde=True, color='blue', bins=10)

plt.title('Distribution of Points')

plt.xlabel('Points')

plt.ylabel('Frequency')

# Histogram for 'assists'

plt.subplot(1, 3, 2)

sns.histplot(df['assists'], kde=True, color='green', bins=10)

plt.title('Distribution of Assists')
plt.xlabel('Assists')

plt.ylabel('Frequency')

# Histogram for 'rebounds'

plt.subplot(1, 3, 3)

sns.histplot(df['rebounds'], kde=True, color='red', bins=10)

plt.title('Distribution of Rebounds')

plt.xlabel('Rebounds')

plt.ylabel('Frequency')

plt.tight_layout()

plt.show()

# Step 4: Box plots to visualize outliers

plt.figure(figsize=(12, 6))

# Box plot for 'points'

plt.subplot(1, 3, 1)

sns.boxplot(y=df['points'], color='blue')

plt.title('Boxplot of Points')

# Box plot for 'assists'

plt.subplot(1, 3, 2)

sns.boxplot(y=df['assists'], color='green')

plt.title('Boxplot of Assists')

# Box plot for 'rebounds'

plt.subplot(1, 3, 3)

sns.boxplot(y=df['rebounds'], color='red')

plt.title('Boxplot of Rebounds')
plt.tight_layout()

plt.show()

# Step 5: Skewness and Kurtosis

from scipy.stats import skew, kurtosis

# Skewness and Kurtosis for 'points'

points_skew = skew(df['points'])

points_kurt = kurtosis(df['points'])

# Skewness and Kurtosis for 'assists'

assists_skew = skew(df['assists'])

assists_kurt = kurtosis(df['assists'])

# Skewness and Kurtosis for 'rebounds'

rebounds_skew = skew(df['rebounds'])

rebounds_kurt = kurtosis(df['rebounds'])

print("\nSkewness and Kurtosis:")

print(f"Points: Skewness = {points_skew:.2f}, Kurtosis = {points_kurt:.2f}")

print(f"Assists: Skewness = {assists_skew:.2f}, Kurtosis = {assists_kurt:.2f}")

print(f"Rebounds: Skewness = {rebounds_skew:.2f}, Kurtosis = {rebounds_kurt:.2f}")

output:
14. i) Using various functions in numpy library, mathematically calculate the
values for a normal distribution and create Histograms to plot the probability
distribution curve.
Program:
import numpy as np

import matplotlib.pyplot as plt

# Step 1: Parameters for the normal distribution

mu = 0 # Mean of the distribution

sigma = 1 # Standard deviation

size = 10000 # Number of data points to generate

# Step 2: Generate random samples from a normal distribution

data = np.random.normal(mu, sigma, size)

# Step 3: Plot the histogram

plt.figure(figsize=(10, 6))
count, bins, ignored = plt.hist(data, bins=30, density=True, alpha=0.6, color='g')

# Step 4: Calculate the Probability Density Function (PDF)

# Define the normal distribution function

def normal_distribution(x, mu, sigma):

return (1/np.sqrt(2 * np.pi * sigma**2)) * np.exp(-0.5 * ((x - mu) / sigma)**2)

# Step 5: Generate points for the normal distribution curve

x_values = np.linspace(min(bins), max(bins), 100)

pdf_values = normal_distribution(x_values, mu, sigma)

# Step 6: Plot the PDF curve over the histogram

plt.plot(x_values, pdf_values, 'k', linewidth=2)

plt.title("Normal Distribution with Histogram")

plt.xlabel("Data points")

plt.ylabel("Density")

plt.grid(True)

plt.show()

output:
14.ii) Using plt.contour(), plt.contourf(), plt.imshow(), plt.colorbar(), plt.clabel()
functions visualize a contour plot.
Program:
import numpy as np

import matplotlib.pyplot as plt

# Create some sample data

x = np.linspace(-3, 3, 100)

y = np.linspace(-3, 3, 100)

X, Y = np.meshgrid(x, y)

Z = np.sin(X2 + Y2) / (X2 + Y2)

# Create a contour plot

plt.contour(X, Y, Z, levels=20, cmap='viridis')

# Create a filled contour plot

plt.contourf(X, Y, Z, levels=20, cmap='viridis', alpha=0.7)

# Add a colorbar

plt.colorbar()

# Add labels to the contour lines

plt.clabel(plt.contour(X, Y, Z, levels=20, colors='k'), inline=True, fontsize=10)

# Display the plot

plt.show()

output:
15 Make a three-dimensional plot with randomly generate 50 data points for x,
y, and z. Set the point color as red, and size of the point as 50.

Program:
import matplotlib.pyplot as plt

from mpl_toolkits.mplot3d import Axes3D

import numpy as np

# Generate 50 random data points for x, y, and z

np.random.seed(42) # Set a seed for reproducibility

x = np.random.rand(50) * 10

y = np.random.rand(50) * 10

z = np.random.rand(50) * 10

# Create a 3D plot

fig = plt.figure()

ax = fig.add_subplot(111, projection='3d')

# Plot the points with specified color and size

ax.scatter(x, y, z, c='red', s=50)

# Set labels for axes

ax.set_xlabel('X')

ax.set_ylabel('Y')

ax.set_zlabel('Z')

# Show the plot

plt.show()

output:

2025 Dealroom Deeptech Report
No ratings yet
2025 Dealroom Deeptech Report
187 pages
Pandas
No ratings yet
Pandas
27 pages
Fods Lab Ans
No ratings yet
Fods Lab Ans
36 pages
dfs manual
No ratings yet
dfs manual
43 pages
Manual
No ratings yet
Manual
52 pages
Ds Lab-1
No ratings yet
Ds Lab-1
40 pages
Data Science Practical
No ratings yet
Data Science Practical
28 pages
5 WEEK Python Programs
No ratings yet
5 WEEK Python Programs
20 pages
PYTHON_UNIT-5
No ratings yet
PYTHON_UNIT-5
14 pages
APP Lab Manual final.docx (1)
No ratings yet
APP Lab Manual final.docx (1)
43 pages
Numpy Tutorial Basic To Advance 1656682851
No ratings yet
Numpy Tutorial Basic To Advance 1656682851
35 pages
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
100% (1)
Introduction To Numpy: Aniruddh Kadam Reg No-12109237 Lovely Professional University
84 pages
python assignment
No ratings yet
python assignment
17 pages
FODS_LAB_MANUAL
No ratings yet
FODS_LAB_MANUAL
26 pages
21BECE30036 Prac 1
No ratings yet
21BECE30036 Prac 1
10 pages
Pedagogy
No ratings yet
Pedagogy
10 pages
Section 7
No ratings yet
Section 7
33 pages
ML IU48prac1,2
No ratings yet
ML IU48prac1,2
16 pages
IP Practical File
No ratings yet
IP Practical File
27 pages
Data Science Practical Problems
No ratings yet
Data Science Practical Problems
40 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
45 pages
Khadeeja_DS_PRACTICAL 4
No ratings yet
Khadeeja_DS_PRACTICAL 4
24 pages
Labmanualfds
No ratings yet
Labmanualfds
49 pages
Fundamentals of Data Science Lab Manual
No ratings yet
Fundamentals of Data Science Lab Manual
34 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
DE LAB MANUAL NEW
No ratings yet
DE LAB MANUAL NEW
24 pages
NumPy PDF
No ratings yet
NumPy PDF
37 pages
Module 6 NumPY and Pandas
No ratings yet
Module 6 NumPY and Pandas
12 pages
dv_lab_manual_modified
No ratings yet
dv_lab_manual_modified
31 pages
Pandas_Numpy[1]
No ratings yet
Pandas_Numpy[1]
7 pages
UNIT-4 Important Q-A
No ratings yet
UNIT-4 Important Q-A
28 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
61 pages
Class 1 - 2024 Business Analytics
No ratings yet
Class 1 - 2024 Business Analytics
8 pages
Data Analysis and Visualization Using Python Libraries and Streamlit - RTF Pre Read Materials
No ratings yet
Data Analysis and Visualization Using Python Libraries and Streamlit - RTF Pre Read Materials
29 pages
ML Lab File Vijay Kumar
No ratings yet
ML Lab File Vijay Kumar
16 pages
ELE492 - ELE492 - Image Process Lecture Notes 5
No ratings yet
ELE492 - ELE492 - Image Process Lecture Notes 5
41 pages
Unit 4 Numpy
No ratings yet
Unit 4 Numpy
14 pages
FDS Exp1,2
No ratings yet
FDS Exp1,2
4 pages
Lab 1 - Introduction
No ratings yet
Lab 1 - Introduction
14 pages
Numpy Basics
No ratings yet
Numpy Basics
66 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
Fundamentals of Data Science Lab Manual New1
No ratings yet
Fundamentals of Data Science Lab Manual New1
32 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
Python exps questions
No ratings yet
Python exps questions
10 pages
Untitled8 (2)
No ratings yet
Untitled8 (2)
2 pages
Experiment No-7 Aaryo PDF
No ratings yet
Experiment No-7 Aaryo PDF
8 pages
Data Science Fundamentals Lab
No ratings yet
Data Science Fundamentals Lab
24 pages
Mod 2 Finalans
No ratings yet
Mod 2 Finalans
9 pages
01 Introduction to Python
No ratings yet
01 Introduction to Python
36 pages
Cat 950F Transmission PDF
100% (4)
Cat 950F Transmission PDF
13 pages
Python Numpy Programming: Eliot Feibush
No ratings yet
Python Numpy Programming: Eliot Feibush
66 pages
Project On e Filing (Income Tax Return Online)
87% (55)
Project On e Filing (Income Tax Return Online)
60 pages
EXP1-siddhant gupta (23_SE_148)
No ratings yet
EXP1-siddhant gupta (23_SE_148)
17 pages
Numpy Tutorial
No ratings yet
Numpy Tutorial
19 pages
Python Unit IV
No ratings yet
Python Unit IV
12 pages
Value Added Course: Programming in Python and Machine Learning UNIT-2
No ratings yet
Value Added Course: Programming in Python and Machine Learning UNIT-2
41 pages
NumPy Exercise
No ratings yet
NumPy Exercise
4 pages
Research Formats Templates and Forms
No ratings yet
Research Formats Templates and Forms
80 pages
Python Notes by Prof T
No ratings yet
Python Notes by Prof T
10 pages
NumPy and Pandas (1)
No ratings yet
NumPy and Pandas (1)
12 pages
Full Download Geologic Time Scale 2020 Felix M. Gradstein PDF
100% (2)
Full Download Geologic Time Scale 2020 Felix M. Gradstein PDF
64 pages
DV Lab2 Updated
No ratings yet
DV Lab2 Updated
12 pages
Introduction To Numpy Pandas and Matplotlib
No ratings yet
Introduction To Numpy Pandas and Matplotlib
2 pages
Project Proposal with Component Projects (Edited with GAD Components) (1)
No ratings yet
Project Proposal with Component Projects (Edited with GAD Components) (1)
13 pages
Combined Cheatsheet
No ratings yet
Combined Cheatsheet
5 pages
International Marketing Dissertation Titles
100% (2)
International Marketing Dissertation Titles
4 pages
Angga Pratama Haloho - 2003511035 - Abstract Assigment - Class A
No ratings yet
Angga Pratama Haloho - 2003511035 - Abstract Assigment - Class A
3 pages
ISFO Science Paper 2020Level1G10
100% (1)
ISFO Science Paper 2020Level1G10
8 pages
State Immunity
No ratings yet
State Immunity
32 pages
Strand Corrosion Click Here 1
No ratings yet
Strand Corrosion Click Here 1
9 pages
Syllabus of Certificate Exam on KYC-AML & Compliance
No ratings yet
Syllabus of Certificate Exam on KYC-AML & Compliance
3 pages
Life On The Mississippi Excerpt
No ratings yet
Life On The Mississippi Excerpt
7 pages
SR NO 7 Tilt Inclinometer - Copy
No ratings yet
SR NO 7 Tilt Inclinometer - Copy
10 pages
CA4-Portfolio - C - Job Search Form (JSF)
No ratings yet
CA4-Portfolio - C - Job Search Form (JSF)
7 pages
Research
No ratings yet
Research
5 pages
Assignment 6 (688684) .pdf-1
No ratings yet
Assignment 6 (688684) .pdf-1
12 pages
Seashells, Mussels, Barnacles and Sea Growth Is Prevent in Water Systems
No ratings yet
Seashells, Mussels, Barnacles and Sea Growth Is Prevent in Water Systems
8 pages
Assessment of Quality of Life (AQoL-8D)
No ratings yet
Assessment of Quality of Life (AQoL-8D)
6 pages
Irreversible Thermodynamics
No ratings yet
Irreversible Thermodynamics
11 pages
Alternator Overhaul SK-KD 18.4
No ratings yet
Alternator Overhaul SK-KD 18.4
25 pages
Isulan National High School: Republic of The Philippines Department of Education Region XII Division of Sultan Kudarat
No ratings yet
Isulan National High School: Republic of The Philippines Department of Education Region XII Division of Sultan Kudarat
2 pages
Scope 090117 Fiction Closereading
No ratings yet
Scope 090117 Fiction Closereading
3 pages
Chennai Region Engg College - Placement Contacts
No ratings yet
Chennai Region Engg College - Placement Contacts
1 page
Track and Field Webquest
No ratings yet
Track and Field Webquest
5 pages
GR No 166715 - ABAKADA vs. PURISIMA
No ratings yet
GR No 166715 - ABAKADA vs. PURISIMA
2 pages
NG Sze Kay Priscilla 1
No ratings yet
NG Sze Kay Priscilla 1
1 page
PsychiatricInjury_Chart
No ratings yet
PsychiatricInjury_Chart
3 pages
List of Companies Address Jan 2018
No ratings yet
List of Companies Address Jan 2018
1 page
Circular - Clould Thing - 2021 Batch
No ratings yet
Circular - Clould Thing - 2021 Batch
1 page
Bibliography Migration
No ratings yet
Bibliography Migration
4 pages
CIN L31900TN1985PLC012343 Tel No: +91-44-42208100/ 28604795 Fax No: +91-44-28604788
No ratings yet
CIN L31900TN1985PLC012343 Tel No: +91-44-42208100/ 28604795 Fax No: +91-44-28604788
4 pages
Vismaya (CV Resume)
No ratings yet
Vismaya (CV Resume)
3 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.