0% found this document useful (0 votes)

6 views

se python_merged (1) (1) (1)

The document discusses the analysis of variance (ANOVA) using a dataset of car attributes and prices. It includes data loading, preprocessing, statistical analysis, and visualization of relationships between variables. Key findings include the distribution of car prices, correlation between features, and the calculation of skewness and kurtosis for numerical columns.

Uploaded by

csedsa23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

se python_merged (1) (1) (1)

Uploaded by

csedsa23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 77

WEEK 7

ANOVA-ANALYSIS OF
VARIANCE
19/06/2024, 23:30 Untitled2.ipynb - Colab

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import skew, kurtosis
import statsmodels.api as sm
%matplotlib inline

# Load the dataset

path = 'Imports_Autos_85.csv'
headers = ["symboling", "normalized-losses", "make", "fuel-type", "aspiration",
"num-of-doors", "body-style", "drive-wheels", "engine-location",
"wheel-base", "length", "width", "height", "curb-weight", "engine-type",
"num-of-cylinders", "engine-size", "fuel-system", "bore", "stroke",
"compression-ratio", "horsepower", "peak-rpm", "city-mpg", "highway-mpg", "price"]
df = pd.read_csv(path, names=headers)

# Display the first few rows of the dataset

print(df.head())

symboling normalized-losses make fuel-type aspiration num-of-doors \

0 3 ? alfa-romero gas std two
1 3 ? alfa-romero gas std two
2 1 ? alfa-romero gas std two
3 2 164 audi gas std four
4 2 164 audi gas std four

body-style drive-wheels engine-location wheel-base ... engine-size \

0 convertible rwd front 88.6 ... 130
1 convertible rwd front 88.6 ... 130
2 hatchback rwd front 94.5 ... 152
3 sedan fwd front 99.8 ... 109
4 sedan 4wd front 99.4 ... 136

fuel-system bore stroke compression-ratio horsepower peak-rpm city-mpg \

0 mpfi 3.47 2.68 9.0 111 5000 21
1 mpfi 3.47 2.68 9.0 111 5000 21
2 mpfi 2.68 3.47 9.0 154 5000 19
3 mpfi 3.19 3.40 10.0 102 5500 24
4 mpfi 3.19 3.40 8.0 115 5500 18

highway-mpg price
0 27 13495
1 27 16500
2 26 16500
3 30 13950
4 22 17450

[5 rows x 26 columns]

# Convert relevant columns to numeric, handling errors by coercing to NaN

numeric_columns = ["symboling", "normalized-losses", "wheel-base", "length", "width", "height",
"curb-weight", "engine-size", "bore", "stroke", "compression-ratio",
"horsepower", "peak-rpm", "city-mpg", "highway-mpg", "price"]
df[numeric_columns] = df[numeric_columns].apply(pd.to_numeric, errors='coerce')

# Fill missing values with the column mean

df[numeric_columns] = df[numeric_columns].fillna(df[numeric_columns].mean())

# Display basic statistics

print(df.describe())

symboling normalized-losses wheel-base length width \

count 205.000000 205.000000 205.000000 205.000000 205.000000
mean 0.834146 122.000000 98.756585 174.049268 65.907805
std 1.245307 31.681008 6.021776 12.337289 2.145204
min -2.000000 65.000000 86.600000 141.100000 60.300000
25% 0.000000 101.000000 94.500000 166.300000 64.100000
50% 1.000000 122.000000 97.000000 173.200000 65.500000
75% 2.000000 137.000000 102.400000 183.100000 66.900000
max 3.000000 256.000000 120.900000 208.100000 72.300000

height curb-weight engine-size bore stroke \

count 205.000000 205.000000 205.000000 205.000000 205.000000
mean 53.724878 2555.565854 126.907317 3.329751 3.255423
std 2.443522 520.680204 41.642693 0.270844 0.313597
min 47.800000 1488.000000 61.000000 2.540000 2.070000
25% 52.000000 2145.000000 97.000000 3.150000 3.110000
50% 54.100000 2414.000000 120.000000 3.310000 3.290000
75% 55.500000 2935.000000 141.000000 3.580000 3.410000
max 59.800000 4066.000000 326.000000 3.940000 4.170000

https://colab.research.google.com/drive/18pq4PjPIeWmnKS0QDggj9qxZf2rTNa8v#scrollTo=sNHmyAXoc1Hv&printMode=true 1/6
19/06/2024, 23:30 Untitled2.ipynb - Colab

compression-ratio horsepower peak-rpm city-mpg highway-mpg \

count 205.000000 205.000000 205.000000 205.000000 205.000000
mean 10.142537 104.256158 5125.369458 25.219512 30.751220
std 3.972040 39.519211 476.979093 6.542142 6.886443
min 7.000000 48.000000 4150.000000 13.000000 16.000000
25% 8.600000 70.000000 4800.000000 19.000000 25.000000
50% 9.000000 95.000000 5200.000000 24.000000 30.000000
75% 9.400000 116.000000 5500.000000 30.000000 34.000000
max 23.000000 288.000000 6600.000000 49.000000 54.000000

price
count 205.000000
mean 13207.129353
std 7868.768212
min 5118.000000
25% 7788.000000
50% 10595.000000
75% 16500.000000
max 45400.000000

# Visualizing distribution of the target variable (price)

sns.histplot(df['price'], kde=True)
plt.title('Distribution of Car Prices')
plt.xlabel('Price')
plt.ylabel('Frequency')
plt.show()

# Visualize relationships between variables and price

sns.pairplot(df, x_vars=['engine-size', 'horsepower', 'curb-weight', 'highway-mpg'], y_vars='price', kind='reg')
plt.show()

# Calculate and display skewness and kurtosis

print("Skewness:")
print(df[numeric_columns].skew())
print("\nKurtosis:")
print(df[numeric_columns].kurtosis())

Skewness:
symboling 0.211072

https://colab.research.google.com/drive/18pq4PjPIeWmnKS0QDggj9qxZf2rTNa8v#scrollTo=sNHmyAXoc1Hv&printMode=true 2/6
19/06/2024, 23:30 Untitled2.ipynb - Colab
normalized-losses 0.854802
wheel-base 1.050214
length 0.155954
width 0.904003
height 0.063123
curb-weight 0.681398
engine-size 1.947655
bore 0.020211
stroke -0.689784
compression-ratio 2.610862
horsepower 1.397763
peak-rpm 0.073591
city-mpg 0.663704
highway-mpg 0.539997
price 1.827324
dtype: float64

Kurtosis:
symboling -0.676271
normalized-losses 1.404644
wheel-base 1.017039
length -0.082895
width 0.702764
height -0.443812
curb-weight -0.042854
engine-size 5.305682
bore -0.785040
stroke 2.174471
compression-ratio 5.233054
horsepower 2.678182
peak-rpm 0.086770
city-mpg 0.578648
highway-mpg 0.440070
price 3.354216
dtype: float64

numeric_df = df[numeric_columns]
correlation_matrix = numeric_df.corr()
print(correlation_matrix)

symboling normalized-losses wheel-base length \

symboling 1.000000 0.465190 -0.531954 -0.357612
normalized-losses 0.465190 1.000000 -0.056518 0.019209
wheel-base -0.531954 -0.056518 1.000000 0.874587
length -0.357612 0.019209 0.874587 1.000000
width -0.232919 0.084195 0.795144 0.841118
height -0.541038 -0.370706 0.589435 0.491029
curb-weight -0.227691 0.097785 0.776386 0.877728
engine-size -0.105790 0.110997 0.569329 0.683360
bore -0.130083 -0.029266 0.488760 0.606462
stroke -0.008689 0.054929 0.160944 0.129522
compression-ratio -0.178515 -0.114525 0.249786 0.158414
horsepower 0.071389 0.203434 0.351957 0.554434
peak-rpm 0.273679 0.237748 -0.360704 -0.287031
city-mpg -0.035823 -0.218749 -0.470414 -0.670909
highway-mpg 0.034606 -0.178221 -0.544082 -0.704662
price -0.082201 0.133999 0.583168 0.682986

width height curb-weight engine-size bore \

symboling -0.232919 -0.541038 -0.227691 -0.105790 -0.130083
normalized-losses 0.084195 -0.370706 0.097785 0.110997 -0.029266
wheel-base 0.795144 0.589435 0.776386 0.569329 0.488760
length 0.841118 0.491029 0.877728 0.683360 0.606462
width 1.000000 0.279210 0.867032 0.735433 0.559152
height 0.279210 1.000000 0.295572 0.067149 0.171101
curb-weight 0.867032 0.295572 1.000000 0.850594 0.648485
engine-size 0.735433 0.067149 0.850594 1.000000 0.583798
bore 0.559152 0.171101 0.648485 0.583798 1.000000
stroke 0.182939 -0.055351 0.168783 0.203094 -0.055909
compression-ratio 0.181129 0.261214 0.151362 0.028971 0.005201
horsepower 0.642195 -0.110137 0.750968 0.810713 0.575737
peak-rpm -0.219859 -0.320602 -0.266283 -0.244599 -0.254761
city-mpg -0.642704 -0.048640 -0.757414 -0.653658 -0.584508
highway-mpg -0.677218 -0.107358 -0.797465 -0.677470 -0.586992
price 0.728699 0.134388 0.820825 0.861752 0.532300

stroke compression-ratio horsepower peak-rpm \

symboling -0.008689 -0.178515 0.071389 0.273679
normalized-losses 0.054929 -0.114525 0.203434 0.237748
wheel-base 0.160944 0.249786 0.351957 -0.360704
length 0.129522 0.158414 0.554434 -0.287031
width 0.182939 0.181129 0.642195 -0.219859
height -0.055351 0.261214 -0.110137 -0.320602
curb-weight 0.168783 0.151362 0.750968 -0.266283
engine-size 0.203094 0.028971 0.810713 -0.244599
bore -0.055909 0.005201 0.575737 -0.254761
stroke 1.000000 0.186105 0.088264 -0.066844
compression-ratio 0.186105 1.000000 -0.205740 -0.435936
horsepower 0.088264 -0.205740 1.000000 0.130971

https://colab.research.google.com/drive/18pq4PjPIeWmnKS0QDggj9qxZf2rTNa8v#scrollTo=sNHmyAXoc1Hv&printMode=true 3/6
19/06/2024, 23:30 Untitled2.ipynb - Colab
peak-rpm -0.066844 -0.435936 0.130971 1.000000
city-mpg -0.042179 0.324701 -0.803162 -0.113723
highway-mpg -0.043961 0.265201 -0.770903 -0.054257
price 0.082095 0.070990 0.757917 -0.100854

city-mpg highway-mpg price

symboling -0.035823 0.034606 -0.082201
normalized-losses -0.218749 -0.178221 0.133999
wheel-base -0.470414 -0.544082 0.583168

import statsmodels.formula.api as smf

# 7. ANOVA Analysis
# Group by 'make' and perform ANOVA
grouped_test2 = df[['make', 'price']].groupby(['make'])
print(grouped_test2.head(2))
print(grouped_test2.get_group('honda')['price'])

# Create the model using a formula

model = smf.ols('price ~ Q("engine-size") + Q("horsepower") + Q("curb-weight") + Q("highway-mpg")', data=df).fit()

# Perform ANOVA
anova_results = sm.stats.anova_lm(model, typ=2)
print(anova_results)

make price
0 alfa-romero 13495.000000
1 alfa-romero 16500.000000
3 audi 13950.000000
4 audi 17450.000000
10 bmw 16430.000000
11 bmw 16925.000000
18 chevrolet 5151.000000
19 chevrolet 6295.000000
21 dodge 5572.000000
22 dodge 6377.000000
30 honda 6479.000000
31 honda 6855.000000
43 isuzu 6785.000000
44 isuzu 13207.129353
47 jaguar 32250.000000
48 jaguar 35550.000000
50 mazda 5195.000000
51 mazda 6095.000000
67 mercedes-benz 25552.000000
68 mercedes-benz 28248.000000
75 mercury 16503.000000
76 mitsubishi 5389.000000
77 mitsubishi 6189.000000
89 nissan 5499.000000
90 nissan 7099.000000
107 peugot 11900.000000
108 peugot 13200.000000
118 plymouth 5572.000000
119 plymouth 7957.000000
125 porsche 22018.000000
126 porsche 32528.000000
130 renault 9295.000000
131 renault 9895.000000
132 saab 11850.000000
133 saab 12170.000000
138 subaru 5118.000000
139 subaru 7053.000000
150 toyota 5348.000000
151 toyota 6338.000000
182 volkswagen 7775.000000
183 volkswagen 7975.000000
194 volvo 12940.000000
195 volvo 13415.000000
30 6479.0
31 6855.0
32 5399.0
33 6529.0
34 7129.0
35 7295.0
36 7295.0
37 7895.0
38 9095.0
39 8845.0
40 10295.0
41 12945.0
42 10345.0
Name: price, dtype: float64
df ( )

https://colab.research.google.com/drive/18pq4PjPIeWmnKS0QDggj9qxZf2rTNa8v#scrollTo=sNHmyAXoc1Hv&printMode=true 4/6
19/06/2024, 23:30 Untitled2.ipynb - Colab
# 8. Regression Plots
sns.regplot(x='engine-size', y='price', data=df)
plt.title('Engine Size vs Price')
plt.show()

sns.regplot(x='horsepower', y='price', data=df)

plt.title('Horsepower vs Price')
plt.show()

sns.regplot(x='curb-weight', y='price', data=df)

plt.title('Curb Weight vs Price')
plt.show()

https://colab.research.google.com/drive/18pq4PjPIeWmnKS0QDggj9qxZf2rTNa8v#scrollTo=sNHmyAXoc1Hv&printMode=true 5/6
19/06/2024, 23:30 Untitled2.ipynb - Colab

sns.regplot(x='highway-mpg', y='price', data=df)

plt.title('Highway MPG vs Price')
plt.show()

Double-click (or enter) to edit

https://colab.research.google.com/drive/18pq4PjPIeWmnKS0QDggj9qxZf2rTNa8v#scrollTo=sNHmyAXoc1Hv&printMode=true 6/6
WEEK 8
CALCULATING THE
SKEWNESS OF A DATA SET
WEEK 9
5-POINT SUMMARY
6/22/24, 2:46 PM Untitled19

 In [3]: import numpy as np

#Example dataset
data = [10,20,30,40,50,60,70,80,90,100]

#Desired percentile (e.g. 25th percentile)
percentiles = [25,50,75]

#Calculate percentile
for percentile in percentiles:
#Calculate the position
position = (percentile / 100) * (len(data) + 1)

# Interpolation if necessary
if position.is_integer():
value = data[int(position) - 1]
else :
lower_index = int(position)
upper_index = lower_index + 1
lower_value = data[lower_index - 1]
upper_value = data[upper_index - 1]
value = lower_value + (position - lower_index) * (upper_value - lower

print(f"{percentile}th percentile:", value)

25th percentile: 27.5

50th percentile: 55.0
75th percentile: 82.5

In [7]: # Calculate minimum and maximum

minimum = np.min(data)
maximum = np.max(data)

# Calculate quartiles
Q1 = np.percentile(data , 25)
median = np.percentile(data , 50)
Q3 = np.percentile(data , 75)

# Print the five-number summary
print("Minimum:" , minimum)
print("First Quartile (Q1):" , Q1)
print("Median (Q2):" , median)
print("Third Quartile (Q3):" , Q3)
print("Maximum:" , maximum)

Minimum: 10
First Quartile (Q1): 32.5
Median (Q2): 55.0
Third Quartile (Q3): 77.5
Maximum: 100

localhost:8888/notebooks/Untitled19.ipynb 1/3
6/22/24, 2:46 PM Untitled19

In [8]: import matplotlib.pyplot as plt

# Calculate Quartiles
Q1 = np.percentile(data , 25)
median = np.percentile(data , 50)
Q3 = np.percentile(data , 75)

# Calculate interquartile range (IQR)
IQR = Q3 - Q1
print("IQR: ", IQR)

IQR: 45.0

localhost:8888/notebooks/Untitled19.ipynb 2/3
6/22/24, 2:46 PM Untitled19

In [9]: # Create a box plot

plt.figure(figsize=(8,6))
plt.boxplot(data,vert=False,patch_artist=True)
plt.title('Box Plot of Data with Interquartile Range (IQR)')
plt.xlabel('Vlaues')
plt.ylabel('Data')
plt.xticks(fontsize=10)
plt.yticks([])
plt.grid(True)

# Highlight median and IQR
plt.scatter(median,1,color='red',label='Median')
plt.scatter([Q1,Q3],[1,1],color='blue',label='Q1/Q3')
plt.plot([Q1,Q1],[0.75,1.25],color='blue')
plt.plot([Q3,Q3],[0.75,1.25],color='blue')

plt.text(Q1,1.4,f'Q1 ({Q1})',ha='center')
plt.text(Q3,1.4,f'Q3 ({Q3})',ha='center')
plt.text(median,1.4,f'Median ({median})', ha='center',color='red')

plt.legend()
plt.show()

localhost:8888/notebooks/Untitled19.ipynb 3/3
WEEK 10
UNIVARIATE, BIVARIATE,
MULTIVARIATE DESCRIPTIVE
STATISTIC MEASURES
WEEK 11
NORMAL DISTRIBUTION
24/06/2024, 18:36 Untitled13.ipynb - Colab

import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt

n = np.arange(0,30)
print(n)

[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29]

rate = 12
poisson = stats.poisson.pmf(n,rate)

print(poisson)

print(poisson[5])

[6.14421235e-06 7.37305482e-05 4.42383289e-04 1.76953316e-03

5.30859947e-03 1.27406387e-02 2.54812775e-02 4.36821900e-02
6.55232849e-02 8.73643799e-02 1.04837256e-01 1.14367916e-01
1.14367916e-01 1.05570384e-01 9.04889002e-02 7.23911201e-02
5.42933401e-02 3.83247107e-02 2.55498071e-02 1.61367203e-02
9.68203217e-03 5.53258981e-03 3.01777626e-03 1.57449196e-03
7.87245981e-04 3.77878071e-04 1.74405263e-04 7.75134504e-05
3.32200502e-05 1.37462277e-05]
0.012740638735861376

print(poisson[0] + poisson[1] + poisson[2] + poisson[3] + poisson[4])

0.007600390681067

plt.plot(n,poisson, 'o-')
plt.show()

# Parameters
n = 10 # Number of trials
p = 0.5 # Probability of success

# Creating a binomial distribution object

binom_dist = stats.binom(n, p)

# Probability mass function (PMF) for a given number of successes (k)

k = 3
pmf = binom_dist.pmf(k)
print(f"PMF at k={k}: {pmf}")

# Cumulative distribution function (CDF) for a given number of successes (k)

cdf = binom_dist.cdf(k)
print(f"CDF at k={k}: {cdf}")

# Generating random samples

https://colab.research.google.com/drive/1pgh75Hk4LTUAeQ-9VA_re6McbwW7Xi9s#scrollTo=2DElP7pcWZzL&printMode=true 1/2
24/06/2024, 18:36 Untitled13.ipynb - Colab
g p
samples = binom_dist.rvs(size=1000)
print(f"Random samples: {samples[:10]}")

PMF at k=3: 0.1171875

CDF at k=3: 0.171875
Random samples: [5 2 4 6 5 5 5 5 5 3]

https://colab.research.google.com/drive/1pgh75Hk4LTUAeQ-9VA_re6McbwW7Xi9s#scrollTo=2DElP7pcWZzL&printMode=true 2/2
WEEK 12
LINEAR REGRESSION
WEEK 13
T-TEST

Designing and Tuning High-Performance Fuel Injection Systems
From Everand
Designing and Tuning High-Performance Fuel Injection Systems
Greg Banish
3.5/5 (8)
Sir Cumference and All The Kings Tens Activities
No ratings yet
Sir Cumference and All The Kings Tens Activities
8 pages
Engine Management: Advance Tuning
From Everand
Engine Management: Advance Tuning
Greg Banish
3/5 (5)
Regression: Descriptive Statistics
No ratings yet
Regression: Descriptive Statistics
13 pages
LS Swaps: How to Swap GM LS Engines into Almost Anything
From Everand
LS Swaps: How to Swap GM LS Engines into Almost Anything
Jefferson Bryant
3.5/5 (2)
The Exponential Guide To The Future of Learning - Singularity University
No ratings yet
The Exponential Guide To The Future of Learning - Singularity University
14 pages
Untitled 21
No ratings yet
Untitled 21
6 pages
vertopal.com_Lab_Exploratory-Data-Analysis
No ratings yet
vertopal.com_Lab_Exploratory-Data-Analysis
25 pages
Untitled.ipynb_ (5) - JupyterLab
No ratings yet
Untitled.ipynb_ (5) - JupyterLab
4 pages
GmPrac1 - Jupyter Notebook
No ratings yet
GmPrac1 - Jupyter Notebook
11 pages
Predicting The Price of A Used Car Using A Regression Model With Python
No ratings yet
Predicting The Price of A Used Car Using A Regression Model With Python
82 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
22 pages
Course2 - DataAnalysis With Python - Week3 - Exploratory Data Analysis
No ratings yet
Course2 - DataAnalysis With Python - Week3 - Exploratory Data Analysis
23 pages
41 - Đinh Thị Thùy Linh - 23a4050209.ipynb Colaboratory
No ratings yet
41 - Đinh Thị Thùy Linh - 23a4050209.ipynb Colaboratory
4 pages
DV ca-1
No ratings yet
DV ca-1
9 pages
Car Price Prediction Using ML
No ratings yet
Car Price Prediction Using ML
11 pages
Statistics Introduction
No ratings yet
Statistics Introduction
8 pages
Lab Assignment 6
No ratings yet
Lab Assignment 6
5 pages
Topic
No ratings yet
Topic
9 pages
Data Analysis
No ratings yet
Data Analysis
58 pages
Expt2.ipynb - Colaboratory
No ratings yet
Expt2.ipynb - Colaboratory
2 pages
Advance EDA & Predictive Analytics
No ratings yet
Advance EDA & Predictive Analytics
38 pages
car-price-prediction-1 (1)
No ratings yet
car-price-prediction-1 (1)
24 pages
Miles Per Gallon
No ratings yet
Miles Per Gallon
11 pages
R Studio
No ratings yet
R Studio
5 pages
Data Cleaning Null and Missing Values 1695787806
No ratings yet
Data Cleaning Null and Missing Values 1695787806
17 pages
Mtcars Dataset Analysis in R
No ratings yet
Mtcars Dataset Analysis in R
4 pages
Assignment CSE-520
No ratings yet
Assignment CSE-520
29 pages
R Studio
No ratings yet
R Studio
4 pages
Exp_5_Exploratory_Data_Analysis_sdk_ok
No ratings yet
Exp_5_Exploratory_Data_Analysis_sdk_ok
13 pages
'Horsepower' "?" 'Horsepower' 'Horsepower' 'Horsepower' 'Horsepower' 'Horsepower'
No ratings yet
'Horsepower' "?" 'Horsepower' 'Horsepower' 'Horsepower' 'Horsepower' 'Horsepower'
5 pages
Car Price Prediction - Ipynb 3
No ratings yet
Car Price Prediction - Ipynb 3
322 pages
Economics 400 Computer Exercise
No ratings yet
Economics 400 Computer Exercise
7 pages
SMDM Business+Report
No ratings yet
SMDM Business+Report
11 pages
SMDM-Business Report
No ratings yet
SMDM-Business Report
11 pages
Autos
No ratings yet
Autos
2 pages
SMDM-Business Report
No ratings yet
SMDM-Business Report
11 pages
Assignment
No ratings yet
Assignment
49 pages
SMDM Business+Report
No ratings yet
SMDM Business+Report
11 pages
activity 2
No ratings yet
activity 2
16 pages
SMDM-Business Report
No ratings yet
SMDM-Business Report
11 pages
Car 13591
No ratings yet
Car 13591
2 pages
Eda 1
No ratings yet
Eda 1
29 pages
Mtcars - Ipynb - Colab
No ratings yet
Mtcars - Ipynb - Colab
2 pages
Mohy - Jupyter Notebook
No ratings yet
Mohy - Jupyter Notebook
3 pages
Drop the columns _id_ and _Unnamed_ 0_ from axis...
No ratings yet
Drop the columns _id_ and _Unnamed_ 0_ from axis...
3 pages
Car Price Prediction
No ratings yet
Car Price Prediction
480 pages
nalysis-manipulation-and-cleaning
No ratings yet
nalysis-manipulation-and-cleaning
15 pages
Muestre los tipos de datos de cada columna utiliz...
No ratings yet
Muestre los tipos de datos de cada columna utiliz...
2 pages
BDA-4 EDA Project
No ratings yet
BDA-4 EDA Project
19 pages
Pyt On Visualization
No ratings yet
Pyt On Visualization
50 pages
R
No ratings yet
R
3 pages
elite-sports-cars-eda
No ratings yet
elite-sports-cars-eda
9 pages
Data Science Lab
No ratings yet
Data Science Lab
28 pages
LinearRegression HandsOn
No ratings yet
LinearRegression HandsOn
3 pages
DSBDA1
No ratings yet
DSBDA1
5 pages
R Lab Ex 1 to 5
No ratings yet
R Lab Ex 1 to 5
26 pages
Research Methods
No ratings yet
Research Methods
12 pages
9.Libraries
No ratings yet
9.Libraries
1 page
CarPrice Assignment
No ratings yet
CarPrice Assignment
4 pages
Mod2Lab1 Unpacking The Walkthrough
No ratings yet
Mod2Lab1 Unpacking The Walkthrough
4 pages
Car Price Prediction
No ratings yet
Car Price Prediction
35 pages
Exp1.1 (1)
No ratings yet
Exp1.1 (1)
13 pages
SE RECORD-33
No ratings yet
SE RECORD-33
1 page
wb_boxes
No ratings yet
wb_boxes
4 pages
vehicle_tracking_system__42 (1)
No ratings yet
vehicle_tracking_system__42 (1)
5 pages
It Workshop Lab Manual (3)
No ratings yet
It Workshop Lab Manual (3)
97 pages
Aops Community 2021 Cono Sur Olympiad
No ratings yet
Aops Community 2021 Cono Sur Olympiad
1 page
PMP 450x System Release Notes 16.1
No ratings yet
PMP 450x System Release Notes 16.1
30 pages
Quant Checklist 571 Exams 2024
No ratings yet
Quant Checklist 571 Exams 2024
91 pages
Math 8 Kawit 2017
No ratings yet
Math 8 Kawit 2017
15 pages
Amit Sir - Assignment
No ratings yet
Amit Sir - Assignment
19 pages
GR 12 Examinable Proofs For Mathematics
No ratings yet
GR 12 Examinable Proofs For Mathematics
4 pages
Article A Generalization of Powers-Størmer
No ratings yet
Article A Generalization of Powers-Størmer
8 pages
Graphing Exponential Functions
No ratings yet
Graphing Exponential Functions
12 pages
Unit 1 RMIPR Notes
No ratings yet
Unit 1 RMIPR Notes
22 pages
Numerical I Module-1
86% (7)
Numerical I Module-1
184 pages
Binary Number System Test
No ratings yet
Binary Number System Test
2 pages
Mathematics4 (2) CHME
No ratings yet
Mathematics4 (2) CHME
40 pages
Vector and 3D Final
No ratings yet
Vector and 3D Final
26 pages
Instant ebooks textbook Molecular Evolution A Statistical Approach 1st Edition Ziheng Yang download all chapters
No ratings yet
Instant ebooks textbook Molecular Evolution A Statistical Approach 1st Edition Ziheng Yang download all chapters
71 pages
27.probability Watermark
No ratings yet
27.probability Watermark
31 pages
Aapt United States Physics Team AIP 2014: 2014 F Ma Contest 25 Questions - 75 Minutes Instructions
No ratings yet
Aapt United States Physics Team AIP 2014: 2014 F Ma Contest 25 Questions - 75 Minutes Instructions
13 pages
Maxwell's Equations For Time-Varying Fields: e Ntents Objectives
No ratings yet
Maxwell's Equations For Time-Varying Fields: e Ntents Objectives
32 pages
Important Aspects of Gas Temperature Modeling in Long Subsea Pipelines
No ratings yet
Important Aspects of Gas Temperature Modeling in Long Subsea Pipelines
17 pages
ELEG5693 Syllabus
No ratings yet
ELEG5693 Syllabus
2 pages
Answers - : Applications
No ratings yet
Answers - : Applications
7 pages
Examples5 PDF
No ratings yet
Examples5 PDF
2 pages
Artcam Post Processor Configur
100% (1)
Artcam Post Processor Configur
64 pages
Mock - TIMO - 2023 - SSF
No ratings yet
Mock - TIMO - 2023 - SSF
4 pages
Series CD1BA/3 SET 1: Mathematics (Basic)
No ratings yet
Series CD1BA/3 SET 1: Mathematics (Basic)
24 pages
TCS Previous Year Papers and Study Materials
No ratings yet
TCS Previous Year Papers and Study Materials
13 pages
A Numerical Analysis of Freezing and Melting With Convection
No ratings yet
A Numerical Analysis of Freezing and Melting With Convection
10 pages
Brain Inspired Robotics Supplement Final
No ratings yet
Brain Inspired Robotics Supplement Final
56 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

se python_merged (1) (1) (1)

Uploaded by

se python_merged (1) (1) (1)

Uploaded by

WEEK 7

# Load the dataset

# Display the first few rows of the dataset

symboling normalized-losses make fuel-type aspiration num-of-doors \

body-style drive-wheels engine-location wheel-base ... engine-size \

fuel-system bore stroke compression-ratio horsepower peak-rpm city-mpg \

# Convert relevant columns to numeric, handling errors by coercing to NaN

# Fill missing values with the column mean

# Display basic statistics

symboling normalized-losses wheel-base length width \

height curb-weight engine-size bore stroke \

compression-ratio horsepower peak-rpm city-mpg highway-mpg \

# Visualizing distribution of the target variable (price)

# Visualize relationships between variables and price

# Calculate and display skewness and kurtosis

symboling normalized-losses wheel-base length \

width height curb-weight engine-size bore \

stroke compression-ratio horsepower peak-rpm \

city-mpg highway-mpg price

import statsmodels.formula.api as smf

# Create the model using a formula

sns.regplot(x='horsepower', y='price', data=df)

sns.regplot(x='curb-weight', y='price', data=df)

sns.regplot(x='highway-mpg', y='price', data=df)

Double-click (or enter) to edit

 In [3]: import numpy as np

print(f"{percentile}th percentile:", value)

25th percentile: 27.5

In [7]: # Calculate minimum and maximum

In [8]: import matplotlib.pyplot as plt

In [9]: # Create a box plot

[6.14421235e-06 7.37305482e-05 4.42383289e-04 1.76953316e-03

print(poisson[0] + poisson[1] + poisson[2] + poisson[3] + poisson[4])

# Creating a binomial distribution object

# Probability mass function (PMF) for a given number of successes (k)

# Cumulative distribution function (CDF) for a given number of successes (k)

# Generating random samples

PMF at k=3: 0.1171875

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.