0% found this document useful (0 votes)
22 views4 pages

Cota12 6

The document contains a Python script that processes the Iris dataset using pandas and scikit-learn. It loads the dataset, explores its structure, and splits it into training and testing sets for model training. The script also includes a reference to the Gaussian Naive Bayes classifier, indicating an intention to perform classification on the dataset.

Uploaded by

omkarmagdum818
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views4 pages

Cota12 6

The document contains a Python script that processes the Iris dataset using pandas and scikit-learn. It loads the dataset, explores its structure, and splits it into training and testing sets for model training. The script also includes a reference to the Gaussian Naive Bayes classifier, indicating an intention to perform classification on the dataset.

Uploaded by

omkarmagdum818
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

cota12-6

March 25, 2025

[1]:

[1]: 'name : Omkar Magdum\n Rollno:COTC53'

[2]:

[3]: data = pd.read_csv("iris.csv")

[4]:

[4] : Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species


0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa

[5] : data.shape

[5]: (150, 6)

[6] :

[6] : Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species


0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa

[7] :

[7] : Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm \


145 146 6.7 3.0 5.2 2.3
146 147 6.3 2.5 5.0 1.9
147 148 6.5 3.0 5.2 2.0

1
148 149 6.2 3.4 5.4 2.3
149 150 5.9 3.0 5.1 1.8

Species
145 Iris-virginica
146 Iris-virginica
147 Iris-virginica
148 Iris-virginica
149 Iris-virginica

[8] :

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
# Column Non-Null Count Dtype

0 Id 150 non-null int64


1 SepalLengthCm 150 non-null float64
2 SepalWidthCm 150 non-null float64
3 PetalLengthCm 150 non-null float64
4 PetalWidthCm 150 non-null float64
5 Species 150 non-null object
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB

[9] :

[9] : Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm


count 150.000000 150.000000 150.000000 150.000000 150.000000
mean 75.500000 5.843333 3.054000 3.758667 1.198667
std 43.445368 0.828066 0.433594 1.764420 0.763161
min 1.000000 4.300000 2.000000 1.000000 0.100000
25% 38.250000 5.100000 2.800000 1.600000 0.300000
50% 75.500000 5.800000 3.000000 4.350000 1.300000
75% 112.750000 6.400000 3.300000 5.100000 1.800000
max 150.000000 7.900000 4.400000 6.900000 2.500000

[10] :

[10] : Id 0
SepalLengthCm 0
SepalWidthCm 0
PetalLengthCm 0
PetalWidthCm 0
Species 0
dtype: int64

2
[11] : x = data.drop(['Species'], axis=1)
y = data.drop(['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', ␣
𝗌'PetalWidthCm'], axis=1)

print(x)
print(y)
print(x.shape)
print(y.shape)

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm


0 1 5.1 3.5 1.4 0.2
1 2 4.9 3.0 1.4 0.2
2 3 4.7 3.2 1.3 0.2
3 4 4.6 3.1 1.5 0.2
4 5 5.0 3.6 1.4 0.2
.. … … … … …
145 146 6.7 3.0 5.2 2.3
146 147 6.3 2.5 5.0 1.9
147 148 6.5 3.0 5.2 2.0
148 149 6.2 3.4 5.4 2.3
149 150 5.9 3.0 5.1 1.8

[150 rows x 5 columns]


Id Species
0 1 Iris-setosa
1 2 Iris-setosa
2 3 Iris-setosa
3 4 Iris-setosa
4 5 Iris-setosa
.. … …
145 146 Iris-virginica
146 147 Iris-virginica
147 148 Iris-virginica
148 149 Iris-virginica
149 150 Iris-virginica

[150 rows x 2 columns]


(150, 5)
(150, 2)

[12] : from sklearn.model_selection import train_test_split


X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2,␣
𝗌shuffle=True)

print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)

3
(120, 5)
(30, 5)
(120, 2)
(30, 2)

[14]:

[15]: GaussianNB()

[15]: GaussianNB()

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy