0% found this document useful (0 votes)
7 views9 pages

Anomaly Detection-1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views9 pages

Anomaly Detection-1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

10

Anomaly Detection -1
Isolation Forest 0

Complete Theory

0
Code Implementation

50

Va t
lu e n
_free C o nte
iwmduvk9p

October 20, 2023

1 Isolation Forest :(Anomaly Detection)


[1]: # importing the dataset
import pandas as pd
df = pd.read_csv('healthcare.csv')
df.head(3)

[1]: 0 1
0 1.616671 1.944522
1 1.256461 1.609444
2 -2.343919 4.392961

[2]: # now plot the data points into the scater plot
import matplotlib.pyplot as plt
plt.scatter(df.iloc[:,0], df.iloc[:,1])

[2]: <matplotlib.collections.PathCollection at 0x7f99a3c91c60>

1
[3]: # importing the Isolation forest from sklearn
import warnings
warnings.filterwarnings('ignore')
from sklearn.ensemble import IsolationForest

[4]: clf = IsolationForest(contamination= 'auto') #contamination == threshold␣


↪value (go for 'auto' so that it give best result)

clf.fit(df) ## applying fit on the df

prediction = clf.predict(df)

[5]: prediction # +1 == Normal data points , -1 == Outliers

[5]: array([ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, -1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, 1, 1, -1, 1, 1,
1, 1, -1, 1, -1, 1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1,
1, 1, 1, 1, -1, 1, -1, 1, 1, 1, -1, 1, 1, 1, 1, -1, 1,
-1, 1, -1, 1, 1, 1, 1, -1, 1, 1, 1, 1, -1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1,
-1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, 1, 1, -1, 1, 1,
1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, 1,

2
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1, 1,
-1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, -1, 1, 1, -1,
-1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, -1, 1, 1, 1, 1, -1, -1, 1])

[6]: # geting the outlier values into a index


import numpy as np
index = np.where(prediction < 0 )
index

[6]: (array([ 20, 24, 45, 48, 53, 55, 63, 72, 74, 78, 83, 85, 87,
92, 97, 108, 114, 119, 130, 133, 141, 151, 167, 179, 187, 199,
212, 217, 220, 221, 227, 242, 247, 248]),)

[7]: # converting into array


x = df.values

[8]: index = np.where(prediction < 0)


plt.scatter(df.iloc[:,0], df.iloc[:,1])
plt.scatter(x[index,0], x[index,1], edgecolors= 'r')

[8]: <matplotlib.collections.PathCollection at 0x7f9997924040>

3
2 Observation:
The data points in red color are actually Outliers that we get by Anomaly detection technique.
this red points here represent those persons who having disease in our dataset.
[ ]:

[ ]:

[ ]:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy