0% found this document useful (0 votes)
10 views4 pages

Fda E0323040 20 12 24

The document demonstrates various data analysis techniques using Python, including calculating autocorrelation, handling time series data, and imputing missing values in a dataset. It showcases the use of libraries such as pandas, numpy, and matplotlib for data manipulation and visualization. Additionally, it illustrates how to save data to CSV and Excel formats, including writing to multiple sheets in an Excel file.

Uploaded by

e0323040
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views4 pages

Fda E0323040 20 12 24

The document demonstrates various data analysis techniques using Python, including calculating autocorrelation, handling time series data, and imputing missing values in a dataset. It showcases the use of libraries such as pandas, numpy, and matplotlib for data manipulation and visualization. Additionally, it illustrates how to save data to CSV and Excel formats, including writing to multiple sheets in an Excel file.

Uploaded by

e0323040
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

from statsmodels.tsa.

stattools import acf

x=[2,4,6,8,10]

autocorr_values = acf(x, nlags=2,fft=False)

print(f"Autocorrelation for lag 1: {autocorr_values[1]}")


print(f"Autocorrelation for lag 2: {autocorr_values[2]}")

Autocorrelation for lag 1: 0.4


Autocorrelation for lag 2: -0.1

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf
from pandas.plotting import autocorrelation_plot

np.random.seed(42)
time_series = np.random.randn(100).cumsum()
plt.figure(figsize=(10,4))
plt.plot(time_series,label='Time Series')
plt.legend()
plt.ylabel('Autocorrelation')
plt.xlabel('Lag')
plt.title('Autocorrelation Function')
ts=pd.Series(time_series)
plot_acf(ts,lags=10)
plt.show()
import random

random.seed(10)
print(random.random())

#Interval

import pandas as pd

data = {
"timestamp":[
"2024-12-01 10:00:00",
"2024-12-01 10:10:00",
"2024-12-01 10:15:00",
"2024-12-01 10:35:00",
"2024-12-01 10:45:00"
],
"value":[10,15,20,25,30]
}

df = pd.DataFrame(data)
df

df["timestamp"]=pd.to_datetime(df["timestamp"])
df["time_diff"]=df["timestamp"].diff().dt.total_seconds()
threshold=1.5*df["time_diff"].median()
irregular_intervals=df[df["time_diff"]> threshold]

print("Time Differences (in seconds): ")


print(df[["timestamp","time_diff"]])
print("\nIrregular Intervals:")
print(irregular_intervals)

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as s

train = pd.read_csv("/content/titanic_train.csv")
train.isnull().sum()
train["Pclass"].unique()

def impute_age(cols):
Age=cols[0]
Pclass=cols[1]

if pd.isnull(Age):
if Pclass==1:
return 37
elif Pclass==2:
return 29
else:
return 24
else:
return Age

train["Age"]=train[["Age","Pclass"]].apply(impute_age,axis=1)
train.isnull().sum()

import pandas as pd
import numpy as np

data={'Name':["Mr.John Smith","Mrs.Emily Davis","Miss.Laura


Wilson","Mr.Alan Brown","Mrs.Clara Clark"],'Age':
[np.nan,28,np.nan,45,np.nan],'Pclass':[1,2,3,1,3]}
df=pd.DataFrame(data)
print("Data Frame:")
df

def fill_age(row):
if pd.isnull(row['Age']):
if 'Mr.' in row['Name']:
return 40
elif 'Mrs.' in row['Name']:
return 35
else:
return 30
return row['Age']

df['Age']=df.apply(fill_age,axis=1)
print("\nUpdated Data Frame:")
df

file_path="/content/output.csv"

df.to_csv(file_path)

file_path="/content/output.xlsx"

df.to_excel(file_path)

import pandas as pd
import numpy as np

data={'Name':["Darshan","Dejo","Dheepak","Oorakathan"],'Age':
[19,20,19,19],'Course':["AIDA","AIDA","AIML","AIML"]}
df1=pd.DataFrame(data)
print("Data Frame:")
df1

import pandas as pd
import numpy as np

data={'Name':["Mr.John Smith","Mrs.Emily Davis","Miss.Laura


Wilson","Mr.Alan Brown","Mrs.Clara Clark"],'Age':
[np.nan,28,np.nan,45,np.nan],'Pclass':[1,2,3,1,3]}
df2=pd.DataFrame(data)
print("Data Frame:")
df2

file_path="output_multiplesheets.xlsx"

with pd.ExcelWriter(file_path,engine="openpyxl") as writer:


df1.to_excel(writer,sheet_name="People",index=False)
df2.to_excel(writer,sheet_name="Cities",index=False)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy