100% found this document useful (1 vote)
369 views40 pages

Financial Analytics With Python

This document provides an introduction to performing financial analytics using Python. It discusses installing Python and relevant libraries like Pandas, NumPy, and Matplotlib. It then covers topics like reading financial data, analyzing time series data through techniques like moving averages and volatility calculation, performing regression analysis, and data visualization. Code examples demonstrate how to work with DataFrames, group data, plot time series and correlations between variables. The document is a guide for learning to manipulate, model and visualize financial data in Python.

Uploaded by

Harshit Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
369 views40 pages

Financial Analytics With Python

This document provides an introduction to performing financial analytics using Python. It discusses installing Python and relevant libraries like Pandas, NumPy, and Matplotlib. It then covers topics like reading financial data, analyzing time series data through techniques like moving averages and volatility calculation, performing regression analysis, and data visualization. Code examples demonstrate how to work with DataFrames, group data, plot time series and correlations between variables. The document is a guide for learning to manipulate, model and visualize financial data in Python.

Uploaded by

Harshit Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

Financial Analytics

with Python
Contents

 Installation – Python Software


 Pandas, Numpy, Matplotlib – Introduction
 Data Analysis in Python – How to work on dataframes?
 Financial Time Series Analysis – using Python
 Regression Analysis – using Python
 Data Visualisation
a) One Dimensional Data Set
b) Two Dimensional Data Set
c) Other Plot Styles
d) Financial Plots
e) 3D plotting
How to install python?

Install latest version of python from Anaconda.com – Free Public Software


https://www.anaconda.com/distribution/
Pandas, Numpy, Matplotlib - Basics

Pandas
• The Pandas module is used for working with tabular data
• It allows us to work with data in table form, such as in CSV or SQL database formats. We can also
create tables of our own, and edit or add columns or rows to tables
• Pandas provides us with some powerful objects like DataFrames (two-dimensional data
structure, i.e., data is aligned in a tabular fashion in rows and columns) and
• Series(one-dimensional labeled array capable of holding data of any type (integer, string, float,
python objects, etc.)) which are very useful for working with and analyzing data
• Dataframe is a collection of series that can be used to analyse the data.
Pandas, Numpy, Matplotlib - Basics

Numpy
• The Numpy module is mainly used for working with numerical data
• It provides us with a powerful object known as an Array
• An array is a data structure that stores values of same data type
• With Arrays, we can perform mathematical operations on multiple values at the same time, and
also perform operations between different Arrays, similar to matrix operations
Pandas, Numpy, Matplotlib - Basics

Matplotlib
• Matplotlib module is used for data visualization
• It provides functionality for us to draw charts and graphs, so that we can better understand and
present the data visually

• Pandas, Numpy and Matplotlib together allow us to analyse, manipulate and visualise data in very
useful ways
Data analysis in Python
from pylab import plt
plt.style.use('seaborn')
import matplotlib as mpl
mpl.rcParams['font.family'] = 'serif’

Note:
Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface
for drawing attractive and informative statistical graphics.

PyLab is a module that belongs to the Python mathematics library Matplotlib. PyLab combines the
numerical module numpy with the graphical plotting module pyplot.
Data analysis in Python

import numpy as np
import pandas as pd

First Steps with DataFrame

> df = pd.DataFrame([10, 20, 30, 40], columns=['numbers'],index=['a', 'b', 'c', 'd'])

> df

> df.index # the index values

> df.columns # the column names

> df.loc['c'] #selection via index


Data analysis in Python

> df.loc[['a', 'd']] #selection of multiple indices

> df.loc[df.index[1:3]] # selection via Index object

> df.sum() # sum per column

> df.apply(lambda x: x ** 2) # square of every element

> df['floats'] = (1.5, 2.5, 3.5, 4.5) # new column is generated

> df

> df['floats'] # selection of column


Data analysis in Python

> df['names'] = pd.DataFrame(['Yves', 'Guido', 'Felix', 'Francesc'],index=['d', 'a', 'b', 'c’])


> df

> df.append({'numbers': 100, 'floats': 5.75, 'names': 'Henry’}, ignore_index=True)


# temporary object; df not changed

> df = df.append(pd.DataFrame({'numbers': 100, 'floats': 5.75, 'names': 'Henry'}, index=['z',]))

> df.join(pd.DataFrame([1, 4, 9, 16, 25],


index=['a', 'b', 'c', 'd', 'y'],
columns=['squares',]))
# temporary object

> df = df.join(pd.DataFrame([1, 4, 9, 16, 25],


index=['a', 'b', 'c', 'd', 'y'],
columns=['squares',]),
how='outer')
df
Data analysis in Python

> df[['numbers', 'squares']].mean()


# column-wise mean

> df[['numbers', 'squares']].std()


# column-wise standard deviation

Next Steps with DataFrame

> a = np.random.standard_normal((9, 4))


a.round(6)

> df = pd.DataFrame(a)
df

> df.columns = ['No1', 'No2', 'No3', 'No4']


df
Data analysis in Python

> df['No2'].iloc[3] # value in column No2 at index position 3

> dates = pd.date_range('2015-1-1', periods=9, freq='M’)


dates

> df.index = dates


df

> np.array(df).round(6)
Base of Analytics

> df.sum()

> df.mean()

> df.cumsum()

> df.describe()

> np.sqrt(abs(df))

> np.sqrt(abs(df)).sum()

> %matplotlib inline


df.cumsum().plot(lw=2.0, grid=True)
# tag: dataframe_plot
# title: Line plot of a DataFrame object
Base of Analytics

> type(df)

> df['No1’]

> type(df['No1’])

> import matplotlib.pyplot as plt


df['No1'].cumsum().plot(style='r', lw=2., grid=True)
plt.xlabel('date')
plt.ylabel('value')
# tag: time_series
# title: Line plot of a Series object
Base of Analytics

Groupby Operations

> df['Quarter'] = ['Q1', 'Q1', 'Q1', 'Q2', 'Q2', 'Q2', 'Q3', 'Q3', 'Q3’]
df

> groups = df.groupby('Quarter’)

> groups.mean()

> groups.max()

> groups.size()

> df['Odd_Even'] = ['Odd', 'Even', 'Odd', 'Even', 'Odd', 'Even',


'Odd', 'Even', 'Odd']
Base of Analytics

> groups = df.groupby(['Quarter', 'Odd_Even’])

> groups.size()

> groups.mean()
Financial Time Series Analysis - using Python

> raw = pd.read_csv('Desktop/Dataset1.csv',


index_col=0, parse_dates=True)
raw.info()

> data = pd.DataFrame(raw['.SPX'])


data.columns = ['Close’]

> data.tail()

> data['Close'].plot(figsize=(8, 5), grid=True);


# tag: dax
# title: Historical DAX index levels

> %time data['Return'] = np.log(data['Close'] / data['Close'].shift(1))


Financial Time Series Analysis - using Python

> data[['Close', 'Return']].tail()

> data[['Close', 'Return']].plot(subplots=True, style='b',


figsize=(8, 5), grid=True);
# tag: dax_returns
# title: The S&P 500 index and daily log returns

> data['42d'] = data['Close'].rolling(window=42).mean() #Moving Averages


> data['252d'] = data['Close'].rolling(window=252).mean()

> data[['Close', '42d', '252d']].tail()

> data[['Close', '42d', '252d']].plot(figsize=(8, 5), grid=True)


# tag: dax_trends
# title: The S&P index and moving averages
Financial Time Series Analysis - using Python

> import math


data['Mov_Vol'] = data['Return'].rolling(window=252).std() * math.sqrt(252)
# moving annual volatility

> data[['Close', 'Mov_Vol', 'Return']].plot(subplots=True, style='b',


figsize=(8, 7), grid=True);
# tag: dax_mov_std
# title: The S&P index and moving Volume, annualized volatility
Regression Analysis using Python

> import pandas as pd

> spx = pd.DataFrame(raw['.SPX’])

> np.round(spx.tail())

> vix = pd.DataFrame(raw['.VIX'])


vix.info()

> data = spx.join(vix)

> data.tail()

> data.plot(subplots=True, grid=True, style='b', figsize=(8, 6));


# tag: spx_vix
# title: The S&P 500 Index and the VIX volatility index
Regression Analysis using Python

> rets = np.log(data / data.shift(1))


rets.head()

> rets.dropna(inplace=True)

> rets.plot(subplots=True, grid=True, style='b', figsize=(8, 6));


# tag: es50_vs_rets
# title: Log returns of S&P500 and VIX

> xdat = rets['.SPX'].values


ydat = rets['.VIX'].values
reg = np.polyfit(x=xdat, y=ydat, deg=1)
reg
Regression Analysis using Python

> plt.plot(xdat, ydat, 'r.')


ax = plt.axis() # grab axis values
x = np.linspace(ax[0], ax[1] + 0.01)
plt.plot(x, np.polyval(reg, x), 'b', lw=2)
plt.grid(True)
plt.axis('tight')
plt.xlabel('S&P 500 returns')
plt.ylabel('VIX returns')
# tag: scatter_rets
# title: Scatter plot of log returns and regression line

> rets.corr()

> rets['.SPX'].rolling(window=252).corr(rets['.VIX']).plot(grid=True, style='b')


# tag: roll_corr
# title: Rolling correlation between S&P 500 and VIX
Data Visualisation

> from pylab import plt


plt.style.use('seaborn')
import matplotlib as mpl
mpl.rcParams['font.family'] = 'serif’

Two-Dimensional Plotting
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline

One-Dimensional Data Set


np.random.seed(1000)
y = np.random.standard_normal(20)
Data Visualisation

x = range(len(y))
plt.plot(x, y)
# tag: matplotlib_0
# title: Plot given x- and y-values

plt.plot(y)
# tag: matplotlib_1
# title: Plot given data as 1d-array

plt.plot(y.cumsum())
# tag: matplotlib_2
# title: Plot given a 1d-array with method attached

plt.plot(y.cumsum())
plt.grid(True) # adds a grid
plt.axis('tight') # adjusts the axis ranges
# tag: matplotlib_3_a
# title: Plot with grid and tight axes
Data Visualisation

plt.plot(y.cumsum())
plt.grid(True)
plt.xlim(-1, 20)
plt.ylim(np.min(y.cumsum()) - 1,
np.max(y.cumsum()) + 1)
# tag: matplotlib_3_b
# title: Plot with custom axes limits

plt.figure(figsize=(7, 4))
# the figsize parameter defines the
# size of the figure in (width, height)
plt.plot(y.cumsum(), 'b', lw=1.5)
plt.plot(y.cumsum(), 'ro')
plt.grid(True)
plt.axis('tight')
plt.xlabel('index')
plt.ylabel('value')
plt.title('A Simple Plot')
# tag: matplotlib_4
# title: Plot with typical labels
Data Visualisation

Two-Dimensional Data Set


np.random.seed(2000)
y = np.random.standard_normal((20, 2)).cumsum(axis=0)

plt.figure(figsize=(7, 4))
plt.plot(y, lw=1.5)
# plots two lines
plt.plot(y, 'ro')
# plots two dotted lines
plt.grid(True)
plt.axis('tight')
plt.xlabel('index')
plt.ylabel('value')
plt.title('A Simple Plot')
# tag: matplotlib_5
# title: Plot with two data sets
Data Visualisation

plt.figure(figsize=(7, 4))
plt.plot(y[:, 0], lw=1.5, label='1st')
plt.plot(y[:, 1], lw=1.5, label='2nd')
plt.plot(y, 'ro')
plt.grid(True)
plt.legend(loc=0)
plt.axis('tight')
plt.xlabel('index')
plt.ylabel('value')
plt.title('A Simple Plot')
# tag: matplotlib_6
# title: Plot with labeled data sets
Data Visualisation

fig, ax1 = plt.subplots()


plt.plot(y[:, 0], 'b', lw=1.5, label='1st')
plt.plot(y[:, 0], 'ro')
plt.grid(True)
plt.legend(loc=8)
plt.axis('tight')
plt.xlabel('index')
plt.ylabel('value 1st')
plt.title('A Simple Plot')
ax2 = ax1.twinx()
plt.plot(y[:, 1], 'g', lw=1.5, label='2nd')
plt.plot(y[:, 1], 'ro')
plt.legend(loc=0)
plt.ylabel('value 2nd')
# tag: matplotlib_8
# title: Plot with two data sets and two y-axes
Data Visualisation

plt.figure(figsize=(7, 5))
plt.subplot(211)
plt.plot(y[:, 0], lw=1.5, label='1st')
plt.plot(y[:, 0], 'ro')
plt.grid(True)
plt.legend(loc=0)
plt.axis('tight')
plt.ylabel('value')
plt.title('A Simple Plot')
plt.subplot(212)
plt.plot(y[:, 1], 'g', lw=1.5, label='2nd')
plt.plot(y[:, 1], 'ro')
plt.grid(True)
plt.legend(loc=0)
plt.axis('tight')
plt.xlabel('index')
plt.ylabel('value')
# tag: matplotlib_9
# title: Plot with two sub-plots
Data Visualisation
plt.figure(figsize=(9, 4))
plt.subplot(121)
plt.plot(y[:, 0], lw=1.5, label='1st')
plt.plot(y[:, 0], 'ro')
plt.grid(True)
plt.legend(loc=0)
plt.axis('tight')
plt.xlabel('index')
plt.ylabel('value')
plt.title('1st Data Set')
plt.subplot(122)
plt.bar(np.arange(len(y)), y[:, 1], width=0.5,
color='g', label='2nd')
plt.grid(True)
plt.legend(loc=0)
plt.axis('tight')
plt.xlabel('index')
plt.title('2nd Data Set')
# tag: matplotlib_10
# title: Plot combining line/point sub-plot with bar sub-plot
# size: 80
Data Visualisation – Other Plot Styles
y = np.random.standard_normal((1000, 2))

plt.figure(figsize=(7, 5))
plt.plot(y[:, 0], y[:, 1], 'ro')
plt.grid(True)
plt.xlabel('1st')
plt.ylabel('2nd')
plt.title('Scatter Plot')
# tag: matplotlib_11_a
# title: Scatter plot via +plot+ function

plt.figure(figsize=(7, 5))
plt.scatter(y[:, 0], y[:, 1], marker='o')
plt.grid(True)
plt.xlabel('1st')
plt.ylabel('2nd')
plt.title('Scatter Plot')
# tag: matplotlib_11_b
# title: Scatter plot via +scatter+ function
Data Visualisation – Other Plot Styles
c = np.random.randint(0, 10, len(y))

plt.figure(figsize=(7, 5))
plt.scatter(y[:, 0], y[:, 1], c=c, marker='o')
plt.colorbar()
plt.grid(True)
plt.xlabel('1st')
plt.ylabel('2nd')
plt.title('Scatter Plot')
# tag: matplotlib_11_c
# title: Scatter plot with third dimension

plt.figure(figsize=(7, 4))
plt.hist(y, label=['1st', '2nd'], bins=25)
plt.grid(True)
plt.legend(loc=0)
plt.xlabel('value')
plt.ylabel('frequency')
plt.title('Histogram')
# tag: matplotlib_12_a
# title: Histogram for two data sets
Data Visualisation – Other Plot Styles
plt.figure(figsize=(7, 4))
plt.hist(y, label=['1st', '2nd'], color=['b', 'g'],
stacked=True, bins=20)
plt.grid(True)
plt.legend(loc=0)
plt.xlabel('value')
plt.ylabel('frequency')
plt.title('Histogram')
# tag: matplotlib_12_b
# title: Stacked histogram for two data sets

fig, ax = plt.subplots(figsize=(7, 4))


plt.boxplot(y)
plt.grid(True)
plt.setp(ax, xticklabels=['1st', '2nd'])
plt.xlabel('data set')
plt.ylabel('value')
plt.title('Boxplot')
# tag: matplotlib_13
# title: Boxplot for two data sets
# size: 70
Data Visualisation – Other Plot Styles
from matplotlib.patches import Polygon
def func(x):
return 0.5 * np.exp(x) + 1

a, b = 0.5, 1.5 # integral limits


x = np.linspace(0, 2)
y = func(x)

fig, ax = plt.subplots(figsize=(7, 5))


plt.plot(x, y, 'b', linewidth=2)
plt.ylim(ymin=0)

# Illustrate the integral value, i.e. the area under the function
# between lower and upper limit
Ix = np.linspace(a, b)
Iy = func(Ix)
verts = [(a, 0)] + list(zip(Ix, Iy)) + [(b, 0)]
poly = Polygon(verts, facecolor='0.7', edgecolor='0.5')
ax.add_patch(poly)
Data Visualisation – Other Plot Styles
plt.text(0.5 * (a + b), 1, r"$\int_a^b f(x)\mathrm{d}x$",
horizontalalignment='center', fontsize=20)

plt.figtext(0.9, 0.075, '$x$')


plt.figtext(0.075, 0.9, '$f(x)$')

ax.set_xticks((a, b))
ax.set_xticklabels(('$a$', '$b$'))
ax.set_yticks([func(a), func(b)])
ax.set_yticklabels(('$f(a)$', '$f(b)$'))
plt.grid(True)
# tag: matplotlib_math
# title: Exponential function, integral area and Latex labels
# size: 60
Data Visualisation – Financial Plots
import pandas as pd
import cufflinks as cf (pip install cufflinks in Anaconda Prompt)

from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot


init_notebook_mode(connected=True)

# data from FXCM Forex Capital Markets Ltd.


raw = pd.read_csv('Desktop/Dataset2.csv',
index_col=0, parse_dates=True)
raw.columns

quotes = raw[['OpenAsk', 'HighAsk', 'LowAsk', 'CloseAsk’]]

qf = cf.QuantFig(quotes.iloc[-100:], title='EUR/USD', legend='top',


name='EUR/USD', datalegend=False)

iplot(qf.iplot(asFigure=True))

qf.add_bollinger_bands(periods=15, boll_std=2)
Data Visualisation – Financial Plots
iplot(qf.iplot(asFigure=True))

3d Plotting

strike = np.linspace(50, 150, 24)


ttm = np.linspace(0.5, 2.5, 24)
strike, ttm = np.meshgrid(strike, ttm)

strike[:2]

iv = (strike - 100) ** 2 / (100 * strike) / ttm


# generate fake implied volatilities
Data Visualisation – Financial Plots
>from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(9, 6))


ax = fig.gca(projection='3d')

surf = ax.plot_surface(strike, ttm, iv, rstride=2, cstride=2,


cmap=plt.cm.coolwarm, linewidth=0.5,
antialiased=True)

ax.set_xlabel('strike')
ax.set_ylabel('time-to-maturity')
ax.set_zlabel('implied volatility')

fig.colorbar(surf, shrink=0.5, aspect=5)


# tag: matplotlib_17
# title: 3d surface plot for (fake) implied volatilities
# size: 70
Data Visualisation – Financial Plots
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot(111, projection='3d')
ax.view_init(30, 60)

ax.scatter(strike, ttm, iv, zdir='z', s=25,


c='b', marker='^')

ax.set_xlabel('strike')
ax.set_ylabel('time-to-maturity')
ax.set_zlabel('implied volatility')

# tag: matplotlib_18
# title: 3d scatter plot for (fake) implied volatilities
# size: 70
Thank You

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy