0% found this document useful (0 votes)
29 views16 pages

Bank Nifty PDF

yest2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views16 pages

Bank Nifty PDF

yest2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Bank Nifty Predictive Analysis through

Time Series Forecasting

Submitted By: Supervised by:


Mohd Saad Khan Prof Dr. Moonis Shakeel
Roll No.21MFE019 Department of Economics
Mohd Owais Jamia Millia Islamia
Roll no. 21MFE016

Course: MSc. Banking and Financial analytics


Sem- 3rd
Introduction
● Bank Nifty prediction using machine learning helps you discover the future value of banking
stock and other financial asset trading on a exchange.
● The entire idea of predicting bank nifty is to gain significant profit.
● Machine learning models such as LSTM’s, ARIMA and GARCH are popular models applied to
predicting time series data of Stock market.
● It is vital to examine the accuracy of forecasts because they have several restrictions in
application
● The major goal of this paper is to see which forecasting approaches provide the best
prediction in terms of minimum squared error and maximum forecasting accuracy.
Research Methodologies
Data Collection
● We’ve imported our data through google sources like yahoo finance.
Data and Pattern analysis
● First we have sorted the data through panda library. Then we’ve plot the data through
libraries like seaborn and matplotlib.
Popular forecasting methods
● Methods we’ll use to predict the nifty bank stock prices are LSTM’s ,ARIMA, and GARCH.
Assessment Metric
● Root-mean-square error commonly used metric for evaluating the accuracy of a model’s
prediction. There is also other metrics to evaluate the accuracy of the model which are F
score, P-score and Classification report etc.
Internship experience

● Account opening
● Checque clearing
● Locker Notices
● Updating database manually
● E- KYC
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
import yfinance as yf
import math
from sklearn.preprocessing import MinMaxScaler

asss=['AUBANK','AXISBANK','BANDHANBNK','BANKBARODA','FEDERALBNK','HDFCBANK','ICICIBANK',
components=list(asss)

components

['AUBANK',
'AXISBANK',
'BANDHANBNK',
'BANKBARODA',
'FEDERALBNK',
'HDFCBANK',
'ICICIBANK',
'IDFCFIRSTB',
'INDUSINDBK',
'KOTAKBANK',
'PNB',
'SBIN']

for i in range(len(components)):
components[i]=components[i]+'.NS'
components

['AUBANK.NS',
'AXISBANK.NS',
'BANDHANBNK.NS',
'BANKBARODA.NS',
'FEDERALBNK.NS',
'HDFCBANK.NS',
'ICICIBANK.NS',
'IDFCFIRSTB.NS',
'INDUSINDBK.NS',
'KOTAKBANK.NS',
'PNB.NS',
'SBIN.NS']

components.append('^NSEBANK')
components

['AUBANK.NS',
'AXISBANK.NS',
'BANDHANBNK.NS',
'BANKBARODA.NS',
'FEDERALBNK.NS',
'HDFCBANK.NS',
'HDFCBANK.NS',
'ICICIBANK.NS',
'IDFCFIRSTB.NS',
'INDUSINDBK.NS',
'KOTAKBANK.NS',
'PNB.NS',
'SBIN.NS',
'^NSEBANK']

data=yf.download(components,start='2007-4-1',end='2023-03-31')
data

[*********************100%***********************] 13 of 13 completed
Adj Close

AUBANK.NS AXISBANK.NS BANDHANBNK.NS BANKBARODA.NS FEDERALBNK.NS HDFCBANK

Date

2007-04-02 NaN 81.345421 NaN 31.156727 9.322885

2007-04-03 NaN 82.443481 NaN 31.257248 9.465549

2007-04-04 NaN 81.239159 NaN 31.589773 9.490828

2007-04-05 NaN 79.955132 NaN 33.066803 9.992247

2007-04-09 NaN 88.438591 NaN 34.837681 10.341372

... ... ... ... ... ...

2023-03-23 569.650024 848.799988 212.500000 162.699997 127.000000 1563.150

2023-03-24 559.900024 839.900024 203.100006 159.800003 126.849998 1560.650

2023-03-27 566.900024 833.349976 197.800003 161.250000 125.449997 1567.449

2023-03-28 563.549988 832.450012 187.449997 160.750000 125.699997 1580.199

2023-03-29 585.549988 842.599976 189.899994 164.449997 129.100006 1587.800

3947 rows × 78 columns

banknifty=yf.download('^NSEBANK',start='2007-4-1',end='2023-4-1')

[*********************100%***********************] 1 of 1 completed

banknifty

Open High Low Close Adj Close Volume

Date

2007-09-17 6898.000000 6977.200195 6843.000000 6897.100098 6897.020020


2007-09-18 6921.149902 7078.950195 6883.600098 7059.649902 7059.567871

2007-09-19 7111.000000 7419.350098 7111.000000 7401.850098 7401.764160

2007-09-20 7404.950195 7462.899902 7343.600098 7390.149902 7390.063965

2007-09-21 7378.299805 7506.350098 7367.149902 7464.500000 7464.413086

... ... ... ... ... ...

2023-03-24 39555.250000 39767.898438 39294.898438 39395.351562 39395.351562 166100

2023-03-27 39484.699219 39695.199219 39273.750000 39431.300781 39431.300781 194200

2023-03-28 39545.050781 39645.199219 39326.101562 39567.898438 39567.898438 202300

2023-03-29 39611.550781 40055.000000 39609.550781 39910.148438 39910.148438 259600

2023-03-31 40231.250000 40690.398438 40180.199219 40608.648438 40608.648438 188000

3538 rows × 6 columns

from datetime import date

plt.figure(figsize=(16,8))
plt.title('Price Chart')
plt.plot(banknifty['Close'])
plt.xlabel('Date', fontsize=18)
plt.ylabel('Closing Price',fontsize=18)
plt.show()
## volume graph
banknifty['Volume'].plot(label='nifty bank volume',figsize=(16,8))

<Axes: xlabel='Date'>
### log return
logreturn=banknifty['Log Return']=np.log(banknifty['Close']/banknifty['Close'].shift(1))
logreturn

Date
2007-09-17 NaN
2007-09-18 0.023294
2007-09-19 0.047335
2007-09-20 -0.001582
2007-09-21 0.010010
...
2023-03-24 -0.005608
2023-03-27 0.000912
2023-03-28 0.003458
2023-03-29 0.008612
2023-03-31 0.017350
Name: Close, Length: 3538, dtype: float64

banknifty['Log Return'].plot(title='log return',figsize=(16,8))

<Axes: title={'center': 'log return'}, xlabel='Date'>


# histogram adjusted close
plt.hist(banknifty['Close'],bins=500)
plt.show

<function matplotlib.pyplot.show(close=None, block=None)>

## histogram volume
plt.hist(banknifty['Volume'])
plt.show()
#only closing data
close_df=banknifty.filter(['Close'])

#convert to numpy array


dataset=close_df.values

dataset

array([[ 6897.10009766],
[ 7059.64990234],
[ 7401.85009766],
...,
[39567.8984375 ],
[39910.1484375 ],
[40608.6484375 ]])

#size of training data


training_data_len=math.ceil(len(dataset)*0.8)
training_data_len

2831

#scaling the data


scaler= MinMaxScaler(feature_range=(0,1))
scaled_data=scaler.fit_transform(dataset)

scaled_data

array([[0.08738522],
[0.09137815],
[0.09978408],
...,
[0.88992216],
[0.89832931],
[0.9154875 ]])

training_dataset= scaled_data[0:training_data_len,:]
#spliitting into xtrain and ytrain
x_train=[]
y_train=[]
for i in range(60,len(training_dataset)):
x_train.append(training_dataset[i-60:i,0])
y_train.append(training_dataset[i,0])
if i<=61:
print(x_train)
print(y_train)
print()

[array([0.08738522, 0.09137815, 0.09978408, 0.09949667, 0.10132303,


0.10590183, 0.10536755, 0.10848108, 0.11039096, 0.11551386,
0.11417019, 0.11688209, 0.1153591 , 0.11067591, 0.1053 ,
0.11191887, 0.11523014, 0.11837806, 0.112856 , 0.12151001,
0.12558525, 0.11693122, 0.10486644, 0.10032326, 0.10386544,
0.1169607 , 0.12009389, 0.12500676, 0.13289806, 0.13906002,
0.13742404, 0.13936462, 0.14392006, 0.15281483, 0.14880715,
0.14435486, 0.1412573 , 0.13600175, 0.13310194, 0.14051545,
0.15410076, 0.15021101, 0.15068879, 0.14877522, 0.14586679,
0.13486074, 0.13901211, 0.13849873, 0.1420495 , 0.14077952,
0.13832553, 0.14209741, 0.14827165, 0.14880715, 0.14805916,
0.15332084, 0.15427518, 0.15859726, 0.16027992, 0.16583393])]
[0.16499874668438996]

[array([0.08738522, 0.09137815, 0.09978408, 0.09949667, 0.10132303,


0.10590183, 0.10536755, 0.10848108, 0.11039096, 0.11551386,
0.11417019, 0.11688209, 0.1153591 , 0.11067591, 0.1053 ,
0.11191887, 0.11523014, 0.11837806, 0.112856 , 0.12151001,
0.12558525, 0.11693122, 0.10486644, 0.10032326, 0.10386544,
0.1169607 , 0.12009389, 0.12500676, 0.13289806, 0.13906002,
0.13742404, 0.13936462, 0.14392006, 0.15281483, 0.14880715,
0.14435486, 0.1412573 , 0.13600175, 0.13310194, 0.14051545,
0.15410076, 0.15021101, 0.15068879, 0.14877522, 0.14586679,
0.13486074, 0.13901211, 0.13849873, 0.1420495 , 0.14077952,
0.13832553, 0.14209741, 0.14827165, 0.14880715, 0.14805916,
0.15332084, 0.15427518, 0.15859726, 0.16027992, 0.16583393]), array([0.09137815, 0
0.10536755, 0.10848108, 0.11039096, 0.11551386, 0.11417019,
0.11688209, 0.1153591 , 0.11067591, 0.1053 , 0.11191887,
0.11523014, 0.11837806, 0.112856 , 0.12151001, 0.12558525,
0.11693122, 0.10486644, 0.10032326, 0.10386544, 0.1169607 ,
0.12009389, 0.12500676, 0.13289806, 0.13906002, 0.13742404,
0.13936462, 0.14392006, 0.15281483, 0.14880715, 0.14435486,
0.1412573 , 0.13600175, 0.13310194, 0.14051545, 0.15410076,
0.15021101, 0.15068879, 0.14877522, 0.14586679, 0.13486074,
0.13901211, 0.13849873, 0.1420495 , 0.14077952, 0.13832553,
0.14209741, 0.14827165, 0.14880715, 0.14805916, 0.15332084,
0.15427518, 0.15859726, 0.16027992, 0.16583393, 0.16499875])]
[0.16499874668438996, 0.16165185396165127]

## convering to numpy arrays


x_train=np.array(x_train)
y_train=np.array(y_train)
## lstm works with only 3d data

## reshaping data
x_train=np.reshape(x_train,(x_train.shape[0],x_train.shape[1],1))

x_train.shape

(2771, 60, 1)

from tensorflow.keras.layers import Dense, Dropout,LSTM

from tensorflow.keras.models import Sequential

## building LSTM
model=Sequential()
model.add(LSTM(50,return_sequences=True,input_shape=(x_train.shape[1],1)))
model.add(LSTM(50,return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))

##compiling model
model.compile(optimizer='adam',loss='mean_squared_error')

model.fit( x_train, y_train, batch_size=1, epochs=1)

2771/2771 [==============================] - 68s 23ms/step - loss: 8.4483e-04


<keras.callbacks.History at 0x7fc98f73c610>

##create the testing data set


## creating neew array containing scaled values from 2771 to 3528
testing_data=scaled_data[training_data_len-60: , :]
#create the data set
x_test=[]
y_test=dataset[training_data_len:, :]
for i in range(60, len(testing_data)):
x_test.append(testing_data[i-60:i,0])

## converting to numpy array


x_test=np.array(x_test)

##reshaping the data


x_test=np.reshape(x_test,(x_test.shape[0],x_test.shape[1],1))

## get the model's predicted price values


prediction=model.predict(x_test)
prediction=scaler.inverse_transform(prediction)

23/23 [==============================] - 1s 13ms/step


## model evaluation
## get root meaned squared error(RMSE)
rmse_model=np.sqrt( np.mean(prediction- y_test)**2)
rmse_model

1417.3565986341937

## plot the data


train=close_df[:training_data_len]
valid=close_df[training_data_len:]
valid['prediction']=prediction
## visualization
plt.figure(figsize=(16,8))
plt.title('Model output')
plt.xlabel('Date',fontsize=18)
plt.ylabel('Close price INR',fontsize=18)
plt.plot(train['Close'])
plt.plot(valid[['Close','prediction']])
plt.legend(['Train','Val','Predictions'],loc='lower right')
plt.show()

<ipython-input-45-fdc1f080e610>:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_g


valid['prediction']=prediction
## actual and predicted prices
valid

Close prediction

Date

2020-05-29 19297.250000 17952.277344

2020-06-01 19959.900391 18246.832031

2020-06-02 20530.199219 18636.347656

2020-06-03 20940.699219 19099.777344

2020-06-04 20390.449219 19595.921875

... ... ...

2023-03-24 39395.351562 37641.164062

2023-03-27 39431.300781 37667.566406

2023-03-28 39567.898438 37673.480469

2023-03-29 39910.148438 37684.117188

2023-03-31 40608.648438 37740.035156

707 rows × 2 columns

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy