Paper 109
Paper 109
net/publication/326080976
CITATIONS READS
39 14,119
2 authors, including:
Nisha Shetty
Manipal Academy of Higher Education
37 PUBLICATIONS 214 CITATIONS
SEE PROFILE
All content following this page was uploaded by Nisha Shetty on 07 January 2019.
1 Introduction
This section describes the limitations of traditional approach in Stock Market analysis
and lists the benefits of using machine learning and sentiment analysis
Stock market is a very volatile in-deterministic system with vast number of factors
influencing the direction of trend on varying scales and multiple layers. Efficient
Market Hypothesis (EMH) states that the market is self-correcting i.e. current stock
price reflects the most relevant cumulative price which is nether undervalued nor
overvalued and any new information is instantly depicted by the price change [1]. In
layman’s term “The market is unbeatable “, as you cannot gain any advantage over
the market but existing research proves otherwise. It is possible to predict the market
trends by analyzing the patterns of stock movement. Traditional approach applies the
following models for this.
Fundamental analysis
This approach focuses mainly on a company’s past performance and
credibility. Performance measures like P/E ratios are utilized to filter stock
which may incline towards a positive price surge. This approach is based on
theory that profitable companies will continue to be so because of uptrend
influenced by rewarding nature of the market.
Technical analysis
This approach is based on predicting the future prices by applying time
series analysis on previous trends. Statistical techniques such as Bollinger
Bands, Simple moving averages etc. are applied to predict the successive
trends.
Computer science provides us with cutting edge tools for Machine learning like SVM
and EML which can analyze and perform knowledge discovery at large scales in short
amount of time. Two approaches for prediction of stock market are proposed in this
research.
Qualitative Analysis
News feeds regarding stock market highly affect the market trend and thus
forms a downhill movement in case of a negative news. Thus, the media /
social network and stock market data are highly coupled and make the
system more unpredictable. Existing research points out that in case of crisis,
stocks mimic each other and lead to market crashes [1]. Nowadays, twitter
has come forth as the most reliable and fastest way of consuming media.
With a combined resources of news feed and twitter feed, general population
sentiment about a company can be highlighted. Text mining and sentiment
analysis are useful tools for such a high scale analysis.
Quantitative Analysis
Historical data is now readily available for most markets. Using this dataset,
we can apply multiple machine learning models to give accurate results for
future investments. These models can be trained for individual stocks with
adjusted bias for most reflective features. These models can also be trained to
work in different scenarios and overall market movement.
2 Literature Survey
In [4], Paul D. Yoo et al investigate the success of machine learning models and
event driven models like sentiment analysis in predicting the stock market trends.
It also illuminates the fact that macro-economic conditions like International and
political events affect market trends and need to be taken into consideration
The research done by Dongning Rao et al in [6] provides great insight into proper
implementation of sentiment analysis. They propose increasing the size of corpus
(training data) with each test. This is done by adding non-polarizing words found
3
in the test data not present in the corpus. Thus, making the training data more
efficient with each successive testing.
3 Methodology
(1)
Where α is the acceptable error rate.
On Finding accuracy for α=2 and 5, the accuracies observed are illustrated in Table 1
and Table 2
Table 1. Accuracy table for Closing price prediction (Error rate less than 2 %)
Classifier Accuracy
Lasso 40.79 %
LassoLars 51.61 %
Elastic Net 40.79 %
Ridge Regressor 85.4 %
SVR (kernel = linear) 0.97 %
SVR (kernel = RBF) 0.97 %
Random forest 15.44 %
5
Ada boost 3.99 %
Decision Tree 3.67 %
Table 2. Accuracy table for Closing price prediction (Error rate less than 5 %)
Classifier Accuracy
Lasso 64.03 %
LassoLars 72.49 %
Elastic Net 64.03 %
Ridge Regressor 94.2 %
SVR (kernel = linear) 2.37%
SVR (kernel = RBF) 2.37 %
Random forest 29.49 %
Ada boost 7.49 %
Decision Tree 9.29 %
As it is obvious that Ridge Regressors give most accurate outcome for our dataset, it
was selected to be used as the regressor for Machine Learning module to provide
the Stock Prediction value.
The formula in Equation (2) gives the Stock Prediction value.
(2)
7
(3)
where n is the number of news articles pertaining to each stock.
3.3 Fuzzy logic Module
The purpose of this module is to output Stock Faith which is the strength of
Recommendation.
The activation rules for this module are:
IF the News Sentiment was good or the Stock Prediction value was good,
THEN the Stock faith will be high.
IF the Stock Prediction value was average, THEN the Stock faith will be
medium.
IF the News Sentiment was poor and the Stock Prediction value was poor
THEN the Stock faith will be low.
Complete operation is illustrated in Fig 2.
Case 1: IF the News Sentiment was good or the Stock Prediction value was good,
THEN the Stock faith will be high as shown in Fig 3.
Case 2: IF the News Sentiment was poor and the Stock Prediction value was poor
THEN the Stock faith will be low as shown in Fig 4.
5 Scope
National Stock Exchange of India (located in Mumbai) ranks at 12th largest in the
world. NSE India has 1659 companies listed for public trading. Out of this only 50
(known as Nifty50) are focused on by investors. Nifty50 acts as a barometer for
Indian stock market growth. Indian economy relies mostly exporting agricultural
goods and services like software and technical support. Unfortunately, only 4 % of
India’s GDP is derived from stock market exchange. This is much less compared to
that of other developing countries which range from 20 to 40%. This untapped
resource can be monetized more efficiently to contribute to development of India.
9
6 Conclusion and Future Work
In this research, we propose that existing work [1-8] may integrated into a robust
model to predict NSE stock market accurately. This model can be improved upon by
defining refined fuzzy rules. Improving upon the training data’s scale and timeframe
can result in better prediction. A trading model using the proposed methodology can
be developed to compute total returns or investments in real time. This can prove the
accuracy of the model. This model can successfully recommend the best stocks for
investment.
References