0% found this document useful (0 votes)
17 views10 pages

JCSSP 2025 817 826

This research paper compares three models—Multiple Linear Regression, Support Vector Machine (SVM), and Long Short-Term Memory (LSTM)—to predict the stock price of Garuda Indonesia (GIAA) using stock price data and the exchange rate between IDR and USD. The findings indicate that Multiple Linear Regression outperformed the other models in predicting stock price movements, highlighting the influence of the IDR/USD exchange rate on stock prices. The study aims to provide insights for researchers and investors in making informed stock investment decisions.

Uploaded by

Lorem Maps
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views10 pages

JCSSP 2025 817 826

This research paper compares three models—Multiple Linear Regression, Support Vector Machine (SVM), and Long Short-Term Memory (LSTM)—to predict the stock price of Garuda Indonesia (GIAA) using stock price data and the exchange rate between IDR and USD. The findings indicate that Multiple Linear Regression outperformed the other models in predicting stock price movements, highlighting the influence of the IDR/USD exchange rate on stock prices. The study aims to provide insights for researchers and investors in making informed stock investment decisions.

Uploaded by

Lorem Maps
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Journal of Computer Science

Original Research Paper

Comparative Study of Garuda Indonesia Stock Price Prediction


Using SVM, LSTM and Multiple Linear Regression
1
Muhammad Naufal Luthfi and 2Evaristus Didik Madyatmadja
1BINUS Graduate Program Master of Information Systems Management, Bina Nusantara University, Jakarta, Indonesia
2Department of Information Systems, School, Bina Nusantara University, Jakarta, Indonesia
Article history Abstract: Stock shares are one of the investment products or tools that have
Received: 05-08-2024 been used by many people. Shares have interesting options for saving or
Revised: 30-10-2024 investment and are able to provide attractive returns based on corporate or
Accepted: 18-12-2024 company growth. Many factors can affect the share prices, whether internal
or external companies. This research was conducted on Garuda Indonesia's
Corresponding Author:
Muhammad Naufal Luthfi
stock price with the shares code (GIAA); Garuda Indonesia is one of the big
BINUS Graduate Program airlines in Indonesia. Machine learning and deep learning are popular topics
Master of Information Systems that give insight and recommendations for stock price movement and
Management, Bina Nusantara prediction. In this study, the researcher will compare the multiple linear
University, Jakarta, Indonesia regression, support vector machine, and long short-term memory model to
Email: muhammad.luthfi008@binus.ac.id
give new insight to other researchers and investors using stock price data and
exchange rate between IDR and USD data for a better decision in stock
investment strategies. The results show that multiple linear regression gave
the best result in predicting the stock price movement of Garuda Indonesia
company with exchange rate currency between IDR and USD, with the best
value result of R-Squared, MAPE, MSE, and RMSE. Showing that the
exchange rate between IDR and USD is influenced by stock price movement.
Keywords: Machine Learning, Artificial Intelligence, Data Mining, Stock
Price Prediction, Garuda Indonesia
Introduction The last one is liquidity risk, which is where the issuer goes
bankrupt or is liquidated so that the worst probability is that
Shares are an investment tool or instrument that is often shareholders will not make a profit.
used by many people because they can provide attractive According to CNBC Indonesia news on January 25,
returns. Shares can be interpreted as the participation of a 2024, written by (Puspadini, 2024), the number of investors
person or corporate body in a company or limited liability in Indonesia reached 12.2 million Single Investor
company (PT). The average surplus obtained by each Identification (SID) as stated by the development director
person who invests capital reaches between 11-13% per of the Indonesian stock exchange (Fig. 1), Mr. Jeffrey
year, supported by the performance of the IHSG. However, Hendrick, who said that the number of investors in
many investors in Indonesia want to take part in purchasing Indonesia only covers 5% of the total population in
shares but get results that do not match expectations or even Indonesia. This figure is very small compared to Singapore,
suffer losses. The advantages of buying shares include which has reached 30-40% of its total population. Below is
having a dividend, which gives a percentage of profits to a graph showing the development of the number of stock
shareholders; the amount given depends on the results of investors in Indonesia from March 2023-December 2023.
the shareholders' general meeting.
The advantage of buying shares is having a dividend,
which gives a percentage of profits to shareowners; the
amount given depends on the results. You get capital
gains when shareholders sell their shares at a more
profitable price than the usual price.
Apart from the advantages, there are also disadvantages
in buying shares, such as not getting dividends because the
company we invested in does not have good performance
or performance declines. The second is the opposite of
capital gain, namely capital loss, where shareholders sell
shares at a condition that is lower than the purchase price. Fig. 1: Number of investors in Indonesia

© 2025 Muhammad Naufal Luthfi and Evaristus Didik Madyatmadja. This open-access article is distributed under a Creative
Commons Attribution (CC-BY) 4.0 license.
Muhammad Naufal Luthfi and Evaristus Didik Madyatmadja / Journal of Computer Science 2025, 21 (4): 817.826
DOI: 10.3844/jcssp.2025.817.826

In the current technological era, artificial intelligence which gets capital income for company expansion and
can help predict stock prices by studying patterns in each investors who invest in the form of shares and gain profits
variable that are better than simple statistical methods. from the expansion or development of the business.
Machine Learning (ML) and Deep Learning (DL) can Stock price always has different movements, and there
make predictions based on time series data. ML is are two factors that can influence share prices: the external
implemented to read data patterns to determine stock data factors and internal factors condition of the company
movements and reduce investment risk when making influence the movement of the company's share price; the
decisions. Meanwhile, DL is a technology used to external factors are macroeconomic fundamental conditions
simulate human-like neural networks and solve complex and fluctuation in the rupiah exchange rate. Related to
non-linear problems. foreign currencies, government policies, panic factors and
Predicting time series data is generally very difficult market manipulation, these factors occur outside the
due to the unprecedented changes caused by changing company's internal control. The following are internal factors
economic trends. Therefore, an assessment of forecasting
that can influence share prices and the impact of company
accuracy is very necessary when using various forms of
activities, including company fundamental factors, company
machine learning models, as we know that each model has
corporate actions, and company performance projections in
several limitations. Some examples of models used to
analyze time series data include Support Vector Machine the future (Otoritas Jasa Keuangan, 2019).
(SVM), Long Short-Term Memory (LSTM) and multiple Data Mining
linear regression.
Many previous studies have used stock predictions, Data Mining is a method used to extract large amounts
machine learning and artificial intelligence, among others. of data and look for patterns or insights from the data
Analysis and forecasting of time-series data using S- collection. The techniques used in data mining are
ARIMA, CNN and LSTM (Dwivedi et al., 2021). Stock statistical techniques to summarize and analyze patterns
Pred: A framework for stock price prediction (Sharaf et al., and trends in data, then machine learning to build an
2021). A Comparison of ARIMA and LSTM in forecasting algorithm that learns from data and makes predictions
time series (Siami-Namini et al., 2018). and, finally, a database system to store, manage and
We can formulate the problem in this research. The first
retrieve data automatically and efficiently.
question compares three machine learning models that have
been proposed and the second is how USD and IDR There are several methodologies used in carrying out
exchange rates affect stock price. The goal of this research is data mining processes, including CRISP-DM (Cross-
to know the best model proposed and the answer to the USD Industry Standard Process for Data Mining), KDD
and IDR exchange rates that can affect the stock price. (Knowledge Discovery in Database) and SEMMA
The research benefits are separated into two (Sample, Explore, Modify, Model, Assess). However, in
perspectives; the first is for the researcher; we expect that this research, we will use the CRISP-DM framework as a
this research can help with machine learning and deep data mining process.
learning knowledge to predict stock price movement. So,
other researchers will hopefully use this research as a Machine Learning
reference for studying machine learning and deep learning Machine learning is a technique for studying patterns
used to predict stock price movement. The second benefit
and shapes using data and statistics. Machine learning
is for investors. Hopefully, this research can help
models work by providing input in the form of data so that
investors in making decisions for buying and selling and
the model created can provide output in the form of
make this research a reference on stock price trading.
decisions, recommendations, and predictions. There are
Ethical Considerations three categories of how machine learning learns:
This study does not involve any student and organization • Supervised learning is where the model created learns
questionnaires. The stock price data and the Indonesia
a pattern from data that has been labeled. This model
Rupiah and US dollar exchange were downloaded on public
can map input and output
space in each organization.
• Unsupervised learning is a model used on data that
Literature Review does not have labels. Unsupervised learning is tasked
with reviewing data so that hidden patterns or data
Stock groupings can be explored. Usually used in clustering
Shares are trading activities in securities on the stock or grouping analysis
exchange. The stock exchange or capital market is a place • Reinforcement learning is a method used for users
where private company activities take place in the form of who must make decisions and actions in certain
investment. Shares are one of the ways for companies to circumstances with the aim of maximum rewards. The
fund company capital. By issuing shares in 2 classes, you three categories above are used in different conditions
can get attractive profits both from the side of the company, and the form of data presented to study patterns

818
Muhammad Naufal Luthfi and Evaristus Didik Madyatmadja / Journal of Computer Science 2025, 21 (4): 817.826
DOI: 10.3844/jcssp.2025.817.826

Linear Model Cross Industry Process for Data Mining (CRISP-DM)


The linear model represents a fundamental algorithm Cross Industry Process for Data Mining (CRISP-DM)
that can produce a linear relationship between the input is a methodology with an approach and process
feature (Independent Variable) and the target output description for data mining projects. There are six
(Dependent Variable). Has a simple character, ability to implementation processes in the CRISP-DM
interpret, and efficiency in training data and predictions? methodology, which are business understanding, data
understanding, data preparation, modeling, evaluation,
Linear models have parameters in the form of coefficients
and deployment. Pete et al. (2000).
that determine the relationship between the independent Brzozowska et al. (2023), stated that CRISP-DM is able
variable and the dependent variable. Training is carried to achieve good-quality model results; the indicated model
out by a linear model by learning optimal parameters, can be used for analytical support, planning, and making
minimizing the loss function and calculating or measuring decisions. CRISP-DM also facilitates the smooth operation
the error between the predicted value and the actual value. of an organization by improving the accuracy of decision
Some linear models used in machine learning include making and identifying the decision-making process.
linear regression, multiple linear regression, logistic
regression, support vector machine, and linear Related Work
discriminant analysis. According to (Nunno, 2014), stock Irzky Ardianta and Sari's research discusses two
price movements can be predicted using several linear traditional techniques used by investors to predict stock
regression models such as support vector machine, linear prices, with forecasting carried out using previous data and
regression, multiple linear regression and neural network- fundamental analysis. This study also uses sentiment
based regression. analysis, technical analysis, and fundamental analysis to
predict stock price movements in Indonesia. This research
Multiple Linear Regression uses a Support Vector Machine (SVM) as a prediction
Multiple Linear Regression is a model that can show the model. This research also uses data from news originating
from the macroeconomic and microeconomic spheres,
relationship between one variable and another variable.
foreign stock price movements, and currency movements
Usually, multiple linear regression is used to carry out between the Rupiah and the Dollar for analysis. The results
predictive analysis with the aim of making decisions when obtained by the author are that the average percentage of
taking steps in business (Panwar et al., 2021). accuracy produced by SVM is 65.33%. It can be concluded
that this research is useful for analyzing stock value
Support Vector Machine movements in Indonesia. Irzky Ardyanta and Sari (2021).
SVM is an algorithm used for classification and Dwivedi et al. (2021), research analyzed and
regression in supervised learning. SVM can find patterns forecasted time-series data, namely Nifty-500 index data
in complex data. SVM works by looking for a hyperplane on the stock market using S ARIMA, CNN, and LSTM.
that can maximize the margin between 2 classes. The And carried out comparisons and evaluations of the
performance of each model, demonstrating promising
hyperplane itself is the boundary that SVM uses to
results. The tools used in this research are Tensor Flow
separate two classes of data. and Keras for implementation. MSE to measure the
Recurrent Neural Network accuracy of the forecasting made by the model. The results
of this research show that deep learning models outperform
Recurrent Neural Network (RNN) has the basic traditional machine learning models. Dwivedi et al. (2021).
feature that this model works in at least one feedback Gao (2021), created this journal to discuss the use of
loop so that the selected label function can work in at Recurrent Neural Network (RNN) and Sequence to
least one loop. RNN is also designed to process Sequence (Seq2Seq) models in machine learning,
sequential data like text data, speech, or time series data. specifically for tasks such as machine translation and
The RNN module will perform at least one repetition. stock index prediction. This journal explains the function
Mittal and Chauhan (2021). and structure of RNN, Long Short-Term Memory
(LSTM) and attention layers in this model, and it provides
Long Short-Term Memory code examples for implementing Seq2Seq in predicting
Long Short-Term Memory (LSTM) is a type of RNN stock price indices. This research also compares the
that is designed to overcome the weakness of standard performance of differentiated time-series data models to
RNNs. LSTM introduces a special portal that can control predict stock price movements from the Dow Jones
the flow of information in and out of the hidden layer, Industrial Average (DIJA) and finds that the LSTM and
which is the storage unit of the network. This portal can Seq2Seq models outperform other models in calculating
determine which information can be stored and discarded the mean squared error. Features low latency relativity
and how much can be input. and sequential prediction. Gao (2021).

819
Muhammad Naufal Luthfi and Evaristus Didik Madyatmadja / Journal of Computer Science 2025, 21 (4): 817.826
DOI: 10.3844/jcssp.2025.817.826

Yan et al. (2021), This journal discusses the use of Ketut et al. (2023) this journal compare optimization
LSTM deep neural networks to predict stock prices based models (Adam, SGD, RMSprop) on LSTM aimed at
on data from the previous N days. Comparing LSTM with predicting the share price of Telkom Indonesia, Tbk
other neural networks and traditional statistical models, the from January 1, 2019, to January 11, 2023. LSTM shows
aim is to improve prediction accuracy in financial market results with very good prediction accuracy with low
time series data. This research also focuses on optimizing values using Mean Absolute Percentage Error (MAPE).
the training process, comparing differences in And Adam's optimization shows an. 45%. Ketut et al.
optimization methods, and exploring the consequences of (2023) accuracy of 98.
the input provided. The results of the research obtained Chrysmien and Jayadi (2022) this research compare
from this research are that LSTM deep neural networks LSTM MLR and ARIMA machine learning on stock
are very effective in predicting stock prices and in solving price movements with additional sentiment data and
the problem of vanishing gradients in traditional RNNs. Rupiah and USD exchange rate movements. The stock
Yan et al. (2021). data analyzed is FREN stock data, namely, a
Siami-Namini et al. (2018) this journal compares the telecommunications company in Indonesia. The results
performance of ARIMA models with deep learning based of this research show that Multiple Linear Regression
on LSTM models in predicting time series data with is the best model for predicting stock prices, with
economic relations and financial variables. This journal figures of 473,875 in MSE and 21,768 in RMSE in
training data analysis; for testing data, it achieved
discusses the ARIMA and LSTM algorithms and
figures of 74,181 in MSE and 8,612 in RMSE.
evaluates the level of accuracy using RMSE as a
Chrysmien and Jayadi (2022).
measuring tool. This journal also shows that LSTM
It can be concluded from previous research that Long
outperforms ARIMA with an average error reduction in Short-Term Memory (LSTM) shows good figures in
the range of 84-87%. This research also emphasizes the predicting stock price movements (Bansal et al., 2022;
influence of parameters such as epochs and neurons in Dwivedi et al., 2021; Gao, 2021; Ketut et al., 2023; Siami
LSTM training models and supports the use of deep -Namini et al., 2018; Yan et al., 2021) and several studies
learning based on algorithms in the economic and show that SVM is the best model in predicting stock
financial sectors. Siami-Namini et al. (2018). prices (Akhtar et al., 2022; Febrilia et al., 2021; Ardyanta
Febrilia et al. (2021), this research discusses the and Sari, 2021). Other researchers have also stated that
implementation of a Support Vector Machine (SVM) to multiple linear regression is the best model for predicting
predict stock movements at Garuda Indonesia Tbk. The stock prices (Chrysmien and Jayadi, 2022).
data used is Garuda Indonesia stock data with a time
period of March 18, 2019-April 23, 2021. The SVM Materials and Methods
algorithm achieved a prediction accuracy score of 0.545.
It was concluded that SVM was able to help investors in Researchers use the CRISP-DM framework to
making decisions about buying and selling shares. predict stock price movements. This chapter will
Febrilia et al. (2021). explain it into six stages consisting of Business
Bansal et al. (2022), this research predicts stock understanding, data understanding, data preparation,
prices using five machine learning models, including modeling and deployment.
K-Nearest Neighbors, linear regression, support vector Business Understanding
regression, decision tree regression, and long short-
term memory. The data was taken from 12 Indian Garuda Indonesia is the first civil flight in Indonesia that
companies for more than 7 years for analysis. The was born on January 26, 1949, under the initiative of the
results of this research show that LSTM outperforms Republic Indonesia Air Force. At this moment, Garuda
all algorithms in terms of accuracy level, then followed Indonesia serves more than 90 destinations in local and
by SVR, which is second in terms of performance, international, with 600 flights in one day. Garuda Indonesia
Linear Regression, and Decision Tree Regression show group also operates around 210 fleet; 142 planes operate as
the main brand of Garuda Indonesia and 68 planes operate
the same results, and the last algorithm, K-NN, shows
as the main brand of Citilink.
poor results. Satisfactory in predicting stock prices.
Garuda Indonesia is a member of SkyTeam and the 2nd
Bansal et al. (2022). largest airline in Indonesia after Lion Air. Garuda
In this research by Akhtar et al. (2022), predictions Indonesia shares were chosen as the topic to discuss stock
were made of stock price movements that focused on price movements due to the release of Unusual Market
preprocessing raw data and using machine learning Activity (UMA) from the Indonesia Stock Exchange
algorithms. The machine learning models used are (BEI) in January 2023 because the share movement looks
random forest and support vector machine. The method very unusual.
proposed by the researchers obtained an accuracy score of Garuda Indonesia (GIIA) shares that operate in the
78.7% for Support Vector Machine and an accuracy score aviation transportation sector or another name for
for random forest of 80.8%. Akhtar et al. (2022). commercial air transportation services, which operate in the

820
Muhammad Naufal Luthfi and Evaristus Didik Madyatmadja / Journal of Computer Science 2025, 21 (4): 817.826
DOI: 10.3844/jcssp.2025.817.826

transportation and logistics sector, including in the airline Adjusted close The share price at the end of the buying and
industry. In this sector, there are also stocks such as PT selling day has been changed by additional
AirAsia Indonesia, Tbk (CMPP) and PT Jaya Trishindo, distributions and corporate actions that occur
Tbk (HELI). before the next day opens
Low Lowest price of the stock in one-day trading
Data was collected from 2 website portals, where first
we obtained data from finance.yahoo.com to collect data on Volume Number of purchased stocks that have been
trading in a certain period
Garuda Indonesia shares as seen (Fig. 2) and bi.go.id to
collect data on changes in the Indonesian currency Table 2: Currency exchange data variable
exchange rate (Rp) to the United States (USD) on (Fig. 3). Variable name Description
In the time span from January 2023-January 2024. Date Exchange rate date
Forex Sell Forex selling price
Data Understanding Forex Buy Forex buying price
Data understanding: The author describes each Value The exchange value between the currency
attribute contained in the data and defines the data that
Table 3: Selection data attribute
will be used in this research. Because the research
No. Variable
obtained two different types of data, the types of data
1 Date
will be explained in Table (1) for the stock movement
2 Open
data table and Tables (2-3) for the Rupiah and USD
3 High
exchange rate movements.
4 Low
5 Close
6 Adj Close
7 Volume
8 Forex Sell
9 Forex Buy

Data Preprocessing
Here, the process carried out consists of data
visualization, data integration, data cleaning and data
testing. In this research, data preprocessing was carried
out in Google Collab using Python. The goal of data
preprocessing is to create the best data quality so that
it can be continued to the next stage. The data in the
Fig. 2: Aruda Indonesia stock price on yahoo finance website research will be combined under the name "master
data," which is a combination of stock movement data and
IDR to USD currency movement data.
Data Selection
By referring to the data above, we can determine
which attributes we will use in this research. Below are
the variables that will be used in the research.
Data Visualization
Stock Data Visualization
It can be seen in the graph (Fig. 4) that when the stock
price is at a low position or going towards a low position,
Fig. 3: Bank Indonesia website page
transaction volume increases. It can be concluded that
Table 1: Stock data variable attribute there is an inverse relationship between stock price and
Attribute name Description sales volume.
Date Share the sale and purchase date
Exchange Rate Visualization
Open The opening price of shares in one trading day
High Highest price of the stock in one trading day
It can be seen in the picture (Fig. 5) that when stock
prices rise on (Fig. 4), the currency exchange rate also
Close Closing stock price in one trading day
rises (Fig. 5).

821
Muhammad Naufal Luthfi and Evaristus Didik Madyatmadja / Journal of Computer Science 2025, 21 (4): 817.826
DOI: 10.3844/jcssp.2025.817.826

Data Integration
At this stage, we combine the stock data and exchange
rate data that have been used in Google Collab as seen on
(Fig. 6) and able to see all attribute for the dataset.
Cleaning Data
In this process, we check whether there are missing
values in the dataset and remove columns that are not needed
to be run in the model so that the model can run optimally.
Checking for Missing Value
At this stage, we check whether there are any
missing values in the data we have. It can be seen (Fig. 7)
Fig. 4: Visualization of stock price exchange
that the dataset we have does not have missing values
for all of its attributes.
Remove the "NO" and "Value" Column
Because the “sequence number” column is not used in
the model and the exchanged value column is not used in
the model that will be implemented, we perform the drop
function on these two columns; (Fig. 8) are the results
after dropping the column in the dataset.
Fig. 5: IDR/USD exchange rate visualization
Splitting the Data
Before implementing the model with the dataset, the
dataset should have split into 2 parts with a comparison of
80% for the training data and 20% for the testing data.
According to (Bichri et al., 2024), increasing the size of
train data to more than 70% of the dataset is required in
the training step to achieve better performance.

Fig. 6: Combining stock data and exchange rate data Testing the Data
Testing the data is a critical component of the CRISP-
DM process. Testing the data is needed before we use the
data in our models. We use Correlation metrics to be carried
out in order to determine whether the relationship between
each variable is a linear relationship or is mutually correlated.
The second test is the Anderson Darling Test. This
testing is used to check whether the data comes from a
specific distribution; this test is very useful for small data
sizes. The last test uses Durbin-Watson testing for detecting
autocorrelation in the residuals of regression models.
Modelling
Fig. 7: Checking missing value At this stage, the model selection is carried out by
predicting stock prices using a machine learning and
artificial intelligence approach by implementing data
preprocessing by combining stock data and the exchange
rate between Rupiah and USD data and checking if any
missing value after the data combined and do
standardization the data including testing the data before
the data is used for the model to train. And model
implementation by using 3 models proposed (LSTM, MLR
and SVM). And expected the result by being able to predict
Fig. 8: Remove column "NO" and "Value" stock price prediction. Is the workflow that will be used.

822
Muhammad Naufal Luthfi and Evaristus Didik Madyatmadja / Journal of Computer Science 2025, 21 (4): 817.826
DOI: 10.3844/jcssp.2025.817.826

Histogram of Residuals
Shown on the (Fig. 10), the residuals are distributed
symmetrically around zero, suggesting normality in the
data that we use for our model.

Fig. 9: Research workflow Q-Q Plot of Residual Results


With the Q-Q plot of residuals on (Fig. 11), it is
As seen on (Fig. 9) the research workflow of stock data
and exchange rate between IDR and USD is run through confirmed that the residuals closely follow a normal
data preprocessing including integrate and combine the distribution, with most points lying on or near the 45-
data, checking missing value and standardized the data degree line, with only minor deviations at the tails.
and we have master dataset. Next, we split the data for However, slight deviations are normal and acceptable.
train split data, after that using machine learning (LSTM, Q-Q Plot of residuals graph.
MLR and SVM) for machine learning to train and learn Correlation Metrics
the data pattern, and finally given the output for stock
price prediction. The image shows on (Fig. 12) that the dependent
variable in this study (Close) provides a very good
Long Short-Term Memory correlation (0.99-1) for all stock data except volume at
The LSTM model uses shape for one loop back, one dense (0.075), which is almost uncorrelated and is followed
for the LSTM model by finding the mean-square-error loss by a correlation with a sufficient level (0.44).
with Adam optimizer with 100 epochs.
Multiple Linear Regression
Multiple linear regression is a model that will be used
to learn the relationship between features and targets.
Train the model with training data so that the model can
learn the patterns in the data. And make a prediction
using a model that has been trained to predict the target
value on the test data.
Support Vector Machine
The SVM model has three steps. First, we need to
determine the grid parameters for grid search, which Fig. 10: Histogram of residual
determines the set of parameters that will be tested to
determine the best combination in the SVR model.
Second, we create a basic SVR model and grid search.
Third, we create a function to determine the best
parameters and train the model. Finally, make
predictions on testing data.

Results
This section is the fifth part of the CRISP-DM
Evaluation, this section will show the data analyst
result and test data result using the correlation matrix Fig. 11: Q-Q Plot of residuals
Anderson Darling test and Durbin Watson test.
Elaborate on the result for each model in the previous
section, the metric that evaluates stock price prediction
will be R-Squared, Mape (Mean Absolute Percentage
Error), MSE (Mean Square Error) and RMSE (Root
Mean Square Error) for each data.
Data Testing Results
Using histogram of residuals, boxplot and Q and Q
with result and diagram or graphs. Fig. 12: Data correlation

823
Muhammad Naufal Luthfi and Evaristus Didik Madyatmadja / Journal of Computer Science 2025, 21 (4): 817.826
DOI: 10.3844/jcssp.2025.817.826

R2 result shows that MLR has the highest R2 values for


both training (0,996) and testing data (0,992), which
indicate MLR has a very strong fit. SVM is slightly lower
than MLR with R2 values of training (0,995) and testing
(0,982). LSTM shows a lower value of R2, especially on the
test data (0,824) train (0,942).
Mean Absolute Percentage Error shows that MLR has
the lowest score on both training (1,26%) and test (1,46%),
Fig. 13: Anderson darling test indicating that MLR has the highest accuracy in percentage
terms. Followed by SVM with slightly higher MAPE value
for train (1,35%) and test (1,74%). And LSTM has the
highest MAPE result on the train (11,09%) and test
(2,12%), suggesting larger deviations from the actual value.
Mean Squared Error and Root Mean Squared Error
results show that MLR has the lowest MSE and RMSE on
both train and test data, indicating that MLR has high
accuracy. SVM has slightly higher MSE and RMSE than
MLR. LSTM shows the highest MSE and RMSE results
on the test data, meaning that its predictions are less
accurate compared to MLR and SVM.

Discussion
Fig. 14: Durbin watson test
For justification for model selection, given the metrics,
Table 4: Model evaluation matrix MLR appears to be the most effective model for this
SVM LSTM MLR dataset due to its simplicity and great performance. It not
Model only captures the underlying pattern effectively but also
Data Data Data Data Data Data
evaluation generalizes it well, as evidenced by the low error metrics
train test train test train test
R2 0.995 0.9816 0.9421 0.823 0.9962 0.992 both on train data and test data. SVM could be an
Mape 0.013 0.0174 11.090 0.021 0.0125 0.014 alternative, but it does not outperform MLR significantly.
MSE 2.534 5.1069 0.0005 6.460 1.8992 2.206 LSTM shows higher error and appears to be less effective
RMSE 1.591 2.2598 0.0229 2.541 1.3781 1.485 in capturing the data pattern here. LSTM may not be
suitable in this case because its performance metrics are
Anderson Darling Test considerably poorer, especially on the test data.
Below is the graph on each model performance with
The image on Fig. (13), showing the result of predicted value and actual value (Fig. 15).
Anderson Darling's results with a P-Value of SVM and MLR perform quite well in following the
0.8728666510020506 indicates that residuals from the actual values. Meaning that these models capture the
regression model are normally distributed. This is a pattern effectively. This aligns with their low error metrics
desirable outcome as it can validate one of the key such as MSE, RMSE, and MAPE. The LSTM model does
assumptions of linear regression, supporting the reliability not capture the peaks and valleys, nor do SVM and MLR.
and validation of the model inference and predictions. This model is smoother, with an almost flat trend showing
that the LSTM might not be well suited for the dataset.
Durbin Watson Test This analysis supports selecting MLR as the primary
With the Durbin Watson shows on Fig. (14), statistic model, with SVM as a potential secondary option.
Based on the research of Chrysmien and Jayadi
of approximately 1.98, the residuals are not significantly
(2022), showing that multiple linear regression is the best
autocorrelated. The assumption of independence of
model for predicting the stock price same what we
residuals in the regression model is satisfied, which is achieved her in the research showing the multiple linear
important for the validity of the regression results. Are regression is the best model for predicting stock price
the images Durbin Watson test? combined with exchange rate between IDR and USD.
Model Performance Results
In this section, we will view the predicted value and
actual value in each model on the graph and the model
performance based on Metric evaluation of R-squared,
Mean Absolute Percentage Error, Mean Squared Error, and
Root Mean Squared Error shown on (Table 4). Fig. 15: Actual value and predicted value in each model

824
Muhammad Naufal Luthfi and Evaristus Didik Madyatmadja / Journal of Computer Science 2025, 21 (4): 817.826
DOI: 10.3844/jcssp.2025.817.826

Conclusion Ethics
In this study, the focus was proposing an approach of This article is original by the first and second authors
using two combined datasets, which are the exchange and has not been previously published.
rate between IDR and USD data and stock price data.
Using artificial intelligence and machine learning to References
learn the data and predict them also gives investors
insight into making decisions. With the result of the Akhtar, M. M., Zamani, A. S., Khan, S., Shatat, A. S. A.,
correlation metric, we can conclude that Exchange Rate Dilshad, S., & Samdani, F. (2022). Stock market
data and Stock Price data are able to correlate with each prediction based on statistical data using machine
other and have an effect on each variable. learning algorithms. Journal of King Saud
Based on provided results on each model, we can rank University-Science, 34(4).
the models according to the different metrics. The https://doi.org/10.1016/j.jksus.2022.101940
conclusion is that multiple linear regression performs best Bansal, M., Goyal, A., & Choudhary, A. (2022). Stock
across all metrics for both test data and train data. The Market Prediction with High Accuracy using
second rank goes to Support Vector Machine, with Machine Learning Techniques. Procedia Computer
consistently better performance than LSTM but not as good Science, 215, 247–265.
as Multiple Linear Regression. The last rank is for Long https://doi.org/10.1016/j.procs.2022.12.028
Short-Term Memory because it lags behind significantly, Bichri, H., Chergui, A., & Hain, M. (2024). Investigating
with lower R-Square and higher error rates, and may not be the Impact of Train / Test Split Ratio on the
the best fit for this particular problem of the dataset. Performance of Pre-Trained Models with Custom
This research, by combining the stock price of the Datasets. In IJACSA) International Journal of
Garuda Indonesia dataset and the exchange rate currency Advanced Computer Science and Applications (Vol.
between the IDR and USD dataset, hopefully can bring a 15, Issue 2). www.ijacsa.thesai.org
new perspective on how using machine learning and Brzozowska, J., Pizoń, J., Baytikenova, G., Gola, A.,
artificial intelligence to analysis also help in decision- Zakimova, A., & Piotrowska, K. (2023). Data
making in stock price execution. This research is also able Engineering in Crisp-Dm Process Production Data
to help investors buy, sell, hold and make decisions to – Case Study. Applied Computer Science, 19(3),
understand the stock price movement. 83–95. https://doi.org/10.35784/acs-2023-26
Chrysmien, & Jayadi, R. (2022). Comparison Stock Price
Acknowledgment Prediction Between Arima, Multiple Linear
Regression and Lstm Models by Adding Stock
We acknowledge Prof. Dr. Evaristus Didik Sentiment Analysis and Usd/Idr Fluctuation. Journal
Madyatmadja, ST., M.Kom., M.T., Lecturer of of Theoretical and Applied Information Technology,
Information Systems, Department of Information 100(4), 1158–1169.
Systems, School, Bina Nusantara University for his Dwivedi, S. A., Attry, A., Parekh, D., & Singla, K. (2021).
advice, motivation, input and support during the creation Analysis and forecasting of Time-Series data using S-
of the manuscript. ARIMA, CNN and LSTM. Proceedings - IEEE 2021
International Conference on Computing,
Funding Information Communication, and Intelligent Systems, ICCCIS
The authors have not received any financial support or 2021, 131–136.
funding to report. https://doi.org/10.1109/ICCCIS51004.2021.9397134
Febrilia, R., Wulandari, T., & Anubhakti, D. (2021).
Implementasi algoritma support vector machine
Author’s Contributions (svm) dalam memprediksi harga saham pt. Garuda
Muhammad Naufal Luthfi: Written the research, indonesia tbk. In indonesia journal information
specifically analyzing and understanding the data, writing system (idealis) (vol. 4, Issue 2).
and executing the code, following the research http://jom.fti.budiluhur.ac.id/index.php/IDEALIS/in
framework, found results, and making a conclusion based dexRatihFebriliaTriWulandari|http://jom.fti.budiluh
on research purposes. ur.ac.id/index.php/IDEALIS/index|
Evaristus Didik Madyatmadja: Designing the Gao, Z. (2021). Stock Price Prediction with ARIMA
research framework, specifically providing guidance on and Deep Learning Models. 2021 IEEE 6th
creating the research framework based on previous International Conference on Big Data Analytics,
research and offering recommendations for data analysis ICBDA 2021, 61–68.
based on research findings. https://doi.org/10.1109/ICBDA51983.2021.9403037

825
Muhammad Naufal Luthfi and Evaristus Didik Madyatmadja / Journal of Computer Science 2025, 21 (4): 817.826
DOI: 10.3844/jcssp.2025.817.826

Irzky ARDYANTA, E., & Sari, H. (2021). A Prediction Panwar, B., Dhuriya, G., Johri, P., Yadav, S. S., & Gaur,
of Stock Price Movements Using Support Vector N. (2021). Stock Market Prediction Using Linear
Machines in Indonesia. Journal of Asian Finance, Regression and SVM. 2021 International Conference
8(8), 399–0407. on Advance Computing and Innovative Technologies
https://doi.org/10.13106/jafeb.2021.vol8.no8.0399 in Engineering (ICACITE), 629–631.
Ketut, I., Enriko, A., Nizar Gustiyana, F., Putra, R. H., & https://doi.org/10.1109/ICACITE51222.2021.9404733
Kunci, K. (2023). JURNAL MEDIA INFORMATIKA Pete, C., Julian, C., Randy, K., Thomas, K., Thomas, R.,
BUDIDARMA Komparasi Hasil Optimasi Pada Colin, S., & Wirth, R. (2000). Crisp-Dm 1.0. CRISP-
Prediksi Harga Saham PT. Telkom Indonesia DM Consortium, 76.
Puspadini, M. (2024, January 25). BEI Sebut Investor RI
Menggunakan Algoritma Long Short-Term Memory.
Dua Kali Lipat dari Singapura, Tapi. CNBC Indonesia.
https://doi.org/10.30865/mib.v7i2.5822
Siami-Namini, S., Tavakoli, N., & Siami Namin, A.
Mittal, S., & Chauhan, A. (2021). A RNN-LSTM-Based
(2018). A Comparison of ARIMA and LSTM in
Predictive Modelling Framework for Stock Market Forecasting Time Series. Proceedings - 17th IEEE
Prediction Using Technical Indicators. International International Conference on Machine Learning and
Journal of Rough Sets and Data Analysis, 7(1), 1–13. Applications, ICMLA 2018, 1394–1401.
https://doi.org/10.4018/ijrsda.288521 https://doi.org/10.1109/ICMLA.2018.00227
Nunno, L. (2014). Stock Market Price Prediction Using Yan, X., Weihan, W., & Chang, M. (2021). Research on
Linear and Polynomial Regression Models. financial assets transaction prediction model based
Otoritas Jasa Keuangan. (2019, November 25). Penyebab on LSTM neural network. Neural Computing and
naik turun harga saham suatu perusahaan. Applications, 33(1), 257–270.
Https://Sikapiuangmu.Ojk.Go.Id/FrontEnd/CMS/Articl https://doi.org/10.1007/s00521-020-04992-7
e/10507.

826

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy