83 319 2 PB
83 319 2 PB
(Received: November 30, 2022 Revised: December 23, 2022 Accepted: January 20, 2023, Available online: January 31, 2023)
Abstract
The development and competition that exists in the business world today leads every manager or company to be more dexterous
in making marketing strategies to increase sales. Various things are done to keep up with existing market competition, such as
analyzing customer purchase transaction data to serve as a policy determination and decision-making system in making
marketing strategies. In determining marketing strategies, it can be done by taking transaction data to see existing purchase or
transaction patterns. Market Basket Analysis is part of a data mining method that uses the FP-Growth algorithm technique to find
out associated products. This research uses data taken from sales transaction data archives as much as 150 sales transaction data
and 26 product data. In this study, it is determined that the minimum support value is 50% and the minimum confidence is ≥ 0.75
From the test results, 9 products have superior support values and meet the minimum value. From the test results, a rule with a
confidence value of 0.870 was obtained: D → W (if consumers buy Wardah Lightening Gentle Wash, then buy Azarine
Sunscreen SPF50), and 0.808: A → E → O (if consumers buy Emina Face Wash, then buy Azarine Night Moisturizer and
Himalaya Neem Mask).
Keywords: Data Mining, Association Rule, Market Basket Analysis, FP-Growth Algorithm
1. Introduction
The development and competition that exists in the business world today leads every manager or company to be more
dexterous in making strategies that can attract consumer attention and ensure business continuity to increase sales. In
the process of trading activities, especially transactions, it results in a lot of data stored in a company archive.
However, the stored data is only used as an archive and is not utilized and will only increase. Transaction data that is
only stored in an archive can actually be used by companies as an important source of information for
decision-making materials in making a marketing strategy by knowing consumer purchasing patterns. Knowledge of
the pattern of purchasing goods by consumers can be utilized in making a marketing strategy by making a purchase
suggestion in the form of a certain product package based on products that have similar purchasing criteria in each
transaction.
There are several studies that have been conducted related to association analysis on sales data. Research conducted
by [1], [2]. By applying the FP-Growth algorithm, prescription transaction data can be utilized to produce important
information in determining the layout pattern of goods according to consumer purchasing patterns. The results of the
association rule can be used as input for the pharmacy in determining the layout pattern of goods at the pharmacy.
Research conducted by [3]. Applying the market basket analysis method to make policies and business strategies for
PT Mora Telematics Indonesia using Association rule techniques. In this study, Market basket analysis uses the
Frequent Pattern Growth (FP-Growth) algorithm to find patterns by applying a Tree data structure or called an
FP-Tree. One of the patterns generated from the analysis of sales transaction data from January 2018 to April 2018
resulted in 7 association rules with the highest lift ratio value, namely if there is an OxygenHome 25 - Super Double
installation, there will be an OxygenHome 15 - Super Double installation with a lift ratio of 4.59%, a support value of
3.125%, and a confidence value of 0.67%.
The first research conducted by [4] used the market basket analysis method with association rules to analyze
transaction data in an online store. The results showed that products that are often purchased together are food
products, beauty products, and household goods products. The conclusion of the above research is that the market
basket analysis method with association rules can be used to analyze transaction data in various stores and find out
products that are often purchased together in one shopping cart. This is beneficial for store owners to increase sales
and offer products that suit customer needs. Based on several studies that have been conducted previously, it is
proven that using the Market Basket Analysis method with the FP-Growth algorithm can be a solution for managing
transaction data that is used to determine consumer purchasing patterns.
So as to minimize data into something meaningless and utilize the data collected to be used as a useful data set, an
association analysis process will be carried out using association rule techniques using the FP-Growth algorithm
which is used to determine consumer purchasing patterns. Consumer purchasing patterns can later be used as a
marketing strategy by making a purchase recommendation in the form of a certain product package based on product
criteria that have similar criteria from each transaction. With the aim of determining purchasing patterns that occur
simultaneously in one transaction to be used as advice or recommendations in the form of certain product packages as
a marketing strategy.
2. Literature Review
Research conducted by [5], [6] aims to improve the decision-making process for supermarkets in organizing their
product catalogs. The research focuses on determining the relationship between products purchased at a particular
store by utilizing the Apriori algorithm and Market Basket Analysis. The results of this study can help companies to
strategize better in placing products close to each other to increase the likelihood of consumers buying them together.
Research conducted by [7]. By utilizing transaction data to provide information for making menu packages at
Angkringan Waru and also providing information about the relationship between food and beverage menu items
based on transaction data using the FP-Growth algorithm. The results of this study are able to provide
recommendations for Angkringan Waru menu packages which include two items, namely snacks and drinks to make
it easier for customers to choose menu packages and help sellers increase their overall sales. This is intended to
provide information for making menu packages at Angkringan Waru and also provide information about the
relationship between food and beverage menu items based on transaction data.
Research conducted by [8] on Market Basket Analysis, customer purchasing patterns are identified by identifying
important associations between products purchased together. The results showed that if the most popular items are
used, it is possible to get almost the same frequent itemset and association rules in a short time compared to the
output obtained by counting all items.
3. Methodology
Figure 1 is an overview of the block diagram on the implementation using association rules and the FP-Growth
algorithm that plays a role in the data processing process. In the block diagram above, it is explained as follows:
1) The data processing process starts from inputting transaction data and then there will be a pre-processing
stage for the data.
2) Data cleaning with the aim of removing and modifying irrelevant data and data duplication.
3) Combining data that has been done by the data cleaning process according to the similarity of the criteria of
each data.
4) Selection of data that is in accordance with the analysis needs for data processing that will be executed at a
later stage.
5) Data transformation for mining is a step taken by changing the data in a form that is suitable for the data
mining process or called normalization with the aim of changing the data measurement scale in another form
according to research needs to meet the assumptions of the analysis and data processing process.
6) The mining process using the FP-Growth algorithm aims to find products that are often purchased together in
one shopping cart by customers by going through several stages, such as frequency table or calculating the
frequency of occurrence of each item in the dataset, creating an FP-Tree, which is about itemset patterns that
often appear in the dataset or in the form of nodes that show the relationship between items, Frequent pattern
mining, the FP-Growth algorithm will search for frequent patterns by following the data flow between nodes
in the FP-Tree. The detected patterns will be stored in the frequent pattern table and the result will store
information about the detected frequent patterns, as well as their frequency of occurrence in the dataset.
7) Calculation of confidence and support to evaluate the strength of a pattern detected in the dataset, where the
support value is a measure of the occurrence of a pattern in the dataset, while confidence is a measure of
confidence that a pattern will appear after another pattern appears.
8) Determination of association rule which is a stage to determine a pattern detected in the dataset that shows
the relationship between items by determining the minimum value to be used as a reference in finding
patterns that meet the predetermined support and confidence criteria.
9) Final evaluation is the process of evaluating the strength and validity of the patterns detected in the dataset.
10) The results of the decision and determination which is the final stage of the transaction data processing
process at Toko Gudang Kosmetik Purwokerto with the FP-Growth Algorithm in the form of a list of
association patterns (association rules) detected in the dataset where each pattern is stored in an association
rule table, which stores information about the pattern, as well as its support and confidence values.
3.2. Dataset
This paper mainly studies the solar MPPT algorithm, mainly using the WNN algorithm. In this paper, several
common MPPT control methods are analyzed, and the WNN algorithm selected in this paper is understood from the
midpoint. Then, this paper describes the structure design of the WNN maximum power tracking control algorithm,
and describes the structure and working principle of photovoltaic cells. This paper describes the WNN MPPT
algorithm structure to track and calculate the MPP of solar energy. This paper analyzes and compares the power
output power of MPPT mode through the simulation experiment of the solar photovoltaic system, and studies the
output voltage of the solar charging panel in the charging process, and realizes the tracking of solar MPP through the
experimental verification.
After all frequent itemsets are known, the next step is to form an association rule that can fulfill a confidence
value. To find out the confidence value, it can be done by using the calculation formula as follows:
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒∶ 𝑃(𝐴 ∩ 𝐵) = ∑ 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛𝑠 𝐶𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝐴 𝑎𝑛𝑑 𝐵 × 100% (2)
According to equation 3, it can be concluded that the support value of itemset or product variation group A against
itemset B is equal to the probability of itemsets A and B combined. Meanwhile, equation 4 explains that the
confidence percentage of itemset A against itemset B is equal to the probability of the combination of itemsets A and
B divided by the probability of itemset A.
Meanwhile, in determining the minimum confidence can be determined by the formula equation (5) as below:
(𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛 𝐶𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝐴 𝑎𝑛𝑑 𝐵)
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 (𝐴 → 𝐵 ) = 𝑃 (𝐴|𝐵) = (𝑇𝑜𝑡𝑎𝑙 𝑇𝑟𝑎𝑛𝑠𝑎𝑐𝑡𝑖𝑜𝑛)
(5)
The association rule mining procedure is a way to find relationships between items in a dataset.
4. Results and discussion
TID ITEM
1 {B,Q,X}
2 {A,E,F,J,M,O,P,Q,R,U,V,X}
3 {A,C,H,L,N,O,P,S,X}
… …
150 {A,E,P,T,X}
The following in Table 2 is the transaction data used as research material and has been carried out a data adjustment
process so that the data can be read during the mining process using the FP-Growth algorithm.
Table 3 is the result of normalization of transaction data that has been cleaned, namely a total of 150 data.
Table. 4. Data initialization
ITEM Code
EM Face Wash A
EM Face Serum B
EM Moisturizing Cream C
... ...
In table 4, there is a process of data initialization or giving code marks to each itemset. This step is done to facilitate
the process of calculating and processing data in the next process. For example, EM Face Wash items or products are
given initialization or code A, EM Face Serum is given code B, EM Moisturaizer Cream is given code C, and GW
Yellow VIT C 125 ml products are given code Z.
Table. 5. Initialized transaction data
TID ITEM
1 {B,Q,X}
2 {A,E,F,J,M,O,P,Q,R,U,V,X}
3 {A,C,H,L,N,O,P,S,X}
... ...
150 {A,E,P,T,X}
In table 5 is transaction data that has been initialized based on each product itemset. For example in TID or
transaction 1 sold products B,Q,X and transaction 2 sold products A,E,F,J,M,O,P,Q,R,U,V,X Transaction data that has
been initialized will be used as sample data to be converted into binominal format which can be seen in the table
below:
Table. 6. Initialized transaction data
A B C D E ... Y Z
0 1 0 0 0 ... 0 1
1 0 0 0 1 ... 0 0
1 0 1 0 0 ... 0 0
1 0 0 0 1 ... 0 0
In table 6 there is a binominal table or fromat with numbers 1 and 0, number 1 indicates that the product was sold at
that transaction number and number 0 indicates that the product was not sold at that transaction number. After the
data is tabular, the data is ready to be imported into the RapidMiner tools. To be processed in the RapidMiner tool, the
data must be converted into binomial data.
Table. 7. Frequency of each buyer making transactions
1 3
2 12
3 9
... ...
150 5
Table 7 is the result obtained from the data processing process with binominal format, namely in the form of the total
number of product sales in each transaction that has been carried out. TID in the table above is interpreted as a
transaction number or transaction id. While the total transaction is the total result of each product purchased in each
transaction.
Table. 8. Frequency of occurrence of each item/product
A EM Face Wash 72
B EM Face Serum 22
C EM Moisturizing Cream 25
Table 8 shows the frequency of each item/product obtained from the data processing process with binomial format,
which is in the form of total sales of each product from all sales transactions. It can be concluded with an example
based on the table above that the product with code A or EM Face Wash in a total of 150 product transactions was
able to sell 72 items.
The results of the Frequency of Occurrence of Each Item / Product are used for calculations with Association Rule
techniques which can be determined by two parameters, namely support (support value) and confidence (certainty
value). Support is a measure that shows the level of dominance of itemsets from all transactions. The following is the
formula for the support value in the FP-Growth algorithm:
𝑆𝑢𝑝𝑝𝑜𝑟𝑡 = (𝑋∪𝑌)𝑐𝑜𝑢𝑛𝑡/𝑛 (6)
Table. 9. Frequency and support value
Table 9 is the calculation process to determine the support value for each product based on the frequency or total sales
of each product. For example, the EM Face Wash product with a frequency or total sales of 72 items produces a
support value of 56%.
In this study, it is determined that the minimum support count value is 50% so that later products that have a support
value below 50% will not be detected to enter the FP-Tree formation stage.
After calculating the frequency of occurrence of each item, it can be seen that the products that are above the value of
support count = 50% are items or products with frequencies above 9 in the form of products with initialization A, D,
E, F, O, P, Q, R, and X. These 9 products will be influential and will be included in the FP-Tree, the rest are not used
because they do not have a significant effect. These 9 products will be influential and will be included in the FP-Tree,
the rest of the other items are not used because they have no significant effect.
Table. 10. Products that meet the minimum support
A EM Face Wash
E WD Moisturizer SPF 28
F WD Sunscreen
P AZ Moisturaizer SPF 25
Q AZ Night Moisturaizer
X AZ Night Moisturaizer
Table 10 is the result or list of products that meet the minimum support count value above 50%, for example,
products with code A, namely EM Face Wash, are included in the criteria that meet the minimum support count
value, namely with 56% support. The next step in the fp-growth algorithm process is tree formation. This is done
based on table 4.9 The formation of the fp-tree starts from TID 1 to TID 150. The following is a sample of TID 1
formation, namely with itemset {Q,X}.
In TID 1 consists of products {Q,X}, which can be interpreted as transaction number 1 there are two products that are
sold and meet the minimum support value, namely product Q or Azarine Night Moisturaizer and product X or GW
Pink 125ml. TID readings are taken from 1 to 150, in this study TID 150 readings will not be displayed because it
will be difficult to see in the form of an image. After all TID readings are done, the next step is to look for trajectories
that end with the number of supports (A), (D), (E), (F), (O), (P), (Q), (R), and (X) or products with support values
above 50%. The reading of the node trajectory is done up to 9 products. An example of the process of forming each
node or node trajectory A can be seen in Figure 2. Figure 3 is the formation of an FP-Tree with paths containing node
A, namely {D,X,A}, { X,F,A} and {R,A}. Node A is defined as a product that has a relationship with product A. For
example, if a consumer buys product A, he will buy product D and product X, as well as consumers who buy product
X and product F will also buy product A, and similarly, consumers who buy product R will simultaneously buy
product A in one transaction.
Figure. 2. TID 1
In this study, it is determined that the minimum confidence value is ≥ 0.75. So that from the calculation of confidence
in the pattern formed above, the Association Rule that meets the confidence requirement ≥ 0.75 is: D → W = 0.870
(if consumers buy Wardah Lightening Gantle Wash, then buy Azarine Sunscreen SPF50), X → E → O = 0.808 (if
consumers buy Emina Face Wash, then buy Azarine Night Moisturaizer and Himalaya Neem Mask).
5. Conclusion
Overall, from 150 sales transaction data and 26 product data, 9 products are produced that have superior support
values and meet the minimum support value. Consumers tend to buy items that are interconnected as in : D → W =
0.870 (if consumers buy Wardah Lightening Gantle Wash, then buy Azarine Sunscreen SPF50), X → E → O = 0.808
(if consumers buy Emina Face Wash, then buy Azarine Night Moisturaizer and Himalaya Neem Mask). By utilizing
the rules obtained, new information related to the research results regarding customer purchasing patterns in each
transaction made. So that it can be used to help companies, especially shop to increase sales.
References
[1] L. Shabtay, P. Fournier-Viger, R. Yaari, and I. Dattner, “A guided FP-Growth algorithm for mining
multitude-targeted item-sets and class association rules in imbalanced data,” Inf. Sci. (Ny)., vol. 553, pp.
353–375, 2021.
[2] F. Fitriana, E. Utami, and H. Al Fatta, “Analisis Sentimen Opini Terhadap Vaksin Covid-19 pada Media Sosial
Twitter Menggunakan Support Vector Machine dan Naive Bayes,” vol. 5, no. 1, pp. 19–25, 2021.
[3] N. Ramadhani, A. Supikar, and W. Zumam, “Penerapan Market Basket Analysis Menggunakan Metode
Multilevel Association Rules dan Algoritma ML_T2L1 Pada Data Order PT. Unirama,” InfoTekJar J. Nas.
Inform. dan Teknol. Jar., vol. 4, no. 2, pp. 261–274, 2020.
[4] E. Elisa, “Market Basket Analysis Pada Mini Market Ayu Dengan Algoritma Apriori,” J. RESTI (Rekayasa Sist.
dan Teknol. Informasi), vol. 2, no. 2, pp. 472–478, 2018.
[5] V. Pathan and P. P. Shende, “A study on Market Basket Analysis and Association Mining,” 2019.
[6] S. Bag, G. Srivastava, M. M. Al Bashir, S. Kumari, M. Giannakis, and A. H. Chowdhury, “Journey of customers
in this digital era: Understanding the role of artificial intelligence technologies in user engagement and
conversion,” Benchmarking An Int. J., vol. ahead-of-p, no. ahead-of-print, Jan. 2021, doi:
10.1108/BIJ-07-2021-0415.
[7] A. Ilham et al., “Market Basket Analysis Using Apriori and FP-Growth for Analysis Consumer Expenditure
Patterns at Berkah Mart in Pekanbaru Riau,” in Journal of Physics: Conference Series, 2018, vol. 1114, no. 1, p.
12131.
[8] H. Hruschka, “Comparing unsupervised probabilistic machine learning methods for market basket analysis,”
Rev. Manag. Sci., vol. 15, no. 2, pp. 497–527, 2021.
[9] R. Moodley, F. Chiclana, F. Caraffini, and J. Carter, “A product-centric data mining algorithm for targeted
promotions,” J. Retail. Consum. Serv., vol. 54, p. 101940, 2020, doi:
https://doi.org/10.1016/j.jretconser.2019.101940.
[10] N. Isa, N. A. Kamaruzzaman, M. A. Ramlan, N. Mohamed, and M. Puteh, “Market basket analysis of customer
buying patterns at corm café,” Int. J. Eng. Technol, vol. 7, pp. 119–123, 2018.
[11] M. P. Tana, F. Marisa, and I. D. Wijaya, “Penerapan Metode Data Mining Market Basket Analysis Terhadap Data
Penjualan Produk Pada Toko Oase Menggunakan Algoritma Apriori,” JIMP (Jurnal Inform. Merdeka Pasuruan),
vol. 3, no. 2, 2018.
[12] A. Musalem, L. Aburto, and M. Bosch, “Market basket analysis insights to support category management,” Eur.
J. Mark., 2018.
[13] K. Tatiana and M. Mikhail, “Market basket analysis of heterogeneous data sources for recommendation system
improvement,” Procedia Comput. Sci., vol. 136, pp. 246–254, 2018.
[14] V. Santarcangelo, G. M. Farinella, A. Furnari, and S. Battiato, “Market basket analysis from egocentric videos,”
Pattern Recognit. Lett., vol. 112, pp. 83–90, 2018.
[15] M. A. Valle, G. A. Ruz, and R. Morrás, “Market basket analysis: Complementing association rules with
minimum spanning trees,” Expert Syst. Appl., vol. 97, pp. 146–162, 2018.
[16] D. L. Olson and G. Lauhoff, “Market basket analysis,” in Descriptive Data Mining, Springer, 2019, pp. 31–44.
[17] A. A. Aldino, E. D. Pratiwi, S. Sintaro, and A. D. Putra, “Comparison of market basket analysis to determine
consumer purchasing patterns using fp-growth and apriori algorithm,” in 2021 International Conference on
Computer Science, Information Technology, and Electrical Engineering (ICOMITEE), 2021, pp. 29–34.
[18] E. Umar, D. Manongga, and A. Iriani, “Market Basket Analysis Menggunakan Association Rule dan Algoritma
Apriori Pada Produk Penjualan Mitra Swalayan Salatiga,” J. MEDIA Inform. BUDIDARMA, vol. 6, no. 3, pp.
1367–1377, 2022.
[19] L. A. M. Fajar and R. Rismayati, “Rekomendasi Paket Menu Angkringan Waru Tanjung Bias Dengan Algoritma
Frequent Pattern Growth Berbasis Web,” JTIM J. Teknol. Inf. Dan Multimed., vol. 3, no. 2, pp. 91–97, 2021.
[20] A. Griva, C. Bardaki, K. Pramatari, and D. Papakiriakopoulos, “Retail business analytics: Customer visit
segmentation using market basket data,” Expert Syst. Appl., vol. 100, pp. 1–16, 2018.
[21] S. K. Dubey, S. Mittal, S. Chattani, and V. K. Shukla, “Comparative Analysis of Market Basket Analysis through
Data Mining Techniques,” in 2021 International Conference on Computational Intelligence and Knowledge
Economy (ICCIKE), 2021, pp. 239–243.
[22] I. Qoniah and A. T. Priandika, “Analisis Market Basket Untuk Menentukan Asossiasi Rule Dengan Algoritma
Apriori (Studi Kasus: Tb. Menara),” J. Teknol. Dan Sist. Inf., vol. 1, no. 2, pp. 26–33, 2020.