A Comprehensive Study On Using Data Mining in ERP Systems
A Comprehensive Study On Using Data Mining in ERP Systems
Subramaniakrishnan K.1, Kishan Grish2, Anjali Bundela3, Mohd. Haseeb4, Imroz Farooq5
Abstract
Enterprise Resource Planning (ERP) helps an enterprise to collect disparate data into a single
database. The various departments such as sales, production, finance and such functions in an
organization can use a single application and thus the data gets stored in the said database. It
has various implications as it gets a lot easier to analyse the stored data from all the
departments, and thus gives rise to the need of tools to analyse said data. Data mining plays a
big role in ERP implementation as it is very much crucial for the successful running of an
enterprise in today's scenario. It helps extract information from the data which helps the
management to come up with finer decisions and forecasts. This research article discusses
about the various methods in which data mining techniques can be applied to various
departments of an organization focussing mainly on Sales & Marketing and CRM based on
existing literature in the field.
Introduction
The growth of an organization in the current business environment is very much rested on
the shoulders of the customer data that they obtain on a daily basis. Many small and large
scale companies face an acute problem when they are trying to analyse and manage the large
data of all the processes in an effective manner (Jagtap & Jaiswal, 2014). But this data
cannot be analysed by the naked eye and make predictions based on a hunch. It needs specific
tools and software so that it can be analysed and recorded in a proper format. Businesses keep
on changing at a rapid pace, and with this change the need to switch from legacy business
systems is a must. That why ERP systems are used so that businesses can integrate the data
efficiently (Saad & Alghamdi, 2012). With the use of ERP, it becomes a lot easier to
manage data and filter it.
With progression in information technology many new tools have brought progressive
advancements such as object oriented techniques, soft computing techniques, AI, internet
data warehouse and data mining technology. One the most critical applications focused by
data mining is ERP. Utilizing data mining, organizations might almost certainly perform
viable market investigation, think about client criticism, distinguish comparative items, hold
exceptionally profitable clients and make brilliant business choices (Amirthalingam,
Shaheen, Kousar, & Bilfaqih, 2014). Data mining and clustering are being used by
developing applications to come up with perfect solutions with the improved technologies.
The ERP system which consists of data from all the departments uses data mining and certain
algorithms to increase productivity in order to be productive in terms of marketability and
efficiency. ERP organises all the scattered data together and this helps data mining to analyse
the hidden data patterns which helps the company to improve their business and functioning
(Kasapoglu & Gursoy, 2015). Data mining can help the organizations create value for their
customers, both internal and external. ERP database only helps us to analyse the past data
but when ERP and data mining are integrated, it helps in the prediction of future trends.
With ERP growing at a monstrous rate in the recent years, there are still problems in
implementing ERP for many companies and organizations. Apparently, the ERP systems are
able to solve problems which the mid-level management is concerned about. The systems are
not equipped to analyse the whole data which will lead to the huge amount of data to get
wasted. Secondly, the ERP systems are not able to help the high-level management to come
up with a better strategy and make decisions (Sathiyamoorthi & Bhaskaran, 2009). ERP
system eliminates most of the cross-functional coordination issues and makes available a
single comprehensive database that can be made available to every single member of the
organization. Data from all sections of the organization are collected and real-time updates
are made and fed into the various modular applications. In order to analyze all these data and
to find the hidden patterns and anomalies, data mining is used.
Literature Review
A major problem during integration of ERP and data mining from data warehouses is of
inconsistent, incomplete and duplicate data. This data has to be cleaned properly and if any
duplicates are found, they have to be eliminated. The data is collected and transferred to a
data staging so that processing can be done. Next stage is the transformation of data which
helps in checking for the data's accuracy and to remove unwanted data. After this, it's made
sure that the quality of the data is sound. In the next stage, as the system keeps on collecting
data, old and outdated data has to be removed periodically. Since data warehouse plays a
major role in ERP systems, its safety is of top priority and also it has to easily accessible to
everyone in the organization for real time updates (Sathiyamoorthi & Bhaskaran, 2009).
Online Analytical Processing techniques (OLAP) helps in the graphical representation of the
data and thus helps in gaining different perspectives of the data collected by the ERP systems
enables the organization to integrate business processes and to optimize the available
resources. OLAP and Data Mining might have different processing methods but also can be
integrated for accomplishing new tasks. A data warehouse that has a collection of all
integrated, non-integrated, subject-oriented and volatile data that will help in business
intelligence and support in decision making. Applying individual processes for OLAP and
Data Mining makes the system complex since at the end of the process both the engines will
go down and combined analytics is difficult to make. So, integrating their processes will help
to get a better understanding of the data (Abdellatif & Elsoud, 2011). Organizations that
utilize ERP frameworks can profit from the insights that OLAP and data mining approaches
can deliver. As they can perform diverse errands on ERP's information, their integration with
the learning revelation is very valuable to perform new assignments that might be required by
organizations chiefs and ERP clients. This integration can increment consumer loyalty,
behaviour and at last the development of enterprise and give assistance for managing the
clients in increasingly effective way.
Using ERP, customer leads can be generated and marketing of the products can easily be
carried out. But the most difficult part here is to figure out which customer to target for the
specific product, says (Bhadlawala, 2013) who carried out a study to analyse data of
customer orders by using an algorithm, Apriori so that interests of the customer can be
identified. What the algorithm does is that it processes the data on sale of products and can
come up with a combination of products, with the application of removing said combinations
if it does not satisfy the conditions. The algorithm first collects all the data from the sales
department about the orders, produces a list of distinct customers who can be targeted for
marketing. The algorithm thus selects customers one by one and performs and analysis on the
data generated. It generates combinations of products bought by the customers in all the
orders from the organization. Once the data has been processed completely, the results will
have product combinations that can be matched to the interest of the customers. The data
obtained from orders can be compared with the Apriori algorithm to make relevant
comparisons. The data processed is taken on the assumption that the product quantity was
one, but real time data can have 'n' quantity which can be then used for an enhanced study.
All retail stores that have computerised billing system find the need to do data mining since it
helps them analyze the huge amounts of data. This analysis will help in business processes
like marketing promotions, inventory management and customer relationship management.
(Kasapoglu & Gursoy, 2015) also talks about the same method which analyses the purchase
behaviour of customers. This will help the retailer to get to know the products that are bought
together. By knowing the products that are bought together, it will help the seller to know the
market needs and using this information, products are placed together in the store so that the
customer would not miss an opportunity to buy it. This information will help the retailer in
planning the purchase, inventory and supply chain management. Data mining and Enterprise
Resource Planning go hand in hand in retail sector. (Jagtap & Jaiswal, 2014) talk about an
Android application which can be used by the owner so that he can access the profit and loss
statements and provides him with graphs for analysis of the data. The owner can handle the
various operations such as payments, bills, suppliers etc through the application which makes
it more reliable for the effectiveness of the system. Clustering accuracy can be increased by
reducing the clustering time, by the use of algorithms such as K-means and Apriori which
enables sampling large percentage of data which in turn provides flexibility in interaction
with the database. (Jagtap & Jaiswal, 2014) also talk about a Standalone model being used
by the employees so that the daily transactional data of the database can be stored, retrieved
and updated by using algorithms such as K-means for clustering and for aggregation of data.
But this model alone is not sufficient for the ERP systems effective usage. (Jagtap &
Jaiswal, 2014) say that Web Server models should used by the employees so that the
Standalone model's efficiency is increased. SQL queries will being handled by the Web
Servers so that no errors are generated. It will also help in effective retrieval of data from the
database.(Hanumanth & PrasadaBapu, 2013) explain the importance of clustering data
mining technique in sales and distribution functions. Sales and distribution in a company is
important as it handles various business processes like forecasting sales, managing inventory
and logistics, pricing and customer relationships. Steel products have been chosen for the
study which relates to customer demands. It aims to cluster the sales data and prove that they
are meaningful. Various important aspects of sales function are product attributes, annual
sales targets, setting prices, net sales realization, handling quality complaints and launching
new products. To understand these aspects from customer point of view, data mining has
been used. Weka, a software that consists of machine learning algorithms for data mining
tasks. A few algorithms are CLOPE, DBScan, EM and XMeans. The research paper explains
hierarchical clustering, partition clustering and density based clustering. To implement
clustering, the first step is to import the SQL query on the data from applicable source fields.
The author has chosen ZeroR classifier. Then the evaluation of attributes is done. Then the
clustering algorithms are run and the results are in visualization format. From the results, it
can be analyzed that the clustering algorithms help in knowing the demand and net sales
realization for the in-development products. The algorithms also give data on the lost order
opportunities and sales volume forecasts for new products. From the results, it can be
understood that using the clustering algorithms, a relationship can be established between
pre-sale and post-sale activities. For any data mining application to be successful, it is
essential that they are precise and acceptable. With the results from the research conducted,
the clustering techniques are reliable. It is learned that the clustering algorithms are simple
and easy to implement. They practically work well and are fast and efficient in computation
terms.
OLAP, which is similar to data mining, mostly deals with data in the form of cubes and helps
in analysis of various dimensions which can be done in real-time. Data mining, on the other
hand helps in analysing the future trends by sifting through the huge amount of data stored in
the repositories. OLAP helps to evaluate the queries whereas Data Mining helps in building a
model. (Abdellatif & Elsoud, 2011) say that applying Association Rule in OLAP extracts
you the data in a cube dimensional figure that helps in better understanding whereas applying
Apriori algorithm using Association Rule in data mining will get you the association between
the products and help in analysing the customer purchase behaviour. OLAP helps in
analysing only the data in the past but Data Mining helps in predicting the future in a
dimension. But, integrating these two has a greater advantage. It will find the interesting
patterns among multiple dimensions at different levels of abstractions. Therefore, it is better
for ERP systems that contain every single data of the organization in its hold to integrate Data
Mining and OLAP for better business functioning. Analysis of data is done by choosing a
testing attribute in OLAP which classifies the principal point and data mining is done from
that point sub-division wise. Testing is done to extract the results and after the test, with the
help of the web tool, the testing times, abnormalities, and error, the testing quality is found.
Data Transformation Service (DTS) is used in ERP systems which provides the language for
Meta Data Architecture Method that helps in data mining. After reading the data using the
language provided by the ERP system data transformation is done. (Chen & Chang, 2002)
come to an inference that even though both OLAP and data mining have their own processing
steps like testing and DTS, they become complex when functioning individually, it is in the
best interest of ERP to integrate them so as to achieve the best results. Even though (Baiju &
Vardhini, 2017) goes with the same conclusions as (Chen & Chang, 2002), there is one
deduction that is obtained from their research that states that there are 4 steps in integrating
OLAP and ERP in data mining and a tool named Power BI, which is a self-service solution is
needed for the successful completion of the integration.
Data mining is used to extract information which can be considered a Decision Support
System that can be used to generate rules. (Saad & Alghamdi, 2012) discuss ERP-CRM
model, which can be used to resolve business issues. Whenever a customer requests the
company for a query, it is directed to the concerned department, evaluated and replied
thereafter. After this, the database stores the whole process for future purposes. With the
implementation of the ERP-CRM model, customer queries can easily be predicted based on
the data available, and more time will be available for the management who deals with such
queries. This can become a decision making process in itself. The Frequent Pattern growth
algorithm was implemented to the data available using a tool called Rapid Miner, so that
rules can be generated for the future and response to customer queries can be easily
addressed. The database contained a lot of queries out of which 15 were selected for the
implementation of the process. This input file was then processed using the algorithm. A list
of rules is generated using the input file. The rules will help the management guide the
respective customers. The knowledge management databases will help the customers by
using the automated response machine based on the rules. It will be modified as and when
new queries are saved in the database accordingly. This can also be used by people inside the
organization.
Data Analysis
(Bhadlawala, 2013) proposed the use of Apriori algorithm with Association rules to come up
with the proposed algorithm - Customer interest finder algorithm. The data from the sales
department is used which is then processed to generate results. The data has been coupled
with the Apriori algorithm to give a meaningful analysis and marketing tactics can be derived
from that. The data gives how many times the items have been bought by the respective
customers and it also gives us the data on the combinations purchased by the customers.
From the table above, we can infer that CD sells the most. For Alice, CD and milk have the
same priority. Bread has a priority of 11 but for James, buying bread alone is not worthwhile.
This data can then be easily used for coming up with marketing plans, so that customers can
be targeted and also new customers can be approached or attract potential customers toward
the products.
Conclusion
So, the importance and applications of data mining has been talked about and it plays a
significant role in an organization by reducing the confusion and dilemma when it comes to
decision making. Thanks to the development in technologies, the companies can update even
the smallest of information from time to timeMarket basket analysis done by data mining is
advantageous not only for the seller, but also for the customer. Shopping is made easy by
placing products that customers might buy together. But then, anyone who wants to apply
data mining has to be cautious about is the reliability of the data being collected and safety of
the data warehouse. Once reliable data is collected and mined, it will give a visualization of
the top priorities and helps in mapping the business processes which will result in more
revenue. Even though data mining has a lot of advantages with its own process, integrating it
with OLAP will result in a better process which will reduce the complexity in functioning
and will also give a multi-dimensional figure which unfolds the hidden facts in a dataset.
Also, data mining along with OLAP will help the organizations get better insights of what
lies ahead in the future. Data mining brings along some cost along with it, but the money
being invested will turn out to be a profit for sure by helping in better understanding of the
business.
References
1. Abdellatif, T. S., & Elsoud, M. A. (2011). Comparing Online Analytical Processing and
Data Mining Tasks In Enterprise Resource Planning Systems, 8(6), 161–175.
2. Amirthalingam, G., Shaheen, R., Kousar, M., & Bilfaqih, S. M. (2014). Integrated Data
Mining and Knowledge Discovery Techniques in ERP, 2(4), 210–214.
3. Baiju, B. V, & Vardhini, R. (2017). An Extensive Study on Data Warehouse , OLAP used
for Implementing ERP Systems, 8087–8092. https://doi.org/10.15680/IJIRCCE.2017.
4. Bhadlawala, S. (2013). Efficient Application of Data Mining for Marketing and Sales
Decision Making in ERP, 1–5.
5. Chen, R.-S., & Chang, C. . (2002). A Web-based Data Mining System for ERP decision
making, 0–5.
6. Hanumanth, S., & PrasadaBapu, M. . (2013). Analysis and Prediction of Sales Data in
SAP- ERP System using Clustering Algorithms, 1(4), 95–109.
7. Jagtap, P., & Jaiswal, S. (2014). Conceptual Model of ERP with Web Server and Android
Application Using K-means Clustering Based on Data Mining, 7782, 10–12.
8. Kasapoglu, O., & Gursoy, U. (2015). DATA MINING AND ERP : AN APPLICATION
IN RETAIL SECTOR, (June), 212–221. https://doi.org/10.20472/IAC.2015.017.040
9. Saad, A., & Alghamdi, A. (2012). Rules Generation from ERP Database : A Successful
Implementation of Data Mining, 12(3), 21–29.
10. Sathiyamoorthi, V., & Bhaskaran, V. M. (2009). Data Mining for Intelligent Enterprise
Resource Planning System, 2(3), 1–5.