0% found this document useful (0 votes)
325 views6 pages

Sales Analysis of E-Commerce Websites Using Data M

This document discusses using data mining techniques to analyze sales data from e-commerce websites. Specifically, it proposes using decision trees to develop pricing models based on customers' past purchasing behaviors. The goal is to optimize prices to maximize profits and customer satisfaction. Data mining can help e-commerce companies better understand customer purchasing patterns to develop effective marketing strategies.

Uploaded by

Sang Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
325 views6 pages

Sales Analysis of E-Commerce Websites Using Data M

This document discusses using data mining techniques to analyze sales data from e-commerce websites. Specifically, it proposes using decision trees to develop pricing models based on customers' past purchasing behaviors. The goal is to optimize prices to maximize profits and customer satisfaction. Data mining can help e-commerce companies better understand customer purchasing patterns to develop effective marketing strategies.

Uploaded by

Sang Nguyễn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/290786852

Sales Analysis of E-Commerce Websites using Data Mining Techniques

Article  in  International Journal of Computer Applications · January 2016


DOI: 10.5120/ijca2016907812

CITATIONS READS

2 515

1 author:

Anurag Bejju
Birla Institute of Technology and Science Pilani
2 PUBLICATIONS   2 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Real Time Banking Solutions: Era of Cashless Payment Systems View project

All content following this page was uploaded by Anurag Bejju on 05 March 2016.

The user has requested enhancement of the downloaded file.


International Journal of Computer Applications (0975 – 8887)
Volume 133 – No.5, January 2016

Sales Analysis of E-Commerce Websites using Data


Mining Techniques
Anurag Bejju
Department of Computer Science
Birla Institute of Technology & Science,
Pilani, Dubai Campus, P.O.Box 345055,
Dubai International Academic City, Dubai,
UAE

ABSTRACT Getting a deeper understanding of e-commerce networks, such


In the emerging global economy, E-commerce is a strong as the Flipkart market space, in terms of structure,
catalyst for economic development. The rapid growth in interactions, trust and reputation has tremendous value in
usage of Internet and Web-based applications is developing business strategies and building eff ective user
decreasing operational costs of large enterprises, extending applications. Nowadays, web data provides comparative
trading opportunities and lowering the financial barriers for advantages for mass merchants to analyze and reveal
active ecommerce participation. Many companies are important parts of online
restructuring their business strategies to attain maximum consuming behavior [2]. This paper discusses examples of
value in terms of profits as well as customer’s
satisfaction. Business tycoons around the globe are multi-channel strategies and designs a pricing model which
realizing that e-commerce is not just trading of products focus on 4 P’s of Marketing mix. Based on the analysis of the
and information over Internet, rather it provides an retailer’s transaction data and a literature review, we derive
opportunity to compete with other giants in the market. hypotheses to explain consumer purchasing behavior.
Data mining (DM) is used to attain knowledge from
available information in order to help companies make 2. BACKGROUND
weighted decisions. An organization needs to invest only The E-Commerce industry represents one of the largest
on the group of products which are frequently purchased industries worldwide. For example, in the United States, it is
by its customers as well as price them appropriately in the second largest industry in terms of both the number of
order to attain maximum customer satisfaction. The establishments and profits, with $3.8 trillion in sales annually.
objective of this paper is to evaluate, propose and [3] In addition, this industry is facing similar trends to those
improve traditional pricing strategies by using web mining affecting other sectors, for instance, the globalization of
techniques to collect information from e- commerce markets, aggressive competition, increasing cost pressures and
websites and apply data mining methods to induce and the rise of customized demand with high product variants.
extract useful information out of it. The proposed strategy Manual capture of sales information increases transaction
can be generated by optimizing decision trees in an costs and can cause inventory inaccuracies.
iterative process and exploit information about historical
buying behavior of a customer. This kind of processing involves numerous human
interventions at different levels such as order taking, data
Keywords entry, processing of the order, invoicing and forwarding. The
E-Commerce, Data Mining, ID3 Algorithm accuracy of the model is questionable and may not be consider
few important factors while developing it. To overcome this
1. INTRODUCTION problem, data mining can be used to analyze big data and
The Web is one of the most revolutionary technologies that develop efficient marketing strategies It is ideal because many
changed the business environment and has a dramatic of the ingredients required for successful data mining are
impact on the future of electronic commerce (EC). The easily satisfied: data records are plentiful, electronic
future of EC will accelerate the shift of the power toward the collection
consumer, which will lead to fundamental changes in the way
companies relate to their customers and compete with one provides reliable data, insight can easily be turned into action,
another. Previous studies in Information Science (IS) and return on investment can be measured [4].
literature like The Consumer Behavior towards online
shopping of electronics in Pakistan (Adil Bashir 2013),
3. DATA MINING AND CONSUMER
Online Consumer Behaviour (Dr. Bas Donkers 2013), BEHAVIOR IN E-COMMERCE
Influencing the online consumer’s behavior: the Web In the past few years, the development of the World Wide
experience (Efthymios Constantinides 2010), ) Post-purchase Web exceeded all expectations. Retrieving data has become a
behavior (Dibb et al., 2004; Jobber, 2010; Boyd et al., 2012; very difficult task taking into consideration the impressive
Kotler, 2011; Brassington and Pettitt, 2013) have proposed variety of the Web. Web consists of several types of data such
various models explaining customer buying behavior. These as text data, images, audio or video, structured records such as
research models typically derive hypotheses from a literature lists or tables and hyperlinks. Web content mining can be used
review. Based on this hypotheses, evaluation of a multi- to mine text, graphs and pictures from a Web page and apply
channel customer choice data can bedone. Commerce data mining algorithms to generate patterns used for
networks involve buying and selling activities among knowledge discovery [5]. For a successful e-commerce site,
individuals or organizations. [1] reducing user-perceived latency is the second most important
quality after good site- navigation quality.

36
International Journal of Computer Applications (0975 – 8887)
Volume 133 – No.5, January 2016
The most successful approach towards reducing user- been.
perceived latency has been the extraction of path traversal
patterns from past users buying history to predict future user
buying behavior and to fetch the required resources. [6]
Vallamkondu & Gruenwald (2003) describe an approach to
predict user behavior in e-commerce sites. The core of their
approach involves extracting knowledge from integrated data
of purchase and path traversal patterns of past users to
develop a pricing model which focuses on profits as well as
customer satisfaction. [7] Web sites are often used to
establish a company’s image, to promote and sell goods and
to provide customer support. The success of a web site The Internet will lead to increased price competition and the
directly affects the success of the company in an electronic standardization of prices. Also, the ability to compare prices
market. across all suppliers using the Internet and online shopping
services will lead to increased price competition. Finally, the
3.1 Product Strategy price of providing Internet-based services often contains
A product is anything that can be offered to a market for little or no marginal costs. Organizations will have to employ
attention, acquisition, use, or consumption that might satisfy a new pricing models when selling over the Internet. A negative
want or need (Kotler, 2001). In an e-commerce marketing correlation of -0.17907 was found on applying it to training
strategy, it is important to remember that information is now set of 1185 instances. It was also found that
its own viable product. In the physical world, a shopper who
wants to buy something has to manually sift through the
millions of choices. A complete search of all offerings would
be extremely expensive, time-consuming and practically
impossible. Instead consumers rely on product suppliers and 3.3 Productivity Concept
retailers to aid them in the search. This allows the suppliers Business profits can be increased by increasing revenue
and providers to use the consumers’ cost-of search as a through stronger sales and/ or by decreasing the costs
competitive advantage. However, on the Internet, consumers associated with constant sales. One of the major factors in
can search much more comprehensively and at virtually no customer satisfaction is the availability and timeliness of the
cost[11]. delivery of products. If a customer has to wait to receive a
product, it can be detrimental to their feeling of satisfaction.
[13] With that in mind, avoiding back orders should be a
major goal of any business.
Table 3. Correlation Between Productivity and Online
Rating

By using the direct access to consumers enabled by the


Internet, companies can collect information, identify target
Through a strong inventory or warehouse management system
consumers, and better introduce products or services to meet
you will be able to use product demand forecasts and lead time
consumers' needs. If a customer finds all the desired product
tracking to ensure that your warehouse is always stocked with
type it will directly affect the customer satisfaction index
the necessary products at the proper times. (Refer fig 1.3) A
(refer to table 1.1). After analyzing the training set with 1185
positive correlation of 0.065128 was found on applying it to
instances, It was found that the 60% customers were satisfied
training set of 1185 instances.
with phone and TV had least customer satisfaction terms of
online product rating.

3.2 Price Strategy


In the earliest days of Internet commerce, many economists
and media observers predicted that competition among
Internet retailers would quickly resemble perfect competition.
After all, the Internet already reduces search costs relative to
visiting physical stores and comparison sites could be
expected to lower search costs still further. The question of
how pricing impacts consumer purchasing behavior is
interesting. In this paper, we
discuss one such application, measuring the potential
magnitude of bias in the consumer price index arising from
underweighting Internet commerce. Price is the only element
of the marketing mix to generate revenues. Internet pricing
decisions will be just as crucial as they traditionally have

37
International Journal of Computer Applications (0975 – 8887)
Volume 133 – No.5, January 2016

Fig 1.1. Application of ID3 algorithm on data extracted from e-commerce website

38
International Journal of Computer Applications (0975 – 8887)
Volume 133 – No.5, January 2016

Fig 1.2. Partial Decision Tree after applying ID3 Algorithm

Fig 1.3. Decision Tree developed using ID3 Algorithm

4. APPLICATION OF ID3 ALGORITHM


TO PREDICT ONLINE RATING OF A
PRODUCT
Decision trees are used in visualization of probabilistic
business models. Through generation of a tree customer’s
area of interest for the products can be determined. ID3
(Iterative Dichotomiser) is a simple decision tree algorithm
developed by Ross Quinlan (1983). It is used to create a
decision tree of given data set, by using top-down greedy
approach to check each attribute at every tree node. In the
decision tree method, information gain approach is Fig 1.4. Performance Evaluation
generally used to determine
suitable property for each node of a generated decision tree.
So, entropy of each attribute is calculated first and accordingly
information Gain is calculated. Attribute which has maximum
information gain set at a root node of the tree and accordingly
it generates sub tree with another node. In this case, Initially,

39
International Journal of Computer Applications (0975 – 8887)
Volume 133 – No.5, January 2016

The product name, product price, quantity, type and online further the companies’ profits.
rating is mined from flipkart.com website. Unique attribute
values are deleted and continuous attributes like price, Many of the e-commerce strategy frameworks offer a unique
quantity are discretized to get effective results. Equal width contribution to strategic planning but with limited solution.
binning technique divides the range of possible values into N This model based on web mining integrates the McCarthy’s
sub ranges of the same size. [14]. 4Ps to provide a complete analysis of e-business strategies.
Thus managers can use an organized and precise process to
5. EXTRACT CLASSIFICATION RULES make more successful and effective decisions. An aggressive
Data classification is an important data mining task that tries competition has been observed in market space among the
to identify common characteristics in a set of N objects companies, thus accelerating the consumer dynamics. E-
contained in a database and to categorize them into different commerce will lead to increased price competition and this
groups. We extract classification IF-THEN rules from those web application will provide an efficient way to price a
equivalence classes. For equivalence class {} , , If (a is particular product. It was found that price, product and
greater production had an impact on online customer ratings. This
model considers these three attributes which are correlated to
than or equal to 500 And (a is less than 14600) Then If b is customer satisfaction and help the marketer make an informed
equal to "laptop" Then IF(c is greater than or equal to25) And decision.
(c is less than 650) Then x = 3; can pruned by following a
path in this tree. Here x is the rating which a product can get. 8. REFERENCES
Using these rules a web application using JavaScript, HTML [1] Han, Jiawei, Micheline Kamber, and Jian Pei. “Data
and CSS is developed [15]. mining: concepts and techniques” Morgan kaufmann,
2006.
6. MODEL EVALUATION
The confusion matrix is a useful tool for analyzing how [2] Quinlan J. R. (1986). “Induction of decision
well your classifier can recognize tuples of different classes. trees.Machine Learning,” Vol.1-1, pp. 81-106.
TP and TN tell us when the classifier is getting things right, [3] J. R. Quinlan, “C4.5: Programs for Machine Learning,”
while FP and FN tell us when the classifier is getting things Morgan Kaufmann Publishers, Inc., 1993.
wrong mislabeling). Given m classes (where m≥2), a
confusion matrix is a table of at least size m by m. An entry, [4] Ding Xiang-wu and Wang Bin, "An Improved Pre-
CMi,j in the first m rows and m columns indicates the number pruning Algorithm Based on ID3," Jisuanji
of tuples of class i that were labelled by the classifier as class Yuxiandaihua,Vol.9, pp. 47,2008.
j. For a classifier to have good accuracy, ideally most of the [5] Ming Fan, Xiaofeng Meng translated, “Data mining
tuples would be represented along the diagonal of the techniques and concepts”, Machinary Industry Press,
confusion matrix, from entry CM1,1 to entry CMm,m, with Beijing, pp. 136-145, Feb., 2004.
the rest of the entries being zero or close to zero In this case
we m=4 so we have a 4*4 matrix. After applying ID3 [6] N R Srinivasa Raghavan, ”Data mining in e-commerce:
algorithm this model has 86.4780% accuracy (i.e: out of A survey,” Sadhana Vol. 30, Parts 2 & 3, April/June
every 100 test cases it has correctly predicted 87 test cases 2005, pp.275–289.
[16].
[7] B. Schafer, J.A. Konstan, and J. Reidl, “E-Commerce
Recommendation Applications,” Data Mining and
Knowledge Discovery, Kluwer Academic, 2001, pp.
115-153.
[8] P. Resnick et al., “GroupLens: An Open Architecture for
Collaborative Filtering of Netnews,” Proc. ACM 1994
Conf. Computer Supported Cooperative Work, ACM
Press, 1994, pp. 175-186.
[9] J. Breese, D. Heckerman, and C. Kadie, “Empirical
7. CONCLUSION Analysis of Predictive Algorithms for Collaborative
In this paper, a detailed study based on data mining Filtering,” Proc. 14th Conf. Uncertainty in Artificial
techniques was conducted in order to extract knowledge in a Intelligence, Morgan Kaufmann, 1998, pp. 43-52.
data set with information about user’s history associated to an
e-commerce website. These datasets are directly mined from [10] Baesens, B. Verstreeten, G. Poel, D. (2004). Bayesian
Flipkart.com using an online software which converts html network classifiers for identifying the slope of the
documents to data tables. The main purpose to web mine data customer lifecycle of long-life customers. European
is to apply a set of descriptive data mining techniques to Journal of Operation Research, 156, 508-523.
induce rules that allow data analyst working at ecommerce [11] Cheng, C.H. Chen, Y.S. (2009). Classifying the
companies make strategic decisions to boost their sales as segmentation of customer value via RFM model and RS
well as provide effective customer service. Techniques used theory. Expert System with Applications, 36, 3,
to discover patterns are web mining and decision tree 41764184. [12] Hwang, H., Jung, T. Suh, E., (2004). An
algorithms. In the future, this study can be used to analyze LTV model and customer segmentation based on
ecommerce websites and obtain interesting knowledge to customer value

IJCATM : www.ijcaonline.org 40

View publication stats

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy