0% found this document useful (0 votes)
59 views5 pages

Auto Sales Forecast

report file

Uploaded by

Manku Kashyap
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views5 pages

Auto Sales Forecast

report file

Uploaded by

Manku Kashyap
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Research on auto sales forecast based on online

reviews Take R brand automobile as an example


Yao Chen Mengyao Yao Jia Zhang
Management school Management school Economics and management school
Shanghai University of International Shanghai University of International Tongji University
Business and Economics Business and Economics Shanghai
Shanghai Shanghai ericaqian@hotmail.com
cheny@suibe.edu.cn yaomengy1994@163.com

Abstract—The rapid development of automobile industry has online reviews are the most direct expression of users’ feelings
led to the blind expansion of the production capacity of various which reflects the evaluation and trend of the market subject, and
auto companies, which has brought serious potential overcapacity. provides the necessary foundation for the study of social and
Accurate forecaste of car sales is not only of great importance to economic behavior.
automobile manufacturers and distributors in setting up strategy
and target. It also has a certain influence on the good growth of In addition, automobile as a special commodity, unlike the
China's automobile market and the healthy development of the general FMCG, it’s use cycle is long, the purchase price is high,
national economy. This article takes R brand automobile as the so users will do more preparation before buying, which is mainly
research object, collects the related online reviews data , through to browse online reviews. Based on this, this paper studies the
the natural language processing and the text emotion model, the impact of online reviews on car sales. The regression forecast
online reviews data is processed quantitatively. The sales forecasts model is established by using the automobile online reviews data
are made by combining the internal historical sales data of the and the history sales data. It provides a new research idea to find
automobile company with the online reviews of R brand and the more scientific and efficient forecasting methods and to establish
sales and reviews data of competitive brands. Compared with the more accurate forecasting models for the subsequent research on
benchmark model, it is found that the sales forecasting model sales forecast.
which has added the sentiment variables and the influence factors
of the competitive brands improves the fitting ability of the data.
II. LITERATURE REVIEW
Keywords—Online reviews, Sentiment analysis, Sales forecast
A. Online Reviews
I. INTRODUCTION Online reviews also called online consumer reviews or user
generated content. It is a form of online word-of-mouth, mainly
After several decades of development, China's automobile in the form of text to evaluate products, is the consumer's
industry has grown into an important pillar industry of China's comments on the product or company submitted via the Internet
economy. China has also become the world's largest auto [1]
. This paper holds that online reviews is the user published in
production and automotive consumer market, the annual sales of
various network channels on the purchase of goods or services
cars for 8 consecutive years in the world's first. 2017, China's car evaluation. Consumers adjust their purchase decisions through
production of 29 million, year-on-year growth of 3.19%. Overall,
online reviews and generate online reviews that are shared and
China's auto market is still a very competitive industry. The disseminated through various online communities or comment
development of automobile industry is not only the industry
sites for free reading and sharing by more web users.
itself, but also related to the livelihood of the nation. But the
rapid development of automobile industry makes each
automobile company blindly expand capacity and causes huge B. Sentiment analysis
human and material waste. Therefore, the accurate forecast of Sentiment Analysis is also known as Opinion extraction 
automobile sales is very important to the production Opinion mining, Sentiment classification.It is the process of
management of the enterprise. processing, analyzing and even inferring the material with
artificial subjective emotion[2]. In the representative study, Hu
The research of automobile sales forecasting in existing
and Liu[3], Scaffidi[4] are based on the feature words and the point
enterprises mostly uses a single data source, that is, historical
of view word proximity principle and the common frequency to
sales data and lacks the use of external users’ reviews data,
identify the feature point of view, while Choi and Cardie[5],
ignoring the influence of the reviews on the public opinion data
zhang[6] through the sequence annotation model method to
on the users’ purchase intention. With the development of social
identify the feature point of view.
networks, the network has become an important reference factor
for consumer purchasing decisions and has an impact on product
sales. The internet has become one of the quickest ways for users C. Prediction research based on sentiment analysis
to obtain information. According to the China Internet Network Domestic and foreign scholars use online reviews sentiment
Information Center, there are more than 80% of internet users to carry on the forecast of each industry. Gross-klussmann and
will browse the product reviews before buying products. And the others, by identifying the sentiment in the news, predicted high-

978-1-5386-5178-0/18/$31.00©2018 IEEE
frequency yields, trading volumes and volatility in the London TABLE I.  THE EMOTIONAL INDEX OF THREE BRAND CARS
Stock Exchange[7]. In domestic research, Wang(2016) study the Time R brand H brand G brand
correlation between large data and carbon price prediction, and
2016.7 1600 39100 10100
discusses the most effective prediction model, which indicates
that the network structure ADL model can significantly improve 2016.8 4322 40600 10100
the predictive effect[8]. Huang(2015) based on the theory of 2016.9 11450 53300 14100
behavioral finance, discuss the influence of micro-blog
… … … …
sentiment on the stock market forecast, and the experimental
results show that the predictive model with the information of 2018.2 20409 30253 20238
micro-blog can obtain higher accuracy[9]. Meng(2017) take skin 2018.3 22065 38358 21919
care products as an example,study the influence of network
2018.4 18500 34000 23000
Word-of-mouth on product sales. And compared with the
benchmark model, the sales forecasting model with affective The online review data of three brand cars mainly collects
variables improves the fitting ability of the data, and the overall from automobile professional forum, including: Automobile
prediction effect of the consumers’ score is not very good, only Comment Network Forum, Pacific Automobile Network Forum,
on some nodes[10]. Sohu Portal Automobile Forum, Auto House forum, etc. from
June 2016 to April 2018. R brand collects 31,933 comments data,
D. Auto Sales Forecast H brand and G brand, respectively 12715, 4,210 car review
The forecast of auto sales is mainly divided into two data.In addition, it includes micro-blogging, micro-credit, news
categories.One is to use the traditional data to predict, the other and other social media channels.
kind of prediction is based on the traditional analysis and adds The online reviews text data mainly carries on four kinds of
big data. text preprocessing operation: the comment text segmentation,
The use of traditional data mainly includes some macro chinese text segmentation, remove common stop words and the
factors and the internal data of automobile Enterprises, word frequency statistics. After preprocessing, it is necessary to
Zhao(2014) studies the main factors that influence automobile analyze the affective tendency. The objective affective tendency
sales including economy, price and environment, and also is determined by the consumer's opinion of the product, and the
considers the government policy, transportation infrastructure subjective affective tendency is determined by the emotion
and other influence variables [11].Chen (2011) forecasts the polarity. Therefore, on the basis of the extraction of the feature
demand of the whole Chinese market by using the single auto words, an emotional dictionary is used to analyze the emotional
historical sales data and the autoregressive integrated moving tendency of fine grained. We use the positive evaluation of 3,730
average model (ARIMA), and the experimental results show that words, negative evaluation of 3,116 words, positive affective
the ARIMA model has a good effect on predicting the words 836, negative emotional word 1254 provided by How
macroscopic car sales[12]. Net2007 version of the emotional vocabulary as a lexical basis
for the establishment of a reference dictionary, and based on the
In recent years, more and more studies have been taken into automobile field network evaluation of the various expressions
account the impact of big data. Cui(2014) revealed the of words to supplement the reference dictionary. Abstract
relationship between the network search data and car sales, adjectives, verbs, negative words and degree adverbs are used to
analysis and test results show that compared with the traditional construct supplementary point words and affective words, and to
vehicle sales forecasting method, this method has a high make artificial annotation to judge the polarity of point words.
accuracy of prediction[13]. Liu(2017) discuss the impact of
historical sales and brand sentiment on car sales forecasts using After judging the emotions of each comment, we use the
online reviews data and sales data. The results show that the periodic affective tendency model to reflect the overall positive
average predictive error of the predictive model is 5.93%, which or negative emotional inclination of the reviewer for a given
decreases by 6.24% than the ARIMA model[14]. period of observation. The bullish index proposed by Antweiler
and Frank is validated as the most stable method of emotional
769
inclination, and the number of positive comments : and
III. DATA ANALYSIS MODEL AND APPLY MODELS TO PRACTICE 5/0
negative comments quantity : in the observation period t as
indicator item:
A. Data sources and processing
769
1) Data sources and sentiment analysis  < :
!!: ? D 5/0 E
The main research object of this paper is R automobile brand,  < :
and the choice of two major competitive brands, including H
Brand and G brand , the experiment of the three models are the The monthly affective indices of the three models were
same model (SUV), the same price (10—15 million) and the calculated according to the cyclical affective tendency model, as
market reputation are in the forefront . shown in the following table:
The sales data analyzed in this paper collects sales data from
July 2016 to April 2018, of which the sales data for R models
are provided by the R company. H and G vehicles are collected
by professional vehicle channels, as shown in the following table:
TABLE II. THE SALES VOLUME OF THREE BRAND CARS Of these, p is to examine the impact of the T months before
Time R brand H brand G brand P month sales to the T month sales, the #: for the T-period sales;
2016.6 0.96 1.17 2.24 ( is the constant term; the %: is the error term of the T-phase;
1 is the model parameter obtained by the least squares
2016.7 0.94 1.55 1.64
regression.
2016.8 0.98 1.45 1.42 Before the regression model analysis, first of all, take R car
2016.9 1.06 1.78 1.70 sales data test of smoothness, the autocorrelation diagram is as
… … … …
follows:
2018.2 3.47 4.01 2.73
2018.3 5.45 4.00 2.48
2018.4 5.90 4.01 3.15

2) Basic trend of variable


Standardize sales and emotional data to see the following
time series of R-branded car sales and other factors:

Fig. 3. Autocorrelation diagram of R brand cars sales.

The Fig.3 can be seen that the autocorrelation coefficient


always fluctuates around 0, which determines that the sequence
is a stationary time series. The autocorrelation and partial
autocorrelation are trailing, p=1,q=1. Therefore, according to the
characteristics of autocorrelation and partial autocorrelation
graphs, the Arima model is selected for fitting, the model is
Arima (1,0,1) (0,0,0), the fitting curve is like Fig.4, the sales
Fig. 1. Sales sequence diagram. forecast trend is basically coincident, but the model
interpretation is not high. In the model statistics of Fig.5, the
stationary R2 only reaches 0.523, the mean absolute percent error
(MAPE) is 63.5%, the model fitting effect is general, p value of
Q statistic is greater than 0.05, the original hypothesis cannot be
rejected, so the residual order of model fitting is white noise and
the sequence information has been fully extracted.

Fig. 2. R brand sales volume and emotion sequence diagram.

Fig.1 shows that R brand car sales and competitive brand


sales trend is similar. Fig.2 can be seen that R brand car's own
comments or competitive brand comments will lag influence R
brand car sales. Fig. 4. Fitting curve of ARIMA model.

B. Regressive analysis prediction model


1) Impact of early sales
The current sales forecast research mainly uses the product
prophase sales data to forecast the future sales. In the study of
literature[15], based on the historical prophase sales data, the
Fig. 5. Model statistics of ARIMA.
auto-regressive moving average model is used to predict the
overall automobile demand in China. In the use of early sales
forecasts for future sales of the method, the most commonly So the R-branded car time series fitting equation can be
used is the autoregressive model. Using #: to represent the T expressed as:
month car sales, t=1,2,... N, using B#: C to represent the entire
time series#)#*#. #: ? 
 < 
#:,) <%: = %:,)
7
So:#: ? 1-) 1 #:,1 <( <%:
2) Impact of online reviews sentiment 3) Impact of competitive brand cars
The rapid growth of online commodity review data provides Automobile as a special commodity, unlike the general
a good opportunity to understand the views of ordinary FMCG, it’s use cycle is long, the purchase price is high. The
consumers, and many scholars have begun to study the hidden competitive cars will be repeatedly compared in the range of
value of online reviews data. Previous studies[16] showed that the consumers’ purchase ability. Therefore, the automobile
number of times the product was mentioned in the comment text competitive brand sales and online reviews as an important
and the emotional tendencies contained in the data had a factor in predicting car sales. Using #: to express the T month
significant effect on predicting product sales, in particular, such target brand car sales, #:A  represents the T month competitive
as automobiles, housing and other needs of customers to invest
brand car sales t=1,2,... N, using B#: C to represent the entire time
time and energy of the highly involved products, sentiment
tendencies in online reviews data are more significant for sales series#)#*#. :
forecasts. In view of the above reasons, this paper classifies and #: ? 4 A A 8
;-) $; :,2 < 0-) &0 #:,3 < 2-) '2
3
:,2 <
quantifies the affective tendencies of online Word-of-mouth
reviews of different brands in the automotive field, and uses the 71-) 1 #:,1 <( <%:
brand sentiment as an important factor to predict car sales. Use Of these, p is to examine the impact of the T months before
: to express the internet car reviews affective value of the T- p month sales to the T month sales. q is T months before q
month, t=1,2,... N. :  obtained through the previous part of months reviews on the impact of the affective index on the T-
sentiment analysis model, so the Auto brand network reviews on month. l is T months before q months competitive brand cars
the emotional inclination of the autoregressive prediction model
reviews on the impact of the affective index on the T-month. m
is:
is the impact of the T months before p month competitive brand
8 7 cars sales to the T month sales.#: for the T-period sales; ( is
the constant term; the %: is the error term of the T-phase;
#: ? @ '2 :,2 < @ 1 #:,1 <( <%: 
1 , '2  &0  $; is the model parameter obtained by the least
2-) 1-)
squares regression.
Of these, p is to examine the impact of the T months before Based on the model added emotional variables of the R brand,
P month sales to the T month sales. q is T months before q the emotional variables of the two competitive brand cars were
months reviews on the impact of the affective index on the T- added. The emotion index of the individual H brand car and G
month. #: for the T-period sales; ( is the constant term; the %: brand car are calculated according to the market share of the
is the error term of the T-phase; 1 ,'2 is the model parameter month to calculate the affective influence variable of the
obtained by the least squares regression. comprehensive competitive brand, and regression results in the
First of all, the R brand's own online reviews emotion factor following g figure
g g
Fig.7:
was added to the basic time series model to get the following
Fig.6 results:

Fig. 7. Model statistics of model adding competitive brand sentiment.

It can be seen from Fig.7 that the explanatory degree of the


Fig. 6. Model statistics of model adding R brand sentiment. model by adding competitive brand sentiment variable is 0.6,
the MAPE of the model is 47.7%. The model fitting degree is
It can be seen from Fig.6 that the explanatory degree of the the best by adding the comprehensive competitive brand
model by adding R brand sentiment variable is 0.558, the MAPE emotion variable and lack three comprehensive competitive
of the model is 51.6%, which is a little better than the basic brand emotional variable. But the significance of the variable is
model. And the significance of the R self sentiment variable is not high, so we discard the variable of competitive brands’
less than 0.1. The model fitting equation can be expressed as: sentiment.
#: ? =   <   :,+ <  #:,) < %: =  %:,) Therefore, considering the impact of the sales variables of the
two competitive products, similarly, a regression analysis was
conducted based on the market share to obtain the
comprehensive sales volume of competitive products. The comprehensively considers historical sales time series data, the
g
results are shown in fig.8: emotional tendencies of the brand's online reviews, and the
influencing factors of competitive product sales and sentiment
trends, and proposes the regression model of car sales
forecasting based on the network big data and traditional time
series analysis. The regression model of car sales forecasting has
refined the car sales forecasting to the granularity of a single car
brand. The experiment results show that model predictions that
take into account brand sentiment factors and competing sales
factors can improve forecast accuracy and fully validate the sales
of network big data. It provides a new research idea to find more
scientific and efficient forecasting methods and to establish more
accurate forecasting models for the subsequent research on sales
forecast.

REFERENCES
Fig. 8. Model statistics of model adding competitive brand sales.
[1] Hennig-Thurau T. , Gwinner K.P. Walsh Get al, “Electronic Word-of-
Month Via Consumer-Opinion Platform: What Motivates Consumers to
It can be seen from Fig.8 that the explanatory degree of the Articulate Themselves on the Internet?”.Journal of Interactive Marketing,
model by adding comprehensive sales variable is 0.79, the 2004, vol. 18(1),pp.8-52.
MAPE of the model is 23.9%. And the significance of the [2] Pang B, Lee L, “Opinion mining and sentiment analysis”. Foundations
variable is reasonable, The model fitting equation can be and trends in information retrievl, 2008, vol.2(1-2),pp.1-135.
expressed as: [3] Hu M, Liu B, “Mining Opinion Features in Customer Reviews” .The 19th
A National Conference on Artificial Intelligence,2004.
#: ? =

 < 
#:,) <  :,+ < #:,) < %: = %:,)
[4] Scaffidi C., Bierhoff K., Chang E, “Red Opal: Product-feature Scoring
from Reviews”. The 8th ACM Conference on Electronic Commerce
4) accuracy of model prediction 2007.
In order to evaluate the accuracy of model prediction, the [5] Choi Y., Cardie C, “Hierarchical Sequential Learning for Extracting
mean absolute percentage error,(MAPE) is used to measure: Opinions and Their Attributes”.The Association for Computational
5 Linguistics Conference,2010.
 1 = "1 
 ? @ >  [6] Zhang S., Jia W., Xia Y., et al, “Extracting Product Features and
 "1 Sentiments from Chinese Customer Reviews”. The 7th Inter-national
1-) Conference on Language Resources and Evaluation,2010.
Comprehensive several model results, get the following
[7] Gross-Klussmann A,Hautsch N, “When machines read the news: Using
model parameter comparison table: automated text analytics to quantify high frequency news-implied market
reactions”.Journal of Empirical Finance,2011,vol.27(2),pp.321-340.
TABLE III.  COMPARISON OF MODEL PARAMETERS [8] Wang Na, “Carbon price prediction based on large data”. Statistical
Stationary R-Squared MAPE research,2016,vol.33(11),pp.56-62.
[9] Huang Ruipeng,Zuo Wenming,Bi Linyan, “Stock market Forecast based
Base Time Series model 0.52 63.5% on micro-blog Emotion Information”. Journal of Management
Engineering,2015,vol.29(01),pp.47-52+215.
Model add R brand 0.6 51.6% [10] Meng Yuan,Wang Hongwei,Wang Wei, “Impact of Internet Word of
Mouth on Product Sales: Based on Fine-grained Sentiment Analysis”.
sentiment variables Management review,2017,vol.29(01).pp.144-154.
Model add Competitive 0.79 23.9% [11] Zhao Ying,“ Research on China's Automobile Sales Forecasting Model
Goods Sales variable Based on Regression Analysis”. Central China Normal University,2014.
[12] Chen D. “Chinese automobile demand prediction based on ARIMA
We can see that with the addition of emotional variables and model”International Conference on Biomedical Engineering and
the sales variables of competitive brands, the interpretation of Informatics. IEEE, 2011,pp.2197-2201.
the model is getting better. [13] Cui Dongjia. “An Empirical Study of Brand Automobile Sales Forecast in
the Background of Big Data Era”. Henan University,2014.
IV. CONSLUSION [14] Liu Yezhen,Zhang Xu,Wang Jinkun, “Automobile sales forecasting
Model considering brand emotion”. Journal of Hefei University of
How to accurately predict car sales has become a common Technology (natural Science Edition),2017,vol.40(09),pp.1276-1282.
concern for many parties. Based on an in-depth analysis of sales
research in the existing automotive field, this paper

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy