Auto Sales Forecast
Auto Sales Forecast
Abstract—The rapid development of automobile industry has online reviews are the most direct expression of users’ feelings
led to the blind expansion of the production capacity of various which reflects the evaluation and trend of the market subject, and
auto companies, which has brought serious potential overcapacity. provides the necessary foundation for the study of social and
Accurate forecaste of car sales is not only of great importance to economic behavior.
automobile manufacturers and distributors in setting up strategy
and target. It also has a certain influence on the good growth of In addition, automobile as a special commodity, unlike the
China's automobile market and the healthy development of the general FMCG, it’s use cycle is long, the purchase price is high,
national economy. This article takes R brand automobile as the so users will do more preparation before buying, which is mainly
research object, collects the related online reviews data , through to browse online reviews. Based on this, this paper studies the
the natural language processing and the text emotion model, the impact of online reviews on car sales. The regression forecast
online reviews data is processed quantitatively. The sales forecasts model is established by using the automobile online reviews data
are made by combining the internal historical sales data of the and the history sales data. It provides a new research idea to find
automobile company with the online reviews of R brand and the more scientific and efficient forecasting methods and to establish
sales and reviews data of competitive brands. Compared with the more accurate forecasting models for the subsequent research on
benchmark model, it is found that the sales forecasting model sales forecast.
which has added the sentiment variables and the influence factors
of the competitive brands improves the fitting ability of the data.
II. LITERATURE REVIEW
Keywords—Online reviews, Sentiment analysis, Sales forecast
A. Online Reviews
I. INTRODUCTION Online reviews also called online consumer reviews or user
generated content. It is a form of online word-of-mouth, mainly
After several decades of development, China's automobile in the form of text to evaluate products, is the consumer's
industry has grown into an important pillar industry of China's comments on the product or company submitted via the Internet
economy. China has also become the world's largest auto [1]
. This paper holds that online reviews is the user published in
production and automotive consumer market, the annual sales of
various network channels on the purchase of goods or services
cars for 8 consecutive years in the world's first. 2017, China's car evaluation. Consumers adjust their purchase decisions through
production of 29 million, year-on-year growth of 3.19%. Overall,
online reviews and generate online reviews that are shared and
China's auto market is still a very competitive industry. The disseminated through various online communities or comment
development of automobile industry is not only the industry
sites for free reading and sharing by more web users.
itself, but also related to the livelihood of the nation. But the
rapid development of automobile industry makes each
automobile company blindly expand capacity and causes huge B. Sentiment analysis
human and material waste. Therefore, the accurate forecast of Sentiment Analysis is also known as Opinion extraction
automobile sales is very important to the production Opinion mining, Sentiment classification.It is the process of
management of the enterprise. processing, analyzing and even inferring the material with
artificial subjective emotion[2]. In the representative study, Hu
The research of automobile sales forecasting in existing
and Liu[3], Scaffidi[4] are based on the feature words and the point
enterprises mostly uses a single data source, that is, historical
of view word proximity principle and the common frequency to
sales data and lacks the use of external users’ reviews data,
identify the feature point of view, while Choi and Cardie[5],
ignoring the influence of the reviews on the public opinion data
zhang[6] through the sequence annotation model method to
on the users’ purchase intention. With the development of social
identify the feature point of view.
networks, the network has become an important reference factor
for consumer purchasing decisions and has an impact on product
sales. The internet has become one of the quickest ways for users C. Prediction research based on sentiment analysis
to obtain information. According to the China Internet Network Domestic and foreign scholars use online reviews sentiment
Information Center, there are more than 80% of internet users to carry on the forecast of each industry. Gross-klussmann and
will browse the product reviews before buying products. And the others, by identifying the sentiment in the news, predicted high-
978-1-5386-5178-0/18/$31.00©2018 IEEE
frequency yields, trading volumes and volatility in the London TABLE I. THE EMOTIONAL INDEX OF THREE BRAND CARS
Stock Exchange[7]. In domestic research, Wang(2016) study the Time R brand H brand G brand
correlation between large data and carbon price prediction, and
2016.7 1600 39100 10100
discusses the most effective prediction model, which indicates
that the network structure ADL model can significantly improve 2016.8 4322 40600 10100
the predictive effect[8]. Huang(2015) based on the theory of 2016.9 11450 53300 14100
behavioral finance, discuss the influence of micro-blog
… … … …
sentiment on the stock market forecast, and the experimental
results show that the predictive model with the information of 2018.2 20409 30253 20238
micro-blog can obtain higher accuracy[9]. Meng(2017) take skin 2018.3 22065 38358 21919
care products as an example,study the influence of network
2018.4 18500 34000 23000
Word-of-mouth on product sales. And compared with the
benchmark model, the sales forecasting model with affective The online review data of three brand cars mainly collects
variables improves the fitting ability of the data, and the overall from automobile professional forum, including: Automobile
prediction effect of the consumers’ score is not very good, only Comment Network Forum, Pacific Automobile Network Forum,
on some nodes[10]. Sohu Portal Automobile Forum, Auto House forum, etc. from
June 2016 to April 2018. R brand collects 31,933 comments data,
D. Auto Sales Forecast H brand and G brand, respectively 12715, 4,210 car review
The forecast of auto sales is mainly divided into two data.In addition, it includes micro-blogging, micro-credit, news
categories.One is to use the traditional data to predict, the other and other social media channels.
kind of prediction is based on the traditional analysis and adds The online reviews text data mainly carries on four kinds of
big data. text preprocessing operation: the comment text segmentation,
The use of traditional data mainly includes some macro chinese text segmentation, remove common stop words and the
factors and the internal data of automobile Enterprises, word frequency statistics. After preprocessing, it is necessary to
Zhao(2014) studies the main factors that influence automobile analyze the affective tendency. The objective affective tendency
sales including economy, price and environment, and also is determined by the consumer's opinion of the product, and the
considers the government policy, transportation infrastructure subjective affective tendency is determined by the emotion
and other influence variables [11].Chen (2011) forecasts the polarity. Therefore, on the basis of the extraction of the feature
demand of the whole Chinese market by using the single auto words, an emotional dictionary is used to analyze the emotional
historical sales data and the autoregressive integrated moving tendency of fine grained. We use the positive evaluation of 3,730
average model (ARIMA), and the experimental results show that words, negative evaluation of 3,116 words, positive affective
the ARIMA model has a good effect on predicting the words 836, negative emotional word 1254 provided by How
macroscopic car sales[12]. Net2007 version of the emotional vocabulary as a lexical basis
for the establishment of a reference dictionary, and based on the
In recent years, more and more studies have been taken into automobile field network evaluation of the various expressions
account the impact of big data. Cui(2014) revealed the of words to supplement the reference dictionary. Abstract
relationship between the network search data and car sales, adjectives, verbs, negative words and degree adverbs are used to
analysis and test results show that compared with the traditional construct supplementary point words and affective words, and to
vehicle sales forecasting method, this method has a high make artificial annotation to judge the polarity of point words.
accuracy of prediction[13]. Liu(2017) discuss the impact of
historical sales and brand sentiment on car sales forecasts using After judging the emotions of each comment, we use the
online reviews data and sales data. The results show that the periodic affective tendency model to reflect the overall positive
average predictive error of the predictive model is 5.93%, which or negative emotional inclination of the reviewer for a given
decreases by 6.24% than the ARIMA model[14]. period of observation. The bullish index proposed by Antweiler
and Frank is validated as the most stable method of emotional
769
inclination, and the number of positive comments : and
III. DATA ANALYSIS MODEL AND APPLY MODELS TO PRACTICE 5/0
negative comments quantity : in the observation period t as
indicator item:
A. Data sources and processing
769
1) Data sources and sentiment analysis < :
!!: ? D 5/0 E
The main research object of this paper is R automobile brand, < :
and the choice of two major competitive brands, including H
Brand and G brand , the experiment of the three models are the The monthly affective indices of the three models were
same model (SUV), the same price (10—15 million) and the calculated according to the cyclical affective tendency model, as
market reputation are in the forefront . shown in the following table:
The sales data analyzed in this paper collects sales data from
July 2016 to April 2018, of which the sales data for R models
are provided by the R company. H and G vehicles are collected
by professional vehicle channels, as shown in the following table:
TABLE II. THE SALES VOLUME OF THREE BRAND CARS Of these, p is to examine the impact of the T months before
Time R brand H brand G brand P month sales to the T month sales, the #: for the T-period sales;
2016.6 0.96 1.17 2.24 ( is the constant term; the %: is the error term of the T-phase;
1 is the model parameter obtained by the least squares
2016.7 0.94 1.55 1.64
regression.
2016.8 0.98 1.45 1.42 Before the regression model analysis, first of all, take R car
2016.9 1.06 1.78 1.70 sales data test of smoothness, the autocorrelation diagram is as
… … … …
follows:
2018.2 3.47 4.01 2.73
2018.3 5.45 4.00 2.48
2018.4 5.90 4.01 3.15
REFERENCES
Fig. 8. Model statistics of model adding competitive brand sales.
[1] Hennig-Thurau T. , Gwinner K.P. Walsh Get al, “Electronic Word-of-
Month Via Consumer-Opinion Platform: What Motivates Consumers to
It can be seen from Fig.8 that the explanatory degree of the Articulate Themselves on the Internet?”.Journal of Interactive Marketing,
model by adding comprehensive sales variable is 0.79, the 2004, vol. 18(1),pp.8-52.
MAPE of the model is 23.9%. And the significance of the [2] Pang B, Lee L, “Opinion mining and sentiment analysis”. Foundations
variable is reasonable, The model fitting equation can be and trends in information retrievl, 2008, vol.2(1-2),pp.1-135.
expressed as: [3] Hu M, Liu B, “Mining Opinion Features in Customer Reviews” .The 19th
A National Conference on Artificial Intelligence,2004.
#: ? =
<
#:,) < :,+ < #:,) < %: = %:,)
[4] Scaffidi C., Bierhoff K., Chang E, “Red Opal: Product-feature Scoring
from Reviews”. The 8th ACM Conference on Electronic Commerce
4) accuracy of model prediction 2007.
In order to evaluate the accuracy of model prediction, the [5] Choi Y., Cardie C, “Hierarchical Sequential Learning for Extracting
mean absolute percentage error,(MAPE) is used to measure: Opinions and Their Attributes”.The Association for Computational
5 Linguistics Conference,2010.
1 = "1
? @ > [6] Zhang S., Jia W., Xia Y., et al, “Extracting Product Features and
"1 Sentiments from Chinese Customer Reviews”. The 7th Inter-national
1-) Conference on Language Resources and Evaluation,2010.
Comprehensive several model results, get the following
[7] Gross-Klussmann A,Hautsch N, “When machines read the news: Using
model parameter comparison table: automated text analytics to quantify high frequency news-implied market
reactions”.Journal of Empirical Finance,2011,vol.27(2),pp.321-340.
TABLE III. COMPARISON OF MODEL PARAMETERS [8] Wang Na, “Carbon price prediction based on large data”. Statistical
Stationary R-Squared MAPE research,2016,vol.33(11),pp.56-62.
[9] Huang Ruipeng,Zuo Wenming,Bi Linyan, “Stock market Forecast based
Base Time Series model 0.52 63.5% on micro-blog Emotion Information”. Journal of Management
Engineering,2015,vol.29(01),pp.47-52+215.
Model add R brand 0.6 51.6% [10] Meng Yuan,Wang Hongwei,Wang Wei, “Impact of Internet Word of
Mouth on Product Sales: Based on Fine-grained Sentiment Analysis”.
sentiment variables Management review,2017,vol.29(01).pp.144-154.
Model add Competitive 0.79 23.9% [11] Zhao Ying,“ Research on China's Automobile Sales Forecasting Model
Goods Sales variable Based on Regression Analysis”. Central China Normal University,2014.
[12] Chen D. “Chinese automobile demand prediction based on ARIMA
We can see that with the addition of emotional variables and model”International Conference on Biomedical Engineering and
the sales variables of competitive brands, the interpretation of Informatics. IEEE, 2011,pp.2197-2201.
the model is getting better. [13] Cui Dongjia. “An Empirical Study of Brand Automobile Sales Forecast in
the Background of Big Data Era”. Henan University,2014.
IV. CONSLUSION [14] Liu Yezhen,Zhang Xu,Wang Jinkun, “Automobile sales forecasting
Model considering brand emotion”. Journal of Hefei University of
How to accurately predict car sales has become a common Technology (natural Science Edition),2017,vol.40(09),pp.1276-1282.
concern for many parties. Based on an in-depth analysis of sales
research in the existing automotive field, this paper