0% found this document useful (0 votes)
51 views11 pages

Forecasting Private Consumption With Google Trends Data: Jaemin Woo - Ann L. Owen

Forecasting

Uploaded by

Tokyo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views11 pages

Forecasting Private Consumption With Google Trends Data: Jaemin Woo - Ann L. Owen

Forecasting

Uploaded by

Tokyo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Received: 29 October 2017 Revised: 26 July 2018 Accepted: 19 October 2018

DOI: 10.1002/for.2559

RESEARCH ARTICLE

Forecasting private consumption with Google Trends data

Jaemin Woo | Ann L. Owen

Department of Economics, Hamilton


College, Clinton, New York, USA
Abstract
This paper examines the predictive relationship of consumption‐related and
Correspondence news‐related Google Trends data to changes in private consumption in the
Ann L. Owen, Department of Economics,
Hamilton College, 198 College Hill Road,
USA. The results suggest that (1) Google Trends‐augmented models provide
Clinton, NY 13323. additional information about consumption over and above survey‐based con-
Email: aowen@hamilton.edu
sumer sentiment indicators, (2) consumption‐related Google Trends data pro-
vide information about pre‐consumption research trends, (3) news‐related
Google Trends data provide information about changes in durable goods con-
sumption, and (4) the combination of news and consumption‐related data sig-
nificantly improves forecasting models. We demonstrate that applying these
insights improves forecasts of private consumption growth over forecasts that
do not utilize Google Trends data and over forecasts that use Google Trends
data, but do not take into account the specific ways in which it informs
forecasts.

KEYWORDS
consumer sentiment indicators, forecasting, Google Trends, private consumption

1 | INTRODUCTION by introducing Google Trends News Search data in addi-


tion to using Google Trends consumption‐related search
Recent developments in the technology industry and the data. We also examine how the predictive relationship
rapid increase in the availability of online data such as of Google Trends data changes as we forecast different
Google Trends (trends.google.com) provide researchers components of consumption (durables, nondurables,
with new sets of data that can complement survey data and services) over different timeframes (1‐month‐ahead
such as the Michigan Consumer Sentiment Index (MCSI) forecasts and nowcasts). This helps us to gain more
and the Conference Board Consumer Confidence Index insight into the information content of Google Trends
(CCI) in economic forecasts. Some potential contribu- data. We demonstrate that these insights can be used to
tions of Google Trends data to consumption forecasts improve forecasts.
are a result of the transparent channels of influence, a Our work is most similar to Vosen and Schmidt
large sample size, and its low cost. Unlike survey (2011), who show that consumption‐related Google
responses on consumer attitudes, Google Trends data Trends search data outperforms the University of Michi-
measures the behavior of consumers such as their prepur- gan and the Conference Board's survey‐based measures
chase research activities and news readership. Google of consumer confidence in forecasts of private consump-
Trends' large quantity of individual‐level data supple- tion. We extend these findings by treating the Google
ments the small samples of consumer surveys. Moreover, search data as complementary to the survey‐based mea-
Google Trends data are easily accessible with no cost and sures of consumer confidence rather than as substitutes,
updated daily. This paper takes advantage of these bene- and confirming that Google Trends augmented models
fits and extends the literature on consumption forecasting provide additional information about current and future

Journal of Forecasting. 2019;38:81–91. wileyonlinelibrary.com/journal/for © 2018 John Wiley & Sons, Ltd. 81
82 WOO AND OWEN

consumption over and above that provided by the survey‐ To examine the relationship between consumption‐
based measures of consumer confidence. We also employ related Google Trends data and personal consumption,
a different approach by forecasting durable goods, nondu- Vosen and Schmidt (2011) compare the suitability of Goo-
rable goods, and services consumption separately to gain gle Trends and consumer sentiment indices (MCSI and
better insight into the ways in which the Google Trends CCI) as predictors of consumption and conclude that,
data inform the forecast. Specifically, our findings suggest albeit by a small amount, Google Trends is a better pre-
that consumption‐related Google Trends data provide dictor than MCSI or CCI. Penna and Huang (2009) find
information about preconsumption research for durable that the levels of MCSI, CCI, and Google Trends are
goods purchases and that news‐related Google Trends highly correlated (0.9 correlation coefficient). However,
data provide information about consumer sentiment they also find that their month‐to‐month changes are
related to durable goods consumption. We find that the only moderately correlated (0.4 correlation coefficient)
combination of news and consumption‐related search and, more importantly, that Google Trends can predict
data improves forecasting models the most. These find- MCSI and CCI, and not the other way around. This leads
ings suggest how Google Trends data should be used in us to believe that these three indicators are not substi-
order to achieve the largest improvements in forecasts tutes, as in Vosen and Schmidt (2011), but rather comple-
and we demonstrate that in an application that compares ments. Survey‐based consumer sentiment indicators still
our best forecast to several different approaches. contain useful information about consumption behavior
Our work is related to two different strands of the lit- that is difficult to extract using Google Trends data. For
erature. The first strand relates survey‐based consumer these reasons, we focus on examining the marginal infor-
sentiment indicators to changes in private consumption, mation that Google Trends data add to forecast models.
and the second uses Google Trends data to improve a Of course, there may be a relationship between con-
variety of economic forecasts. There is a general consen- sumer sentiment and the news media; Doms and Morin
sus in the extensive literature examining survey‐based (2004) confirm that financial news can affect consumer
consumer sentiment indicators that they can predict sentiment. They directly count the number of articles that
changes in private consumption (see,e.g., Carroll, Fuhrer, have the words “recession” or “layoff” in the title pub-
& Wilcox, 1994; Wilcox, 2007). Despite their predictive lished by a sample of 70 news media agencies. We replicate
ability, MCSI and CCI are criticized for their opaque their methodology but instead use news‐related Google
channels of influence on consumption and high correla- Trends data that measure the relative search frequencies
tion with other macroeconomic indicators. For instance, of the words “recession” and “layoff” in “news.google.
Carroll et al. (1994) conclude that the predictive power com.” Unlike Doms and Morin's volume data, the Google
of MCSI comes from a complex mix of direct and indirect Trends data will allow us to more directly and conveniently
channels of influence that are unobservable. In a similar examine households' interest in financial news articles. An
vein, Fuhrer (1993) finds that other macroeconomic indi- increase in the search frequencies of the words “recession”
cators explain 70% of the variation in MCSI. and “layoff” should indicate a near‐future decrease in con-
There is also an extensive literature on various appli- sumer sentiment, and therefore consumption.
cations of Google Trends data, which simply indicates a This study contributes to the existing literature in a
given search term or category's relative search popularity few important ways. It examines how both
in a geographic region. For example, Chen, So, Wu, and consumption‐related and news‐related Google Trends
Yan (2015) predict official recession dates with searches can predict different components of consumption. To
related to “recession,” “foreclosure help,” or “layoff”; our knowledge, this study is the first to use the Google
Preis, Moat, and Stanley (2013) predict stock market Trends' news search function, to examine Google Trends
movements with search terms related to finance; and data's marginal information over and above survey‐based
Wu and Brynjolfsson (2013) predict housing prices and measures of consumer sentiment, and to use components
sales with searches related to “real estate agencies” and of consumption (durables, nondurables, and services) as
“real estate listings.”1 In theory, macroeconomic variables dependent variables. Applying these contributions allows
indicate a household's ability to spend, survey data on us to improve forecasts. In what follows, we explain in
sentiment reflect its willingness to spend, and Google detail our data, methods, and results, starting with the
Trends depicts its preparatory steps toward spending. data and methods in Section 3.

2 | DA T A A ND M E T HO DS
1
See also Breyer et al. (2011), Choi and Varian (2012), Drake, Roulstone,
and Thornock (2012), D'Amuri and Marucci (2015), Ginsberg et al. For each of the three components of consumption, we
(2008), and Penna and Huang (2009). observe whether or not recursive window ordinary least
WOO AND OWEN 83

squares (OLS) models augmented with three different a very unusual economic period, the Great Recession,
specifications of Google Trends data can reduce the fore- followed by an uncharacteristic recovery. Nonetheless,
casting errors of baseline models for 1‐month‐ahead fore- in supplementary estimations, we do attempt to season-
casts and nowcasts. ally adjust the Google Trends data using the Census
The baseline model for 1‐month‐ahead forecasts and Bureau's X‐13 ARIMA SEATS program and use that in
nowcasts is forecasts of month‐on‐month growth rates for
comparison.
C i;tþh ¼ α⁎Ci;t−2 þ β⁎MCSIt−1 þ γ⁎CCIt−1 þ θ⁎DIt−2 In the baseline model, six independent variables are
þ δ⁎VIXt−1 þ φ⁎TBillt−1 þ εtþh ; selected to account for the information already provided
by other widely used macroeconomic indicators. We use
where C is the monthly 12‐month growth rate of compo- both the University of Michigan Consumer Sentiment
nents of real private consumption. The subscript i refers Index (MCSI) and the Conference Board Consumer Con-
to the different types of consumption (durable, nondura- fidence Index (CCI). Although they measure similar
ble, and services), t is the month at the time of prediction, things using similarly designed survey methodologies,
and h is 1 for 1‐month‐ahead forecasts and 0 for CCI is believed to put heavier emphasis on the health of
nowcasts. We use 12‐month growth rates of seasonally the labor market (Vosen & Schmidt, 2011). Consequently,
adjusted monthly real personal consumption expenditure CCI's 12‐month growth rates are more exaggerated than
of durable goods, nondurable goods, and services released MCSI's. Nonetheless, the two indicators are highly line-
by the Bureau of Economic Analysis. Whereas previous arly correlated at 0.805. We include both indicators, with
literature uses total personal consumption as the depen- the belief that the two indicators provide similar yet
dent variable, we use these three components and distinct slightly different information.
models for each component because they react differently The CBOE Volatility Index (VIX), which measures the
to economic events, as can be seen in Figure 1. These expectation of 30‐day volatility based on options‐market
three components are only moderately correlated; the data, controls for the changes in consumption due to
12‐month growth rate of durable goods expenditure is unusual changes in the stock market. Similarly, real dis-
far more volatile than that of nondurable goods and ser- posable income controls for the income effect on con-
vices, as consumers faced with economic challenges dur- sumption. The secondary market rate of the 3‐month
ing a downturn have greater flexibility to delay Treasury Bill controls for the effect of monetary policy
purchasing durable goods than some nondurable goods on consumption. All of the independent variables are
and services. Finally, studying the components of con- measured in 12‐month growth rates, for easy comparison
sumption allows us to examine Google Trends' forecast- with the rest of the variables in the prediction models. In
ing abilities in detail. addition, all the variables are lagged to the latest monthly
In our main specification, we use 12‐month growth observation available at the time of each prediction.
rates rather than month‐on‐month growth rates because In the augmented models, we add combinations of the
the Google Trends data are not seasonally adjusted. Fur- Google Trends consumption data and Google Trends
thermore, this specification allows the easiest comparison news data to the baseline specification. To create the Goo-
to previous literature; Vosen and Schmidt (2011) also gle Trends consumption data for each of the three compo-
forecast 12‐month growth rates. Reliable seasonal adjust- nents of personal consumption expenditure used as
ment of the Google Trends data is difficult because the dependent variables, we follow Vosen and Schmidt's
available time period is relatively short (from 2008 for (2011) method and identify the Google Trends categories
Google Trends news data) and the time period contains that are intuitively related to the Bureau of Economic

FIGURE 1 12‐month year‐over‐year growth rates of components of consumption (Source: Bureau of Economic Analysis) [Colour figure
can be viewed at wileyonlinelibrary.com]
84 WOO AND OWEN

Analysis' categorization. Since the 2011 study, Google has all the Google searches that are related to the category.
eliminated or updated some of its Trends categories. We To measure the relative popularity, Google takes an unbi-
are still able to identify and extract most of these Google ased sample from all the search queries in a given region
Trends categories and find intuitively related replace- over a given period. Then, Google divides the number of
ments for the ones that are missing. Table 1 shows the searches related to the given category by the sample size.
Google Trends categories used for each of the three com- Lastly, Google standardizes these numbers so that the
ponents of consumption. Each of these Google Trends highest proportion is equal to 100. We extract all Google
categories data depicts the overall relative popularity of Trends data at the monthly level and convert them to
12‐month growth rates to account for seasonality. One
TABLE 1 Components of consumption and matching Google of the shortfalls of using Google's predetermined catego-
Trends categories
ries is that the exact list of searches that are contained
Google Trends categories in each category is unknown.
Durable Nondurable Services Many of the categories that are in the same compo-
consumption consumption consumption nents of consumption are highly correlated. In order to
Auto Vehicles Alcoholic Home Financing mitigate the problem of multicollinearity, we use princi-
Beverages pal components analysis. In the forecast equation, we
Auto Financing Food & Drink Home Improvement use all the factors that are required to explain 90% of
the variation in the Google Trends categories
Automotive Industry Grocery & Food Home Insurance
Retailers representing each component of consumption, which
range from four factors for nondurable consumption to
Auto Insurance Non‐alcoholic Homemaking &
seven factors for services. (Results of the principal compo-
Beverages Interior Décor
nents analysis are available from the authors upon
Vehicle Brands Apparel Drugs &
request.)
Medications
News‐related Google Trends data are similar to the
Vehicle Shopping Apparel Services Health Insurance
consumption‐related Google Trends data discussed. The
Computer Electronics Footwear Medical Facilities & news‐related data only go as far back as January 2008
Services (January 2009 when converted to 12‐month growth
Consumer Electronics Undergarments Auto Financing rates), and the sample search queries that are used to cal-
Home Appliances Athletic Apparel Auto Insurance culate the data are limited to those submitted to “news.
Home Financing Electricity Entertainment google.com” or the Google news search function under
Industry its search bar. These news‐related Google Trends data
Home Furnishing Energy Utilities Movies
allow one to measure the relative popularity of news arti-
cles with specific keywords. For this research, we extract
Home Gardening Oil & Gas Computer & Video
the data for the keywords “recession” and “layoff.”
Games
The three augmented models include combinations of
Home Improvement Beauty & Fitness Ticket Sales
Google Trends data. Table 2 shows the independent vari-
Home Insurance Chemical Industry Food & Drink ables used in each model, where the “Baseline Variables”
Homemaking and Drugs & Grocery & Food refers to all the independent variables used in the base-
Interior Décor Medications Retailers line model discussed above. “90% PCA” refers to all the
Book Retailers Face & Body Care Hotels & principal components that are required to explain 90%
Accommodations of variation in the Google Trends categories related to
Arts & Entertainment Hair Care Restaurants the relevant component of consumption. The [News
Entertainment Health Home Financing
Industry
TABLE 2 Private consumption prediction model specifications

Movies Newspapers Home Insurance News


News 90% +90%
Computer & Video Tobacco Products Insurance
Variables Baseline Only PCA PCA
Games
Mobile Wireless Internet & Telecom Baseline variables X X X X

Internet & Telecom Retirement Pension News Google Trends X X


(“recession” & “layoff”)
Social Services
Consumption Google X X
Waste Management Trends (90% PCA)
WOO AND OWEN 85

+90% PCA] model includes both news‐related Google [90% PCA] models and recession data to forecast the
Trends, “recession” and “layoff,” and relevant principal changes in components of consumption from 2010 to
components. 2016. These estimates allow us to check for result robust-
All Google Trends data are based on the USA only ness to changes in forecast method and training observa-
and were extracted on February 25, 2017. For our main tion timeline. Finally, we estimate fixed‐sample OLS
results, we rely on the Google Trends data from January models where training observations are fixed over time.
2009 to November 2014, because that is the time period This allows us to examine the sign and significance of
over which the annual growth rates of the Google Trends the coefficients.
news data are available. Although Google Trends con-
sumption data are available over a slightly longer
timeframe, the news data are what limit our sample. 3 | R E SUL T S

Our main results are the following: (1) Google Trends


2.1 | Procedure augmented models provide additional information about
consumption over and above that provided by survey‐
For each component of consumption and model specified
based measures of consumer sentiment; (2)
in Table 2, we conduct out‐of‐sample 1‐month‐ahead
consumption‐related Google Trends data provide infor-
forecasts and nowcasts using the recursive window
mation about preconsumption research trends; (3) news‐
method.2 The out‐of‐sample prediction timeline is Janu-
related Google Trends data provide information about
ary 2015 to November 2016. Using the recursive window
changes in durable goods consumption; and (4) the com-
method, the forecast and nowcast models for January
bination of news and consumption‐related data improves
2015 have the smallest training observations (January
forecasting models at the 1‐month horizon the most. We
2009 to November 2014 for nowcasts and January 2009
discuss these results and their robustness in subsequent
to October 2014 for forecasts), and the predictions should
sections. We conclude by applying these insights to dem-
become more accurate over the course of the prediction
onstrate that they can improve forecasts.
timeline because the number of training observations
increase. Result 1: Google Trends augmented models
For each component and model's forecasts and provide additional information about
nowcasts, we calculate the root mean squared forecasting changes in consumption..
error (RMSFE). Then we compare the RMSFE of each
Table 3 summarizes the results of 1‐month‐ahead
augmented model to that of the relevant baseline model.
RMSFE in 12‐month percentage growth rates. In order
A smaller RMSFE indicates more accurate predictions.
to compare the forecast performance of the augmented
In order to compare the magnitude of these errors to
models conveniently, we calculate the percentage of base-
another benchmark, we also provide some error statistics
line errors that augmented models reduce (Table 4). On
from the Survey of Professional Forecasters of the Federal
average, augmented models reduce 1‐month‐ahead base-
Reserve Bank of Philadelphia.
line forecast errors by 11.15%. The 1‐month‐ahead [News
We also conduct other estimates to check for result
+90% PCA] forecast models perform the best for all three
robustness. First, to show that Google Trends augmented
components of consumption, reducing errors by 16.59%,
models can improve other baseline models, we use Vosen
on average. This model was exceptionally effective in
and Schmidt's baseline model. Their baseline includes 12‐
reducing errors in services consumption predictions
month growth rate of S&P 500, 3‐month Treasury rate,
(21.33% reduction).
real personal income and either MCSI or CCI. We include
Similarly, the augmented models reduce nowcast
both MCSI and CCI for simplicity and to reduce the num-
errors as well (Table 4). On average, augmented models
ber of specifications. As mentioned above, we also fore-
reduce nowcast errors by 7.14%. As is the case for 1‐
cast month‐on‐month growth rates after seasonally
month‐ahead forecasts, the [News +90% PCA] model
adjusting the Google Trends data. In addition, we use
continues to perform well for services consumption. The
60‐month rolling window OLS models for the [90%
best models for durable and nondurable goods consump-
PCA] models to estimate out‐of‐sample 1‐month‐ahead
tion are [News Only] and [90% PCA], respectively.
forecasts and nowcasts and compare the errors to our
We also find improved forecasts with an alternative
baselines. We use these 60‐month rolling window OLS
baseline model. Under the Vosen and Schmidt (2011)
2
We also experimented with a 60‐month rolling window sample, but baseline model, the results are similar (Tables 5 and 6).
found the recursive window procedure produced lower forecast errors, On average, the augmented models reduced errors by
suggesting that the relationships estimated are relatively stable. 6.79% for forecasts and 4.30% for nowcasts. The
86 WOO AND OWEN

TABLE 3 RMSFE (12‐month % growth): forecast timeline January 2015 to November 2016

Dependent variable Statistics Baseline News Only 90% PCA News +90% PCA Average

Forecast
Durable RMSFE 1.836 1.670 1.703 1.610 1.705
Nondurable RMSFE 0.842 0.846 0.750 0.706 0.786
Services RMSFE 0.395 0.374 0.320 0.310 0.350
Nowcast
Durable RMSFE 1.680 1.475 1.604 1.620 1.595
Nondurable RMSFE 0.670 0.723 0.612 0.624 0.657
Services RMSFE 0.289 0.282 0.241 0.239 0.263

TABLE 4 RMSFE reduction (% of baseline error): forecast time- Trends is more useful. The following sections discuss
line January 2015 to November 2016 the nature of the contributions of Google Trends data to
News forecasts and the potential channels through which Goo-
Dependent News 90% +90% Best gle Trends can indicate changes in consumption.
variable Only PCA PCA Average model
Result 2: Consumption‐related Google
Forecast Trends data provide information about pre‐
Durable 9.04 7.23 12.30 9.52 News +90% PCA consumption research.
Nondurable −0.51 10.82 16.15 8.82 News +90% PCA The characteristics of online preconsumption research
Services 5.18 18.84 21.33 15.12 News +90% PCA vary by the type of consumption. Durable goods, which
Average 4.57 12.29 16.59 11.15 are defined as goods that last more than 3 months,
require more in‐depth and extensive research than non-
Nowcast
durable goods do because durable goods generally are
Durable 12.22 4.52 3.54 6.76 News Only
larger investments with a longer lifetime.3 In contrast,
Nondurable −7.89 8.67 6.87 2.55 90% PCA nondurable goods require relatively brief research, if
Services 2.43 16.58 17.30 12.10 News +90% PCA any, because they are usually smaller investments. Ser-
Average 2.25 9.93 9.24 7.14 vices consumption, however, is not defined in terms of
the lifetime of the products or cost, and therefore the
Note. Some models result in larger error.
characteristics of its online preconsumption research are
difficult to generalize. Following these ideas,
augmented models' ability to reduce the average forecast consumption‐related Google Trends data should be more
and nowcast errors suggest that consumption‐related effective as a leading indicator for durable goods than it is
and news‐related Google Trends data contribute addi- for nondurable goods or services.
tional information about changes in consumption to Evidence from a comparison of the error reduction of
baseline models. 1‐month‐ahead and nowcast [PCA 90%] models suggests
Finally, in supplementary estimations, we find that that the pattern holds (Table 9). In the table, a positive
the month‐on‐month forecasts for durable goods and ser- reduction difference means that the model was more
vices consumption as well as the nondurable goods effective in forecasting than nowcasting consumption. In
nowcast are also improved with Google Trends data other words, the Google Trends data in that model are
(Tables 7 and 8). Unlike with the 12‐month forecast, we more effective as leading indicators of consumption than
do not find improved forecasts for nondurable consump- as contemporaneous indicators. Conversely, negative dif-
tion or improved nowcasts for durable and services. ference means that the related Google Trends data are
These slightly weaker results for the usefulness of the stronger contemporaneous indicators. Under our baseline
Google Trends data could be due to the difficulty of sea- model, the Google Trends models are more effective in
sonally adjusting the Google Trends data with our limited forecasting than in nowcasting for all types of consump-
sample or because the baseline month‐on‐month fore- tion, suggesting that Google Trends data provide
casts are better (lower RMSFE), thus leaving less opportu-
nity for improvement. Forecasting the longer‐term 12‐ 3
See, for example, Nelson (1970) for a discussion of the costs and bene-
month growth rate is an inherently more difficult task fits that consumers weigh when deciding to search for more information
and the additional information provided by Google about purchases.
WOO AND OWEN 87

TABLE 5 Alternative baseline model RMSFE (12‐month % growth): forecast timeline January 2015 to November 2016

Dependent variable Statistics Baseline News Only 90% PCA News +90% PCA Average

Forecast
Durable RMSFE 1.831 1.688 1.870 1.614 1.751
Nondurable RMSFE 0.833 0.791 0.752 0.680 0.764
Services RMSFE 0.326 0.336 0.305 0.303 0.318
Nowcast
Durable RMSFE 1.543 1.329 1.612 1.399 1.471
Nondurable RMSFE 0.628 0.696 0.560 0.593 0.619
Services RMSFE 0.254 0.258 0.235 0.233 0.245

TABLE 6 Alternative baseline model RMSFE reduction (% of more than it reduces nowcast errors. This confirms our
baseline error): forecast timeline January 2015 to November 2016 belief that Google Trends data are most effective in pro-
News viding leading information about durable goods con-
Dependent News 90% +90% Best sumption due to the nature of the products. The results
variable Only PCA PCA Average model are consistent under the alternative baseline model
(Table 9) and the 60‐month rolling window method
Forecast
(Tables 10, 11, and 12). Under the 60‐month rolling win-
Durable 7.78 −2.17 11.83 5.81 News +90% PCA
dow models, the error reduction difference of durable
Nondurable 5.10 9.81 18.38 11.09 News +90% PCA goods forecasts is the largest of the three. Under the alter-
Services −3.02 6.46 6.92 3.45 News +90% PCA native baseline, Google Trends models are more effective
Average 3.29 4.70 12.37 6.79 in nowcasting nondurable goods and services consump-
Nowcast
tion than they are in forecasting them.4
In summary, durable goods models augmented with
Durable 13.91 −4.44 9.34 6.27 News Only
consumption‐related Google Trends data are more effec-
Nondurable −10.67 10.97 5.57 1.95 90% PCA tive in forecasting than nowcasting consumption under
Services −1.59 7.43 8.19 4.67 News +90% PCA all three different specifications. That is, durable goods
Average 0.55 4.65 7.70 4.30 Google Trends data provide more leading information
about consumption than contemporaneous. For nondura-
Note. Some models result in larger error.
ble goods and services, the results are mixed, suggesting
that Google Trends data contain similar magnitudes of
TABLE 7 RMSFE of month‐on‐month forecasts (12‐month % leading and contemporaneous information. We believe
growth): forecast timeline January 2015 to November 2016 the difference in Google Trends' ability to lead compo-
News nents of consumption is due to the fact that generally
Dependent News 90% +90% durable goods preconsumption research occurs further
variable Statistics Baseline Only PCA PCA out in advance than do nondurable or services‐related
Forecast
ones. Next, we discuss the nature of the information that
news‐related Google Trends data provide.
Durable RMSFE 1.094 1.078 1.093 1.078
Nondurable RMSFE 0.372 0.386 0.373 0.387 Result 3: News‐related Google Trends data
provide additional information about durable
Services RMSFE 0.163 0.162 0.184 0.184
goods consumption.
Nowcast
Durable RMSFE 1.096 1.103 1.107 1.158 News‐related Google Trends provides data on the rel-
ative popularity of news search queries. We use these
Nondurable RMSFE 0.318 0.332 0.317 0.331
data to see whether or not an increase in the search
Services RMSFE 0.157 0.157 0.161 0.162 queries related to news articles that convey negative sen-
Note. Sample starts in 2008.90% PCA data are monthly growth rate of PCAs. timents about the economy is followed by a reduction in
consumption. If consumers have a precautionary savings
information on preconsumption research for all types. Of
those models, however, especially effective is the durable 4
Durable goods Google Trends models performed worse than baseline
goods model, which reduces forecast errors by 2.70% models.
88 WOO AND OWEN

TABLE 8 RMSFE reduction of month‐on‐month forecasts (% of TABLE 11 RMSFE reduction using 60‐month rolling window
baseline error) forecast timeline: January 2015 to November 2016 models (% of baseline error): forecast timeline January 2015 to
November 2016
Dependent News 90% News Best
variable Only PCA +90% PCA Average model Dependent variable 90% PCA

Forecast Forecast
Durable 1.52 0.07 1.45 1.01 News Only Durable 14.00
Nondurable −3.85 −0.43 −4.09 −2.79 Baseline Nondurable 9.79
Services 1.03 −12.64 −12.65 −8.09 News Only Services 9.39
Average −0.43 −4.33 −5.10 −3.29 Average 11.06
Nowcast Nowcast
Durable −0.59 −0.94 −5.57 −2.36 Baseline Durable 10.06
Nondurable −4.64 0.16 −4.11 −2.87 90% PCA Nondurable 8.22
Services −0.06 −2.37 −3.14 −1.86 Baseline Services 15.94
Average −1.76 −1.05 −4.27 −2.36 Average 11.41
Note. Sample starts in 2008.90% PCA data are monthly growth rate of PCAs.

TABLE 9 Difference in error reductions (1‐month‐ahead minus TABLE 12 Difference in error reduction using 60‐month rolling
nowcast): forecast timeline: January 2015 to November 2016 window models (1 month ahead minus nowcast): forecast timeline
January 2015 to November 2016
Dependent Standard baseline Alternative baseline
variable (PCA 90%) (PCA 90%) Dependent variable Standard baseline (PCA 90%)
a
Durable 2.70 2.269 Durable 3.95
Nondurable 2.15 −1.159 Nondurable 1.57
Services 2.25 −0.966 Services −6.55
Note. Negative number means nowcast error reduction is larger, Note. Negative number means nowcast error reduction is larger.
a
Both models increased RMSFE.

for their negative correlation to durable goods


consumption.
TABLE 10 RMSFE using 60‐month rolling window models (12‐ Models augmented with news‐related Google Trends
month % growth): forecast timeline January 2015 to November 2016
reduce the baseline errors of durable consumption fore-
Dependent variable Statistics Baseline 90% PCA casts and nowcasts by 9.04% and 12.22% (Table 4). Under
Forecast
the alternative baseline model, the results are similar
(Table 6). The [News Only] models reduce errors of dura-
Durable RMSFE 1.616 1.390
ble goods consumption forecasts and nowcasts by 7.78%
Nondurable RMSFE 0.636 0.573 and 13.91%, respectively. This implies that the relative
Services RMSFE 0.305 0.276 popularity of news articles with negative sentiments pro-
Nowcast vides useful information on durable goods consumption,
Durable RMSFE 1.698 1.527
as expected.
The opposite is true for nondurable goods consump-
Nondurable RMSFE 0.621 0.570
tion. Under the standard baseline model, the [News
Services RMSFE 0.300 0.252 Only] models actually increase the errors of both fore-
casts and nowcasts. Using the alternative baseline model,
the same is true for nowcasts but not forecasts. This rela-
motive, consumers who read about negative economic tively weak performance of the news‐related Google
outlooks will be more likely to consume less to save more Trends data suggests that nondurable goods consumption
in preparation of an economic downturn or potential is not correlated to the popularity of the articles convey-
unemployment.5 We extract data for the search queries ing negative images of the state of the economy.
“recession” and “layoff” and find supporting evidence The evidence is mixed also for services consumption.
Using our baseline model, the [News Only] models
5
See Browning and Lusardi (1996) for a discussion of savings motives. decrease the baseline errors. Under the alternative
WOO AND OWEN 89

baseline models, they increase errors. A potential expla- under both baseline models. For nondurable goods and
nation for this result is that it is difficult to generalize services consumption [90% PCA] and [News +90%
the effect of economic sentiment on services consumption PCA], respectively, perform the best. This implies that
because it includes a much larger variety of items. consumption‐related Google Trends data disrupts durable
Although we do not report the detailed results here, nowcasts and news‐related data disrupts nondurable
the regression outputs of fixed‐observation OLS models nowcasts. For services consumption, both news‐related
further strengthen the argument that durable goods con- and consumption‐related data provide useful informa-
sumption is negatively correlated with news‐related Goo- tion, and therefore reduce baseline errors.
gle Trends. The coefficients of lagged values of “Layoff” Models augmented with both consumption‐related
and “Recession” Google Trends in [News Only] and and news‐related Google Trends data reduce 1‐month‐
[News +90% PCA] models are mostly negative for all ahead forecast errors more than do models augmented
three dependent variables and both forecast horizons. with only one type of Google Trends data. The best
These coefficients are significant only when the depen- models for nowcast predictions are mixed, since not all
dent variable is durable goods consumption and 1‐ data provide information about contemporaneous change
month‐ahead nondurable goods consumption. in components of consumption.
The relatively strong performance of the [News Only] The results above indicate that Google Trends data
model for durable goods consumption suggests that can improve forecasts, but the best model to use depends
changes in durable goods consumption is vulnerable to on the type of consumption and the timeframe. For exam-
changes in the readership of news articles conveying pes- ple, in Table 4, we find that the best model for durable
simistic outlooks on the economy. A potential explana- goods nowcast only incorporates news searches, indicat-
tion for this result is that durable goods consumption is ing consumers are sensitive to negative news when decid-
more responsive to changes in sentiment because the ing to purchase durable goods. However, we also show
goods are usually larger investments. In fact, the 12‐ that if we are using that same information to predict the
month growth rate of durable goods consumption is the current month's nondurable consumption, we actually
most volatile and has the largest variability of the three make our forecast worse than if we did not use Google
components, with a standard deviation of over 5 com- Trends data at all. These results suggest that we can
pared to standard deviations less than 2 for the other improve our forecasts with a better understanding of
two components of consumption. Furthermore, changes how and why Google Trends data predicts consumption,
in durable goods are more strongly correlated to changes using the Google search data only for the consumption
in CCI and MCSI than are those of nondurable goods and types and timeframes for which it provides additional
services consumption. This evidence implies that durable information about consumer behavior. We now demon-
goods consumption is closely related to changes in senti- strate that by comparing a forecast that selects the best
ment, and that news‐related Google Trends data provide Google Trends information to include for each type of
marginal information to baseline models. This conclusion consumption and timeframe. We compare our best fore-
is supported in the month‐on‐month growth rate fore- cast to our baseline model that does not use Google
casts as well (Tables 7 and 8). In these results the News Trends data, an alternative approach that uses Google
Only specification outperforms all others for durable Trends data in the same way for all types of consumption
goods consumption as well. and timeframes, and an alternative benchmark, the
median forecast from the Survey of Professional
Result 4: Models augmented with both Goo-
Forecasters.
gle Trends news and consumption searches
The Survey of Professional Forecasters provides quar-
perform the best for 1‐month‐ahead
terly annualized consumption growth forecasts, so we
forecasts.
adapt our methods to make quarterly forecasts. Specifi-
The [News +90% PCA] model reduces the average 1‐ cally, we make out‐of‐sample forecasts for total private
month‐ahead baseline errors by 16.59% under our base- consumption growth for the current quarter using the
line model and 12.37% and the alternative baseline model information that would have been available to forecasters
(Tables 4 and 6). The [News +90% PCA] model was also in the second month of the quarter. This includes using
the best‐performing 1‐month‐ahead forecast model for the initial releases for the different components of con-
each component of consumption under both baseline sumption and of lagged disposable income; we update
models. the sample used for each quarter forecasted, simulating
For nowcasts, however, the best‐performing models the process that forecasters would have used in real time.
vary by components. For durable goods consumption, We forecast the three different components of private
the [News Only] model reduces the most amount of error consumption for each month of the quarter using (1)
90 WOO AND OWEN

TABLE 13 RMSFEs for quarterly consumption growth forecasts: The ultimate conclusion of this exercise is that Google
out‐of‐sample forecasts for 2015:Q1 to 2016:Q4 Trends data have the potential to improve forecasts; how-
Model RMSFE ever, it does not always do so. Care must be taken to con-
sider the ways in which consumers are using Google
Best models combined 0.4488
searches to inform purchasing decisions.
PCA 90 model 0.4584
Baseline model 0.5322
News + PCA 90 model 0.5451
4 | C ON C L U S I ON
SPF current quarter median response 0.7760 This study shows the potential of Google Trends as a con-
News Only Model 3.7157 sumption indicator. Google Trends indicates changes in
future and contemporaneous consumption components
the baseline model, (2) augmented models in which the through providing information about preconsumption
same model is used across all timeframes and consump- research trends and popularity of economy‐related news
tion types, and (3) the best model (augmented or base- articles. Consumption forecast models augmented with
line) for each specific timeframe and type of Google Trends data reduce forecast errors of baseline
consumption. In spite of the fact that we are using initial models, and the results are nontrivial and robust to
releases for the most recent data, in calculating the changes in baseline models, measurements of errors,
RMSFE for each forecast we use the final revised values and statistical methods. That said, our results also suggest
for each of the components of consumption. In other that the way that Google Trends data are added to fore-
words, we judge how well the forecast predicts the cast models should consider carefully the way in which
change in sum of actual consumption components, even search data are used by consumers and firms. Specifically,
though some of the information forecasters have at the we demonstrate that different treatment of the Google
time is imperfect and will eventually be revised.6 Trends data for each of the components of consumption
For example, to forecast nondurable consumption, (durables, nondurables, and services) results in a better
1 month ahead, the best model is News +90% PCA, but forecast. Furthermore, the optimal use of Google Trends
the lowest RMSFE for the previous month forecast is data differs depending on whether or not the forecaster
achieved with the baseline model. Therefore, we switch is attempting to predict future consumption or obtaining
the model based on the different type of consumption a nowcast. Therefore, a uniform approach to using Goo-
and timeframe being forecast. Once we estimate con- gle Trends data for all economic series and timeframes
sumption growth for each type of consumption and does not result in the best forecast.
timeframe, we calculate a weighted average growth rate, The implications of this study are significant. Econo-
using the weights of each component in total consump- mists and policy makers can utilize daily and weekly
tion and then average the monthly estimates across the Google Trends data to estimate high‐frequency changes
3 months in a quarter. Then, we compare the forecasted in consumption. Because consumption is the largest part
weighted average for each quarter to the actual. of the economy, unusual changes in consumption fore-
The resulting RMSFEs for each forecast are shown in casts associated with changes in consumption‐related
Table 13. While all but one of our forecasts are signifi- and news‐related Google Trends should be indicative of
cantly better than the median forecast of the Survey of future economic events. Future research should investi-
Professional Forecasters, the approach of using different gate the potential of Google Trends data further, allowing
information to forecast different types of consumption its usefulness to vary with economic conditions. For
and timeframes—the “Best Models” approach—results example, the relationship between Google searches and
in a slightly lower RMSFE than the second‐best approach consumption behavior may weaken in times of economic
that uses the same model for all types and timeframes of uncertainty. With further research and refinement, Goo-
consumption. Interestingly, the baseline model outper- gle Trends data will allow policy makers to respond to
forms models that incorporate news searches uniformly economic events in a timelier and more appropriate
across all types of estimations, in spite of the fact that manner.
the models with news searches are sometimes the best
model (e.g., nondurable goods nowcast and forecast).
ACKNOWLEDGMENTS
6
Note that we must also “forecast” consumption growth for the previous We are grateful for helpful comments and advice from
month because that information would not be known at the time to the Chris Georges, Dan Kraynak, Javier Pereira, and Jeff
forecaster in the middle of the quarter. Pliskin.
WOO AND OWEN 91

ORCID Preis, T., Moat, H. S., & Stanley, H. E. (2013). Quantifying trading
behavior in financial markets using Google Trends. Scientific
Ann L. Owen https://orcid.org/0000-0001-6513-2491 Reports, 3, article no. 1684.
Vosen, S., & Schmidt, T. (2011). Forecasting private consumption:
R EF E RE N C E S Survey‐based indicators vs. Google Trends. Journal of Forecast-
ing, 30(6), 565–578.
Breyer, B. N., Sen, S., Aaronson, D. S., Stoller, M. L., Erickson, B. A.,
Wilcox, J. A. (2007). Forecasting components of consumption with
& Eisenberg, M. L. (2011). Use of Google Insights for Search to
components of consumer sentiment. Business Economics, 42(4),
track seasonal and geographic kidney stone incidence in the
22–32.
United States. Urology, 78(2), 267–271.
Wu, L., & Brynjolfsson, E. (2013). The future of prediction: How
Browning, M., & Lusardi, A. (1996). Household saving: Micro theo-
Google Searches foreshadow housing prices and sales. In A.
ries and micro facts. Journal of Economic Literature, 34(4),
Goldfarb, S. M. Greenstein, & C. E. Tucker (Eds.), Economic
1797–1855.
analysis of the digital economy (pp. 89–118). Cambridge, MA:
Carroll, C. D., Fuhrer, J. C., & Wilcox, D. W. (1994). Does consumer National Bureau of Economic Research.
sentiment forecast household spending? If so, why? American
Economic Review, 84(5), 1397–1408.
Chen, T., So, E. P. K., Wu, L., & Yan, I. K. M. (2015). The 2007–2008 AUTHOR BIOGRAPHIES
U.S. recession: What did the real‐time Google Trends data tell Jaemin Woo is a Research Analyst at The Brattle
The United States? Contemporary Economic Policy, 33(2),
Group. He studied economics at Hamilton College.
395–403.
His research interests are big‐data analysis and
Choi, H., & Varian, H. (2012). Predicting the present with Google
inequality.
Trends. The Economic Record, 88, 2–9.
Doms, M. E., & Morin, N. J. (2004). Consumer sentiment, the econ-
Ann L. Owen is a Professor of Economics and the
omy, and the news media (Working Paper No. 2004–09. San
Henry Platt Bristol Professor of Public Policy at Ham-
Francisco, CA: Federal Reserve Bank.
ilton College. She received her Ph.D. from Brown Uni-
Fuhrer, J. C. (1993). What role does consumer sentiment play in the
versity. Her research focuses on empirical
U.S. macroeconomy. New England Economic Review. January,
32–44 macroeconomics and inequality.
Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski,
M. S., & Brilliant, L. (2008). Detecting influenza epidemics using
search engine query data. Nature, 457, 1012–1014. How to cite this article: Woo J, Owen AL.
Nelson, P. (1970). Information and consumer behavior. Journal of Forecasting private consumption with Google
Political Economy, 78(2), 311–329. Trends data. Journal of Forecasting. 2019;38:81–91.
Penna, N. D., & Huang, H. (2009). Constructing consumer sentiment https://doi.org/10.1002/for.2559
index for U.S. using internet search patterns (Working Paper No.
2009–26. Edmonton, Canada: University of Albert.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy