Identifying Long-Term Precursors of Financial Market Crashes Using Correlation Patterns
Identifying Long-Term Precursors of Financial Market Crashes Using Correlation Patterns
10 September 2018
1. Introduction
robust and stable. We thus have a considerable degree of confidence in determining the
“optimal” number of market states identified by the new prescription. For our research,
we have used adjusted closure price data from Yahoo finance [24] for the S&P 500 (USA)
and Nikkei 225 (JPN) stock exchanges, for the 32-year period (1985-2016). The stock
list has been filtered such that we have stocks which were included in the market index
for the entire period of 32 years. Among others, our main finding is that there exist
four market states in USA and five in JPN. We then study the dynamical transitions
between the market states, in a probabilistic manner; we also analyze the co-occurrence
of paired market states and find that the probability of remaining in the same state
is much higher than jumping to another state. The transitions mainly occur among
adjacent states, with a few rare intermittent transitions to the remote states. The state
adjacent to the critical state may indicate a “precursor” to the critical state (market
crash) and this novel method of identifying the long-term precursors may be very helpful
for constructing the early warning system in financial markets, and in other complex
systems.
The paper is organized as follows: We present briefly the methodology and the
data description. Then we present the main part of data analyses along with the above
mentioned findings. Finally, we present summary and concluding remarks.
Table 1. Abbreviations of different sectors for S&P 500 and Nikkei 225 markets
the adjusted closing price of the k-th stock at time t (trading day). Then, the cross-
correlation matrix is constructed using equal-time Pearson cross-correlation coefficients,
Cij (τ ) = (hri rj i − hri ihrj i)/σi σj , where i, j = 1, . . . , N and τ indicates the end date of
the time-epoch of size M days. Here, we computed daily return cross-correlation matrix
C(τ ) computed over the short time-epoch of M = 20 days, for (a) USA with N = 194
stocks of S&P 500 for a return series of T = 8060 days, and (b) JPN with N = 165
stocks of Nikkei 225 for T = 7990 days, during the calendar period 1985-2016. We use
time-epochs of 20 days, such that there is a balance between choosing short time-epochs
for detecting changes and long ones for reducing fluctuations. In figure 1, we show the
time evolution of the return of the market index, r(τ ), along with the mean market
correlation (average of all the elements of the cross-correlation matrix), µ(τ ), and the
Gini coefficient that characterizes the inequality in the distribution of the correlation
coefficients. Evidently, whenever there is a market crash (fall in the r(τ )), the mean
market correlation µ(τ ) rises a lot, and the Gini coefficient falls drastically, indicating
that market is extremely correlated and all the stocks behave similarly (see Ref. [10]).
Since the assumption of stationarity manifestly fails for longer return time series, it is
often useful to break the long time series of length T , into shorter n time-epochs of
size M (such that T /M = n). The assumption of stationarity then improves for the
shorter time-epochs used. However, if there are N return time series such that N > M ,
then this implies that the correlation matrices are highly singular with N − M + 1 zero
eigenvalues, leading to poor statistics. As mentioned in the introduction, we thus use
the power map technique [19, 21, 22] to suppress the noise present in the correlation
structure of short time series. In this method, a non-linear distortion is given to each
cross-correlation coefficient within an epoch by: Cij → (sign Cij )|Cij |1+ , where is
the noise-suppression parameter. This also gives rise to an “emerging spectrum” of
eigenvalues, arising from the breaking of the degeneracy of the zero eigenvalues (see
Ref. [15] for a recent review).
Identifying long-term precursor 5
(a) (b)
Figure 1. Results of market evolution for (a) USA and (b) JPN,
respectively. The top row shows the returns of the respective market indices. The
middle row shows the mean market correlation (averaged over all the cross-correlation
coefficients) of the respective markets. The bottom row shows the inequality in the
distribution of the cross-correlation coefficients, as characterized by the Gini coefficient.
Evidently, whenever there is a market crash, the mean market correlation becomes very
high and the Gini coefficient becomes very low, indicating that all the stocks behave
very similarly.
The effect of the variation of the parameter on noise reduction and determining the
optimal number of market states, can thus be better captured through the MDS. The
question is what should be the ideal choice of the noise-suppression parameter ? A very
small value of , say = 0.01, surely breaks the degeneracy of eigenvalues (giving rise
to an “emerging spectrum” with interesting properties [10]) but does not contribute
much to noise-suppression. On the other hand, a large value, say = 0.5, suppresses
the noise in the correlation pattern and helps in clustering better way; however, the
emerging spectrum approaches towards the main Marc̆enko-Pastur distribution [26]. In
this paper, we are more interested in noise-suppression in the cross-correlation matrix
Identifying long-term precursor 6
within a single time-epoch rather than properties of the emerging spectrum; hence, we
use = 0.6 and this choice of a high value is based on the robustness and finding distinct
clusters of stocks using MDS. The effect can be clearly seen through the supplementary
figures S2 and S3. Further, our main aim is to find the optimal number of market
states, based on correlation structures which are similar and appear more frequently.
Hence, we formulate a similarity measure between different cross-correlation matrices
at different time-epochs τ , and then find similar groups of correlation frames across
different time-epochs. We find that with = 0.6, the noise suppressed cross-correlation
structures can be grouped well into similar clusters, as we will describe below. However,
we find that the number of market states is not very sensitive to the noise-suppression
parameter. A higher value of lowers the mean of the cross-correlation coefficients, µ
(see supplementary figure S1) and the maximum eigenvalue λmax of the cross-correlation
matrix.
Figure 2 shows the results of the noise-suppression on the short time cross-
correlation matrix using power mapping method [10, 16, 19, 27]. Figure 2(a) shows
a correlation frame computed for the short time-epoch M = 20 days for USA with
N = 194 stocks of S&P 500 ending on 30/11/2001 (arbitrarily chosen date). The
eigenvalue spectrum and MDS map of the correlation frame is shown in figures 2(b)
and (c), respectively. As mentioned earlier, for any short time series M < N , the
highly singular correlation matrices will have N − M + 1 degenerate eigenvalues at zero.
Hence, in our case the eigenvalue spectrum consists of 175 eigenvalues at zero, followed
by 19 distinct positive eigenvalue. The non-linear power mapping method removes the
degeneracy of eigenvalues at zero, leading to an emerging spectrum [10,15]. Figure 2(d)
shows the correlation pattern for = 0.01. The effect of the small distortion on the
corresponding eigenvalue spectrum and MDS map is shown in figures 2(e) and (f),
respectively. The effect is less visible on MDS map; λmax reduces its value by a small
amount from 44.05 to 43.67. Next, we use a high value of noise-suppression parameter
= 0.6 to reduce considerably the noise of the correlation frame (shown in figure 2(g)).
The effect of = 0.6 is clearly visible on the corresponding eigenvalue spectrum and
MDS map, as shown in figures 2(h) and (i), respectively. The shape of the eigenvalue
spectrum changes completely. The emerging spectrum from 175 eigenvalues at zero
is now non-degenerate in nature, and shows a spread around zero with some negative
eigenvalues. Inset of the figures 2(e) and (h) show the emerging spectra in greater details,
while for the inset of figure 2(b) the emerging spectrum is absent. Note that, for = 0.6,
the value of highest eigenvalue λmax decreases by a large amount to 27.27; the clusters
of stocks in the MDS maps are distinct and denser as compare to low noise-suppression
( = 0.01) or without noise-suppression ( = 0).
Identifying long-term precursor 7
(d) (e) (f )
are two correlation matrices C(τ1 ) and C(τ2 ) at different time-epochs τ1 and τ2 ,
each computed over a short time-epoch of M days, then to quantify the similarity
between the correlation structures, the similarity measure is computed as: ζ(τ1 , τ2 ) ≡
h| Cij (τ1 ) − Cij (τ2 ) |i, where | ... | denotes the absolute value and h...i denotes the
average over all matrix elements {ij} [13]. We then use the MDS map to visualize the
information contained in n × n similarity matrix, where each element is ζ(τp , τq ), where
p, q = 1, ...n.
Interestingly, the noise-suppression applied to individual correlation frames in short
time-epochs, has a dramatic effect in the similarity matrix too. Figure 3 shows the ef-
fect of noise-suppression on the similarity matrix [13] and the corresponding MDS map.
Each correlation frame is computed with N = 194 stocks of USA; hence, for the time
series of length T = 8060 days during the period 1985-2016, there are n = 805 correla-
tion frames constructed from short time-epochs of M = 20 days and shifts of ∆τ = 10
days (50% overlapping time-epochs). Similarly, we have N = 165 stocks of JPN; the
time series of length T = 7990 days in the same period yield n = 798 correlation frames.
The sharp changes in the structural patterns of the similarity matrices become evident
at higher = 0.6. It is noteworthy that figure 3(e) shows the block structure for the
Identifying long-term precursor 9
USA market and reveals the fact that behavior of USA market was relatively calmer
till 2002 and it became more volatile afterwards; the red-yellow stripes highlighting the
crash periods. Similarly, figure 3(g) shows that the JPN market became more volatile
from 1990 onward; also, it went through more critical periods as compared to USA
market. Importantly, the MDS maps with the noise-suppression parameter = 0.6 are
more compact and denser, which lead to better clustering and determination of optimal
number of markets states (see also supplementary figures S2 and S3).
(a) (b)
number of clusters k for USA and JPN are shown in figures 4(a) and (b), respectively.
The standard deviations of the intra-cluster distances measured for 500 initial conditions
are shown as the error bars. The insets of figures 4(a) and (b), show the plots for 500
initial conditions. As mentioned earlier, the value of k is optimized by keeping the
standard deviation lowest and the number of clusters highest; note that for k = 1,
the standard deviations are always trivially zero. We find that for USA, the standard
deviations are low till k = 4 and then grow for higher number of clusters; thus, k = 4 is
the optimal number of clusters. For JPN, which is more complex than USA, the standard
deviation is low for k = 1, 2, 3, increases for k = 4 and then decreases drastically for
k = 5; beyond that again the standard deviation is higher. Thus, k = 5 is the optimum
number of clusters for JPN.
The final k-means clustering of the correlation frames in the similarity matrix is
therefore performed for k = 4 clusters (USA) and k = 5 clusters (JPN), as shown
in figures 5(a) and (b), respectively. We identify the points in each cluster (different
colors represent different clusters) with similar correlation patterns and nearby mean
correlation as one market state. Based on k-means clustering, figure 5(c) shows four
different market states S1, S2, S3 and S4 of USA, where S1 corresponds to the calm
state (with low mean correlation) and S4 corresponds to the crash or critical state (with
high mean correlation); figure 5(d) shows five market states S1, S2, S3, S4 and S5 of
Identifying long-term precursor 11
(a) (b)
(c)
(d)
Figure 5. Market states. (a) Classification of the USA market into four market
states. (b) Classification of the JPN market into five market states. k-means clustering
is performed on MDS map constructed from noise suppressed ( = 0.6) similarity
matrix. The coordinates assigned in the MDS map are the corresponding correlation
frames. For USA, we have 805 correlation frames of time-epoch M = 20 days with a
shift of ∆t = 10 days; for JPN, we have 798 correlation frames for the same. (c) shows
the four different states of USA market S1, S2, S3 and S4, where S1 corresponds to
a calm state (with low mean correlation) and S4 corresponds to the crash or critical
state (with high mean correlation). (d) shows the fives different states of JPN market
S1, S2, S3, S4 and S5, where S1 corresponds to the calm state and S5 corresponds to
the critical state.
Identifying long-term precursor 12
JPN, where S1 corresponds to the calm state and S5 corresponds to the critical state,
respectively. The states are arranged in the increasing order of mean correlation. Here,
we can also see clear differences structure-wise among the correlation matrices, e.g.,
there are strong intra-sectoral correlations within the energy, finance and utility sectors,
in each of the market states of USA.
It may also be mentioned that the selection of noise-suppression parameter = 0.6
is not totally arbitrary. We compared the plots of the average intra-cluster distance as
function of the number of clusters for both USA and JPN, using ranging from 0.1 to
0.7 (shown in supplementary figures S2 and S3). The outcome of the comparison is that
= 0.6 yields the best results.
(b)
(c)
(d)
Figure 6. Dynamical evolution of market states for USA and JPN. (a)
Temporal dynamics of the USA in four different states (S1, S2, S3 and S4) for the
period of 1985-2016. (b) Probability plot of the four market states with each color
length corresponds to the evolution probability of these four states during 110 days
(10 overlapping epochs). (c) and (d) show similar results for JPN with five market
states (S1, S2, S3, S4 and S5).
(a) (b)
(c) (d)
depends only on the previous state via Wij , and in no way on the previous history, we
obtain
X
Pi (n + 1) = Wji Pj (n), (1)
j
where the sum is over all possible states j. After long times, it is plausible, and can
in fact be proved rigorously, that the probability distribution becomes independent of
(0)
n; in other words, the distribution reaches an equilibrium state Pi . The latter then
satisfies the equations
(0)
X (0)
Pi = Wji Pj . (2)
j
This can be solved explicitly, if Wij is known. The solution can be proved to be always
Identifying long-term precursor 15
2nd MS →
1st MS S1 S2 S3 S4
↓
S1 0.869 0.112 0.017 0.002
S2 0.221 0.623 0.152 0.004
S3 0.033 0.333 0.575 0.058
S4 0 0 0.273 0.727
Table 2. USA: Co-occurrence probability of four market states (MS) (first is followed
by second).
2nd MS →
1st MS S1 S2 S3 S4 S5
↓
S1 0.809 0.155 0.023 0.009 0.005
S2 0.150 0.634 0.179 0.033 0.004
S3 0.014 0.234 0.603 0.120 0.029
S4 0.011 0.075 0.330 0.511 0.075
S5 0.036 0 0.107 0.393 0.464
Table 3. JPN: Co-occurrence probability of five market states (MS) (first is followed
by second).
In summary, we have studied the identification of market states and long-term precursors
to critical states (crashes) in financial markets, based on the probabilistic occurrences of
correlation patterns, determined using noise-suppressed short-time correlation matrices.
We analyzed and compared the data of the S&P 500 (USA) and Nikkei 225 (JPN) stock
markets over a 32-year period. We used the power mapping method to reduce the
noise of the singular correlation matrices and obtained distinct and denser clusters in
the two/three dimensional MDS maps. The effects are prominent also on the similarity
matrices and the corresponding MDS maps. The evolution of the market can be followed
by the dynamics transitions between the market states. Using multidimensional scaling
maps, we applied k-means clustering to divide the clusters of similar correlation patterns
of different time-epochs into k groups or market states. We showed that based on
the cluster radii we could have a fairly robust determination of the optimal number
of clusters. In each market, the value of optimal number of clusters was chosen by
keeping the standard deviation of the intra-cluster distance ‘minimum’ and number of
clusters ‘highest’. Thus, based on the modified prescription of finding similar clusters of
correlation patterns, we characterized USA by four market states and JPN by five.
One must mention that this method yields the correlation frames that correspond
to the critical states (or crashes). We have verified that these indeed correspond to
the well-known financial market crashes; also, specifically studied the properties of the
emerging spectrum and characterization of the critical states (catastrophic instabilities)
in Refs. [10, 15]. We also analyzed the co-occurrence probabilities of the paired market
states. We observed that the probability of remaining in the same state is much higher
than the transition to a different state. It implies that market states also feel an “inertia”
– stay in the same states for a long time. Also, probable transitions are the nearest
neighbor transitions and from the co-occurrence table we showed that the probability
reduces very fast if one moved away from the diagonal. Hence, the transitions to other
states mainly occurred in immediately adjacent states with a few rare intermittent
transitions to the remote states. The state adjacent to the critical state (crash) behaved
like a long-term precursor for the critical state, and this prescription could be helpful in
constructing an early warning system for financial market crashes.
Acknowledgments
References
[1] Vemuri V 1978 Modeling of Complex Systems: An Introduction (Academic Press, New York)
[2] Gell-Mann M 1995 Complexity 1 16–19
[3] Bar-Yam Y 2002 Encyclopedia of Life Support Systems (EOLSS), UNESCO, EOLSS Publishers,
Oxford, UK
[4] Mantegna R N and Stanley H E 2007 An introduction to econophysics: correlations and complexity
in finance (Cambridge University Press, Cambridge)
[5] Bouchaud J P and Potters M 2003 Theory of Financial Risk and Derivative Pricing: from
Statistical Physics to Risk Management (Cambridge University Press)
[6] Sinha S, Chatterjee A, Chakraborti A and Chakrabarti B K 2010 Econophysics: an introduction
(John Wiley & Sons)
[7] Chakraborti A, Muni Toke I, Patriarca M and Abergel F 2011 Quantitative Finance 11 991–1012
[8] Chakraborti A, Muni Toke I, Patriarca M and Abergel F 2011 Quantitative Finance 11 1013–1041
[9] Chakraborti A, Challet D, Chatterjee A, Marsili M, Zhang Y C and Chakrabarti B K 2015 Physics
Reports 552 1–25
[10] Chakraborti A, Sharma K, Pharasi H K, Das S, Chatterjee R and Seligman T H 2018 arXiv
preprint arXiv:1801.07213
[11] Sornette D 2004 Why Stock Markets Crash: Critical Events in Complex Financial Systems
(Princeton University Press)
[12] Buchanan M 2000 Ubiquity: Why Catastrophes Happen (Three Rivers Press, New York)
[13] Münnix M C, Shimada T, Schäfer R, Leyvraz F, Seligman T H, Guhr T and Stanley H E 2012
Scientific reports 2 644
[14] Chetalova D, Schfer R and Guhr T 2015 Journal of Statistical Mechanics: Theory and Experiment
2015 P01029
[15] Pharasi H K, Sharma K, Chakraborti A and Seligman T H 2018 Complex market dynamics
in the light of random matrix theory New Perspectives and Challenges in Econophysics and
Sociophysics ed Abergel F, Chakrabarti B, Chakraborti A, Deo N and Sharma K (Springer New
Economic Windows)
[16] Schäfer R, Seligman T H et al. 2013 Physical Review E 88 032115
[17] Laloux L, Cizeau P, Bouchaud J P and Potters M 1999 Physical review letters 83 1467
[18] Plerou V, Gopikrishnan P, Rosenow B, Amaral L A N and Stanley H E 1999 Physical review letters
83 1471
[19] Guhr T and Kälber B 2003 Journal of Physics A: Mathematical and General 36 3009
[20] Bouchaud J P and Potters M 2000 Theory of Financial Risks (Cambridge University Press,
Cambridge)
[21] Schmitt T A, Schäfer R, Wied D and Guhr T 2016 Empirical Economics 50 1091–1109
[22] Vinayak and Seligman T H 2014 AIP Conference Proceedings 1575 196
[23] Borg I and Groenen P 1997 Modern Multidimensional Scaling: Theory and Applications Springer
series in statistics (Springer)
[24] 2017 Yahoo finance database.accessed on 7th july, 2017, using the r open source programming
language and software environment for statistical computing and graphics URL https://
finance.yahoo.co.jp/
[25] Mantegna R N 1999 The European Physical Journal B - Condensed Matter and Complex Systems
11 193–197
Identifying long-term precursor 18
Supplementary information
(a)
(b)
Figure S1. Plots of the mean correlation without noise-suppression (blue) and with
high noise-suppression of = 0.6 (magenta). For (a) USA, and (b) JPN. USA market
was relatively calm upto 2002 and became turbulent with high mean correlation from
2002 onward; JPN market became turbulent 1990 onward.
Identifying long-term precursor 19
Table S1. List of all stocks of USA market (S&P 500) considered for the analysis.
The first column has the serial number, the second column has the abbreviation, the
third column has the full name of the stock, and the fourth column specifies the sector
as given in the S&P 500.
S.No. Code Company Name Sector
1 CMCSA Comcast Corp. Consumer Discretionary
2 DIS Walt Disney Co. Consumer Discretionary
3 F Ford Motor Consumer Discretionary
4 GPC Genuine Parts Consumer Discretionary
5 GPS Gap (The) Consumer Discretionary
6 GT Goodyear Tire & Rubber Consumer Discretionary
7 HAS Hasbro Inc. Consumer Discretionary
8 HD Home Depot Consumer Discretionary
9 HRB Block H&R Consumer Discretionary
10 IPG Interpublic Group Consumer Discretionary
11 JCP Penney (J.C.) Consumer Discretionary
12 JWN Nordstrom Consumer Discretionary
13 LEG Leggett & Platt Consumer Discretionary
14 LEN Lennar Corp. Consumer Discretionary
15 LOW Lowe’s Cos. Consumer Discretionary
16 MAT Mattel Inc. Consumer Discretionary
17 MCD McDonald’s Corp. Consumer Discretionary
18 NKE NIKE Inc. Consumer Discretionary
19 SHW Sherwin-Williams Consumer Discretionary
20 TGT Target Corp. Consumer Discretionary
21 VFC V.F. Corp. Consumer Discretionary
22 WHR Whirlpool Corp. Consumer Discretionary
23 ADM Archer-Daniels-Midland Co Consumer Staples
24 AVP Avon Products Consumer Staples
25 CAG ConAgra Foods Inc. Consumer Staples
26 CL Colgate-Palmolive Consumer Staples
27 CPB Campbell Soup Consumer Staples
28 CVS CVS Caremark Corp. Consumer Staples
29 GIS General Mills Consumer Staples
30 HRL Hormel Foods Corp. Consumer Staples
31 HSY The Hershey Company Consumer Staples
32 K Kellogg Co. Consumer Staples
33 KMB Kimberly-Clark Consumer Staples
34 KO Coca Cola Co. Consumer Staples
35 KR Kroger Co. Consumer Staples
36 MKC McCormick & Co. Consumer Staples
Identifying long-term precursor 22
Table S2. List of all stocks of Japan market (Nikkei 225) considered for the analysis.
The first column has the serial number, the second column has the abbreviation, the
third column has the full name of the stock, and the fourth column specifies the sector
as given in the Nikkei 225.
S. No. Code Company Name Sector
1 S-8801 MITSUI FUDOSAN CO., LTD. Capital Goods
2 S-8802 MITSUBISHI ESTATE CO., LTD. Capital Goods
3 S-8804 TOKYO TATEMONO CO., LTD. Capital Goods
4 S-8830 SUMITOMO REALTY & DEVELOPMENT CO., LTD. Capital Goods
5 S-7003 MITSUI ENG. & SHIPBUILD. CO., LTD. Capital Goods
6 S-7012 KAWASAKI HEAVY IND., LTD. Capital Goods
7 S-9202 ANA HOLDINGS INC. Capital Goods
8 S-1801 TAISEI CORP. Capital Goods
9 S-1802 OBAYASHI CORP. Capital Goods
10 S-1803 SHIMIZU CORP. Capital Goods
11 S-1808 HASEKO CORP. Capital Goods
12 S-1812 KAJIMA CORP. Capital Goods
13 S-1925 DAIWA HOUSE IND. CO., LTD. Capital Goods
14 S-1928 SEKISUI HOUSE, LTD. Capital Goods
15 S-1963 JGC CORP. Capital Goods
16 S-5631 THE JAPAN STEEL WORKS, LTD. Capital Goods
17 S-6103 OKUMA CORP. Capital Goods
18 S-6113 AMADA HOLDINGS CO., LTD. Capital Goods
19 S-6301 KOMATSU LTD. Capital Goods
20 S-6302 SUMITOMO HEAVY IND., LTD. Capital Goods
21 S-6305 HITACHI CONST. MACH. CO., LTD. Capital Goods
22 S-6326 KUBOTA CORP. Capital Goods
23 S-6361 EBARA CORP. Capital Goods
24 S-6366 CHIYODA CORP. Capital Goods
25 S-6367 DAIKIN INDUSTRIES, LTD. Capital Goods
26 S-6471 NSK LTD. Capital Goods
27 S-6472 NTN CORP. Capital Goods
28 S-6473 JTEKT CORP. Capital Goods
29 S-7004 HITACHI ZOSEN CORP. Capital Goods
30 S-7011 MITSUBISHI HEAVY IND., LTD. Capital Goods
31 S-7013 IHI CORP. Capital Goods
32 S-7911 TOPPAN PRINTING CO., LTD. Capital Goods
33 S-7912 DAI NIPPON PRINTING CO., LTD. Capital Goods
34 S-7951 YAMAHA CORP. Capital Goods
35 S-1332 NIPPON SUISAN KAISHA, LTD. Consumer Goods
36 S-2002 NISSHIN SEIFUN GROUP INC. Consumer Goods
Identifying long-term precursor 27