0% found this document useful (0 votes)
39 views53 pages

SSRN 4709397

Uploaded by

Varun Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views53 pages

SSRN 4709397

Uploaded by

Varun Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Non-Linear Factor Returns

in the U.S. Equity Market

Roger Clarke, Harindra de Silva, and Steven Thorley


March 19, 2024

Forthcoming in the Financial Analysts Journal

Abstract

We examine non-linear return-to-characteristic relationships for five equity market


factors: Value, Momentum, Small Size, Low Beta, and Profitability. Our study
employs monthly returns and characteristics for the largest one thousand U.S.
stocks from 1964 to 2023 with a focus on average active returns over the last twenty
years. Beyond simplicity in modeling the return generating process we find no
reason to assume a linear relationship between characteristics and security returns.
Allowance for non-linearity leads to increases in Information Ratios for some factor
portfolios neutralized with respect to non-linear exposure to the other factors. The
return to the pure Profitability characteristic across stocks is highly non-linear, as
is the alpha return within the Low Beta portfolio. Non-linearity of the Small Size
return across the largest one thousand stocks is complex and changed around the
turn of the century. In contrast, over the last twenty years, the return to earnings
yield has been linear and flat across the entire Value spectrum.

Roger Clarke, PhD, is the former President of Ensign Peak Advisors. Harindra de Silva, PhD,
CFA is a Portfolio Manager at Allspring Global Investments. Steven Thorley, PhD, CFA is an
Emeritus Professor of Finance at the Marriott School of Management.

Electronic copy available at: https://ssrn.com/abstract=4709397


Non-Linear Factor Returns in the U.S. Equity Market

Individual stock characteristics for investable equity market factors are based on
accounting or statistical measures assigned to each stock. For example, the Small Size
characteristic is typically measured as the log of market capitalization with a negative sign to
capture smallness instead of largeness. The log transformation provides a more normal
distribution of characteristics across stocks but does not address the linearity of stock returns to
the size characteristic. The stock measure for another popular equity market factor, Value, is now
often the inverse P/E ratio or earnings yield. As in the older Fama and French (1993) book-to-
market measure of Value, the motivation for inverting market-to-book is the mathematical
properties of a ratio. Specifically, the P/E ratio for some stocks is undefined due to zero or negative
earnings, and equal changes in high versus low P/E ratios are not economically equivalent. But
beyond simplicity in modeling the return generating process, there is no reason to assume a linear
relationship between either earnings yield or book-to-market and realized security returns.

A dichotomous distinction between positive and negative price momentum has received
attention, but little research has been published about non-linear relationships between security
returns and the Momentum characteristic after controlling for other common factors like Value
and Profitability. For any given factor, does a stock with a scored characteristic of say 2.0 have
twice the average active return as a stock with a score of 1.0? Does a stock with a scored
characteristic of -1.0 have a negative active return of the same magnitude? De Boer (2020), Zhang
(2022), Bollerslev, Patton, Quaedvlieg (2023), Kagkadis (2023) et. al., and Didisheim (2024) et.
al., provide background on recent findings of non-linear returns in investment management. This
study provides a systematic examination of non-linearities in the return-to-characteristic
relationship for the pure version of five well-known equity market factors.

Our research data set is the largest one-thousand U.S. stocks each month from 1964 to
2023, with a focus on the last twenty years, 2004 to 2023. Data on characteristics are gathered for
five popular equity market factors that have emerged over time: Value, Momentum, Small Size,
Low Beta, and Profitability. The start date and factor choices come from the availability of
comprehensive return and characteristic data in CRSP and Compustat for at least one thousand
individual stocks. We use trailing annual earnings yield for Value, the logged version of Carhart’s

Electronic copy available at: https://ssrn.com/abstract=4709397


(1997) prior annual return less prior month return for Momentum, negative log capitalization at
the beginning of the month for Small Size, and one minus the trailing 36-month S&P 500 beta for
Low Beta. The Profitability characteristic is the prior year gross profit margin as specified by
Novy-Marx (2013) with an asset adjustment to allow for financial stocks as described in Clarke,
de Silva, and Thorley (2020).

The Value and Size factors have extensive research records in academic literature, as
introduced by Fama and French (1992), although we use the practitioner definition of earnings
yield for Value instead of book-to-market. The Momentum factor was introduced by Jegadeesh
(1990) and Jegadeesh and Titman (1993) with asymmetric properties discussed in Barroso and
Santa-Clara (2015) and the relationship to other factors explored by Ehsani and Linnainmaa
(2020). The Low Beta anomaly was implicit in early academic research on the CAPM and more
recently popularized by Frazzini and Pedersen (2014). We use Low Beta as a more precise and
portfolio-relevant characteristic of the “low volatility” factor. Novy-Marx (2013) introduced the
Profitability factor into academic literature. Quality as defined by many practitioners is shown by
Hsu, Kalesnik, and Kose (2019) to be primarily dependent on gross margin, so readers interested
in Quality can make inferences from the Profitability factor.

Basic equilibrium economics in Sharpe (1964) and Jensen, Black, and Scholes (1972),
initiated the belief that factors with positive market-relative performance must represent a reward
for taking on systematic risk. In contrast, most financial economists now believe that factors with
a significant positive long-term alpha are informational or behavioral anomalies that have not or
cannot be arbitraged away, for example Shliefer and Vishney (1997). Others worry that some of
the historical results are the outcome of data mining the factor “zoo” examined in Feng, Giglio and
Xu (2020) among others. We mitigate the problem of ex-post data mining by examining factors
that gained popularity prior to the turn of the century, and except for Value, by using the stock
characteristic definitions proposed by the original academic researchers. We focus on the most
recent twenty-year sample, 2004 to 2023, but also report on two earlier twenty-year periods that
show changes in the structure of the anomalies over time.

The primary focus of this study is non-linearity in the return-to-characteristic relationship


across stocks, but we also examine non-linearity in characteristic-to-characteristic relationships

Electronic copy available at: https://ssrn.com/abstract=4709397


between stocks. We report on the performance of optimal single-factor portfolios constructed to
be pure meaning purged of both linear and non-linear relationships to the other factors. The paper
provides tables on pure portfolio returns and visual representations of the non-linear return-to-
characteristic profiles within portfolios using capitalization-weighted Fama-Macbeth regressions.
The monthly cross-sectional regressions of security returns on characteristics are multivariate,
simultaneously incorporating all five factors, including orthogonalized squared and cubic
characteristics to allow for non-linearities. We also examine non-linearities in return by defining
four category portfolios for each factor based on positive versus negative security characteristic
scores, and scores with an absolute value greater than one. Results for non-pure, linearly pure,
and fully pure factor categories are compared to better understand the source of non-linearity in
returns.

Economically and statistically significant non-linear return-to-characteristic relationships


are documented for some factors after purging non-linear characteristic-to-characteristic
relationships. The most complex non-linear average return patterns are found along the size
spectrum, with gross margin and market beta also being non-linear in return over the last twenty
years. In this study, size exposure is relative within the largest one thousand U.S. stocks. Outside
of that range, stocks have less liquidity and little market capitalization significance and are
excluded from our analysis. Despite substantial prior research on kinked Momentum payoffs, the
return-to-characteristic relationship for the pure version of the Momentum factor is more linear.
Profitability is both the highest performing and the most significantly non-linear factor return over
the last twenty years. Unfortunately, the Value factor has had a near zero average return profile
along the entire exposure spectrum. Value for the last twenty years has been linear but flat with
little performance from Glamour stocks to Deep Value stocks.

Linear Factor Portfolio Returns


This section introduces econometric methodologies used throughout the paper including
multivariate linear regressions, and then presents the results for non-linear cubic regressions. The
five-factor Fama and Macbeth (1973) linear regression specification is

ri = c1 s1,i + c2 s2,i + c3 s3,i + c4 s4,i + c5 s5,i +  i (1)

Electronic copy available at: https://ssrn.com/abstract=4709397


where ri are i = 1 to 1000 one-month security returns and sk ,i are beginning-of-month scored

characteristics to the k = 1 to 5 factors. The cross-sectional regression observations are weighted


by beginning-of-month security capitalization, as motivated in the Technical Appendix. The
security returns on the left-hand side of Equation 1 are active, meaning differential to the market
return, although total returns and an intercept term can also be used without impacting the slope
coefficients. Using total returns, an intercept term c0 , if included in Equation 1, is the market

portfolio return each month due to capitalization weighting of the regressions.

The five sets of scores in Equation 1, sk ,i , are standardized characteristics to earnings yield

(i.e., inverse trailing P/E), log price momentum, negative log market capitalization, negative
trailing 36-month market beta, and prior-year gross-margin. The characteristics are cross-
sectionally standardized each month to weighted mean-zero unit-variance variables as described
in the Technical Appendix. For notational convenience, Equation 1 does not have time subscripts,
t, because the exposure data for each cross-sectional regression is taken at one point in time,
specifically the beginning of each month. One insight from weighted regressions is that the
composition of the securities in the portfolio that generate the factor return can be inferred from
the estimated coefficients. As shown in the Technical Appendix, the estimated coefficients in
Equation 1 represent active (i.e., market differential) returns to optimally constructed single-factor
portfolios. The factor portfolios are fully invested and primarily composed of long security
positions, in contrast to the long/short portfolios used to examine factors in many academic
studies.1

The five pure single-factor portfolios are neutralized with respect to linear and non-linear
exposure to the other four characteristics as discussed in Clarke, de Silva, and Thorley (2017), in
contrast to Fama-French style factors which are only partially neutralized to the size characteristic.
In practice, multi-factor strategies based on the pure single factor portfolios contain little shorting
due to offsets for each security between factors as well as reduced active risk devoted to any single
factor. Thus, multi-factor portfolios are either long-only or can be constrained to be long-only

1
The factor portfolios specified by the regression coefficients in Equation 1 have a one standard
deviation exposure to the factor of interest. The portfolios are commonly known as 120/20 long-
short where the amount of shorted security capitalization varies from about 10 to 30 percent.
5

Electronic copy available at: https://ssrn.com/abstract=4709397


without materially impacting performance. We capture the monthly portfolio returns or regression
coefficients in Equation 1, and then summarize the average return for each factor over the last
twenty years (i.e., 120 months) with statistical inference using the coefficient standard error and
economic significance based on realized Information Ratios.2

Table 1 reports the results of Equation 1 applied to the 120 months from January 2004 to
December 2023 with market returns in excess of the contemporaneous risk-free rate (one month
T-bill) in the first column. The monthly returns are annualized in Table 1 by multiplying the mean
return by 12 and the return standard deviation by the square root of 12. For example, the first
column shows that the average excess market return was 9.33 percent over 20 years, with a
standard deviation of 15.01 percent, giving a Sharpe Ratio (mean divided by standard deviation)
of 0.622. The Sharpe Ratio entries in the other five columns are for active returns in contrast to
the more common Sharpe Ratio definition of average excess return over excess return standard
deviation. The pure active returns in the next five columns have a one-standard deviation exposure
to the factor of interest due to standardized scoring of the security characteristics in Equation 1,
resulting in variation in risk between the factors. For example, the active return standard deviation
of the Value portfolio is 2.56 percent while the Momentum portfolio active return standard
deviation is almost twice as large at 4.75 percent.

Table 1: Linear Factor Portfolio Returns from 2004 to 2023


Small Low
Market Value Momentum Size Beta Profitability
Mean 9.33% -0.14% 1.07% 0.16% -1.09% 1.51%
Standard Dev. 15.01% 2.56% 4.75% 3.59% 5.66% 3.05%
Sharpe Ratio 0.622 -0.056 0.226 0.044 -0.192 0.497

Market Beta 1.000 0.01 -0.04 0.04 -0.25 -0.03


Market Alpha -0.22% 1.42% -0.24% 1.20% 1.79%
Active Risk 2.56% 4.71% 3.54% 4.29% 3.02%
Information Ratio -0.087 0.301 -0.068 0.280 0.592

2
Panel regressions that simultaneously include all 20*12*1000 = 240 thousand observations using
market capitalization weights have similar results for average active returns with larger t-statistics.
However, the estimated coefficients do not equate to identifiable portfolios and the panel
regressions do not provide a time-series active risk parameter.

Electronic copy available at: https://ssrn.com/abstract=4709397


As is now widely reported, Value factor portfolio performance has been non-existent over
the last two decades, with an annualized return of -14 basis points as shown in the first row of
Table 1. Performance is even worse for the non-pure Value portfolio (not shown) constructed
without neutralization to the other factors. Specifically, the non-pure Value factor average active
return with just the Value characteristic on the right-hand side of Equation 1 has a mean return of
-0.75 percent compared to -0.14 percent. Non-pure performance is worse because the Value
characteristic has a negative correlation across securities with the Profitability characteristic, and
Profitability has performed well over the last twenty years. In general, non-pure portfolio
performance is worse as measured by an Information Ratio for all five factors because return
standard deviations are higher due to non-neutralized exposure to other important factors. We
explain more about linear and non-linear factor portfolio purification later in the paper.

The realized active market beta of the pure factor portfolios is reported in the fourth row
of Table 1. For example, the Value portfolio’s active beta of 0.01 is almost zero, meaning the
portfolio’s total realized market beta is 1.00 + 0.01 = 1.01, slightly greater than one. The other
factors also have total betas close to one, except for the Low Beta portfolio which is by design
tilted towards low beta stocks. The market beta of the Low Beta portfolio in Table 1 is 1.00 - 0.25
= 0.75, and the portfolio’s alpha, calculated by active return minus active beta times market return,
is 1.20 percent.3 The other alpha calculations in Table 1 are close to the mean active return for
each factor. For example, the Value alpha is -22 basis points, slightly under the -14 basis point
mean return due to a market beta that is slightly higher than one.

The active risks reported in the second to the last row of Table 1 use the realized market
betas, in other words calculated as the standard deviation of the alpha return. The Information
Ratio for each factor in the final row is alpha divided by active risk. The IR for the Profitability
portfolio is 0.592, about twice the magnitude of the Momentum and Low Beta portfolio IRs of
0.301 and 0.280, respectively. Like the Value portfolio, the alpha of the linear Small Size portfolio
has been slightly negative over the last twenty years, leading to a negative IR. Information Ratios

3
The security return model ri =  i + i rM and portfolio active weights of wM i si give the
 N 
portfolio’s alpha as  P = rP −  P rM where  P =   wM i si i  − 1 is the portfolio’s active beta.
 i =1 
7

Electronic copy available at: https://ssrn.com/abstract=4709397


measure economic significance and are best evaluated with respect to the magnitude of other IRs.
To check the statistical significance of the alphas, the annualized Information Ratios can be
multiplied by the square-root of the number of years, in this case 20, to get t-statistics for the null
hypothesis that the realized annual alpha is zero. The alpha for the pure linear Profitability factor
in Table 1 has a t-statistic of 0.592 times 201/2 = 2.6, significant at the 99-percent confidence level,
while the t-statistics for the other four factors indicate statistical insignificance.

An optimal multi-factor portfolio can be constructed by weighing the individual pure


portfolios with the objective of maximizing the combined Information Ratio. As described in
Clarke, de Silva and Thorley (2020), those weights are proportional to the individual factor
portfolio Information Ratios under the assumption of uncorrelated factor returns. The optimal
five-factor portfolio (not shown in Table 1) has an alpha of 1.44 percent and active risk of 2.06
percent, for an Information Ratio of 0.698. This multi-factor IR is slightly lower than Sharpe’s IR
approximation formula

 K 
IRP =   IR j 
2
(2)
 j =1 

of 0.729 because of the non-zero correlations between the realized returns reported in Table 2. For
example, Value has a realized return correlation of -0.271 to Profitability, even though both
portfolios come from the same linear regression.

Table 2: Factor Portfolio Active Return Correlations from 2004 to 2023


Small Low
Value Momentum Size Beta Profitability
Value 1.000 -0.083 0.224 0.114 -0.271
Momentum 1.000 -0.078 0.163 0.192
Small Size 1.000 -0.178 -0.248
Low Beta 1.000 0.046
Profitability 1.000

The time-series correlations between returns are lower than those for non-pure factor
portfolios, produced by Equation 1 with just one score set on the right-hand side, but they are not
zero. For example, the return correlation (not shown in Table 2) between non-pure Value and non-

Electronic copy available at: https://ssrn.com/abstract=4709397


pure Profitability is -0.312 compared to -0.278, and the largest correlation between non-pure
portfolios is -0.579 for Profitability and Small Size, instead of -0.248 as reported in Table 2. The
correlation coefficients in Table 2 with absolute values greater than four standard errors, 4 ×
1/2401/2 = 0.258 are bolded for emphasis.

Non-Linear Factor Portfolio Returns


Non-linear returns within factor portfolios use an expansion of Equation 1 that includes
orthogonalized squared and cubed characteristic scores for each factor. The non-pure single-factor
portfolio regression specification is

ri = 1 si + 2 si2 + 3 si3 +  i (3)

where si is the scored characteristic, and si2 is based on the squared characteristic. As described

in the Technical Appendix, the squared characteristic is analytically orthogonalized to the linear
score, making them cross-sectionally uncorrelated. The cubed characteristic, si3 , is jointly

orthogonalized to both the linear score and squared characteristics. The squared and cubed terms
in Equation 3 are rescored after orthogonalization to a weighted mean of zero and weighted
variance of one. The orthogonalization process adds precision to the examination of non-linear
return patterns using cubic regressions because the right-hand side variables would otherwise be
highly correlated leading to larger correlations in realized portfolio returns. Specifically, the slope
coefficients, 1 , 2 , and 3 in Equation 3 are equivalently estimated by three individual univariate

regressions. As with Equation 1, an intercept term if included is exactly zero, or equal to the
market portfolio return if total security returns are used on the left-hand side. As with the
regression using just linear characteristics in Equation 1, the coefficients from the 15-variable
regression represent portfolio returns with the security weights in each portfolio being a by-
product.

Table 3 reports on a regression like Equation 3 with 15 right-hand side variables, where
the five scores are included along with their orthogonalized squared and cubed terms. Table 3
does not include return means and standard deviations to save space but reports on the other rows
in Table 1. For example, the pure linear score coefficient for the Value portfolio has an average

Electronic copy available at: https://ssrn.com/abstract=4709397


alpha of -32 basis points, while the alpha for the characteristic squared term is 36 basis points, and
the cubed term alpha is 20 basis points. The active risks at a one standard deviation exposure for
the three Value portfolios are 2.47 percent, 1.61 percent, and 1.58 percent, respectively, leading to
Information Ratios of -0.128, 0.225, and 0.127 for the three Value portfolios.

Table 3: Non-Linear Factor Portfolio Returns from 2004 to 2023


Value Score Squared Cubed Combined
Market Beta 0.00 0.02 0.00 0.02
Market Alpha -0.32% 0.36% 0.20% 0.69%
Active Risk 2.47% 1.61% 1.58% 2.56%
Information Ratio -0.128 0.225 0.127 0.268
Optimal Weight -28% 98% 30%

Momentum Score Squared Cubed Combined


Market Beta -0.04 0.09 -0.02 -0.09
Market Alpha 1.36% -0.80% 0.96% 2.94%
Active Risk 4.58% 2.61% 1.76% 4.71%
Information Ratio 0.298 -0.306 0.544 0.624
Optimal Weight 19% -18% 99%

Small Size Score Squared Cubed Combined


Market Beta 0.02 0.01 0.00 -0.02
Market Alpha 0.09% 0.07% -0.68% 1.40%
Active Risk 3.58% 2.25% 2.05% 3.54%
Information Ratio 0.026 0.030 -0.332 0.396
Optimal Weight -1% -33% -66%

Low Beta Score Squared Cubed Combined


Market Beta -0.22 -0.04 0.01 -0.17
Market Alpha 1.37% -0.04% -0.52% 1.96%
Active Risk 4.11% 2.11% 1.51% 4.29%
Information Ratio 0.334 -0.019 -0.346 0.457
Optimal Weight 50% -7% -143%

Profitability Score Squared Cubed Combined


Market Beta -0.03 0.02 0.00 -0.03
Market Alpha 1.85% -1.54% 0.11% 2.45%
Active Risk 2.90% 2.13% 1.70% 3.02%
Information Ratio 0.638 -0.723 0.067 0.812
Optimal Weight 24% -88% -36%

10

Electronic copy available at: https://ssrn.com/abstract=4709397


An ex-post optimal non-linear factor portfolio can be constructed by combining the linear,
squared, and cubed portfolios for each factor. The combined column in Table 3 uses the three
return columns for each factor with weights that sum to plus or minus 100 percent. The weights
in the combined portfolio are close to the relative magnitude of the three individual portfolio
Sharpe Ratios times the inverse return correlation matrix reported later in Table 4. For example,
the non-linear portfolio return for the Value factor has a -28 percent weight of the score portfolio
return, a 98 percent weight of the squared portfolio return, and a 30 percent weight of the cubed
portfolio return. The combined non-linear factor portfolios are then scaled to the same active risk
as the corresponding linear portfolio in Table 1. The largest contributor to the combined non-
linear Value factor portfolio is the squared term leaving the combined non-linear Value portfolio
IR at 0.268. Alternatively, the largest contributor to the combined non-linear Momentum portfolio
is the cubed term, giving an IR of 0.624. Small Size has a cubed weight of -66 percent and squared
weight of -33 percent, for combined non-linear portfolio IR of 0.396.

The Low Beta factor in Table 3 has a very large weight on the cubed portfolio of -143
percent, offset by a 50 percent weight on the score portfolio, yielding a combined non-linear factor
portfolio IR of 0.457. The non-linear Profitability portfolio combines -88 percent of the squared
portfolio and -36 percent of the cubed portfolio, offset by 24 percent on linear score portfolio,
leading to a large combined non-linear Profitability portfolio IR of 0.812. In other words, the non-
linear Profitability portfolio has an extraordinary alpha of 2.45 percent over the last twenty years
at an active risk of just over three percent.

Statistical tests of non-linearity in factor returns can be based on t-statistics for either the
squared or cubed annualized alpha for each factor, computed by IR times the square root of 20
years. For example, the t-statistic for the cubed Momentum portfolio is 0.544×201/2 = 2.4, and the
t-statistic for the squared Profitability portfolio with a sign change is 0.723×201/2 = 3.2. But a
complete test of non-linearity is based on the improvement in non-linear versus linear IR for each
factor

t − statistic = ( IR 2
non
2
)
− IR lin N (4)

11

Electronic copy available at: https://ssrn.com/abstract=4709397


where IRnon is the combined portfolio IR in Table 3 and IRlin is from Table 1. For example, the t-
statistic in Equation 3 for Value non-linearity is ((0.2682 – 0.0872)×20)1/2 = 1.2. The complete
non-linear return tests for other factors give t-statistics of 2.4 for Momentum, 1.8 for Small Size,
2.4 for Low Beta, and 4.5 for Profitability. Return non-linearity is thus statistically significant for
the Momentum and Low Beta portfolios, weakly significant for the Small Size portfolio, and
highly significant for the Profitability portfolio.

Table 4: Non-Linear Portfolio Active Return Correlations from 2004 to 2023


Value Momentum Small Size
1 2 3 1 2 3 1 2 3
1 1.000 0.030 -0.491 -0.055 0.012 -0.021 0.225 -0.132 0.119
2 1.000 0.040 -0.095 0.135 -0.019 -0.013 0.001 0.001
3 1.000 0.040 -0.033 0.081 -0.258 0.028 0.020
1 1.000 -0.253 0.059 -0.005 0.107 -0.195
2 1.000 -0.276 -0.016 -0.005 0.021
3 1.000 -0.134 0.063 0.100
1 1.000 -0.522 0.229
2 1.000 -0.601
3 1.000

Low Beta Profitability


1 2 3 1 2 3
1 0.096 -0.154 -0.025 -0.205 0.133 0.086
2 -0.192 -0.074 0.003 -0.071 -0.020 0.153
3 -0.057 0.062 -0.049 0.172 -0.051 -0.067
1 0.170 0.048 0.143 0.151 -0.262 0.165
2 -0.406 -0.223 0.104 -0.124 0.077 -0.052
3 0.249 0.051 0.061 0.075 0.031 -0.120
1 -0.156 -0.194 0.151 -0.145 -0.010 0.202
2 -0.023 0.085 -0.125 0.031 0.063 -0.012
3 -0.018 -0.083 -0.012 -0.165 0.269 -0.171
1 1.000 0.152 -0.164 0.017 0.012 -0.017
2 1.000 -0.084 0.106 -0.150 -0.037
3 1.000 0.119 -0.160 0.019
1 1.000 -0.497 -0.142
2 1.000 -0.468
3 1.000

Table 4 reports time-series correlations between 15 portfolio retuns, i.e., the coefficients in
Equation 3 with 15 rather than 3 right-hand side variables. As in Table 2, correlations with
12

Electronic copy available at: https://ssrn.com/abstract=4709397


absolute values greater than four standard errors, 0.258, are bolded for emphasis. For example,
the -0.491 correlation between the score and cubed score portfolios for the Value factor is relatively
large, as are two of the correlations within the Small Size and Profitability portfolios. On the other
hand, the correlations among the Low Beta portfolios are relatively small, but the market beta
portfolio has a large correlation of -0.406 with the squared Momentum portfolio. While these are
significant time-series return correlations, they would be much higher, in some cases with absolute
values of 0.700 to 0.800, without orthogonalizing the characteristics within each factor prior to
running the monthly cross-sectional regressions. Realized return correlations of that magnitude
make it difficult to identify the source of performance in non-linear factor portfolios or establish
the weights required for each factor’s optimal combination.

Figure 1: Non-Linear Factor Portfolios in Active Return Space


Average monthly returns from 2003 to 2024
3.0% Momentum

Five-Factor
2.5% Profitability

2.0% Low Beta


Linear Pofitability
Linear Five-Factor
1.5%
Linear Momentum
Alpha

Small Size
Linear Low Beta
1.0%

Value
0.5%

0.0%
Linear Value
Linear Small Size
-0.5%
0.0% 1.0% 2.0% 3.0% 4.0% 5.0% 6.0% 7.0%
Active Risk

The results of the “market plus 15” Fama-Macbeth monthly regressions can be visualized
in several ways, but we start with the market relative performance of the optimal non-linear factor
portfolios. Figure 1 plots the performance of the five combined non-linear factor portfolios in

13

Electronic copy available at: https://ssrn.com/abstract=4709397


active space, with alpha on the vertical axis and active risk on the horizontal axis. The relevant
aspect of each portfolio’s performance is the IR-based slope of a line that connects the portfolio
position in Figure 1 to the point of zero alpha and zero active risk. For example, the combined
non-linear Value portfolio has an IR of 0.268 as reported in Table 3, compared to a negative IR
slope of -0.087 reported in Table 1. The non-linear Momentum portfolio has an IR of 0.624, as
shown in Figure 1 by the portfolio position in the upper right corner, scaled to the same 4.71
percent active risk as the linear factor portfolio with an IR of 0.301.

Small Size has an economically significant IR increase in Figure 1, going from -0.068 as
reported in Table 1, to a non-linear combined portfolio IR of 0.396 as reported in Table 3. The
Low Beta and Profitability non-linear factor portfolios have Information Ratios of 0.457 and 0.812,
respectively, both incrementally better than their linear counterparts, placing the non-linear
Profitability portfolio alpha at about 2.5 percent with an active risk of about 3.0 percent. Figure 1
also plots a multi-factor portfolio created by combining the non-linear factor portfolio returns
weighted by their respective Information Ratios. The five-factor portfolio optimally (i.e., using
return cross-correlations) combines the performance of the five individual non-linear factor
portfolios and has an extraordinary alpha of just under 2.5 percent at an active risk of about 2.0
percent, an Information Ratio of 1.191. In contrast, the optimal linear five-factor portfolio has an
alpha of just under 1.5 percent with the same level of active risk. Although impressive, the actual
Information Ratio of 1.191 for the optimal non-linear five-factor portfolio is slightly less than the
Sharpe’s rule approximation of 1.219 from Equation 2 because of the non-zero cross-correlations
between the 15 return columns reported in Table 4.

The visualizations in Figures 2 and 3 provide more perspective on the nature of the five
linear and non-linear pure factor portfolios. Figure 2 plots the security active weights as a ratio to
market portfolio weight for each security for the five linear factors based on the regression reported
in Table 1. The individual security dots are sized using their market capitalization in mid-year
2023 for perspective on the capitalization-weighted nature of the Fama-Macbeth cross-sectional
regressions. For example, the linear Small-size portfolio has security weights that decline linearly
with the Small Size score using the Information Ratio of -0.068 as reported in Table 1. The largest
two dots for on the size characteristic line for Apple and Alphabet and indicate slight market over-

14

Electronic copy available at: https://ssrn.com/abstract=4709397


weights for these stocks in the linear Small Size portfolio. The same two stocks are somewhere in
the middle of the other four factor lines.

Figure 2: Linear Active Return to Security Characteristics


Average monthly return from 2004 to 2023
Value Momentum Small Size Low Beta Profitability
150%

100%
Active/Market Weight

50%

0%

-50%

-100%

-150%
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0
Scored Exposure

The vertical axis in Figure 2 is based on 2004 to 2023 fitted non-linear scores, scaled so
that dots above zero are over-weights compared to the market portfolio and dots below zero are
under-weights compared to the market portfolio. The adjustment of the vertical scale employs the
factor portfolio relationship

wP ,i = wM ,i (1 + si ) (5)

discussed in the Technical Appendix where wP,i and wP,i are weights of security i in the factor and

market portfolios, respectively, and si is the securities standardized factor characteristic.

Specifically, the fitted score is converted into active portfolio weight, wP ,i − wM ,i , divided by

market weight, or wP ,i / wM ,i - 1. The 100 percent level on the vertical axis indicates an active

15

Electronic copy available at: https://ssrn.com/abstract=4709397


weight that is twice the market portfolio weight, while the minus 100 percent level indicates a
negative active weight equal to the market weight, a total weight of zero. Dots below the -100
percent level indicate short sells in the linear optimal factor portfolio. For example, a few securities
are shorted on the left-hand end of the optimal linear Profitability portfolio.

Figure 2 is designed to display a lot of information about the individual securities including
the cross-sectional distribution of standardized characteristics in mid-year 2023. For example, the
distribution of characteristics for the Profitability factor is close to normal, with few securities
outside the range of -2.0 to 2.0. In contrast, the cross-sectional distribution of the Value
characteristic has many securities outside the two standard deviation range, and the two largest
stocks Apple and Alphabet are just to the left of the middle of the distribution. All five of the
linear factor curves cross at the origin, where active weights are zero, because of capitalization
weighting of the individual regression observations in Equation 1.

Figure 3: Non-Linear Active Return to Security Characteristics


Average monthly return from 2004 to 2023

Value Momentum Small Size Low Beta Profitability

150%

100%
Active/Market Weight

50%

0%

-50%

-100%

-150%
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0
Scored Exposure

16

Electronic copy available at: https://ssrn.com/abstract=4709397


Figure 3 is like Figure 2, but for the optimal non-linear factor portfolios as reported in the
last column of Table 3. Non-linear returns to the Profitability factor are parabolic downward based
on the large Information Ratio of -0.723 for the squared term, but with a flatter left tail. The non-
linear Low Beta factor return graph is also curved downward for scores above about 1.0, meaning
stocks with very low market betas, but curved upward for high beta stocks on the left-hand side of
Figure 3, based on a linear characteristic IR of 0.334 and a cubed characteristic IR of -0.346, as
reported in Table 3. Small Size has the most complex shape in Figure 3 due primarily to the cubed
characteristic IR of -0.332. The largest two stocks, Apple and Alphabet, are over-weighted in the
non-linear Small Size portfolio, with lower weights for the “Magnificent Seven” and then a gradual
increase in market-relative active weights for the other 993 stocks that turn down again at a score
of about 1.0, meaning the smallest stocks within the largest one thousand. Note that the curves in
Figure 3 do not cross at zero because of capitalization weighting of a non-linear function.
Momentum is parabolic downward, although less so on the right-hand side, with shorting of the
extreme negative momentum stocks. Value is primarily linear and even more downward sloping
than in Figure 2, except for stocks with very high earnings yield (i.e., very low P/E ratios) known
as “Deep Value” stocks, where the non-linear Value factor curve turns upward in Figure 3.

Figure 4: Cumulative Risk-adjusted Alpha


3.0 percent Active Risk from 2004 to 2023

Value 1 Value 2 Value 3 Moment 1 Moment 2


Moment 3 Small 1 Small 2 Small 3 Beta 1
Beta 2 Beta 3 Profit 1 Profit 2 Profit 3
50%
40%
30%
Cumulative Alpha

20%
10%
0%
-10%
-20%
-30%
-40%
-50%
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023

17

Electronic copy available at: https://ssrn.com/abstract=4709397


Another way to visualize the regression results in Table 3 is the plot of cumulative alphas
for all 15 portfolios shown in Figure 4. The monthly alpha for each portfolio is accumulated over
the 20 years from 2003 to 2023, where each alpha is adjusted to an ex-post 3.0 percent active risk
level for direct comparison between portfolios. The annualized risk-adjusted alpha for the “Profit
1” portfolio at the end of the plot is 38.25 percent divided by 20 years = 1.91 percent, which
translates into the large Information Ratio of 1.91/3.00 = 0.638 given at the bottom of the first
column in Table 3. An even more impressive cumulative alpha on the negative side of the plot is
the Profit 2 portfolio, which translates to the -0.723 Information Ratio as reported at the bottom of
the second column in Table 3. The combination of the large positive and large negative returns
for these two Profitability portfolios results in the downward parabolic shape on the non-linear
Profitability return in Figure 3, with the additional contribution from the cubed characteristic
leading to a less downward slope on the left-hand side.

The next largest cumulative alphas in Figure 4 are for the cubed Momentum portfolio at
about 33 percent, called Moment 3, followed by Beta 1 and Moment 1 each ending at about 20
percent. Large cumulative alphas, positive or negative, are also seen at the bottom of Figure 4,
specifically the Moment 2, Small 3, and Beta 3 portfolios, all ending at about -20 percent. The
associated squared Momentum, cubic Small Size, and cubic Low Beta IRs are -0.306, -0.332 and
-0.346, respectively, as given in Table 3. The cumulative alphas and associated Information Ratios
do not completely convey the combined magnitude of any given non-linear factor portfolio
because of non-zero return correlations between the three parts. For example, the large negative
correlation between the Profit 1 and Profit 2 returns of -0.497 in Table 4 is visually evident in
Figure 4 with the large opposing moves of those two plots late in the calendar year 2008.

We can also use the monthly Fama-Macbeth regression from the 15-variable version of
Equation 3 to draw more accurate non-linear active return patterns than Figure 3. Each month
fitted active returns are generated for hypothetical stocks with scores that range from -2.0 to 2.0
along the factor of interest, with scores fixed at zero for the other four factors. The 240 fitted
return observations from 2004 to 2023 are then averaged over time, with the annualized time-series
return standard deviation measuring risk at each score, and after dividing by the square root of
240, providing a standard error for the average. Figure 5 plots the average active return from a

18

Electronic copy available at: https://ssrn.com/abstract=4709397


score of -2.0 to 2.0 for each of the five factors.4 For example, the non-linear Small Size factor has
the same two-twist shape in Figure 5 as shown in Figure 3, while the Low Beta curve dips down
on the right-hand side and the Profitability curve has a significant non-linear dip towards the lower
left hand corner, all similar shapes to those shown in Figure 3. Notably, the fitted average returns
to Value from 2004 to 2023 plot on an almost flat line along the vertical axis in Figure 5.

Figure 5: Average Active Return to Pure Score 2004 to 2023


Value Momentum Small Size Low Beta Profitability
10%

8%

6%

4%
Average Return

2%

0%

-2%

-4%

-6%

-8%

-10%
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
Pure Score

Figure 6 uses the same 240 fitted returns as Figure 5, but plots active return risk at each
score for the factor of interest, holding the other four scores constant at zero. Active risk is mostly
linear and symmetric on the negative and positive score side for the Value factor, but slightly
higher for positive compared to negative Low Beta scores, and higher on the lower end of the
Momentum, Profitability, and especially the Small Size spectrum. For example, the risks at scores
of 1.0 and -1.0 for Small Size are both about 5 percent, but the risk associated with Mega Cap

4
As in Figure 3, the non-linear fitted return curves from the weighted Fama-Macbeth regressions
do not cross at zero each month due to squared characteristic scores that are adjusted to have a
mean of zero. The fitted return at zero is subtracted along the entire score spectrum each month
so the average active return and risk are exactly zero at the origin in Figures 5 and 6.
19

Electronic copy available at: https://ssrn.com/abstract=4709397


stocks with a score of around -2.0 is about 22 percent, more than 3 times the risk of about 7 percent
for the smallest stocks with a score of 2.0. In a supplemental appendix we provide separate charts
for each of the five factors, with two standard error lower and upper bounds, as well as the
distribution of capitalization weight along each score spectrum.

Figure 6: Active Risk to Pure Score 2004 to 2023


Value Momentum Small Size Low Beta Profitability
25%

20%
Active Risk

15%

10%

5%

0%
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
Pure Score

Factor Portfolio Returns in Older Data


The CRSP and Compustat databases allow for examination of at least one thousand U.S.
stocks with fully populated characteristic data back to April 1964. We extend the sample back
three additional months, January to March 1964, with cross-sectional stock counts in the high 900s,
in order have a full 60 years of market history. Our initial choice of dividing the sample into three
twenty-year periods was a trade-off between relatively long two-decade time periods in which to
fit average returns while allowing for changes in market dynamics over time. Indeed, a key
assumption behind the 240-month coefficient averages is stability in market and anomaly structure
over twenty years.

20

Electronic copy available at: https://ssrn.com/abstract=4709397


In this section, we review the earlier two twenty-year time periods by examining
cumulative alpha charts like Figure 4 and average active return plots like Figure 5. Figure 7 and
9 are based on monthly 15-variable cross-sectional regressions, as in the 2004 to 2023 period, with
ending plot points that translate into the twenty-year annualized Information Ratios, the key
performance metric in Table 3. The charts also allow for subjective (i.e., not statistically rigorous)
observations of when changes in the structure of factor returns appear to have occurred. To save
space in this paper, we do not include the other return tables or visual representations of the earlier
two 20-year periods.

Figure 7: Cumulative Risk-adjusted Alpha


3.0 Percent Active Risk from 1984 to 2003
Value 1 Value 2 Value 3 Moment 1 Moment 2
Moment 3 Small 1 Small 2 Small 3 Beta 1
Beta 2 Beta 3 Profit 1 Profit 2 Profit 3
70%
60%
50%
40%
Cumulative Alpha

30%
20%
10%
0%
-10%
-20%
-30%
-40%
-50%
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003

Figure 7 covers the middle 20-year period with 240 monthly returns from January 1984 to
December 2003. By far the best portfolio performance over these twenty years is for the earnings
yield or Value score, ending with a cumulative alpha of about 60 percent, which translates into an
extraordinary single-factor Information Ratio of about 60/(3×20) = 1.000. Much has been said
about the demise of the Value factor in recent decades, which historically was the best performing
of the five factors but fixing a date for the “death of Value” is subjective. Figure 7 does indicate

21

Electronic copy available at: https://ssrn.com/abstract=4709397


a burst of Value portfolio performance around the turn of the century, and Figure 4 shows smaller
incremental performance up to the 2008 financial crisis, after which pure Value has been in
decline. Notably, the Value 3 portfolio based on cubed earnings yield also has a burst in
performance (acknowledging the sign change) at the turn-of-the-century but is negatively
correlated over time with the Value 1 portfolio.

Figure 8: Average Active Return to Pure Score 1984 to 2003


Value Momentum Small Size Low Beta Profitability
10%

8%

6%

4%

2%
Average Return

0%

-2%

-4%

-6%

-8%

-10%
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
Pure Score

Another subjective observation of factor dynamics in Figure 7 is the steady performance


of the Profit 1 portfolio, and with a sign change, the Profit 3 portfolio. Together with Figure 9 for
the 1964 to 1983 period, the indication is that the Profitability anomaly has been a consistent non-
linear performer over time in roughly the same parabolic shape as indicated in Figure 3. Another
notable visual change in Figure 7 is the reversal of the Small 1 portfolio plot from declining to
advancing in about 1999. The most recent 20-year chart in Figure 4 shows a decline in Small 1
starting in about 2018, as well as a contemporaneous decline in Small 3 and rise in Small 2. Small

22

Electronic copy available at: https://ssrn.com/abstract=4709397


Size within the largest 1000 stocks appears to be both non-linear and dynamic over time, with a
complex non-linear pattern as shown in Figure 3.

Figure 8 plots the average active return to pure score for each of the five factors over the
twenty years from 1984 to 2003, as in Figure 5 for the most recent twenty years. Value and
Momentum are the best performing factors but are both almost linear despite allowing for non-
linearity in the return-to-characteristic relationship. Small size has two twists, as in the most recent
twenty years, but the shape is essentially flipped vertically so that the curve dips down on the left-
hand side. The non-linear pattern for Profitability is also different than in the most recent twenty
years, with most of the value added coming from stocks with positive Profitability scores rather
than underweighting those with negative scores that had on average positive active returns. The
Low Beta factor plot for 1984 to 2003 is quite linear and flat, but Low Beta alpha was strong in
those years after accounting for the downward slopping Low Beta version of the SML for the
traditional CAPM.

Figure 9: Cumulative Risk-adjusted Alpha


3.0 Percent Active Risk from 1964 to 1983
Value 1 Value 2 Value 3 Moment 1 Moment 2
Moment 3 Small 1 Small 2 Small 3 Beta 1
Beta 2 Beta 3 Profit 1 Profit 2 Profit 3
70%
60%
50%
40%
Cumulative Alpha

30%
20%
10%
0%
-10%
-20%
-30%
-40%
-50%
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983

23

Electronic copy available at: https://ssrn.com/abstract=4709397


Figure 9 covers the 1964 to 1983 period and shows the strong performance of the
Momentum anomaly in those early years, with a two-decade Information Ratio of about 1.110 for
the Moment 1 portfolio, as well as strong performance with a sign change for the Moment 2
portfolio. The combination of these two portfolios, with less cumulative return activity for the
Moment 3 portfolio, may have been the basis of the dichotomous distinction between positive and
negative price momentum in early anomalies research. Another “early years” observation on
classic CAPM testing can be based on the large negative cumulative alpha and Information Ratio
of the Beta 3 portfolio, a phenomenon which also carries over to recent decades as seen by the
Beta 3 portfolio in Figure 4.

Figure 10: Average Active Return to Pure Score 1964 to 1983


Value Momentum Small Size Low Beta Profitability
10%

8%

6%

4%
Average Return

2%

0%

-2%

-4%

-6%

-8%

-10%
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
Pure Score

Figure 10 plots the average active return to pure score for the earliest twenty years from
1964 to 1983, as in Figure 5 for the most recent twenty years. Interesting, the active factors were
in generally more linear back then, except for the Low Beta anomaly, which followed the predicted
downward slopping SML on the low beta side (i.e., large positive Low Beta scores) but not on the

24

Electronic copy available at: https://ssrn.com/abstract=4709397


high beta side. As in the most recent twenty years, an supplemental appendix to this paper contains
figures for each seperate factor for the first two twenty-year periods with additional risk and
capitalization distribution information.

Non-linear Relationships between Security Characteristics


In this section we examine non-linear pair-wise characteristic-to-characteristic
relationships across stocks. While the non-linear return-to-characteristic pattern for each factor is
the focus of the study, documenting the non-linear relationships between characteristics motivates
the inclusion of beginning-of-month squared and cubed characteristics in the weighted Fama-
Macbeth regressions. Non-linear factors that are pure for each factor require the inclusion of other
factor squared and cubed characteristics in Equation 3. Figures 11 and 12 show the current (i.e.,
mid-year 2023) relationship between earnings yield and the four other characteristics using the
cross-sectional regression

sk ,i =  0 +  1 s1,i +  2 s1,2i +  3 s1,3i +  i (6)

where sk ,i is the scored characteristic of stock i for factor k. To first analyze linear relationships,

the k = 2 to 5 applications of Equation 6 only include the earnings yield score, s1,i . As in Equation

1, the estimated intercept term,  0 , in the univariate regressions is exactly zero because of market

capitalization weighting. Figure 11 plots the fitted scores from these four pair-wise regressions on
the vertical axis and earnings yield score on the horizontal axis. The lines are composed of
individual dots for each stock with many outside the -2.0 to 2.0 score range based on the
distribution of the earnings yield characteristic across stocks.

The largest linear relationship in Figure 11 is the negative correlation between earnings
yield and gross margin, indicating that higher accounting gross margins are currently found among
lower earnings yield stocks. On the other hand, earnings yield is positively correlated to the Low
Beta characteristic, indicating that higher yielding stocks currently tend to have lower market
betas. Figure 12 is like Figure 11 but with the right-hand side of Equation 6 including the squared
and cubed terms. The cubed and squared scores can be but are not orthogonalized in Equation 6
because in this section we are interested in overall fit, not the individual coefficients.

25

Electronic copy available at: https://ssrn.com/abstract=4709397


Figure 11: Earnings Yield Linear Relationships in 2023
Price Momentum Small Size Low Beta Gross Margin
1.0

0.8

0.6

0.4
Fitted Exposure

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0
Exposure Score

Figure 12: Earnings Yield Non-Linear Relationships in 2023


Price Momentum Small Size Low Beta Gross Margin

1.0
0.8
0.6
0.4
Fitted Exposure

0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0
Exposure Score

26

Electronic copy available at: https://ssrn.com/abstract=4709397


The relationship between the earnings yield and gross margin exposures in Figure 12 is
highly non-linear, with the highest margins currently found in the middle range of earnings yield
stocks. Statistical tests indicate that the assumption of linearity is rejected at high levels of
significance for most factor pairs in most months, specifically F-tests that the squared and cubed
terms in Equation 6 are jointly zero. Figure 12 shows that earnings yield is currently (i.e., mid-
year 2023) concave with respect to price momentum, low beta, and gross margin. Interestingly,
earnings yield is convex with respect to the Small Size characteristic, but the otherwise parabolic
shape flattens out towards the right-hand side of Figure 12.

Figure 13 captures the current linear relationships of gross margin rather than earnings
yield to the other four characteristics. Note that the cross-sectional distribution of gross margin is
positively skewed with a long right tail, as seen by the individual dots on the right-hand side of
Figure 13, but no dots below -2.0. The earnings yield fit is negatively sloped, a reconfirmation of
the negative relationship between these two sets of exposures in Figure 11. The strongest linear
correspondence to gross margin is the -0.335 correlation to the Small Size characteristic, the fitted
line at the characteristic score of 1.0.

Figure 13: Gross Margin Linear Relationships in 2023


Earnings Yield Price Momentum Small Size Low Beta

1.0
0.8
0.6
0.4
Fitted Exposure

0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0
Exposure Score

27

Electronic copy available at: https://ssrn.com/abstract=4709397


Figure 14 plots the current non-linear relationships between gross margin and the other
four factors, as in Figure 12 for earnings yield. Two of the fitted plots, price momentum and small
size, have significant cubed terms leading to more complex curves, while the relationship to the
low beta characteristic is relatively linear. In other words, both the squared and cubed market beta
terms on the right-hand side of Equation 6 are relatively unimportant in 2023. The relationship of
gross margin to earnings yield has a significant squared but not cubed coefficient, as in the reverse
plot for earnings yield against gross margin in Figure 12, so the earnings yield curve in Figure 14
has just one twist.

Figure 14: Gross Margin Non-Linear Relationships 2023


Earnings Yield Price Momentum Small Size Low Beta

1.0

0.8

0.6

0.4
Fitted Exposure

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0
Exposure Score

Non-linear characteristic-to-characteristic relationships among stocks change gradually


over time like the better-known linear (i.e., correlation coefficient) relationships. Specifically, an
otherwise identical point-in-time perspective on the market twenty years ago (i.e., mid-year 2003)
bears almost no similarities to the relationships between any given pair of exposures shown in
Figures 12 and 14. To illustrate the dynamic nature of characteristic-to-characteristic
relationships, we focus on simple linear correlations to gross margin over the last 60 years in Figure

28

Electronic copy available at: https://ssrn.com/abstract=4709397


15. Earnings yield, which often has a negative correlation to gross margin, has fluctuated between
0.2 and -0.2 in recent decades. The correlation is currently (i.e., year-end 2023) slightly negative,
a reconfirmation of the negative slope in Figure 13, but in the late 1970’s the correlation was much
larger at about -0.6.

Figure 15: Gross Margin Exposure Linear Correlations Over Time


Earnings Yield Price Momentum Small Size Low Beta
0.6

0.4
Correlation Coefficient

0.2

0.0

-0.2

-0.4

-0.6
1963
1965
1967
1969
1971
1973
1975
1977
1979
1981
1983
1985
1987
1989
1991
1993
1995
1997
1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
2019
2021
2023
Gross margin across stocks typically has a negative correlation to the Small Size
characteristic as shown in Figure 15, meaning that larger stocks tend to have lower financial
accounting gross margins. But the current correlation of almost -0.4 is by far the largest ever in
this entire 60-year history. In other words, Profitability exposure, which is often understood to be
almost the opposite of Value exposure across stocks, is currently more negatively correlated with
Small Size. The correlation of the gross margin characteristic, or any other characteristic, to price
momentum across stocks is erratic because the momentum characteristic definitionally changes
each year for each stock. Figure 15 shows that the correlation of gross margin with the Low Beta

29

Electronic copy available at: https://ssrn.com/abstract=4709397


characteristic across stocks has fluctuated around 0.2 in recent decades but reached -0.3 in the late
1970s and 1980s.

Other charts like Figure 15 could be shown for the linear relationship between, for example,
Small Size and the other four factors, as well as charts for the various non-linear relationships. But
they all lead to the same empirical conclusion that the pair-wise relationships between factor
exposures are both non-linear and dynamic over time. Statements in equity factor research about
one set of characteristics being positively or negatively related to another set should be thought of
as specific to a point in time, not permanent characterizations of market structure. For exmaple,
the more complex non-linear relationships shown in Figure 13 for gross margin also change
dynamically over time, as they do in charts like Figure 13 for the other four characteristics.

Factor Portfolio Category Performance


In this section we use a different methodology for documenting non-linear performance
patterns within factor portfolios. Rather than multivariate regressions, we examine non-linear
return-to-characteristic relationships using categories within portfolios constructed from factor
scores. Converting raw characteristics into scores is a well-established process of subtracting the
cross-sectional mean of all N (e.g., 1000) securities and dividing by the cross-sectional variance.
In this study, we use means and variances that are benchmark-anchored meaning that beginning-
of-period market weights are employed in the mean and variance calculations. As explained in
the Technical Appendix, simultaneously orthogonalizing K (e.g., 5) sets of exposures with respect
to the other K - 1 sets requires the matrix algebra expression,

S = B[B(WM B)]−1 (7)

where S is an N-by-K matrix of pure scores, WM is a vector of market weights, and B is an N-by-

K matrix of exposures with their weighted-average means subtracted. Conceptually, matrix B on


the right-hand side of Equation 7 is converted into matrix S by the inverse weighted covariance
matrix shown in square brackets. Note that after orthogonalization, the exposures must be rescaled
to unit variance because of non-zero correlations between the exposure sets.

30

Electronic copy available at: https://ssrn.com/abstract=4709397


The innovation in this paper compared to Clarke, de Silva, and Thorley (2017) is that
squared and cubed characteristics are included in matrix B leading to 15 rather than 5 columns,
with the first 5 columns of the output matrix S being fully purified scores. As explained in the
Technical Appendix, the columns of squared and cubed characteristics should be analytically
orthogonalized with respect to the column of scores and each other, as in the regression
applications. In this section, we refer to non-pure scores as standardized characteristics that are
not adjusted by Equation 7, linear pure scores as the output matrix S with 5 columns in matrix B,
and fully pure scores as the first five columns of matrix S with 15 columns in matrix B.

For all three types of scores, non-pure, linear pure, and fully pure, the weights for factor
portfolio construction are simply,

wPi = wM i si (8)

where wM i are market weights and wPi are the active (i.e., market differntial) weights on the i =1

to N securities in the portfolio. Conceptually, the scores in Equation 8 convert market portfolio
weights into active factor portfolio weights that sum to zero rather than one. As discussed in the
Technical Appendix, single-factor portfolios constructed by Equation 8 are mean-variance optimal
under the assumption of a “market plus one active factor” return generating process and
homogeneous idiosyncratic risk. The portfolio returns are calculated each month by the
summation,

N
rP =  wM i si ri (9)
i =1

where ri are the security returns.

The portfolio returns in Equation 9 are active, meaning differential to the market return,
and used to measure factor performance with an additional adjustment for non-unitary market betas
as described in Footnote 3. The factor portfolios constructed by Equation 8 are not long-only
because securities with exposure scores less than -1 are shorted. The construction of strictly long-
only portfolios removes the lower end of exposures, obfuscating an examination of non-linearity
in the return-to-exposure relationship. The portfolios are commonly called 120/20 long/short,

31

Electronic copy available at: https://ssrn.com/abstract=4709397


meaning 20 percent of the notional portfolio dollars are invested in long positions financed by
short positions. As with portfolio returns based on regression coefficients in prior sections, the
actual amount of capitalization shorted varies from about 10 percent to 30 percent depending on
the factor and time period.

The first column of Table 5 reports the total alpha for the fully pure Value factor portfolio
in each sub-period, followed by characteristic categories that contain securities with exposure
scores delimited by minus one, zero, and positive one. Specifically, the four category returns are
conditional summations that separate the total portfolio active return in Equation 8 into

rlarge under-weight = 
si −1.0
wM i si ri (10)

and three other categories where the conditions under the summation are the -1.0 to 0.0 score range
(under-weight category), the 0.0 to 1.0 score range (over-weight category), and the greater than
1.0 (large over-weight category) score range, respectively. The categories are themselves
portfolios with securities weighted by score times market capitalization but have not been adjusted
to be fully invested portfolios. This ensures that the total portfolio alpha is the simple sum of four
category portfolio alphas, a property which would not hold if each portfolio were divided by the
sum of security capitalization weights in that category each month.

We give the four categories factor-specific names borrowed from current investment
management jargon associated with the range of exposures, for example Glamor (scores less than
a minus one), Expensive (minus one to zero), Shallow Value (zero to one), and Deep Value (scores
greater than one) for the Value categories. The small -31 basis point Value portfolio alpha in Table
5 for the most recent 2004 to 2023 period is decomposed into similarly small category alphas of
14, -5, 5, and -45 basis points, indicating relative linearity across the Value exposure spectrum.
As previously documented using the Fama-Macbeth cross-sectional regressions in the first section,
nothing along the Value spectrum has significantly outperformed or underperformed the market
portfolio in the last 20 years. While one could create more categories (i.e., quintile or decile
portfolios) with equal capitalization in each, zero is the natural boundary between over- and under-
weighting securities compared to the market portfolio. Similarly, minus one is the boundary
between securities that are not only underweighted but shorted in the full-spectrum single-factor
portfolio.

32

Electronic copy available at: https://ssrn.com/abstract=4709397


Table 5: Fully Pure Value Active Return Categories
1964 to 1983 Shallow Deep
Total Glamour Expensive Value Value
Alpha 2.10% 0.35% 0.46% 0.37% 0.92%
Active Risk 2.92% 1.16% 0.61% 0.59% 1.68%
Information Ratio 0.721 0.298 0.752 0.628 0.549

1984 to 2003 Shallow Deep


Total Glamour Expensive Value Value
Alpha 2.51% 0.74% 0.58% 0.59% 0.60%
Active Risk 2.50% 1.65% 0.55% 0.60% 1.59%
Information Ratio 1.001 0.447 1.053 0.985 0.378

2004 to 2023 Shallow Deep


Total Glamour Expensive Value Value
Alpha -0.31% 0.14% -0.05% 0.05% -0.45%
Active Risk 2.21% 1.55% 0.53% 0.51% 1.38%
Information Ratio -0.140 0.089 -0.086 0.097 -0.327

Table 5 also reports active risks and Information Ratios for each category. Note that even
with return-to-exposure linearity, the alpha and active risk numbers are larger in magnitude in the
tail categories, for example Glamour and Deep Value in Table 5, because of greater factor
exposure. The average exposures would be -1.53, -0.46, 0.46, and 1.53, respectively, under a
perfectly normal cross-sectional distribution of characteristics.5 Thus, a general rule of thumb is
a 3-to-1 (i.e., 1.53 compared to 0.46) magnitude in tail versus middle category alphas and active
risks because they have three times more exposure to the factor. For example, the Glamour
category active risk of 1.55 percent is about three times larger than the Expensive category active
risk of 0.53 percent. On the other hand, Information Ratios calculated by alpha divided by active
risk do have directly comparable magnitudes across the categories.

5
The weighted cross-sectional characteristic distributions vary by factor and month, but have
approximately normal distributions, with total capitalization of about 15 percent in the tail
categories and 35 percent in the middle categories. The pure earnings yield and pure momentum
characteristics tend to be thin tailed compared to normal with about 10 percent and 40 percent in
the tail versus middle categories.
33

Electronic copy available at: https://ssrn.com/abstract=4709397


Table 6 reports Information Ratios from Table 5 for the Value factor along with category
Information Ratios for the other four factors in this study. The factor portfolios in Table 6 are
constructed from fully pure scores applied to Equation 8. Using the 1/201/2 = 0.224 standard error
rule for 20-year annualized Information Ratios, adjacent category IRs have a less than (i.e., <) or
greater than (i.e., >) sign for a one standard error difference and double inequality signs for two or
more standard errors (i.e., a t-statistic of 2.0 or greater.) The most statistically significant non-
linearity in the older Value portfolio returns is that middle category IRs are better than the tail
category IRs, with three out of the four pairs significantly different. But none of the Value
categories in the most recent twenty years have large Information Ratios except for Deep Value at
-0.327. The Momentum results in Table 6 confirm the decline in performance in the most recent
20-year period found through regression analysis, with an Information Ratio of just 0.352
compared to previous Information Ratios of 1.165 and 0.593. Momentum performance in the first
twenty years (1964 to 1983) is stronger in the tails, particularly on the positive momentum side
with IRs of 0.492 and 0.975, a property that holds on the negative momentum side in the
subsequent 20-year period. In the most recent twenty-year period (2004 to 2023) the Information
Ratio difference between the Bear and Negative Momentum categories has the most significance.

Table 6 next reports on the pure Small Size factor, with the now popular Mega Cap moniker
associated with stocks that have exposure scores below negative one. The other three categories
contain stocks with larger, medium, and smaller capitalization within the largest 1000 stocks in
the U.S. market. What are commonly called small-cap stocks, for example the Russell 2000, are
not included in this study, so smallness is relative within our data set that approximates the Russell
1000. In the first twenty-year period (1964 to 1983) the Small Size factor under this definition
had good performance with an IR of 0.417. However, in the middle twenty years, the IR dropped
to 0.118 due to negative IRs in the middle categories of -0.256 and -0.201. Then in the most recent
20 years (2004 to 2023) shorting the Mega Cap category became quite problematic with an IR of
-0.133, while under-weighting Larger Cap stocks became profitable with an IR of 0.289. The
category results confirm the large cubed term IR of -0.332 in Table 3 and the complex shape of
the non-linear Small Size factor in Figure 3. The Small Size categories show a dramatic shift in
the pattern of non-linearity over the last twenty years compared to 1984 to 2003, with the Mega
Cap IR dropping from 0.249 to -0.133, while the Larger Cap IR rose from -0.256 to 0.289.

34

Electronic copy available at: https://ssrn.com/abstract=4709397


Table 6: Category Information Ratios in three 20-year Periods
Total Shallow Deep
Value Factor Glamour Expensive Value Value
IR from 1964 to 1983 0.721 0.298 << 0.752 0.628 0.549
IR from 1984 to 2003 1.001 0.447 << 1.053 0.985 >> 0.378
IR from 2004 to 2023 -0.140 0.089 -0.086 0.097 > -0.327

Total Negative Positive


Momentum Factor Bear Momentum Momentum Bull
IR from 1964 to 1983 1.165 0.863 > 0.539 0.492 << 0.975
IR from 1984 to 2003 0.593 0.625 > 0.231 0.339 0.397
IR from 2004 to 2023 0.352 0.525 >> 0.075 < 0.407 > 0.083

Total Mega Larger Medium Smaller


Small Size Factor Cap Cap Cap Cap
IR from 1964 to 1983 0.417 0.190 0.288 0.106 < 0.518
IR from 1984 to 2003 0.118 0.249 >> -0.256 -0.201 < 0.202
IR from 2004 to 2023 0.033 -0.133 < 0.289 0.105 0.112

Total Top Higher Lower Bottom


Low Beta Factor Beta Beta Beta Beta
IR from 1964 to 1983 0.400 0.260 0.138 0.115 < 0.419
IR from 1984 to 2003 0.521 0.417 0.442 0.422 0.370
IR from 2004 to 2023 0.379 0.404 > 0.052 < 0.311 0.196

Total No Lower Highly


Profitability Factor Margin Margin Profitable Profitable
IR from 1964 to 1983 0.520 0.312 0.259 0.223 < 0.448
IR from 1984 to 2003 0.596 0.127 < 0.486 >> 0.037 << 0.705
IR from 2004 to 2023 0.661 0.753 >> 0.164 0.027 < 0.466

Table 6 next reports on the Low Beta factor with the Top Beta category for standardized
exposures less than -1.0, then High Beta, Low Beta, and Bottom Beta for exposures greater than
1.0. The range of realized total market betas goes from about 1.2 on the low end of the Low Beta
exposure range to about 0.8 on the high end of the exposure spectrum. The 36-month historical
beta characteristic range across securities is a bit wider, from about 0.6 to 1.4. In other words,
security betas exhibit the well-documented shrinkage towards one in realized versus predicted
values, popularly known as the Scholes and Williams (1977) correction. The strong IR

35

Electronic copy available at: https://ssrn.com/abstract=4709397


performance of the total Low Beta factor in the first two twenty-year periods contributed to
consternation in early testing of the classic CAPM. Translated into our terminology, the CAPM
prediction was that the IR for the Low Beta factor would be zero, but the IR of 0.521 from 1984
to 2003 is more than two standard errors above zero. The empirically observed pure Security
Market Line (SML) from 1964 to 1983 was not only too flat but inverted. The category IRs in
both the first and second periods indicate that the classic CAPM empirical failures were fairly
spread out across the market beta spectrum but larger in the Bottom Beta tail category from 1964
to 1983. However, in the most recent twenty-year period, the Higher Beta category IR dropped to
just 0.052, compared to 0.404 and 0.311 for the categories on either side.

Table 6 indicates that unlike Value, the Profitability factor has maintained strong
performance over time, with the total factor IR increasing from 0.520 in the first twenty years, to
0.595, and then 0.661 in the most recent 2004 to 2023 period. Except for the early 1964 to 1983
period, the most consistent observation of non-linearity for the Profitability factor is that IR
performance is concentrated in the tail categories, for example 0.753 and 0.466 in the most recent
twenty years, compared to 0.164 and 0.027 for the middle categories. This pattern of non-
nonlinearity is consistent with the large negative IR of -0.723 for the squared Profitability
characteristic at the bottom of Table 3, and the downward parabolic shape for the non-linear
Profitability factor in Figure 3.

We next explore the impact of linear and non-linear factor portfolio purification using non-
pure, linear pure, and fully pure scores for each factor, where Table 6 was for fully pure scores.
Table 7 reports the total and category Information Ratios in the same format as Table 6, but only
for the last twenty years to save space. The first section of Table 7 reports on portfolios constructed
using simple scored characteristics with lower Information Ratios as reported in Clarke, de Silva,
and Thorley (2017). For example, the total Profitability factor IR in the 2004 to 2023 period drops
from 0.661 in Table 6 to 0.595 in Table 7. Almost all the total factor IRs are lower in the first
section of Table 6 because the non-neutralized factor portfolios are contaminated with exposures
to the other factors, increasing the active risk without an increase in alpha. In some cases, the IRs
are also lower because of declines in alpha due to non-neutralized exposure and thus larger realized
return correlation to other factors.

36

Electronic copy available at: https://ssrn.com/abstract=4709397


Table 7: Category Information Ratios by Score Purity from 2004 to 2023
Under Over Large
Non-Pure Scores Total Short Sell Weight Weight Overweight
Value -0.212 0.040 -0.077 -0.025 > -0.414
Momentum 0.232 0.472 >> -0.061 0.151 0.079
Small Size -0.145 -0.122 -0.125 < 0.106 > -0.178
Low Beta 0.224 0.309 > 0.012 0.213 0.005
Profitability 0.595 0.564 0.598 > 0.341 0.369
Multi-Factor 0.724
Under Over Large
Linear Pure Scores Factor Short Sell Weight Weight Overweight
Value -0.091 0.105 -0.047 0.075 > -0.307
Momentum 0.338 0.731 >> -0.063 << 0.502 >> 0.047
Small Size -0.071 -0.137 0.112 0.022 -0.037
Low Beta 0.315 0.366 > 0.061 < 0.343 > 0.096
Profitability 0.605 0.577 0.512 > 0.192 0.381
Multi-Factor 0.777
Under Over Large
Fully Pure Scores Factor Short Sell Weight Weight Overweight
Value -0.140 0.089 -0.086 0.097 > -0.327
Momentum 0.352 0.525 >> 0.075 < 0.407 > 0.083
Small Size 0.033 -0.133 < 0.289 0.105 0.112
Low Beta 0.379 0.404 > 0.052 < 0.311 0.196
Profitability 0.661 0.753 >> 0.164 0.027 < 0.466
Multi-Factor 0.851

The multi-factor IR at the end of each total portfolio column in Table 7 is calculated by
Sharpe’s IR rule as given in Equation 2 and measures the combined impact of all five portfolios.
The Information Ratio for the optimal combined five-factor portfolio is only approximate given
some correlation between the realized factor returns. The multi-factor IR increases by 0.777 –
0.724 = 0.053 through neutralizing linear characteristic relationships between factors, with an
additional 0.851 – 0.777 = 0.074 increase by also neutralizing non-linear relationships. Table 7
indicates that most of the increase in the Momentum and Low Beta factor IRs comes from linear
purification, while most of the increase in the Profitability factor IR comes from the non-linear
purification.

37

Electronic copy available at: https://ssrn.com/abstract=4709397


The previously discussed concentration of Profitability factor performance in the tail
categories in the bottom section of Table 7 is less evident for linear pure scores in the middle
section, and almost non-existent for the non-pure score construction where the tail and middle
category IR on both the left and right side of the table are almost equal. The Small Size category
over the last twenty years also shows material changes, with the Under Weight category (called
Larger Cap in Table 6) dropping from 0.289 for fully pure scores to 0.112 for linear pure scores
and -0.125 for non-pure scores. Table 7 indicates that the return non-linearities for the Small Size
and Profitability factor portfolios are more evident after being neutralized with respect to both
linear and non-linear exposures to the other factors.

Other Applications of Non-linear Returns


The focus of this paper is on the average non-linear return-to-characteristic aspect of
popular equity market factors. We document that most factors have a statistically and
economically material non-linear structure by fitting returns within 20-year periods. The same ex-
post average return analysis was used by academic researchers in the past to establish linear factor
performance, but within sample fitting does not directly address the issue of return forecastability.
Fama and French drew attention to the linear Value and Small Size factors by examining realized
average stock returns over extended periods. The choice of the Book-to-Market and log
capitalization characteristics were not motived by economic theory, but simply reflected the
general awareness of Value and Size being important properties of the U.S. equity market, for
example as shown in the well-known 3-by-3 Morningstar charts also introduced in the 1990s.

Four of the five now popular linear factors were explored by academic and practitioner
research after observing realized returns from the 1960 to the 1990s. Indeed, the only factor
characteristic motivated by equilibrium economic theory rather than being discovered ex-post was
market beta, and the classic CAPM prediction turned out to have little empirical backing.
Specifically higher beta stocks were not found to have higher average returns, but rather about the
same return as lower betas stocks, a phenomenon now called the Low Beta anomaly. Similarly,
this study shows that most factors have non-linear returns across characteristics by examining 60
years of return data after the fact. The study does not directly address the issue of out-of-sample
return forecasting, although there is promise based on the strength and persistence of the non-linear
patterns. We have looked at monthly portfolio return forecasting using prior 10-year non-linear

38

Electronic copy available at: https://ssrn.com/abstract=4709397


return patterns with strong results compared to linear forecasting, as well as what is now called
“factor momentum” as discussed in Ehsani and Linnainmaa (2020) using prior 3-month
forecasting. But a report on direct non-linear return forecasting is outside of the scope of this
paper.

We also do not address performance attribution for multi-factor portfolio strategies in this
paper. Specifically, one could envision a process where the returns for a given month are explained
not just by the performance of linear Value, Profitability, and Size portfolios, but by returns to
category portfolios like Deep Value, Highly Profitable and Mega Cap. Indeed, market observers
frequently summarize daily or monthly market results with references to Mega Caps, which could
be modeled with a mega-cap dummy variable. For factors with material non-linear returns, more
of the realized performance of active strategies is naturally explained by four category returns
rather than one portfolio return. Performance attribution could also be enhanced by the inclusion
of squared and cubed characteristics, as in the monthly Fama-Macbeth cross-sectional regressions
in the first section. While non-linear performance attribution is outside the scope of this paper, we
briefly comment on the traditional academic objective of explaining the cross-section of realized
security returns using R-squared statistics.

Figure 16 plots the rolling 36-month R-squared for all 60 years in this study from Equation
1 with 5 right-hand side variables, as well as R-squared for the market plus 15 version of Equation
3.6 The linear and cubic R-squared plots are also matched by a plot of R-squared from a LOESS
regression on Equation 1, a methodology that allows for non-parametric forms for non-linear
relationships between security returns and characteristics.7

6
The direct formula for R-squared in observationally weighed cross-sectional regressions is
N N
1 −  wi ei2 /  wi ri 2 where ei are the residual returns and wi are the weights. Adjusted R-squared
i =1 i =1
is often employed in multivariate regressions to account for the number of independent variables.
As a practical matter the change in degrees of freedom using 15 rather than 5 regressors has little
impact because of the large sample size, N = 1000, an Effective N due to observational weighting
of about 100 to 200 a month.
7
LOESS (locally estimated scatterplot smoothing) allows for a non-polynomial shape in the
regression curves that replace the ck in Equation 1. LOESS is computationally difficult to

39

Electronic copy available at: https://ssrn.com/abstract=4709397


The cubic and LOESS approaches to modeling non-linear returns result in almost
equivalent increases in the 36-month rolling R-squared, as shown by the overlapping of cubic and
LOESS plots in Figure 16. Both non-linear return approaches increase the R-squared by about 6
percentage points over time, although the difference to linear R-squared has increased in recent
years to about 10 percentage points as shown on the right-hand side of Figure 16. Specifically,
the LOESS and cubic R-squared plots at year-end 2023 are at 29 and 26 percentage points,
respectively, while the linear R-squared plot ends at just 19 percentage points.

Figure 16: Monthly Security Return Cross-sectional R-squared


(Rolling 36 months)

Cubic LOESS Linear


35%

30%

25%
R-squared

20%

15%

10%

5%

0%
1963
1965
1967
1969
1971
1973
1975
1977
1979
1981
1983
1985
1987
1989
1991
1993
1995
1997
1999
2001
2003
2005
2007
2009
2011
2013
2015
2017
2019
2021
2023

Much has been published in the academic literature about increases in explanatory power
by adding additional factor characteristics, going up to 30 or 40 percent in some studies. The
results in Figure 16 suggest that allowing for non-linearity in these five popular factors has a
substantially greater impact than adding one or even several less commonly used factors.

implement in our weighted-observation large-sample context and less known in financial markets
research.
40

Electronic copy available at: https://ssrn.com/abstract=4709397


Similarly, commercial risk model providers like Axioma have begun to include squared and cubed
terms, specifically for the size characteristic.

Summary and Conclusions


The implications of non-linear factor portfolio returns for investors and portfolio managers
are significant and multifaceted. Portfolio performance is enhanced by weighting securities not
just by their linear factor characteristics, but by the non-linear patterns given by orthogonalized
squared and cubed characteristics in cross-sectional regressions. Table 3 documents large
Information Ratio increases in single-factor and multi-factor strategies that incorporate the patterns
of non-linear average returns visualized in Figures 3 and 5. For example, the already large alpha
of 1.79 percent over the last twenty years for the linear Profitability portfolio is increased by 66
basis points to 2.45 percent for a non-linear Profitability portfolio at the same level of active risk.
On the other hand, little can apparently be done about the poor-performing Value factor, except to
make sure investors construct Value portfolios that are pure and thus not as negatively correlated
with the high performing Profitability factor. An updated and much better performing version of
the Small Size factor based on non-linear patterns within the largest 1000 stocks underweights
larger stocks but not Mega Cap stocks, and overweights smaller but not the smallest stocks.

While more complicated than exploiting the linear version of well-known market
anomalies, methodologies for incorporating non-linear analysis in portfolio construction are
provided in the body and Technical Appendix of this study. Other researchers have used squared
and cubed characteristics to explore non-linearity in returns, but our more precise implications
come from the methodological innovation of employing squared and cubed scores that are
orthogonalized with respect to the linear stock characteristic. Non-linear portfolio strategies are
also informed by the factor characteristic categories used later in the paper, motivating dummy
variable or piece-wise linear regression analysis and portfolio construction. The simplest action
for quantitative analysts may be to specifically add cubed log market capitalization and squared
gross margin to their list of otherwise linear stock characteristics.

The motivating research question in this paper was if stocks with a 2.0 scored characteristic
have on average twice the realized active return as stocks with a 1.0 scored characteristic, and if
stocks with a scored exposure of -1.0 have a negative average active return of the same magnitude

41

Electronic copy available at: https://ssrn.com/abstract=4709397


as the stocks with a 1.0 score. For four of the five popular factors in this study, the answer for
pure factor portfolios is no. Beyond the implications for non-linear factor portfolio construction,
the results suggest that increasing the active weights in one part of the exposure spectrum does not
have the same impact as moving up the spectrum. With the Low Beta anomaly for example, the
change in portfolio return created by increasing the under-weights on stocks with slightly higher
betas does not have the same result as under-weighting stocks with much higher betas.
Underweighting stocks with active (i.e., differential to one) betas of 1.0 does not have the same
impact on a portfolio’s return as overweighting stocks with an active beta of -1.0. The underlying
source of active returns can be materially misstated in the Small Beta and other factor portfolios
by the assumed and often unstated linear extrapolation of factor exposures.

Separate from the basic finding of non-linear return-characteristic patterns for four popular
factors, this study shows that non-linear as well as linear purification of factor exposures is needed
because of highly significant non-linear pair-wise characteristic-to-characteristic patterns that
change gradually over time. Table 7, which focuses on the last twenty years, shows that the
incremental increase in multi-factor portfolio performance from purging non-linear relationships
is larger than the increase from purging simple linear correlation coefficient patterns. The increase
from non-linear purification is much more significant than linear purification for the Profitability
factor.

Our empirical study of non-linear returns illustrates the need for benchmark-anchoring in
cross-sectional stock statistics like exposures, standard deviations, and correlations, as in the well-
known weighted return calculation for a portfolio. Benchmark anchoring motivates the use of
capitalization weighted rather than equally weighted monthly Fama-Macbeth regressions that
would otherwise produce coefficients which are not optimal portfolios with returns that are active
with respect to an equally weighted market-wide portfolio, not the market benchmark. Equally
weighted regression results are driven by the 80 percent of stock observations that only represent
about 20 percent of market capitalization.

We mention but do not fully explore the issues of non-linear return forecasting and
performance attribution. Commercial risk modelers have already introduced squared and cubed
characteristics for some factors, but non-linear performance attribution has only been heuristically

42

Electronic copy available at: https://ssrn.com/abstract=4709397


incorporated into media and analyst coverage of factor performance. For example, daily and
monthly characterizations of U.S. equity market performance often include comments about the
mega-cap space or the Magnificent Seven as separate explanatory factors without a linear
extrapolation to the rest of the size spectrum. Similarly, analysts and active portfolio managers
speak in terms of Deep Value or Glamour stock strategies that are distinct from a simple linear
extrapolation of the middle-range Value spectrum. Performance attribution could be significantly
improved as shown by the increases in R-squared plotted in Figure 16. Better market commentary
could come with an awareness of non-linear return-to-exposure patterns for popular factors.

With some hesitation about further populating the “factor zoo” we note that the only factors
identified and emphasized by academic researchers in the 1990s were those with a long track
record of linear performance. A vertically downward or upward parabolic average return stock
characteristic, for example, with strong non-linear long-term performance, would not have been
identified given the prevailing use of linear research methodologies. An important extension of
our research on U.S. stock returns will be return forecasting methods based on persistent non-
linear patterns within factor portfolio spectrums. Other questions require the extension of non-
linear factor return analysis outlined in this paper to international equity markets. The CRSP and
Compustat databases only include about half the global public equity market. We anticipate that
other researchers will apply our methodologies including weighted Fama-Macbeth regressions
with orthogonalized squared and cubed characteristics to European and Asian markets. Is a non-
linear pure Value factor still “alive” in non-U.S. markets? Do the high performing pure
Profitability and Low Beta factors have the same non-linear properties in global markets?

43

Electronic copy available at: https://ssrn.com/abstract=4709397


Technical Appendix

Benchmark Anchoring
This paper uses the “benchmark anchoring” concept that cross-sectional references to the
market or other portfolios should be weighted by the size of the constituent securities within the
portfolio. Most investors understand that the portfolio return is a capitalization-weighted average
of the security returns but may not employ weighted statistics in other contexts. Another example
of the need for weighted statistics in portfolio theory is that the equally weighted average market
beta (which includes second moment and correlation statistics) across stocks is only approximately
one. The capitalization-weighted beta across the stocks that comprise the market portfolio is
exactly one.

We weight all cross-sectional statistics, means, standard deviations, and correlations, by


beginning of period security capitalization to account for the importance or contribution of each
security in the portfolio. We do not supplement the empirical analysis in this paper with equally
weighted regressions, scores, or other cross-sectional statistics under the perspective that such
results are dominated by stocks that have little capitalization impact on the factor portfolios and
thus detract from rather than enhance perspectives on performance.

Scores and Weights


In this paper, total portfolio weights are related to security exposure scores, si , by

wTPi = wM i (1 + si ) (A1)

where wM i is the market weight of the security. The scores are benchmark-anchored with a

capitalization-weighted mean of zero and a capitalization-weighted variance of one. Alternatively,


the weights for a pure (i.e., zero exposure to other factors) portfolio in Equation A1 can be used to
infer a set of pure factor scores. Pure scores can be calculated directly from K sets of raw exposures
by the matrix equation.
S = B[B(WM B)]−1 (A2)

44

Electronic copy available at: https://ssrn.com/abstract=4709397


where S is an N-by-K matrix with K columns of pure factor scores, and B is an N-by-K matrix of
raw measures (i.e., earnings yield) for factor exposures with their weighted-average means
subtracted.

Linear in Exposure
Our main empirical research question is where along the spectrum of exposures the average
active return to well-known factors originates. While we and others have examined this question
for the low beta and price momentum anomalies, little has been done and published with respect
to the linearity of the other factors using capitalization-weighting and pure factor methodologies.
The term “non-linear” can have a lot of meanings, but here we specifically ask whether a change
in score from say -1.0 to 0.0, has the same impact on average active return as a change in score
from say 0.0 to +1.0.

We are not examining the “long side” and “short side” of the factor exposure spectrum.
The methodology is primarily long-only since deviations in security weights are relative to
benchmark weights. The basic weight formula in Equation A1 over-weights and under-weights
securities compared to the market benchmark. Using Equation A1, the active (difference to
benchmark) security weight, wPi , is the simple product,

wPi = wM i si . (A3)

The non-linear portfolio construction equation analogous to Equation A3 is


wPi = wM i f ( si ) (A4)

where f ( si ) is some non-linear function of si with the property that w Mi f ( si ) = 0 . Few total

security weights in combined multi-factor linear or non-linear portfolios are negative at reasonable
levels of active risk (e.g., 3 percent) suggesting that short selling is rare in actual application.

The realized active (i.e., market differential) factor portfolio return is the sum-product,
N
rP =  wM i si ri . (A5)
i =1

where the ri are security active returns. We decompose the active return for a given portfolio in

Equation A5 into categories using the conditional summation,

45

Electronic copy available at: https://ssrn.com/abstract=4709397


rLarge under-weight = w
si −1
Mi si ri (A6)

and three similar equations where the condition is for scores in the -1 to 0, 0 to 1, and greater than
1 range. The ranges are generically labeled the large under-weight, under-weight, over-weight,
and large over-weight categories.

Note that the scores in the large underweight category shown in Equation A6 are less than
-1.0, so they represent the short sells. Thus, the total factor portfolio in Equation A5 is often called
a 120/20 long-short portfolio, where the amount of shorting varies from almost zero to as high as
40 percent. If the security active returns, ri , in the large underweight category are on average
negative, then the category return in Equation A6 will be positive, generating a positive
contribution to the total portfolio active return.

Optimality of Simple Factor Portfolios


The market-benchmark plus one factor model for security returns is,
ri = rM + i +  i (A7)

where rM is the market portfolio return, i is the security-specific “alpha” component of returns,

and  i is the residual return. We qualify the term alpha because return models typically have a

security-specific market-beta multiplier in front of the market return. In expectation, the residuals
are uncorrelated with each other and with the market return, distributed with zero mean and a
security-specific (i.e., heterogenous) variance of  i2 . Internal consistency for the forecasted values

for the alpha component dictates that they also have a market-weighted sum of zero,
N

w
i =1
Mi i = 0 . (A8)

Technically speaking, Equation A8 is an ex-ante condition on the rationality of the security return
forecasts, separate from the ex-post concept that the realized value of the residuals has a cross-
sectional weighted sum of zero.

Under the model for security returns in Equation A7, the Markowitz (1952) mean-variance
optimal active (i.e., market differential) weight on each security is,

46

Electronic copy available at: https://ssrn.com/abstract=4709397


wM i i A
wP i = (A9)
 2
N wM2 i i2

i

i =1  i2
where the non-subscripted parameter  A is the targeted level of active portfolio risk. Equation A9

is a simple algebraic expression of the general multi-factor active-space mean-variance optimal


portfolio, expressed as w  Ω μ under matrix notation. Using the Grinold (1989) process for
-1

calculating forecasted security returns, i = IC si  i , the optimal portfolio weights become,

wM i si A
wP i = (A10)
i N

w
i =1
2
Mi si2

and the expected portfolio return (i.e., sum product of expected security returns and weights) is

 P = IC BR  A . (A11)

Equation A11 is known as the fundamental law of active portfolio management, where point-in-
time breadth (BR) is defined by one over the radical in Equation A10. Under the assumption of
homogeneous idiosyncratic risk, the optimal security weights in Equation A10 are the product of
market weights times scores, multiplied by a constant,
 BR  A 
wP i = wM i si   . (A12)
 
 
The conditions for the mean-variance optimality of the factor portfolio defined by Equation A3
are 1) a market-plus-one-factor security return generating process, 2) homogeneous idiosyncratic
risk, and 3) an unspecified level of active risk.

Orthogonalized Squared and Cubed Regressors


The first six capitalization-weighted ( wM i ) central moments of the cross-sectional

distribution of a point-in-time scored stock characteristic ( si ) are:


N
1) Mean Mean =  wM i si (standardized as 0.0)
i =1
N
2) Variance Var =  wM i si2 (standardized as 1.0)
i =1

47

Electronic copy available at: https://ssrn.com/abstract=4709397


N
3) Skewness Skew =  wM i si3
i =1
N
4) Kurtosis Kurt =  wM i si4
i =1
N
5) Hyper Skewness Hyper =  wM i si5
i =1
N
6) Tailedness Tail =  wM i si6
i =1

We first derive a transformation of squared score that is orthogonal to linear score and has weighted
zero mean and unit variance. With some algebra it can be shown that
si2 − Skew si − 1
s 2i = (A13)
( )
1/2
Kurt − Skew2 − 1

has these properties. For example, for a perfectly normal distribution with Skew = 0 and Kurt = 3,
Equation A13 simplifies to

s 2i =
1
2
(s 2
i )
−1 . (A14)

Equation A13 has a zero mean because the first term in the numerator goes to 1 when weighted,
the second term goes to zero when weighted, and the third term is -1. Similarly, the denominator
is constructed to ensure unit variance of s2i.

The cubed score transformation that is orthogonal to just the linear score is a bit more
complicated to solve algebraically but can be shown to be

si3 − Kurt si − Skew


s3i = . (A15)
( )
1/2
Tail − Kurt 2 − Skew2

The cubed score transformation that is jointly orthogonal to both the linear score and the squared
transformation s2i involves substantial algebra, but can be solved as

si3 − Kurt si − Skew −  ( si2 − Skew − 1)


s3i = (A16)
(Tail − Kurt )
1/2
2
− Skew −  ( Kurt − 2 Skew + Skew − 1)
2 2 2

48

Electronic copy available at: https://ssrn.com/abstract=4709397


Hyper − Skew ( Kurt − 1)
where  = . (A17)
Kurt − Skew2 − 1

For example, for a normal distribution with Skew = 0, Kurt = 3, Hyper = 0, and Tail = 15, both
Equations A15 and A16 both simplify to

s3i =
1
6
(s 3
i )
− 3 si . (A18)

In other words, deviations from a normal cross-sectional distribution of score create much of the
complexity of the orthogonal squared and cubed transformations in Equations A13 and A16
compared to Equations A14 and A18.

As a numerical example of the orthogonalization calculations, moments for the five


characteristics on the largest 1000 U.S. stocks at the end of June 2023 are given in Table A1 below.
Minus the log of market capitalization or Small Size is the closest to a normal distribution, with
skewness of just 0.07, and a 2.21 kurtosis reading that is close to 3.00. On the other hand,
Momentum is positively skewed at 2.02 while both Value and Momentum are Leptokurtotic with
kurtosis of 8.49 and 9.04, respectively.

Table A1: Mid-year 2023 Characteristic Distributions


Value Momentum Small Size Low Beta Profitability
Mean 0.00 0.00 0.00 0.00 0.00
Variance 1.00 1.00 1.00 1.00 1.00
Skewness 0.94 2.02 0.07 -0.71 -0.07
Kurtosis 8.49 9.04 2.21 4.04 2.45
Hyper Skew 16.17 33.52 0.63 -7.52 0.23
Tailedness 135.48 138.39 6.21 26.87 8.81

Taking the Value central moments in Table A1 as an example, the weighted cross-sectional
correlation between the score and squared score is 0.343. The correlation between the score and
cubed score is much higher at 0.732, and the correlation between squared and cubed scores is
0.480. Alternatively, the three weighted correlations using the squared and cubic transformations
in Equations 1 and 4 are exactly zero. Thus, the slope coefficients in a weighted multivariate

49

Electronic copy available at: https://ssrn.com/abstract=4709397


regression of stock returns on linear score, s2i, and s3i, will be identical to the coefficients in
separate univariate regressions. Because the cross-sectional regressions are capitalization
weighted, the intercept term is identical to the market portfolio return.

50

Electronic copy available at: https://ssrn.com/abstract=4709397


References

Ang, Andrew, Robert Hodrick, Yuhang Xing, and Xiaoyan Zhang. 2006. “The Cross-Section of
Volatility and Expected Returns.” Journal of Finance 61 (1): 259–99.

Arnott, Robert, Vitali Kalesnik, and Juhani Linnainmaa. 2023. “Factor Momentum.” The Review
of Financial Studies, 36 (8); 3034–3070

Asness, Clifford, Andrea Frazzini, Ronen Israel, Tobias Markowitz, and Lasse Pedersen. 2018.
“Size Matters, If You Control Your Junk.” Journal of Financial Economics 129 (3): 479-
509.

De Boer, Sanne. 2020. “Nonlinear Factor Attribution.” Journal of Investment Consulting 20 (1);
21-29.

Barroso, Pedro, and Pedro Santa-Clara. 2015. “Momentum Has Its Moments.” Journal of
Financial Economics 116 (1): 111–20.

Bollerslev, Tim, Andrew Patton, and Rogier Quaedvlieg. 2023. “Granular Betas and Risk Premium
Functions”, NBER and European Central Bank working paper.

Carhart, Mark. 1997. “On Persistence in Mutual Fund Performance.” Journal of Finance 52 (1):
57-82.

Cederburg, Scott, Michael O’Doherty, Feifei Wang, and Xuemin Yan. 2019. “On the Performance
of Volatility Managed Portfolios.” Journal of Financial Economics, forthcoming.

Clarke, Roger, Harindra de Silva, and Steven Thorley. 2017. “Pure Factor Portfolios and
Multivariate Regression Analysis.” Journal of Portfolio Management 43 (3): 16-31.

Clarke, Roger, Harindra de Silva, and Steven Thorley. 2020. “Risk Management and Optimal
Combination of Equity Market Factors.” Financial Analysts Journal 76 (3): 57-79.

Didisheim, Antoine, Shikun Ke, Bryan Kelly, and Semyon Malamud. 2024. “Complexity in
Factor Pricing Models” Swiss Finance Institute Research paper No. 23-19.

Ehsani, Sina, and Juhani Linnainmaa. 2020. “Factor Momentum and the Momentum Factor.”
Journal of Finance forthcoming.

Fama, Eugene, and Kenneth French. 1992. “The Cross-Section of Expected Stock Returns.”
Journal of Finance 47 (2): 427–65.

Fama, Eugene, and Kenneth French. 1993. ‘Common Risk Factors in the Returns on Stocks and
Bonds.” Journal of Financial Economics 33 (1): 3-56

Fama, Eugene, and James MacBeth. 1973. “Risk, Return, and Equilibrium: Empirical Tests.”
Journal of Political Economy 81 (3): 607-636.

51

Electronic copy available at: https://ssrn.com/abstract=4709397


Feng, Guanhao, Stefano Giglio, and Dacheng Xio. 2020. “Taming the Factor Zoo: A Test of New
Factors.” Journal of Finance 73 (3): 1327-1370.

Frazzini, Andrea, and Lasse Pedersen. 2014. “Betting Against Beta.” Journal of Financial
Economics 111 (1): 1-25.

Grinold, Richard. 1989. “The Fundamental Law of Active Management.” Journal of Portfolio
Management 15 (3): 30–37.

Hsu, Jason, Vitali Kalesnik, and Engin Kose. 2019. “What is Quality?” Financial Analysts
Journal, 75 (2): 44-61.

Israel, Ronen, Kristoffer Laursen, and Scott Richardson. 2021. “Is Systematic Value Investing
Dead?” Journal of Portfolio Management 47 (2): 38-62.

Jegadeesh, Narasimhan. 1990. “Evidence of Predictable Behavior of Security Returns.” Journal


of Finance 45 (3): 881–898.

Jegadeesh, Narasimhan, and Sheridan Titman. 1993. “Returns to Buying Winners and Selling
Losers: Implications for Stock Market Efficiency.” Journal of Finance 48 (1): 65–91.

Jensen, Michael C., Fischer Black, and Myron Scholes. 1972. “The Capital Asset Pricing Model:
Some Empirical Tests.” Studies in the Theory of Capital Markets, edited by Michael C.
Jensen, 79–121.

Kagkadis, Anastasios, Harald Lohre, Ingmar Nolte, Sandra Nolte (Lechner), Nikolaos Vasilas.
“Power Sorting.” SSRN working paper, August 2023.

Novy-Marx, Robert. 2013. “The Other Side of Value: The Gross Profitability Premium.” Journal
of Financial Economics 108 (1): 1-28.

Markowitz, Harry. 1952 “Portfolio Selection.” Journal of Finance 7 (1): 77–91.

Markowitz, Tobais, and Mark Grinblatt. 1999. “Do Industries Explain Momentum?” Journal of
Finance 54 (4): 1249-1290.

Scholes, Myron, and Joseph Williams. 1977. “Estimating betas from nonsynchronous data.”
Journal of Financial Economics 5 (3): 309-327.

Shliefer, Andrei, and Robert Vishney. 1997. “The Limits to Arbitrage.” Journal of Finance 52
(1): 35-55.

Sharpe, William. 1964. “Capital Asset Prices: A Theory of Market Equilibrium under Conditions
of Risk.” Journal of Finance 19 (3): 259–63.

Treynor, Jack, and Fischer Black. 1973. “How to Use Security Analysis to Improve Portfolio
Selection.” Journal of Business 46 (1): 66–86.

52

Electronic copy available at: https://ssrn.com/abstract=4709397


Zhang, Shao. 2022. “Factor Construction Zoo: Are Factor Exposures Created Equal?” Journal of
Portfolio Management 48 (2): 105-118.

53

Electronic copy available at: https://ssrn.com/abstract=4709397

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy