Robeco's One-Legged Vol Factor
just ignore high vol
Two months ago, Robeco’s Amar Soehbag, Guido Baltussen, and Pim van Vliet posted a new empirical paper, Factoring in the Low-Volatility Factor. I consider Pim a good friend, and he is one of the initial low-vol portfolio managers, as he started his conservative fund at Robeco around 2006 (the others were Analytic Investors, Acadian, and Unigestion). He says he was introduced to the low-vol factor via Bob Haugen’s 1996 paper, which presented the low-vol factor premium among dozens of other factors. I ran into Haugen in the early 2000s while at the Deephaven hedge fund, and he was selling a factor model that included low vol buried among 50 other factors. Pim deserves all the credit for seeing the importance of low-vol in the data that even the author did not appreciate.
It’s fun to see low vol used so prominently now, not merely as an investment class, but as a factor. I wrote my dissertation on the low-vol effect in 1994. I did not get any job offers because while I wrote about finance, specifically asset pricing, none of Northwestern’s asset pricing professors were promoting me (including Bob Hodrick, who would later go on to co-author one of the seminal papers on the low-vol effect in 2006). I submitted an empirical paper to the Journal of Finance that created a low-vol factor and showed how it rejected the standard capital asset pricing model (CAPM). The reviewer, Bill Schwert, angrily told me to stop wasting his and my time, as I had no model that would explain my results as a measure of risk: asset return factors had to come from risk aversion, and certainly could not disprove risk aversion.
I figured it was their loss, and would pitch people in New York whenever I visited there for risk conferences while working as a risk manager at KeyBank in Cleveland. Practitioners objected to the lack of support from other prominent asset pricing academics for my findings. You need at least one prominent person supporting your Big New Idea, as I am experiencing with my solution to the AMM LP unprofitability.
In the 90s, there was a big debate about whether these factors reflect risk (Fama and French 1993) or are just characteristics that reflect investor biases (Lakonishok, Shleifer, and Vishny 1994; Daniel and Titman 1997 [edit: fixed link!]). The debate died without anyone claiming victory or admitting defeat, but clearly the characteristics interpretation won, because when anyone sees a factor premium, they think it means one should go long it. If it were a risk, it would reflect an arbitrary preference, with half of investors gladly forgoing the extra return to avoid the risk.
The current conventional factor is proxied by a long-short portfolio that is long stocks in the top decile and short those in the bottom decile when sorted by some characteristic. Thus, Fama and French’s original factors were for size, SMB is ‘small minus big,’ and value, HML is ‘high minus low’ book-market ratio portfolios. Factors are constructed so that the average factor-mimicking portfolio return is positive.
The following factors dominate current factor models. They are generated by sorting by the following variables, and creating portfolio proxy returns that capture these factors:
Market: no sorting, just the value-weighted market return
Size: market cap
Value: book-market ratio
Profitability: Cash-flow profits, accrual profits, or net income/book value
Growth: change in total assets divided by total assets, share issuance
Momentum: past 12-month return (lagged a month)
For each metric, you sort companies each month from high to low and create a long-short portfolio. Often, one controls for size by performing these sorts within market capitalization ranges to avoid the mistake of finding a factor that merely highlights the prominent size factor in a different form. The initial empirical results (Black, Jensen, and Scholes 1972, Fama and MacBeth 1973) that confirmed the capital asset pricing model taught in standard MBA classes were solely due to the omitted variable size: beta only worked because it picked up the size factor. When Fama and French (1992) first cross-tabbed the data by size and then sorted by beta, there was no correlation between beta and future returns.
You regress each stock against these factors in a multivariate regression. The resulting betas for each asset for the factor are called factor loadings, and the factors are positive numbers reflecting the return premium generated by the factor. For example, the market factor is the expected market premium over the risk-free rate, about 5%. Unlike the market betas, which are always positive and centered around 1.0, the other factor loadings are symmetrically distributed around zero.
The low-minus-high volatility factor is rarely used because its marginal contribution to a multifactor long-short portfolio is weak. However, if you isolate the low-vol stocks and omit the high-vol stocks, when added to a long-only portfolio built on standard factor models, there was a significant increase. Nothing insane, about a 0.12 Sharpe lift, but it seems robust. For institutional portfolios handling billions of dollars, that is a lot of alpha in terms of USD. Below are data from their Table 3, where various factor models are listed in the columns, and the Sharpe ratios without (SRw/o) and with (SRw) the volatility factor (Sharpe is (annualized return - risk-free rate)/volatility). Note the first set of data refers to the standard long-short factors, while the bottom is for the long-only version (adjusted for market beta).
Table 3
Sharpe Ratios without and with the low-vol factor
The nice thing about this approach is that it exactly matches what quants target. They want to show their boss or potential clients how their new factor increases the Sharpe ratio of some base strategy. The highest Sharpe ratio maximizes return for a given annualized volatility. Alternatively, for any desired return, a levered version of that portfolio generates the lowest volatility. With dozens of plausible factors, quants spend their time finding and justifying their special combination of secret factors, like how every fried chicken recipe has its take on KFC’s secret blend of 11 herbs and spices. Most use models with 3 to 8 factors.
One problem with the long-short factors is that often one side of the portfolio’s returns is not ‘real,’ in that transaction costs would kill them. For example, when the size factor was initially discovered, many also touted the related low-price factor, as stock price and market cap were correlated. It turned out that the low-price anomaly was especially susceptible to measurement error and price impact. By the end of the 80s, most of the low-priced funds were gone. An uninvestible factor return could still be interesting, like overnight returns, just not an investible strategy. Robeco’s paper confirms that their results hold when these costs are applied to the various factors.
Another significant problem with some prominent long-short factor-mimicking portfolios is that they have ambiguous betas. Fama-French’s size factor had a negative beta in the 1950s, which is weird, but whatever. However, their value factor has had a positive beta for half of its life. When a factor’s market beta is sometimes positive, sometimes negative, it is difficult to determine the extent to which the factor return comes from the market.
The market return dominates stock return data because it’s large, around 5% annually, and its factor loading averages 1.0, not zero. Frazzini and Pedersen (2014) implemented a beta-neutral version of the volatility factor by adjusting the size of the short so their low-beta factor is beta-neutral (beta and volatility sorts are highly correlated).
That’s an interesting metric of sentiment, but as a strategy, the problem is that high volatility/beta stocks are neither good longs nor shorts, regardless of their weighting. High vol stocks aren’t good longs because their returns are low; they aren’t good shorts because their volatility is too high. Using a dataset with 30 years of US stock data that excludes small-cap stocks (about 1500 stocks per year), I created portfolios sorted by historical volatility and subtracted their ex-ante beta returns. The resulting time series of returns generated various Sharpe ratios for these deciles, and the graph below shows that while the highest volatility decile had the lowest return, its high volatility puts it well within the efficient frontier of investment possibilities. The loVol-minus-hiVol portfolio—beta adjusted—generates the market return with double the volatility (green dot). A good factor should have a good Sharpe.
High volatility stocks are miniature versions of most crypto coins: negative expected returns with high volatility. I wouldn’t short Fartcoin even though I am confident it has a negative expected return (even if you love flatulence jokes, they get old). Most coins are like 5-delta calls with a 50% premium on their implied volatility and an unhedgeable underlying asset; their optimal weighting is zero.
Robeco models the low-vol factor using just the long leg and removing the beta-implied market return. This avoids the effect of high vol stocks.
They add some color to their finding by noting the high-vol short is highly correlated with low profitability and high-investment stocks. Lousy stocks are subject to delusional hope: unprofitable companies with high growth virtually all have high volatility.
They also perform a neat exercise I have not seen before, adjusting various selection criteria to highlight the factor’s robustness. As mentioned, when creating a factor, one usually sorts within various size groupings. However, there are several ways to do this, such as with groups segregated by the 30th and 70th percentiles of market cap, or using the size quintile. One can also neutralize the industry weights in a factor portfolio. Using 12 different criteria of this sort, they generated 4096 possibilities, generating a nice set of Sharp histograms. It’s comforting if the factor’s returns are not too sensitive to these criteria. Interestingly, the ROE (Net Income/book) and Earnings Announcement Drift (PEAD) generated bimodal distributions, implying these factors work very differently in different industries.
When evaluating long-only portfolios, perhaps other factors would be more effective in their long-only form.





