*Matt Roberts-Sklar*

Often when analysing financial markets, we want to know the statistical distribution of some financial market prices, yields or returns. But the ‘true’ distribution is unknown and unknowable. So we estimate the distribution, based on what we’ve observed in the past. In financial markets, adding one data point can make a huge difference. Sharp moves in Italian bond yields in May 2018 are case in point – in this blog I show how a single day’s trading drastically alters the estimated distribution of returns. This is important to keep in mind when modelling financial market returns, e.g. for risk management purposes or financial stability monitoring.

It is a well-established – but often forgotten – fact that financial markets are not normal. I don’t mean that in a pejorative sense. Statistically, financial market returns do not follow a normal distribution. When data are normally distributed, extreme values are very unlikely.

But in financial markets, we know that very large moves can and do happen – one data point can change everything. Many an investor has learned the hard way that years of steady returns can evaporate in a heartbeat (1929, 1987, 2008 etc).

These aren’t new points (see Mandelbrot, Taleb etc). But recent moves in Italian bond markets give us a neat reminder of the difference a day can make.

Over a few days in May 2018, yields on Italian government bonds (roughly speaking the return an investor gets when buying the bond) rose sharply, following increased political uncertainty. On 29 May 2018, the yield Italian bonds with a residual maturity of 2-years increased from 0.90% to 2.77% – an increase of over 180 basis points (**Chart 1**). That one day change was far larger than anything seen over the past 25 years.

**Chart 1 – Italian 2-year government bond yield – level and daily change **

*Source: Bloomberg and author calculations.*

How surprising was this large move in Italian yields, based on previous data? Well that depends on your model and what data you used to estimate it.

One approach would be to use all the data available to you at each point in time. You might start by looking at the first few moments of the data. **Chart 2 **does just this, showing the standard deviation, skewness and kurtosis of the daily change in Italian 2-year government bond yields, re-estimated each day using all available data since 1993.

There is a small increase in the full sample standard deviation, but this metric is not designed to pick up tails of the distribution. Even taking into account the 29^{th} May, the standard deviation is 9bps. So the daily change on 29^{th} May was 22 times larger than the standard deviation. If changes in Italian bond yields followed a normal distribution, the probability of such a move would have been astronomically small.

But looking at the skewness and kurtosis of the daily changes in Italian 2-year yields, you can see the distribution is a) not symmetric (skewness not zero), and b) fat-tailed (kurtosis is large – jumps from about 30 to 60; a normal distribution has a kurtosis of 3). That was true before May 2018, but the additional datapoints dramatically changed the estimates of these moments of the distribution.

**Chart 2 – Moments of the distribution of Italian 2-year government bond yields, based on full sample at each point in time**

*Source: Bloomberg and author calculations.*

So what? That financial market variables are fat-tailed is well-known. But models based on empirical distributions are widely used eg in risk management, regulatory limits and algorithmic trading strategies. Chart 2 shows that even when the whole distribution is used to estimate model parameters or backtest model performance, an additional data point can make a material difference.

And in many applications the whole sample is not used. Instead a rolling window of the past one, two or five years of data is used. Whilst a short window of recent data helps models pick up changes in relationships (e.g. changing correlations between different assets), it throws away past examples of large moves, perhaps giving false comfort on the likelihood of tail events.

One well known example is Value at Risk (VaR), a risk management tool designed to give an idea of how much money a given portfolio could lose over a certain period, with a given probability. For example, in UCITS – a set of European investment fund regulations – funds can report their VaR based on holding a portfolio for one month, using a 99% one-tailed confidence interval, with data estimated over at least one year.

VaR models can provide a useful summary statistic of the riskiness of a given portfolio. For example, a VaR of £100mn, estimated using a 99% one-tailed confidence interval with a one month holding period, means that the losses on the portfolio over a month should only exceed £100mn in one month out of a hundred.

Many different methods are used to estimate the distribution in VaR models. These can broadly be split into those that assume a given distribution for the data (parametric e.g. normal distribution), and those that sample from historical data (non-parametric, e.g. historical simulation).

In **Chart 3** I’ve shown a very naïve version of a VaR-type calculation, using a normal distribution fitted to daily changes in Italian 2-year bond yields. Fitting this distribution only requires the mean and standard deviation, which I’ve estimated using a two year rolling window.

**Chart 3 – Value at Risk of daily change in Italian 2-year government bond yields, with daily percentage change in VaR**

*Estimated using a 99% VaR, 1 month holding period, 2 year rolling window, based on a normal distribution*

*Source: Bloomberg and author calculations.*

What a difference a day makes. On 29 May 2018 the estimated ‘VaR’ rose sharply. Whilst it doesn’t quite reach levels seen during the euro area sovereign debt crisis, the daily change is over 170%.

Part of the reason for the jump in the estimated VaR is the very low volatility in the prior few years – a much discussed phenomenon more generally across financial markets. When the underlying distribution is estimated on a relatively short period of recent data, at a time of low volatility, the probability of a large move will be hugely underestimated. That is true regardless of the approach used to estimate the distribution using a short backrun of data.

The use of short, rolling windows of data in VaR models can lead to so-called procyclical behaviour. For example, suppose the amount of risk you are allowed to take is based on not exceeding a certain VaR limit. Then a fall in volatility will mean a given portfolio has a lower VaR. So you can achieve the same VaR with a bigger portfolio. And when volatility rises, the VaR of the portfolio rises, perhaps exceeding the limit. So a VaR limit leads investors to take more risk when volatility is low (e.g buy more bonds) and de-risk when volatility is high (e.g. sell bonds). Such behaviour can amplify moves – a so-called VaR shock – e.g. as seen in Japanese government bond markets in 2003.

We often need to estimate the distribution of financial market variables, and not just for VaR models. Using a long backrun of data can help, but the past can only tell you so much. As with all modelling, it is important to understand the assumptions used and limitations involved. For example, whilst Value at Risk has a place as an indicator of portfolio risk, it is only as good as the estimated distribution used.

*Matt Roberts-Sklar works in the Bank’s Capital Markets Division.*

*If you want to get in touch, please email us at bankunderground@bankofengland.co.uk or leave a comment below.*

__Comments__ will only appear once approved by a moderator, and are only published where a full name is supplied.

*Bank Underground is a blog for Bank of England staff to share views that challenge – or support – prevailing policy orthodoxies. The views expressed here are those of the authors, and are not necessarily those of the Bank of England, or its policy committees.*

Financial series reflect the operation of markets, and in general tend to show scalar behaviour, reflecting similar patterns over small and large scales of time observation.

For such series probability distributions that also have scaling charactersitics are appropriate, such as the Breit-Wigner distribution.

The table below shows the probabilities of multiple sigma events for the Gaussian and Breit-Wigner distributions.

Probability of exceeding n-sigma (one-sided)

Multiple sigma Normal Breit-Wigner

1 15.87% 15.87%

2 2.28% 8.46%

3 0.135% 5.71%

4 0.0030% 4.30%

5 0.000029% 3.45%

6 0.00000010% 2.88%

The probability of an event more than 6 sigma from the mean expectation is 2.9%, significantly greater than “the once in the lifetime of the universe” likelihood suggested by one senior banker.