Structural Breaks

This implementation is based on Chapter 17 of the book Advances in Financial Machine Learning. Structural breaks, like the transition from one market regime to another, represent the shift in the behaviour of market participants.

The first market participant to notice the changes in the market can adapt to them before others and, consequently, gain an advantage over market participants who have not yet noticed market regime changes.

To quote Marcos Lopez de Prado, “Structural breaks offer some of the best risk/rewards”.

We can classify the structural break in two general categories:

  1. CUSUM tests: These test whether the cumulative forecasting errors significantly deviate from white noise.

  2. Explosiveness tests: Beyond deviation from white noise, these test whether the process exhibits exponential growth or collapse, as this is inconsistent with a random walk or stationary process, and it is unsustainable in the long run.

Note

Underlying Literature

The following sources elaborate extensively on the topic:

CUSUM tests

Chu-Stinchcombe-White CUSUM Test on Levels

We are given a set of observations \(t = 1, ... , T\) and we assume an array of features \(x_{i}\) to be predictive of a value \(y_{t}\) .

\[y_{t} = \beta_{t}x_{t} + \epsilon_{t}\]

Authors of the Testing for Speculative Bubbles in Stock Markets: A Comparison of Alternative Methods paper suggest assuming \(H_{0} : \beta_{t} = 0\) and therefore forecast \(E_{t-1}[\Delta y_{t}] = 0\). This allows working directly with \(y_{t}\) instead of computing recursive least squares (RLS) estimates of \(\beta\) .

As \(y_{t}\) we take the log-price and calculate the standardized departure of \(y_{t}\) relative to \(y_{n}\) (CUSUM statistic) with \(t > n\) as:

\[\begin{split}\begin{equation} \begin{split} S_{n,t} & = (y_{t}-y_{n})(\hat\sigma_{t}\sqrt{t-n})^{-1}, \ \ t>n \\ \hat\sigma_{t}^{2} & = (t-1)^{-1} \sum_{i=2}^{t}({\Delta y_{t_{i}}})^2 \\ \end{split} \end{equation}\end{split}\]

With the \(H_{0} : \beta_{t} = 0\) hypothesis, \(S_{n, t} \sim N[0, 1]\) .

We can test the null hypothesis comparing CUSUM statistic \(S_{n, t}\) with critical value \(c_{\alpha}[n, t]\), which is calculated using a one-sided test as:

\[c_{\alpha}[n, t] = \sqrt{b_{\alpha} + \log{[t-n]}}\]

The authors in the above paper have derived using Monte Carlo method that \(b_{0.05} = 4.6\) .

The disadvantage is that \(y_{n}\) is chosen arbitrarily, and results may be inconsistent due to that. This can be fixed by estimating \(S_{n, t}\) on backward-shifting windows \(n \in [1, t]\) and picking:

\[\begin{equation} S_{t}= \sup_{n \in [1, t]} \{ S_{n, t}\} \end{equation}\]

Implementation

get_chu_stinchcombe_white_statistics(series: Series, test_type: str = 'one_sided', num_threads: int = 8, verbose: bool = True) Series

Multithread Chu-Stinchcombe-White test implementation, p.251.

Parameters:
  • series – (pd.Series) Series to get statistics for.

  • test_type – (str) Two-sided or one-sided test.

  • num_threads – (int) Number of cores.

  • verbose – (bool) Flag to report progress on asynch jobs.

Returns:

(pd.Series) Statistics.


Explosiveness tests

Chow-Type Dickey-Fuller Test

The Chow-Type Dickey-Fuller test is based on an \(AR(1)\) process:

\[y_{t} = \rho y_{t} + \varepsilon_{t}\]

where \(\varepsilon_{t}\) is white noise.

This test is used for detecting whether the process changes from the random walk (\(\rho = 1\)) into an explosive process at some time \(\tau^{*}T\), \(\tau^{*} \in (0,1)\), where \(T\) is the number of observations.

So, the hypothesis \(H_{0}\) is tested against \(H_{1}\):

\[\begin{split}\begin{equation} \begin{split} H_{0} & : y_{t} = y_{t-1} + \varepsilon_{t} \\ H_{1} & : y_{t}=\begin{cases} y_{t-1} + \varepsilon_{t} \ \text{for} \ \ t= 1, ..., \tau^*T \\ \rho y_{t-1} + \varepsilon_{t} \ \text{for} \ \ t= \tau^*T+1, ..., T, \text{ with } \rho > 1 \end{cases} \end{split} \end{equation}\end{split}\]

To test the hypothesis, the following specification is being fit:

\[\Delta y_{t} = \delta y_{t-1} D_{t}[\tau^*] + \varepsilon_{t}\]
\[\begin{split}\begin{equation} \begin{split} D_{t}[\tau^*] & = \begin{cases} 0 \ \text{if} \ \ t < \tau^*T \\ 1 \ \text{if} \ \ t \geq \tau^*T \end{cases} \end{split} \end{equation}\end{split}\]

So, the hypothesis tested are now transformed to:

\[\begin{split}\begin{equation} \begin{split} H_{0} & : \delta = 0 \\ H_{1} & : \delta > 1 \\ \end{split} \end{equation}\end{split}\]

And the Dickey-Fuller-Chow(DFC) test-statistics for \(\tau^*\) is:

\[DFC_{\tau^*} = \frac{\hat\delta}{\hat\sigma_{\delta}}\]

As described in the Advances in Financial Machine Learning:

The first drawback of this method is that \(\tau^{*}\) is unknown, and the second one is that Chow’s approach assumes that there is only one break and that the bubble runs up to the end of the sample.

To address the first issue, in the work Tests for Parameter Instability and Structural Change With Unknown ChangePoint available here, Andrews proposed to try all possible \(\tau^{*}\) in an interval \(\tau^{*} \in [\tau_{0}, 1 - \tau_{0}]\)

For the unknown \(\tau^{*}\) the test statistic is the Supremum Dickey-Fuller Chow which is the maximum of all \(T(1 - 2\tau_{0})\) values of \(DFC_{\tau^{*}}\) :

\[\begin{equation} SDFC = \sup_{\tau^* \in [\tau_0,1-\tau_0]} \{ DFC_{\tau^*}\} \end{equation}\]

To address the second issue, the Supremum Augmented Dickey-Fuller test was introduced.

Implementation

get_chow_type_stat(series: Series, min_length: int = 20, num_threads: int = 8, verbose: bool = True) Series

Multithread implementation of Chow-Type Dickey-Fuller Test, p.251-252.

Parameters:
  • series – (pd.Series) Series to test.

  • min_length – (int) Minimum sample length used to estimate statistics.

  • num_threads – (int): Number of cores to use.

  • verbose – (bool) Flag to report progress on asynch jobs.

Returns:

(pd.Series) Chow-Type Dickey-Fuller Test statistics.


Supremum Augmented Dickey-Fuller

This test was proposed by Phillips, Shi, and Yu in the work Testing for Multiple Bubbles: Historical Episodes of Exuberance and Collapse in the S&P 500 available here. The advantage of this test is that it allows testing for multiple regimes switches (random walk to bubble and back).

The test is based on the following regression:

\[\Delta y_{t} = \alpha + \beta y_{t-1} + \sum_{l=1}^{L}{\gamma_{l} \Delta y_{t-l}} + \varepsilon_{t}\]

And, the hypothesis \(H_{0}\) is tested against \(H_{1}\):

\[\begin{split}\begin{equation} \begin{split} H_{0} & : \beta \le 0 \\ H_{1} & : \beta > 0 \\ \end{split} \end{equation}\end{split}\]

The Supremum Augmented Dickey-Fuller fits the above regression for each end point \(t\) with backward expanding start points and calculates the test-statistic as:

\[\begin{equation} SADF_{t} = \sup_{t_0 \in [1, t-\tau]}\{ADF_{t_0, t}\} = \sup_{t_0 \in [1, t-\tau]} \Bigg\{\frac{\hat\beta_{t_0,t}}{\hat\sigma_{\beta_{t_0, t}}}\Bigg\} \end{equation}\]

where \(\hat\beta_{t_0,t}\) is estimated on the sample from \(t_{0}\) to \(t\), \(\tau\) is the minimum sample length in the analysis, \(t_{0}\) is the left bound of the backwards expanding window, \(t\) iterates through \([\tau, ..., T]\) .

In comparison to SDFC, which is computed only at time \(T\), the SADF is computed at each \(t \in [\tau, T]\), recursively expanding the sample \(t_{0} \in [1, t - \tau]\) . By doing so, the SADF does not assume a known number of regime switches.

structural breaks

Image showing SADF test statistic with 5 lags and linear model. The SADF line spikes when prices exhibit a bubble-like behavior, and returns to low levels when the bubble bursts.

The model and the add_const parameters of the get_sadf function allow for different specifications of the regression’s time trend component.

Linear model (model=’linear’) uses a linear time trend:

\[\Delta y_{t} = \beta y_{t-1} + \sum_{l=1}^{L}{\gamma_{l} \Delta y_{t-l}} + \varepsilon_{t}\]

Quadratic model (model=’quadratic’) uses a second-degree polynomial time trend:

\[\Delta y_{t} = \beta y_{t-1} + \sum_{l=1}^{L}{\gamma_{l} \Delta y_{t-l}} + \sum_{l=1}^{L}{\delta_{l}^2 \Delta y_{t-l}} + \varepsilon_{t}\]

Adding a constant (add_const=True) to those specifications results in:

\[\Delta y_{t} = \alpha + \beta y_{t-1} + \sum_{l=1}^{L}{\gamma_{l} \Delta y_{t-l}} + \varepsilon_{t}\]

and

\[\Delta y_{t} = \alpha + \beta y_{t-1} + \sum_{l=1}^{L}{\gamma_{l} \Delta y_{t-l}} + \sum_{l=1}^{L}{\delta_{l}^2 \Delta y_{t-l}} + \varepsilon_{t}\]

respectively.

Implementation

get_sadf(series: Series, model: str, lags: int | list, min_length: int, add_const: bool = False, phi: float = 0, num_threads: int = 8, verbose: bool = True) Series

Advances in Financial Machine Learning, p. 258-259.

Multithread implementation of SADF.

SADF fits the ADF regression at each end point t with backwards expanding start points. For the estimation of SADF(t), the right side of the window is fixed at t. SADF recursively expands the beginning of the sample up to t - min_length, and returns the sup of this set.

When doing with sub- or super-martingale test, the variance of beta of a weak long-run bubble may be smaller than one of a strong short-run bubble, hence biasing the method towards long-run bubbles. To correct for this bias, ADF statistic in samples with large lengths can be penalized with the coefficient phi in [0, 1] such that:

ADF_penalized = ADF / (sample_length ^ phi)

Parameters:
  • series – (pd.Series) Series for which SADF statistics are generated.

  • model – (str) Either ‘linear’, ‘quadratic’, ‘sm_poly_1’, ‘sm_poly_2’, ‘sm_exp’, ‘sm_power’.

  • lags – (int or list) Either number of lags to use or array of specified lags.

  • min_length – (int) Minimum number of observations needed for estimation.

  • add_const – (bool) Flag to add constant.

  • phi – (float) Coefficient to penalize large sample lengths when computing SMT, in [0, 1].

  • num_threads – (int) Number of cores to use.

  • verbose – (bool) Flag to report progress on asynch jobs.

Returns:

(pd.Series) SADF statistics.

The function used in the SADF Test to estimate the \(\hat\beta_{t_0,t}\) is:

get_betas(X: DataFrame, y: DataFrame) Tuple[array, array]

Advances in Financial Machine Learning, Snippet 17.4, page 259.

Fitting The ADF Specification (get beta estimate and estimate variance).

Parameters:
  • X – (pd.DataFrame) Features(factors).

  • y – (pd.DataFrame) Outcomes.

Returns:

(np.array, np.array) Betas and variances of estimates.

Tip

Advances in Financial Machine Learning book additionally describes why log prices data is more appropriate to use in the above tests, their computational complexity, and other details.

The SADF also allows for explosiveness testing that doesn’t rely on the standard ADF specification. If the process is either sub- or super martingale, the hypotheses \(H_{0}: \beta = 0, H_{1}: \beta \ne 0\) can be tested under these specifications:

Polynomial trend (model=’sm_poly_1’):

\[y_{t} = \alpha + \gamma t + \beta t^{2} + \varepsilon_{t}\]

Polynomial trend (model=’sm_poly_2’):

\[log[y_{t}] = \alpha + \gamma t + \beta t^{2} + \varepsilon_{t}\]

Exponential trend (model=’sm_exp’):

\[y_{t} = \alpha e^{\beta t} + \varepsilon_{t} \Rightarrow log[y_{t}] = log[\alpha] + \beta t^{2} + \xi_{t}\]

Power trend (model=’sm_power’):

\[y_{t} = \alpha t^{\beta} + \varepsilon_{t} \Rightarrow log[y_{t}] = log[\alpha] + \beta log[t] + \xi_{t}\]

Again, the SADF fits the above regressions for each end point \(t\) with backward expanding start points, but the test statistic is taken as an absolute value, as we’re testing both the explosive growth and collapse. This is described in more detail in the Advances in Financial Machine Learning book p. 260.

The test statistic calculated (SMT for Sub/Super Martingale Tests) is:

\[SMT_{t} = \sup_{t_0 \in [1, t-\tau]} \Bigg\{\frac{ | \hat\beta_{t_0,t} | }{\hat\sigma_{\beta_{t_0, t}}}\Bigg\}\]

From the book:

Parameter phi in range (0, 1) can be used (phi=0.5) to penalize large sample lengths ( “this corrects for the bias that the \(\hat\sigma_{\beta_{t_0, t}}\) of a weak long-run bubble may be smaller than the \(\hat\sigma_{\beta_{t_0, t}}\) of a strong short-run bubble, hence biasing method towards long-run bubbles” ):

\[SMT_{t} = \sup_{t_0 \in [1, t-\tau]} \Bigg\{\frac{ | \hat\beta_{t_0,t} | }{\hat\sigma_{\beta_{t_0, t}}(t-t_{0})^{\phi}}\Bigg\}\]

Example

>>> import pandas as pd
>>> import numpy as np
>>> from mlfinlab.structural_breaks.chow import get_chow_type_stat
>>> from mlfinlab.structural_breaks.sadf import get_sadf
>>> from mlfinlab.structural_breaks.cusum import get_chu_stinchcombe_white_statistics
>>> # Import data
>>> url = "https://raw.githubusercontent.com/hudson-and-thames/example-data/main/dollar_bars.csv"
>>> data = pd.read_csv(url, index_col="date_time")
>>> data.index = pd.to_datetime(data.index)
>>> # Use subset of data as example
>>> data = data.iloc[:300]
>>> # Change to log prices data
>>> log_prices = np.log(data.close)  # see p.253, 17.4.2.1 Raw vs Log Prices
>>> # Chu-Stinchcombe test (one-sided and two-sided)
>>> one_sided_test = get_chu_stinchcombe_white_statistics(
...     log_prices, test_type="one_sided", num_threads=1
... )
>>> two_sided_test = get_chu_stinchcombe_white_statistics(
...     log_prices, test_type="two_sided", num_threads=1
... )
>>> # Chow-type test
>>> chow_stats = get_chow_type_stat(log_prices, min_length=20, num_threads=1)
>>> # SADF test with linear model and a constant, lag of 5 and minimum sample length of 20
>>> linear_sadf = get_sadf(
...     log_prices, model="linear", add_const=True, min_length=20, lags=5, num_threads=1
... )
>>> # Polynomial trend SMT
>>> sm_poly_1_sadf = get_sadf(
...     log_prices,
...     model="sm_poly_1",
...     add_const=True,
...     min_length=20,
...     lags=5,
...     phi=0.5,
...     num_threads=1,
... )
>>> one_sided_test  
stat...
>>> two_sided_test  
stat...
>>> chow_stats  
2015-01-02 16:30:52.544   -0.175202
2015-01-02 16:43:33.673   -0.107024...
>>> linear_sadf  
2015-01-02 18:12:28.096   -1.274585
2015-01-02 18:31:31.004   -1.675116...
>>> sm_poly_1_sadf  
2015-01-02 18:12:28.096    1.193729
2015-01-02 18:31:31.004    0.960044...

Presentation Slides

lecture8.png

Note

  • pg 1-14: Structural Breaks

  • pg 15-24: Entropy Features

  • pg 25-37: Microstructural Features


References