mlfinlab.labeling.fixed_time_horizon

Chapter 3.2 Fixed-Time Horizon Method, in Advances in Financial Machine Learning, by M. L. de Prado.

Work “Classification-based Financial Markets Prediction using Deep Neural Networks” by Dixon et al. (2016) describes how labeling data this way can be used in training deep neural networks to predict price movements.

Module Contents

Functions

fixed_time_horizon(prices[, threshold, resample_by, ...])

Fixed-Time Horizon Labeling Method.

fixed_time_horizon(prices, threshold=0, resample_by=None, lag=True, standardized=False, window=None)

Fixed-Time Horizon Labeling Method.

Originally described in the book Advances in Financial Machine Learning, Chapter 3.2, p.43-44.

Returns 1 if return is greater than the threshold, -1 if less, and 0 if in between. If no threshold is provided then it will simply take the sign of the return.

Parameters:
  • prices – (pd.Series or pd.DataFrame) Time-indexed stock prices used to calculate returns.

  • threshold – (float or pd.Series) When the absolute value of return exceeds the threshold, the observation is labeled with 1 or -1, depending on the sign of the return. If return is less, it’s labeled as 0. Can be dynamic if threshold is inputted as a pd.Series, and threshold.index must match prices.index. If resampling is used, the index of threshold must match the index of prices after resampling. If threshold is negative, then the directionality of the labels will be reversed. If no threshold is provided, it is assumed to be 0 and the sign of the return is returned.

  • resample_by – (str) If not None, the resampling period for price data prior to calculating returns. ‘B’ = per business day, ‘W’ = week, ‘M’ = month, etc. Will take the last observation for each period. For full details see here.

  • lag – (bool) If True, returns will be lagged to make them forward-looking.

  • standardized – (bool) Whether returns are scaled by mean and standard deviation.

  • window – (int) If standardized is True, the rolling window period for calculating the mean and standard deviation of returns.

Returns:

(pd.Series or pd.DataFrame) -1, 0, or 1 denoting whether the return for each observation is less/between/greater than the threshold at each corresponding time index. First or last row will be NaN, depending on lag.