Return Versus Benchmark

Labeling versus benchmark is featured in the paper Evaluating multiple classifiers for stock price direction prediction, by Ballings et al., 2015. In this paper, the authors label yearly forward stock returns against a predetermined benchmark, and use that labeled data to compare the performance of several machine learning algorithms in predicting long term price movements.

Labeling against benchmark is a simple method of labeling financial data in which time-indexed returns are labeled according to whether they exceed a set value. The benchmark can be either a constant value, or a pd.Series of values with an index matching that of the returns. The labels can be the numerical value of how much each observation’s return exceeds the benchmark, or the sign of the excess.

At time \(t\), given that price of a stock is \(p_{t, n}\), benchmark is \(B_t\) and return is:

\[r_{t,n} = \frac{p_{t,n}}{p_{t-1,n}} - 1\]

Note that \(B_t\) is a scalar value corresponding to the benchmark at time \(t\), while \(B\) is the vector of all benchmarks across all timestamps. The labels are:

\[L(r_{t,n}) = r_{t,n} - B_t\]

If categorical labels are desired:

\[\begin{split}\begin{equation} \begin{split} L(r_{t, n}) = \begin{cases} -1 &\ \text{if} \ \ r_{t,n} < B_t\\ 0 &\ \text{if} \ \ r_{t,n} = B_t\\ 1 &\ \text{if} \ \ r_{t,n} > B_t\\ \end{cases} \end{split} \end{equation}\end{split}\]

The simplest method of labeling is just returning the sign of the return. However, sometimes it is desirable to quantify the return compared to a benchmark to better contextualize the returns. This is commonly done by using the mean or median of multiple stocks in the market. However, that data may not always be available, and sometimes the user might wish a specify a constant or more custom benchmark to compare returns against. Note that these benchmarks are unidirectional only. If the user would like a benchmark that captures the absolute value of the returns, then the fixed horizon method should be used instead.

If desired, the user can specify a resampling period to apply to the price data prior to calculating returns. The user can also lag the returns to make them forward-looking. In the paper by Ballings et al., the authors use yearly forward returns, and compare them to benchmark values of 15%, 25%, and 35%.

The following shows the returns for MSFT stock during March-April 2020, compared to the return of SPY as a benchmark during the same time period. Green dots represent days when MSFT outperformed SPY, and red dots represent days when MSFT underperformed SPY.

labeling vs benchmark — Comparison of MSFT return to SPY return.

Note

Underlying Literature

This labeling method is sourced from the following: - Chapter 5.5.1 of Machine Learning for Factor Investing, by Coqueret, G. and Guida, T. (2020).

Implementation

Return in excess of a given benchmark.

Chapter 5, Machine Learning for Factor Investing, by Coqueret and Guida, (2020).

Work “Evaluating multiple classifiers for stock price direction prediction” by Ballings et al. (2015) uses this method to label yearly returns over a predetermined value to compare the performance of several machine learning algorithms.

return_over_benchmark(prices, benchmark=0, binary=False, resample_by=None, lag=True)

Return over benchmark labeling method. Sourced from Chapter 5.5.1 of Machine Learning for Factor Investing, by Coqueret, G. and Guida, T. (2020).

Returns a Series or DataFrame of numerical or categorical returns over a given benchmark. The time index of the benchmark must match those of the price observations.

Parameters:

prices – (pd.Series/pd.DataFrame) Time indexed prices to compare returns against a benchmark.
benchmark – (pd.Series or float) Benchmark of returns to compare the returns from prices against for labeling. Can be a constant value, or a Series matching the index of prices. If no benchmark is given, then it is assumed to have a constant value of 0.
binary – (bool) If False, labels are given by their numerical value of return over benchmark. If True, labels are given according to the sign of their excess return.
resample_by – (str) If not None, the resampling period for price data prior to calculating returns. ‘B’ = per business day, ‘W’ = week, ‘M’ = month, etc. Will take the last observation for each period. For full details see here.
lag – (bool) If True, returns will be lagged to make them forward-looking.

Returns:

(pd.Series/pd.DataFrame) Excess returns over benchmark. If binary, the labels are -1 if the return is below the benchmark, 1 if above, and 0 if it exactly matches the benchmark.

Example

Below is an example on how to use the return over benchmark labeling technique on real data.

                        # Import packages
import yfinance as yf

# Import MlFinLab tools
from mlfinlab.labeling.return_vs_benchmark import return_over_benchmark

# Import price data
tickers = "AAPL SPY"
data = yf.download(tickers, start="2010-01-01", end="2020-01-01")["Adj Close"]

# Get returns in SPY to be used as a benchmark
benchmark_returns = data["SPY"].pct_change()
# Use Apple as asset to compare to benchmark
apple_prices = data["AAPL"]

# Create labels using SPY as a benchmark
numerical_labels = return_over_benchmark(
    prices=apple_prices, benchmark=benchmark_returns
)

# Create labels categorically
binary_labels = return_over_benchmark(
    prices=apple_prices, benchmark=benchmark_returns, binary=True
)

# Label yearly forward returns, with the benchmark being an 25% increase in price
yearly_labels = return_over_benchmark(
    prices=apple_prices, benchmark=0.25, binary=True, resample_by="Y", lag=True
)

                      

Research Notebook

The following research notebook can be used to better understand the return against benchmark labeling technique.

Return Over Benchmark Example

Presentation Slides

References

Ballings, M., Van den Poel, D., Hespeels, N. and Gryp, R., 2015. Evaluating multiple classifiers for stock price direction prediction. Expert systems with Applications, 42(20), pp.7046-7056.