Sample Datasets
In this module we provide very small samples of data to help users validate some functions.
The 3 small data sets are:
Tick data (2011/07/31 - 2011/07/31)
-
Dollar Bars Data Structure (2015/01/01 - 2015/01/29)
ETF Dataset (2008 - 2016)
Tick Data
MlFinLab provides a sample (2011/07/31 - 2011/07/31) of tick data for E-Mini S&P 500 futures which can be used to test bar compression algorithms, microstructural features, etc. Tick data sample consists of Timestamp, Price and Volume.
- load_tick_sample() DataFrame
-
Loads E-Mini S&P 500 futures tick data sample.
- Returns:
-
(pd.DataFrame) Frame with tick data sample.
Dollar Bars
We also provide a sample (2015/01/01 - 2015/01/29) of dollar bars for E-Mini S&P 500 futures. Data set structure:
Open price (open)
High price (high)
Low price (low)
Close price (close)
Volume (cum_volume)
Dollar volume traded (cum_dollar)
Number of ticks inside of bar (cum_ticks)
Tip
You can find more information on dollar bars and other bar compression algorithms in Data Structures
- load_dollar_bar_sample() DataFrame
-
Loads E-Mini S&P 500 futures dollar bars data sample.
- Returns:
-
(pd.DataFrame) Frame with dollar bar data sample.
ETF Prices
- The data set consists of close prices for:
-
-
EEM, EWG, TIP, EWJ, EFA, IEF, EWQ, EWU, XLB, XLE, XLF, LQD, XLK, XLU, EPP, FXI, VGK, VPL, SPY, TLT, BND, CSJ, DIA
From 2008 till 2016.
-
It can be used to test and validate portfolio optimization techniques.
- load_stock_prices() DataFrame
-
Loads stock prices data sets consisting of EEM, EWG, TIP, EWJ, EFA, IEF, EWQ, EWU, XLB, XLE, XLF, LQD, XLK, XLU, EPP, FXI, VGK, VPL, SPY, TLT, BND, CSJ, DIA starting from 2008 till 2016.
- Returns:
-
(pd.DataFrame) The stock_prices data frame.
Example Code
from mlfinlab.datasets import (load_tick_sample, load_stock_prices, load_dollar_bar_sample)
tick_df = load_tick_sample()
dollar_bars_df = load_dollar_bar_sample()
stock_prices_df = load_stock_prices()