mlfinlab.microstructural_features.feature_generator
Inter-bar feature generator which uses trades data and bars index to calculate inter-bar features.
Module Contents
Classes
Class which is used to generate inter-bar features when bars are already compressed. |
- class MicrostructuralFeaturesGenerator(trades_input: str, pandas.DataFrame, tick_num_series: pandas.Series, batch_size: int = 20000000.0, volume_encoding: dict = None, pct_encoding: dict = None)
-
Class which is used to generate inter-bar features when bars are already compressed.
- Parameters:
-
-
trades_input – (str/pd.DataFrame) Path to the csv file or Pandas DataFrame containing raw tick data in the format[date_time, price, volume].
-
tick_num_series – (pd.Series) Series of tick number where bar was formed.
-
batch_size – (int) Number of rows to read in from the csv, per batch.
-
volume_encoding – (dict) Dictionary of encoding scheme for trades size used to calculate entropy on encoded messages.
-
pct_encoding – (dict) Dictionary of encoding scheme for log returns used to calculate entropy on encoded messages.
-
- get_features(verbose=True, to_csv=False, output_path=None)
-
Reads a csv file of ticks or pd.DataFrame in batches and then constructs corresponding microstructural intra-bar features: average tick size, tick rule sum, VWAP, Kyle lambda, Amihud lambda, Hasbrouck lambda, tick/volume/pct Shannon, Lempel-Ziv, Plug-in entropies if corresponding mapping dictionaries are provided (self.volume_encoding, self.pct_encoding). The csv file must have only 3 columns: date_time, price, & volume.
- Parameters:
-
-
verbose – (bool) Flag whether to print message on each processed batch or not.
-
to_csv – (bool) Flag for writing the results of bars generation to local csv file, or to in-memory DataFrame.
-
output_path – (bool) Path to results file, if to_csv = True.
-
- Returns:
-
(DataFrame or None) Microstructural features for bar index.