mlfinlab.feature_importance.fingerprint

Implementation of an algorithm described in Yimou Li, David Turkington, Alireza Yazdani ‘Beyond the Black Box: An Intuitive Approach to Investment Prediction with Machine Learning’ (https://jfds.pm-research.com/content/early/2019/12/11/jfds.2019.1.023)

Module Contents

Classes

AbstractModelFingerprint

Model fingerprint constructor.

RegressionModelFingerprint

Regression Fingerprint class used for regression type of models.

ClassificationModelFingerprint

Classification Fingerprint class used for classification type of models.

class AbstractModelFingerprint

Bases: abc.ABC

Model fingerprint constructor.

This is an abstract base class for the RegressionModelFingerprint and ClassificationModelFingerprint classes.

__slots__ = ()
fit(model: object, X: pandas.DataFrame, num_values: int = 50, pairwise_combinations: list = None) None

Get linear, non-linear and pairwise effects estimation.

Parameters:
  • model – (object) Trained model.

  • X – (pd.DataFrame) Dataframe of features.

  • num_values – (int) Number of values used to estimate feature effect.

  • pairwise_combinations – (list) Tuples (feature_i, feature_j) to test pairwise effect.

get_effects() Tuple

Return computed linear, non-linear and pairwise effects. The model should be fit() before using this method.

Returns:

(tuple) Linear, non-linear and pairwise effects, of type dictionary (raw values and normalised).

plot_effects(sort_by: str = 'lin', normalized=False) matplotlib.pyplot.figure

Plot each effect (normalized) on a bar plot (linear, non-linear). Also plots pairwise effects if calculated. The results are sorted by linear effect values by default.

Parameters:
  • sort_by – (str) Choose the effect (‘lin’ or ‘non-lin’) that the results will be sorted by.

  • normalized – (bool) Choose whether the plot results should be normalized or not. Values are normalized across all variables.

Returns:

(plt.figure) Plot figure.

class RegressionModelFingerprint

Bases: AbstractModelFingerprint

Regression Fingerprint class used for regression type of models.

__slots__ = ()
fit(model: object, X: pandas.DataFrame, num_values: int = 50, pairwise_combinations: list = None) None

Get linear, non-linear and pairwise effects estimation.

Parameters:
  • model – (object) Trained model.

  • X – (pd.DataFrame) Dataframe of features.

  • num_values – (int) Number of values used to estimate feature effect.

  • pairwise_combinations – (list) Tuples (feature_i, feature_j) to test pairwise effect.

get_effects() Tuple

Return computed linear, non-linear and pairwise effects. The model should be fit() before using this method.

Returns:

(tuple) Linear, non-linear and pairwise effects, of type dictionary (raw values and normalised).

plot_effects(sort_by: str = 'lin', normalized=False) matplotlib.pyplot.figure

Plot each effect (normalized) on a bar plot (linear, non-linear). Also plots pairwise effects if calculated. The results are sorted by linear effect values by default.

Parameters:
  • sort_by – (str) Choose the effect (‘lin’ or ‘non-lin’) that the results will be sorted by.

  • normalized – (bool) Choose whether the plot results should be normalized or not. Values are normalized across all variables.

Returns:

(plt.figure) Plot figure.

class ClassificationModelFingerprint

Bases: AbstractModelFingerprint

Classification Fingerprint class used for classification type of models.

__slots__ = ()
fit(model: object, X: pandas.DataFrame, num_values: int = 50, pairwise_combinations: list = None) None

Get linear, non-linear and pairwise effects estimation.

Parameters:
  • model – (object) Trained model.

  • X – (pd.DataFrame) Dataframe of features.

  • num_values – (int) Number of values used to estimate feature effect.

  • pairwise_combinations – (list) Tuples (feature_i, feature_j) to test pairwise effect.

get_effects() Tuple

Return computed linear, non-linear and pairwise effects. The model should be fit() before using this method.

Returns:

(tuple) Linear, non-linear and pairwise effects, of type dictionary (raw values and normalised).

plot_effects(sort_by: str = 'lin', normalized=False) matplotlib.pyplot.figure

Plot each effect (normalized) on a bar plot (linear, non-linear). Also plots pairwise effects if calculated. The results are sorted by linear effect values by default.

Parameters:
  • sort_by – (str) Choose the effect (‘lin’ or ‘non-lin’) that the results will be sorted by.

  • normalized – (bool) Choose whether the plot results should be normalized or not. Values are normalized across all variables.

Returns:

(plt.figure) Plot figure.