A. Interpretability and Explainability


Defining Explainability

We can more or less define an "explainable" model as one that can give you are reasonable estimate to the following questions:
Importance Measures - Why was that prediction made as a function of our inputs and their interactions? Explanation Methods - Under what conditions would the outcome differ?
Plain Text
Some problems are just too complex to explain, e.g, 20-layer neural network with 4,000 features.
It's exactly their intractability to our brains that makes them ideal for equation-generating algorithms to solve.
It is for that reason that some firms don’t care too much about explainability, anecdotally, firms like RenTech sometimes have no idea why their models are doing what they are doing.

Importance Measures

Interpretable Models

For interpretable models you know the exact contribution of every feature to the final output. For uninterpretable models you only have an estimate of each feature’s contribution to the final output.
Interpretable models (white-box) are inherently explainable, we don’t need to use methods like Permutation or Shapley value calculation to identify the feature effects.
Uninterpretable (black-box) models are not interpretable by nature, as such we need to use explainability methods like calculating Permutation importance or Shapley values.
Explainability methods seeks to close the gap in understanding between white-box and black-box models.
No matter how many explainability methods you use, they will always be estimates, and will never give you the intrinsic explanations of say linear regression models, i.e., parameter coefficients.
A good performing model is a necessary criterion for trusting the explainability outcomes. Features that are deemed of low importance in a bad model might me very important in a good model.

Explanation Methods

Linear models are intrinsically interpretable, but perform poorly
Nonlinear models are powerful, but not intrinsically interpretable
Use approaches that make ML models interpretable (post hoc) (explainability)
Explanation methods allow us to use more powerful models (e.g., gradient boosting machines, neural networks) while still understanding how they work.
It helps to turn previous black-box models into grey-box models, so we have the benefit of model performance and model explainability.
Explainability methods are also sometimes called post-hoc interpretability because they don’t have intrinsic interpretability like linear regression models.
Intrinsic or post hoc? This criteria distinguishes whether interpretability is achieved by restricting the complexity of the machine learning model (intrinsic) or by applying explainability methods that analyze the model after training (post hoc)
Explainability methods can be applied to inherently interpretable models, e.g., permutation importance applied to linear regression models. (it just wouldn’t make sense as you are trading exact numbers for estimates)

Why explainability?

Why is the model working?
We don’t just want to know why Warren Buffet makes a lot of money, we want to know why he makes a lot of money.
In the same way don’t just want to know that the machine learning model is good, we also want to know why the model good.
If we know why the model performs well we can more easily improve the model and learn under what conditions the model could improve more, or in fact struggle.
Why is the model failing?
During drawdown periods, the research team would want to help explain why a model failed and some degree of interpretability.
Is it due to abnormal transaction costs, a bug in the code, or is the market regime not suitable for this type of strategy?
With a better understanding of which features add value, a better answer to drawdowns can be provided. In this way models are not as ‘black box’ as previously described.
Should we trust the model?
Many people won't assume they can trust your model for important decisions without verifying some basic facts.
In practice, showing insights that fit their general understanding of the problem, e.g., past returns are predictive of future returns, will help build trust.
Being able to interpret the results of a machine learning model leads to better communication between quantitative portfolio manager and investors.
Clients feel much more comfortable when the research team can tell a story.
What data to collect?
Collecting and buying new types of data can be expensive or inconvenient, so firms want to know if it would be worth their while.
If your feature importance analysis shows that volatility features shows great performance and not sentiment features, then you can collect more data on volatility.
Instead of randomly adding technical and fundamental indicators, it becomes a deliberate process of adding informative factors.
Feature selection?
We may also conclude that some features are not that informative for our model.
Fundamental feature might look like noise to the data, whereas volatility features fit well.
As a result, we can exclude these fundamental features from the model and measure the performance.
Feature generation?
We can investigate feature interaction using partial dependence and feature contribution plots.
We might see that their are large interaction effects between volatility features and pricing data.
With this knowledge we can develop new feature like entropy of volatility values divided by closing price.
We can also simply focus on the singular feature and generate volatility with bigger look-back periods or measures that take the difference between volatility estimates and so on.
Empirical Discovery?
The interpretability of models and explainability of results have a central place in the use of machine learning for empirical discovery.
After assessing feature importance values you might identify that when a momentum and value factor are both low, higher returns are predicted.
In corporate bankruptcy, after 2008, the importance of solvency ratios have taken center stage replacing profitability ratios.

Explainability Types

Explainability methods have three criteria, are they (1) local or global, (2) model-specific or model-agnostic, and (3) numerical or visual.
Although not shown in the diagram above, all local methods can be turned into global methods. You can simply sum or average across the local methods to create a global method.
Local versus global: with local methods we calculate the contribution of every feature for every datapoint (row), with global methods we only have the aggregate importance value for every feature across the entire dataset.
Specific versus agnostic: specific model explainability values arise from some internal characteristic, e.g., only linear models have coefficients, only decision trees have features splits, whereas agnostic methods can be applied to any model.
Numerical versus visualization: some explanations are better communicated with visualizations, e.g., we can visualize how the output of a model changes with the progressive change of a feature value using Partial Dependence Plots.
It is my preference to work with both local and global model-agnostic methods, one method in particular called Shapley values come to mind.

Common Misconceptions

Explaining the model ≠ explaining the data

Model inspection only tells you about the model.
The model might not accurately reflect the data.
If the model performance is good, then it might better reflect the data

More Explainable + More Interpretable ≠ Better Decisions

You could be using the most interpretable and explainable models.
It doesn’t mean the performance of the model is great.
Nor does it mean that you are making the right predictions.
Neither does it guarantee the robustness of the relationships over time.

Explainable ≠ Understandable ≠ Trusted

A Random Forest model might be explainable, but do stakeholders understand and trust it?
Even when an explainable model is understandable, it doesn’t mean that it can be trusted.
Additional robustness tests have to be performed to develop a model that is also robust.