Explainability & Transparency

To prevent
Inscrutable Evidence

A lack of interpretability and transparency can lead to algorithmic systems that are hard to control, monitor, and correct. This is the commonly cited ‘black-box’ issue.

An example here is the one of “black boxes”, where the criteria which lead to a prediction is inscrutable and humans are left guessing. In healthcare, an AI systems could learn to make predictions based on factors that less related to a disease than the brand of MRI machine used, the time a blood test is taken or whether a patient was visited by a chaplain. 

Tools for Explainability & Transparency

SHAP Model

This resource presents a unified approach to interpreting model predictions known as SHAP (shapely regression values and layer-wise relevance propagation) which combines the methodologies of local-interpretable model explainers, DeepLIFT (https://github.com/kundajelab/deeplift), Tree Interpreters, QII and shapely sampling values to deliver a method that they clam can be used to explain the prediction of any machine learning model.


InterpretML is an open-sourced code by Microsoft toolkit aimed at improving explainability.

LIME & TreeInterpreters

The PWC report by Oxborugh et al. titled ‘Explainable AI: Driving business value through greater understanding’ provides a high-level introduction to the range of techniques available to developers seeking to make their models more explainable. Some of the ‘hands-on’ available ones are LIME, a model-agnostic approach and TreeInterpreters, an algorithm-specific method.


Random Forest Explainer (RFEX 2.0), by D. Petkovic, A. Alavi, D. Cai, J. Yang, S. Barlaskar, offers integrated model and novel sample explainability. RFEX 2.0 is designed in User Centric way with non-AI experts in mind, and with simplicity and familiarity, e.g. providing a one-page tabular output and measures familiar to most users. RFEX is demonstrated in a case study from the collaboration of Petkovic et al. with the J. Craig Venter Institute (JCVI).


Alibi is an open source Python library aimed at ML model inspection and interpretation. It focuses on providing the code needed to produce explanations for black-box algorithms. The goals of the library are to provide high quality reference implementations of black-box ML model explanation algorithms, ·   define a consistent API for interpretable ML models, support multiple use cases (e.g. tabular, text and image data classification, regression) and implement the latest model explanation, concept drift algorithmic bias detection and other ML model monitoring and interpretation methods.


DeepLIFT (Deep Learning Important FeaTures) as a method for ‘explaining’ the predictions made by neural networks.

Visualisation of CNN representations

Among the techniques available for understanding neural-networks there are the visualisation of CNN representations, methods for diagnosing representations of pre-trained CNNs, approaches for disentangling pre-trained CNN representations, learning of CNNs with disentangled representations and middle-to-end learning based on model interpretability. These can be found in their python implementation here:

Improving the Interpretability of Algorithms

An online toolkit providing a range of resources (e.g. codebooks) available for use for the purpose of improving the interpretability of a an algorithm. have created a series of Juptyer notebooks using open source tools including Python, H20, XGBoost, GraphViz, Pandas, and NumPy to outline practical explanatory techniques for machine learning models and results.

Get involved!

AI for People is open for collaborations, funding and volunteers to make them reach a more mature stage. Help us to make them become a reality!

Receive Updates

Slack Channel
Join our discussions

Attend & meet us

Support AI for People