1. Explainable & Interpretable AI
Case-based Reasoning
Reasoning based on similar cases from the past (i.e. it is possible to determine accurately
and in a personalized manner how skaters should build up their lap times in order to achieve
a faster end result).
Computer Metaphor
The human is a symbol processor like a computer (limited attention and memory capacities).
● Information-processing approach:
○ The mental process can be compared with computer operations.
○ The mental process can be interpreted as information progressing through a
system in stages.
● Serial processing and symbolic representation.
General Problem Solver
The means-end analysis compares the current state with the goal state and chooses an
action that brings you closer to the goal.
Connectionist Approach
Rumelhart and McClelland (1986)
Connectionism / neural networks → parallel processing & distributed representation →
inspired current AI & deep learning techniques (brings new challenges for explanation).
Why Do We Need Explainability?
Important for trust and actual use/deployment of AI / ML.
● Model validation: avoid bias, unfairness or overfitting, etc.
● Model debugging & improvement: improve model fit, adversarial learning, reliability &
robustness.
● Knowledge discovery: explanations provide feedback to data scientists.
● Trust & technology acceptance: explanations might convince users to adopt
technology and have more control.
What Is A Good Explanation?
Confalonieri et al. (2020) & Molnar (2020)
● Contrastive: why this, and not this (counterfactual).
● Selective: focus on a few important causes.
● Social: should fit the mental model of the explainee / target audience and consider
social context + prior belief.
● Abnormalness: humans like rare causes.
● Truthfulness: less important for humans than selectiveness.
Important Properties ML
● Accuracy: does the explanation predict unseen data? As accurate as the model?
● Fidelity: does the explanation approximate the prediction of the model (important for
black-box)?
1
, ● Consistency: same explanations for different models?
● Stability: similar explanations for similar instances?
● Comprehensibility: do humans get it?
Types Of Explanations
Confalonieri et al. (2020)
● Symbolic reasoning systems: based on a knowledge base and productions
rules/logical inferences (inherently interpretable / explainable).
● Sub-symbolic reasoning: representations are distributed, explanations are
approximate models, focus on causability / counterfactuals can help the user.
● Hybrid / neural-symbolic systems: use the symbolic system to explain models coming
from the sub-symbolic system.
▶ Explanations as lines of reasoning:
domain knowledge as production rules:
● Q&A module: explanations on the
knowledge base.
● Reasoning status checker:
evaluate a sequence of rules used.
▶ Explanations as problem solving activity:
explanations need different levels of
abstraction, and should focus on
explaining the problem solving of the task.
Machine Learning / AI Interpretability
● Methods:
○ Glass-box models (inherently interpretable): regression, decision trees, GAM.
○ Black-box models: neural networks, random forest → requires post-hoc
explanations.
○ Model-specific methods: explanation specific to ML techniques.
○ Model-agnostic methods: treat ML model as black-box, only use in- and
output.
● Classifications:
Molnar et al. (2020)
○ Analyzing the components of the model (model-specific).
○ Explaining individual predictions (local explanation / counterfactuals).
○ Explaining global model behavior.
○ Surrogate models train on the in- / outputs (model-agnostic).
Confalonieri et al. (2020)
○ Global methods.
○ Local methods.
○ Introspective methods.
2
, Analyzing Components of Interpretable Models
● Linear Regression: weighted sum of features.
● Decision trees: interpret the learned structure.
▶ Does not work for high-dimensional data: pruning decision trees or shrinking coefficients in
regression (LASSO).
Analyzing Components of Complex Models
Molnar et al. (2020)
● Feature maps visualizing layers in CNN models.
● Analyze the structure of random forests (Gini Importance).
● Add interpretability constraints to the model.
Global Explanations
How does the model perform on average for the dataset?
● Generate symbolic representations: fit interpretable model on input / output relations
of the trained model.
● Feature importance ranks: permutate / remove features and observe changes in
model output.
● Feature effect: effect of a specific feature on the model outcome.
Local Interpretable Model-agnostic Explanations (LIME)
An algorithm that can explain predictions of classifiers / regressions by approximating it
locally with an interpretable model.
● Interpretable: provide quantitative understanding between input and response.
● Local fidelity: explanations must be at least locally faithful.
● Model-agnostic: explainer should be able to explain any model.
Shapley Values
The average marginal contribution of a feature value across all possible coalitions.
Contrastive explanations (better than LIME), high computational time.
Counterfactuals
How does the output change when the input changes? Need to be actionable (unchangeable
features are bad counterfactuals: gender, race, etc.).
Recommender Systems
Often black-box but can be explained abstractly by the underlying algorithm.
● Item-based: we commend you … because you also like …
● Use-based: people like you also like …
● Feature-based: you like this because you like aspects X and Y.
Challenges
Molnar et al. (2020) & Confalonieri et al. (2020)
● Statistical uncertainty: most often not represented in the explanations, so how much
can we trust them?
● Causality: casual interpretation is usually the goal, but models are typically more
correlated and so are the explanations.
3
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller tomdewildt. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $5.89. You're not tied to anything after your purchase.