Type
Master's thesis / Bachelor's thesis / supervised research
Prerequisites
- Knowledge of deep learning with image data, natural language data or graph data
- Proficiency in Python and deep learning frameworks (either PyTorch or Tensorflow)
Description
Over the last decade, deep learning methods have been deployed in numerous real-world, often safety-critical, applications. However, a major and growing concern remains the explainability of neural network decisions. A neural network operates as a black box: A priori, one can only comprehend the input and output of a neural net decision, not the reasoning leading to the decision. The explainable AI (XAI) field aims to develop explanation methods that “open the black box” and shed light on the reasoning behind neural network decisions.
References
- Comprehensive book covering many XAI methods
Interpretable Machine Learning https://christophm.github.io/interpretable-ml-book/ - Classic work on using Shapley values for XAI
A unified approach to interpreting model predictions https://arxiv.org/abs/1705.07874 - One of the first works on concept based interpretability
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors https://arxiv.org/abs/1711.11279 - A popular post-hoc method for image classifiers
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization https://arxiv.org/abs/1610.02391 - Popular tree and prototype based method for explainability
Prototypical Trees for Interpretable Image Classification https://openaccess.thecvf.com/content/CVPR2021/papers/Nauta_Neural_Prototype_Trees_for_Interpretable_Fine-Grained_Image_Recognition_CVPR_2021_paper.pdf - Popular concept based method that is interpretable by design
Concept Bottleneck Models https://arxiv.org/abs/2007.04612 - Another method for interpretability by design that is relevant to current XAI research in the chair
Variational Information Pursuit for Interpretable Predictions https://arxiv.org/abs/2302.02876