This project evaluates multiple Explainable AI (XAI) techniques to understand how convolutional neural networks (CNNs) make classification decisions.
Using a pretrained VGG16 model on the MexCulture142/validation dataset, the framework compares different Explainability methods that produce saliency maps highlighting which regions in an image most influence the model’s prediction.
The system provides tools to:
- Generate saliency maps using multiple XAI methods.
- Compare those maps with human gaze fixation data.
- Evaluate both faithfulness (model-based) and human-alignment (ground-truth-based) metrics.
- Visualize and analyze method performance across architectural categories.
| Method | Description |
|---|---|
| FEM | Gradient-free, feature-based explanation highlighting statistically rare and strong activations. |
| Grad-CAM | Gradient-weighted, class-specific visualization showing which regions influence a model’s decision. |
| LIME | Perturbation-based local surrogate model estimating regional feature importance. |
| RISE | Randomized input sampling approach estimating pixel importance probabilistically. |
- AUC Deletion → Measures drop in confidence as key pixels are removed (lower = better faithfulness).
- AUC Insertion → Measures increase in confidence as key pixels are added (higher = better fidelity).
- PCC (Pearson Correlation Coefficient) → Linear correlation with gaze-based saliency maps.
- SIM (Similarity) → Measures spatial overlap between generated and gaze maps.
- FEM achieves the best PCC and SIM scores, aligning most closely with human attention.
- LIME and RISE perform best in AUC Insertion, reflecting strong model fidelity.
- Each method balances interpretability and accuracy differently.
| Method | Strengths | Limitations |
|---|---|---|
| FEM | Excellent localization and faithfulness; ideal for complex textures. | Slightly computationally intensive. |
| Grad-CAM | Clear, intuitive activations. | Can miss fine details and over-smooth results. |
| LIME | Strong for structured geometry. | Struggles with texture-rich or irregular scenes. |
| RISE | Consistent across samples, robust black-box method. | Coarser visual precision due to random masking. |
Execute the evaluation pipeline to generate saliency maps and raw XAI results:
python Evaluation.pyThis will save output data to:
xai_results.pkl
Process and visualize the results:
python Analyze_xai_results.pyThis script produces:
- Quantitative summaries and evaluation metrics
- Visualization plots under
plots/
XAI_Evaluation
├── Evaluation.py # Runs evaluation and saves metrics
├── Analyze_xai_results.py # Summarizes and visualizes results
├── XAI_Metrics.py # Implements AUC Deletion, AUC Insertion, PCC, SIM
├── FEM.py # Feature-based gradient-free explanation
├── GradCAM.py # Gradient-weighted activation mapping
├── LIME.py # Local surrogate model explanations
├── RISE.py # Randomized input sampling explanation
├── plots/ # Generated visualizations and curves
