This project is a web-based tool designed to analyze and highlight potential biases in a pre-trained image classification model (ResNet-18). The goal is not just to classify images, but to use Explainable AI (XAI) techniques to understand why the model makes certain decisions, which is the first step in identifying and mitigating unfair biases.
- Backend: Flask
- Machine Learning: PyTorch (using a pre-trained ResNet-18)
- Explainable AI: LIME (Local Interpretable Model-agnostic Explanations)
- Frontend: Basic HTML
A user can upload an image through the web interface. The Flask backend processes the image, gets a prediction from the ResNet model, and then uses LIME to generate a visual heatmap highlighting the pixels most influential to the model's decision.
This tool can be used to investigate specific hypotheses about model bias. For this project, we define the following scenario:
-
Hypothesis: The ResNet-18 model, trained on the broad ImageNet dataset, may have learned societal biases associating certain professions with specific genders (e.g., "doctor" with male, "nurse" with female).
-
Methodology: To test this, one would gather a controlled dataset of images, including pictures of female doctors and male nurses. Each image would be run through this tool. The analysis would focus on two outputs:
- Classification Accuracy: Does the model misclassify these "atypical" images at a higher rate than their stereotypical counterparts (e.g., calling a female doctor a "nurse")?
- LIME Explanation: When the model does classify correctly, what features is it looking at? For an image of a female doctor, does the LIME explanation highlight the stethoscope (a professional tool), or does it highlight gendered features like hair or jewelry? A focus on non-relevant, gendered features would be evidence of bias.
-
Expected Outcome for a Biased Model: A biased model would likely show lower accuracy for gender-atypical professions and produce LIME explanations that focus on gendered features rather than professional indicators. This tool provides the first step in uncovering such behavior.
- Clone the repository:
git clone [Your-GitHub-Repo-URL] - Create and activate a virtual environment:
python3 -m venv venvsource venv/bin/activate(on macOS/Linux) orvenv\Scripts\activate(on Windows) - Install the required dependencies:
pip install -r requirements.txt - Run the Flask application:
export FLASK_APP=app.py(on macOS/Linux) or$env:FLASK_APP = "app.py"(on Windows)flask run - Open your browser and go to
http://12.0.0.1:5000.
This tool serves as a strong foundation for a more comprehensive responsible AI toolkit. Future enhancements could include:
- Implementing a Quantitative Bias Score: Calculate and display a simple metric to quantify bias, such as the difference in the model's confidence score between stereotypical and atypical image pairs.
- Adding More XAI Techniques: Integrate other explainability libraries like SHAP or Grad-CAM to provide a more holistic view of the model's behavior.
- Model Selection: Allow users to choose from various pre-trained models (e.g., VGG, MobileNet) to compare and contrast potential biases across different architectures.
- Expanding to Other Modalities: Generalize the tool's framework to analyze biases in other data types, such as text (using LIME for text on NLP models) or tabular data, turning it into a more universal toolkit.