Skip to content

Phazertron/Backpropagation-analisys

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural Network Experiments – Artificial Intelligence fundamentals

A from-scratch implementation of feedforward neural networks in pure NumPy, built as part of the Artificial Intelligence fundamentals course at the Università degli Studi di Parma.

The project systematically sweeps network architectures, activation functions, and dropout regularisation on the MNIST handwritten-digit dataset, logging all results to CSV and generating visualisation plots.


Features

  • Pure NumPy implementation – no autograd framework, every gradient is hand-derived and backpropagated manually.
  • Mini-batch SGD with configurable batch size and learning rate.
  • Inverted dropout regularisation.
  • Numerical gradient check (finite differences) runs automatically before every sweep to verify backpropagation correctness.
  • Structured logging – INFO to console, DEBUG to a rotating log file.
  • CSV experiment log – all hyperparameters and per-epoch losses persisted for reproducibility.
  • Automatic result analysis – four publication-ready plots generated by analyzer.py.
  • Cross-platform: PowerShell (Windows), Make (Unix/macOS), Docker.

Project Structure

graph TD
    A[main.py] --> B[src/experiments.py]
    B --> C[src/config.py]
    B --> D[src/data.py]
    B --> E[src/training.py]
    B --> F[src/network.py]
    F --> G[src/layers.py]
    E --> F
    H[analyzer.py] --> C
Loading
Backpropagation analisys/
├── src/
│   ├── __init__.py        # Public re-exports
│   ├── config.py          # ExperimentConfig dataclass – all hyperparameters
│   ├── layers.py          # DenseLayer, DropoutLayer
│   ├── network.py         # NeuralNetwork container
│   ├── data.py            # MNIST loading, normalisation, noise injection
│   ├── training.py        # train(), evaluate(), gradient_check()
│   └── experiments.py     # Sweep orchestration, logging setup, CSV writing
├── main.py                # Entry point
├── analyzer.py            # Results visualisation
├── requirements.txt
├── Makefile               # Unix/macOS convenience targets
├── Dockerfile
├── run.ps1                # Windows PowerShell launcher
├── .gitignore
├── LICENSE
└── CONTRIBUTING.md

Getting Started

Prerequisites

  • Python 3.11+
  • pip
  • (Optional) make, Docker, or PowerShell 7+

Installation

Windows (PowerShell)

.\run.ps1 setup

Unix / macOS (Make)

make setup

Manual (any platform)

python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\Activate.ps1
pip install -r requirements.txt

Docker

docker build -t fondamenti-ia .

Usage

Run experiments

Windows

.\run.ps1 start

Unix / macOS

make run

Docker

docker run --rm \
  -v $(pwd)/data:/app/data \
  -v $(pwd)/results:/app/results \
  fondamenti-ia

Results are written to results/run_<timestamp>/:

  • experiment_log.csv – full metrics for every run.
  • experiment.log – detailed debug log.

Analyze results

Windows

.\run.ps1 analyze

Unix / macOS

make analyze

Or point the analyzer at a specific CSV:

python analyzer.py results/run_20250623_142953/experiment_log.csv

Generated plots:

File Contents
grouped_accuracy_by_activation.png Test accuracy by activation function
grouped_accuracy_by_arch.png Test accuracy by architecture
loss_curves.png Per-run training loss over epochs
test_accuracy_per_run.png Bar chart: test accuracy per run

Configuration

All hyperparameters live in src/config.py as fields of the ExperimentConfig dataclass. Edit the defaults there before running.

Parameter Default Description
train_limit 5000 Training samples (max 60 000)
test_limit 10000 Test samples
val_fraction 0.2 Validation fraction of training set
noise_rate 0.2 Fraction of training labels corrupted
epochs 30 Training epochs per run
learning_rate 0.05 SGD learning rate
batch_size 64 Mini-batch size
architectures [[64],[128,64],…] Hidden-layer sizes to sweep
activations [relu, sigmoid, tanh] Activations to sweep
dropout_rates [0.2, 0.5] Dropout rates to test

Architecture Overview

flowchart LR
    Input["Input\n784 features"] --> H1["DenseLayer\n(He init)"]
    H1 --> D1["DropoutLayer\n(optional)"]
    D1 --> Hn["… hidden layers …"]
    Hn --> Out["DenseLayer\nSoftmax"]
    Out --> Loss["Cross-Entropy\nLoss"]
    Loss --> Back["Backprop\n∂L/∂W stored"]
    Back --> Upd["Weight update\nSGD"]
Loading
  • He initialisation for all hidden layers.
  • Softmax + cross-entropy with numerically stable combined gradient.
  • Inverted dropout: activations scaled by 1/(1-rate) at train time, no correction needed at inference.
  • Gradient / weight update decoupled: backward() stores gradients, update_weights() applies them – enabling gradient checking without corrupting weights.

License

MIT © 2025 Claudio Bendini

About

A from-scratch implementation of feedforward neural networks using NumPy. Developed for the Artificial Intelligence Fundamentals course at the University of Parma, featuring manual backpropagation, mini-batch SGD, and inverted dropout on the MNIST dataset.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors