This repository demonstrates end-to-end preprocessing, modeling, and evaluation workflows across four data modalities: text, images, audio, and tabular records. Each modality folder contains:
- Raw or auto-downloaded data assets
- Training scripts that implement preprocessing and simple models from scratch or lightweight deep learning architectures
- Persisted metrics and 50-epoch train/validation loss curves (SVG)
- A modality-specific README explaining design decisions, exploratory analysis, and outcomes
| Folder | Description |
|---|---|
text/ |
BBC-style news snippets with bag-of-words, TF-IDF, and a shallow neural classifier.【F:text/README.md†L1-L28】【F:text/src/train_text_models.py†L95-L170】 |
image/ |
MNIST digit recognition using a handcrafted CNN and torchvision data utilities.【F:image/README.md†L1-L22】【F:image/src/train_mnist_cnn.py†L22-L133】 |
audio/ |
YESNO spoken command classification with MFCC features and a 2D CNN.【F:audio/README.md†L1-L23】【F:audio/src/train_audio_model.py†L20-L164】 |
Tabular/ |
Kaggle Titanic passenger survival modeling via scratch logistic regression.【F:Tabular/README.md†L1-L20】【F:Tabular/src/train_titanic_model.py†L32-L134】 |
- Consistent Splits: Every training script enforces explicit train/validation/test partitions to standardize evaluation.【F:text/src/train_text_models.py†L86-L115】【F:image/src/train_mnist_cnn.py†L67-L86】【F:audio/src/train_audio_model.py†L72-L122】【F:Tabular/src/train_titanic_model.py†L68-L91】
- From-Scratch Models: Simple neural or statistical models are implemented without relying on prebuilt pipelines, keeping the focus on core math and preprocessing logic.【F:text/src/train_text_models.py†L117-L170】【F:image/src/train_mnist_cnn.py†L22-L44】【F:audio/src/train_audio_model.py†L49-L71】【F:Tabular/src/train_titanic_model.py†L56-L134】
- Transparent Metrics: Accuracy, precision, recall, and F1 are logged per modality to contextualize performance across balanced and imbalanced settings.【F:text/text_metrics.json†L1-L16】【F:image/mnist_metrics.json†L1-L6】【F:audio/audio_metrics.json†L1-L6】【F:Tabular/titanic_metrics.json†L1-L6】
- Training Curves: SVG plots visualize 50-epoch loss trajectories for reproducibility and quick diagnostics.【F:text/plots/text_loss_curve.svg†L1-L23】【F:image/plots/mnist_loss_curve.svg†L1-L23】【F:audio/plots/audio_loss_curve.svg†L1-L23】【F:Tabular/plots/titanic_loss_curve.svg†L1-L23】
- Create a Python virtual environment with PyTorch, Torchaudio, Torchvision, Pandas, NumPy, and Scikit-learn.
- Choose a modality folder and run the corresponding script under
src/(e.g.,python text/src/train_text_models.py). - Review the generated
metrics.jsonfiles and SVG loss curves; consult the modality README for interpretation guidance.【F:text/README.md†L29-L34】【F:image/README.md†L23-L27】【F:audio/README.md†L24-L27】【F:Tabular/README.md†L21-L24】
Each pipeline is intentionally lightweight to encourage experimentation with feature engineering, model capacity, and evaluation techniques across different data types.