Credit Card Transaction Classifier

A machine learning pipeline designed to identify fraudulent transactions with high precision and recall. This project focuses on handling extreme class imbalance and preventing data leakage through a robust pipeline architecture.

Project Structure

The project is organised as follows:

credit-card-fraud-ml/
│
├── data/           # Source CSV data (e.g. creditcard.csv)
├── docs/           # Technical report and documentation
├── figures/        # Exported Precision-Recall curves and EDA plots
├── models/         # Serialised champion model (.pkl)
├── notebooks/      # EDA and benchmarking experiments
├── src/            # Python scripts and utilities
├── requirements.txt
└── README.md

Methodology

Anti-Leakage Pipeline: Utilises imblearn.pipeline.Pipeline to ensure that the RobustScaler is fitted strictly on training data, eliminating look-ahead bias.
Cost-Sensitive Learning: Addresses the extreme 0.17% fraud imbalance by utilising scale_pos_weight in XGBoost and class_weight='balanced' in Linear models. This proved more effective than naive scaling during experimentation.
Metric Focus: Prioritises AUPRC (Area Under the Precision-Recall Curve) to ensure high detection (Recall) while minimising false alarms (Precision).
Model Suite: Compares Logistic Regression, Random Forest, Hist-Gradient Boosting, XGBoost, and Linear SVM.

How to Run

Install dependencies:
```
pip install -r requirements.txt
```
Explore the Research: Review notebooks/eda.ipynb for data insights and notebooks/experiments.ipynb to see the benchmarking process and leakage analysis.
Train the Champion Model: Execute the training suite to benchmark all models and serialise the best performer to the models/ directory:
```
python src/train.py
```
Verify Results: Run the final verification script to load the serialised pipeline and test it against the held-out dataset. This will also save a final verification plot to figures/:
```
python src/test.py
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit Card Transaction Classifier

Project Structure

Methodology

How to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
docs		docs
figures		figures
models		models
notebooks		notebooks
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Credit Card Transaction Classifier

Project Structure

Methodology

How to Run

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages