An ML-powered CLI that detects suspicious encrypted traffic from flow metadata and explains why a flow was flagged.
- Monitors encrypted network flow patterns without looking at packet payloads.
- Flags suspicious connections that behave differently from normal traffic.
- Explains each alert in plain terms so analysts can act faster.
data_loader.pyloads CIC-IDS2017 CSV files, cleansinf/NaNrows, and standardizes numeric features withStandardScaler.train_model.pytrains anIsolationForeston sampledBENIGNflows (unsupervised normal-behavior learning), then scores sampled traffic and saves anomaly outputs.explain.pyusesshap.TreeExplaineron the trained Isolation Forest to produce per-flow top feature impacts (top 3 by absolute SHAP value), including inverse-transformed original feature values.cli.pyexposes the full workflow throughtrainandanalyzecommands with Rich-formatted terminal output.
CIPHERWATCH/
├── cli.py
├── data_loader.py
├── train_model.py
├── explain.py
├── requirements.txt
├── README.md
├── .gitignore
├── data/ # CIC-IDS2017 MachineLearningCSV files (local)
├── model.pkl # Generated after training
├── scaler.pkl # Generated after preprocessing/training
├── results.csv # Generated scored sample
└── explanations.json # Generated SHAP explanation summary
git clone <your-repo-url>
cd bluff
python -m venv venv
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate
pip install -r requirements.txtpython cli.py train --folder ./data
python cli.py analyze --results ./results.csv┌─────────────────────────────────────────────┐
│ ⚠ ANOMALY DETECTED │
│ Flow #37166 Label: DoS Hulk │
│ Anomaly Score: -0.051 │
└─────────────────────────────────────────────┘
Why flagged:
→ Idle Min 98600000.0
→ Bwd IAT Std 56900000.0
→ Max Packet Length 11595.0
┌─────────────────────────────────────────────┐
│ Summary │
│ Total flows analyzed: 100000 │
│ Total anomalies: 1000 │
│ Attack types found: DoS Hulk, DDoS, ... │
│ Saved: explanations.json │
└─────────────────────────────────────────────┘
- Dataset: CIC-IDS2017
- Source: Canadian Institute for Cybersecurity, University of New Brunswick (UNB)
- Isolation Forest is unsupervised and learns a notion of "normal," so some attack families with behavior close to benign traffic can be missed.
- In this project run,
PortScanandBotdetection can appear near 0% because sampled traffic and model assumptions may not separate those patterns strongly from normal flow statistics. - Only flow-level metadata is used (no packet payload inspection), so certain subtle threats are out of scope for this version.