CipherWatch

An ML-powered CLI that detects suspicious encrypted traffic from flow metadata and explains why a flow was flagged.

What It Does

Monitors encrypted network flow patterns without looking at packet payloads.
Flags suspicious connections that behave differently from normal traffic.
Explains each alert in plain terms so analysts can act faster.

How It Works

data_loader.py loads CIC-IDS2017 CSV files, cleans inf/NaN rows, and standardizes numeric features with StandardScaler.
train_model.py trains an IsolationForest on sampled BENIGN flows (unsupervised normal-behavior learning), then scores sampled traffic and saves anomaly outputs.
explain.py uses shap.TreeExplainer on the trained Isolation Forest to produce per-flow top feature impacts (top 3 by absolute SHAP value), including inverse-transformed original feature values.
cli.py exposes the full workflow through train and analyze commands with Rich-formatted terminal output.

Project Structure

CIPHERWATCH/
├── cli.py
├── data_loader.py
├── train_model.py
├── explain.py
├── requirements.txt
├── README.md
├── .gitignore
├── data/                  # CIC-IDS2017 MachineLearningCSV files (local)
├── model.pkl              # Generated after training
├── scaler.pkl             # Generated after preprocessing/training
├── results.csv            # Generated scored sample
└── explanations.json      # Generated SHAP explanation summary

Setup

git clone <your-repo-url>
cd bluff
python -m venv venv
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate
pip install -r requirements.txt

Run

python cli.py train --folder ./data
python cli.py analyze --results ./results.csv

Example Output

┌─────────────────────────────────────────────┐
│  ⚠  ANOMALY DETECTED                        │
│  Flow #37166   Label: DoS Hulk              │
│  Anomaly Score: -0.051                      │
└─────────────────────────────────────────────┘
Why flagged:
→ Idle Min           98600000.0
→ Bwd IAT Std        56900000.0
→ Max Packet Length  11595.0

┌─────────────────────────────────────────────┐
│ Summary                                     │
│ Total flows analyzed: 100000                │
│ Total anomalies: 1000                       │
│ Attack types found: DoS Hulk, DDoS, ...     │
│ Saved: explanations.json                    │
└─────────────────────────────────────────────┘

Dataset Credit

Dataset: CIC-IDS2017
Source: Canadian Institute for Cybersecurity, University of New Brunswick (UNB)

Limitations

Isolation Forest is unsupervised and learns a notion of "normal," so some attack families with behavior close to benign traffic can be missed.
In this project run, PortScan and Bot detection can appear near 0% because sampled traffic and model assumptions may not separate those patterns strongly from normal flow statistics.
Only flow-level metadata is used (no packet payload inspection), so certain subtle threats are out of scope for this version.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CipherWatch

What It Does

How It Works

Project Structure

Setup

Run

Example Output

Dataset Credit

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
cli.py		cli.py
data_loader.py		data_loader.py
explain.py		explain.py
requirements.txt		requirements.txt
train_model.py		train_model.py

Folders and files

Latest commit

History

Repository files navigation

CipherWatch

What It Does

How It Works

Project Structure

Setup

Run

Example Output

Dataset Credit

Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages