GitHub - Ravindu-S/JaneGPT-v2: A lightweight, fast, and accurate intent classification model built from scratch for virtual assistant command understanding.

Assistant-focused model suite for command understanding, hierarchical NLU, and runtime-safe execution

Start Here

For first-time visitors, begin with Janus:

Featured model: JaneGPT-v2-Janus
Full docs: JaneGPT-v2-Janus/README.md
Try now: JaneGPT-v2-Janus/examples/demo_runtime.py

If you want a simpler intent-only baseline:

Model Explorer

Track	Best For	What You Get	Jump
JaneGPT-v2-Janus	Real assistant runtime flows	Hierarchical `(domain, action, slots)`, clarifications, pending-slot fill, follow-up state	Open Janus
JaneGPT-v2	Fast intent routing baseline	Lightweight 22-intent classifier with simple integration	Open v2

Expand: Janus quick navigation

Docs: JaneGPT-v2-Janus/README.md
Runtime demo: JaneGPT-v2-Janus/examples/demo_runtime.py
Inference demo: JaneGPT-v2-Janus/examples/demo_inference.py
Runtime source: JaneGPT-v2-Janus/runtime/jane_nlu_runtime.py

Expand: v2 quick navigation

Docs: JaneGPT-v2/README.md
Basic inference: JaneGPT-v2/examples/basic_inference.py
Classifier wrapper: JaneGPT-v2/model/classifier.py

Quick Paths

I Want To...	Go Here
Understand Janus architecture and runtime behavior	JaneGPT-v2-Janus/README.md
Run assistant-style multi-turn behavior	JaneGPT-v2-Janus/examples/demo_runtime.py
Run pure model inference only	JaneGPT-v2-Janus/examples/demo_inference.py
Use a simpler classifier baseline	JaneGPT-v2/README.md
Benchmark classifier performance	JaneGPT-v2/examples/benchmark.py
View fair benchmark summary	JaneGPT-v2-Janus/reports/fair_benchmarks.md

Fair Benchmarks (Apr 2026)

Only schema-aligned or schema-agnostic benchmarks are shown here.

Fair Test	JaneGPT-v2	JaneGPT-v2-Janus	Why It Is Fair
Latency (CUDA, batch=1)	31.60 ms mean, 32 preds/sec	25.31 ms mean predict, 34.60 ms p95	Same local hardware and same benchmark pipeline
Runtime reliability suite (82 turns)	-	67 local commands, 3 Llama routes, 12 clarifications, 0 errors	In-domain assistant behavior with strict pass/fail
OOD rejection on BANKING77	OOD F1: 94.31%	OOD F1: 87.80%	Label-schema independent safety test
OOD rejection on CLINC OOS	OOD F1: 89.16%	OOD F1: 79.23%	Label-schema independent safety test

⚠️ Why we exclude MASSIVE/SNIPS from headlines:
Jane was trained on assistant commands like "turn_on_lights" and "set_reminder".
MASSIVE/SNIPS use different command names ("light_on", "alarm_set", etc.).
Only ~50% of their labels could be mapped, so accuracy scores would mislead about Jane's real quality.
Instead, we show OOD safety tests (out-of-domain rejection) which prove this model's smarts on any topic.

Understanding These Benchmarks:

Benchmark	What It Tests	What It Means	Example
Latency	How fast Jane runs per prediction	Speed is critical for real-time assistants. Under 50ms = excellent; over 200ms = noticeable lag	User says "turn on lights" → model responds in ~25ms (Janus) or ~32ms (v2)
Runtime Reliability	Can Jane handle 82 multi-turn conversations without crashing?	0 errors = production-ready; 10+ errors = unstable. Tests real assistant behavior (clarifications, slot filling, state changes)	Turn 1: "Set alarm" → Turn 45: "Change to 3pm" → Turn 82: Still perfect
OOD Safety (BANKING77)	Can Jane reject finance questions when trained on home automation?	Tests this model's judgment. ~90% F1 = excellent (rejects what it shouldn't handle). Under 60% = dangerous (would give wrong answers)	User asks "What's my account balance?" → Jane correctly says "I can't help with that"
OOD Safety (CLINC)	Can Jane reject random real-world off-topic requests?	Similar to BANKING77 but with diverse random questions. Proves this model knows its limits	User asks "What's the capital of France?" → Jane correctly rejects it

Bottom Line: Jane is SOLID ✅

Fast enough for real users (25-31ms per prediction)
Stable enough for production (0 crashes in 82 turns)
Safe enough to deploy (87-94% OOD rejection accuracy)

Full detailed report:

Current Model List

Model	Purpose	Key Highlights	Recommended Entry
JaneGPT-v2-Janus	Hierarchical NLU with runtime state	7.95M params, domain+action+slot output, clarification/pending-slot runtime	JaneGPT-v2-Janus/README.md
JaneGPT-v2	Intent classification baseline	7.8M params, 22 intents, fast lightweight classifier	JaneGPT-v2/README.md

Repo Layout

assets - shared visual assets
JaneGPT-v2-Janus - hierarchical NLU + runtime package
JaneGPT-v2 - intent classifier package

License

Each model folder defines its own license:

Author

Ravindu Senanayake

Back to top

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
JaneGPT-v2-Janus		JaneGPT-v2-Janus
JaneGPT-v2		JaneGPT-v2
assets		assets
tools/results		tools/results
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Start Here

Model Explorer

Quick Paths

Fair Benchmarks (Apr 2026)

Current Model List

Repo Layout

License

Author

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Start Here

Model Explorer

Quick Paths

Fair Benchmarks (Apr 2026)

Current Model List

Repo Layout

License

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages