JARVIS Reactor (Reactor-Core)

The Nerves of the AGI OS — training, fine-tuning, experience collection, and model deployment

JARVIS Reactor (Reactor-Core) is the training and learning layer of the JARVIS AGI ecosystem. It provides ML training (DPO, RLHF, curriculum, meta-learning, world models, causal reasoning), model serving with hot-reload, experience collection from JARVIS Body, model deployment to JARVIS-Prime, and Trinity Protocol integration for cross-repo coordination. As of v244.0 (JARVIS Body-side), command lifecycle events now flow through TrinityEventBus providing richer training signals (intent, domain, execution outcomes) for DPO pair generation, and brain vacuum fallback properly classifies commands during J-Prime downtime (producing valid telemetry even during outages). It is started either standalone (run_reactor.py) or by the unified supervisor in JARVIS (python3 unified_supervisor.py).

Session Update (2026-03-18): Voice Unlock Telemetry Integration and Ingestion Mapping

This session extends Reactor-Core's Trinity ingestion surface so biometric unlock outcomes become first-class learning signals instead of being dropped or collapsed into generic events.

1) New Telemetry Event Types (Body → Nerves Contract)

backend/core/telemetry/events.py now defines three unlock-specific events:

VOICE_UNLOCK_GRANTED
VOICE_UNLOCK_DENIED
VOICE_UNLOCK_ROUTING

These events distinguish route correctness from auth outcome, enabling cleaner causal analysis in training data.

2) Body Emission Wiring

backend/api/unified_command_processor.py::_handle_voice_unlock_action() now emits auth outcome telemetry after each unlock attempt using fire-and-forget semantics.

Preserves unlock UX latency while still producing training telemetry.
Captures both positive and negative authentication outcomes.

3) Reactor Ingestion Classification

reactor_core/ingestion/telemetry_ingestor.py now maps unlock event types to InteractionOutcome categories used by the training pipeline.

This allows unlock routing/auth behavior to be represented in downstream datasets with consistent labels rather than ad hoc parsing.

4) Training/Analytics Impact

With unlock-specific event separation, Reactor can now support:

Routing quality analysis: detect cases where unlock intent took non-unlock routes before eventual correction.
Auth outcome baselines: track unlock grant/deny trends over time.
Preference data hygiene: avoid mixing biometric events into unrelated workspace/general quality metrics.

5) Session Validation Context

Cross-repo routing nuance tests for unlock phrasing completed with 50/50 pass rate this session, indicating stable end-to-end routing for tested command variants.

What is JARVIS Reactor? (Trinity Role)

Role	Repository	Responsibility
Body	JARVIS (JARVIS-AI-Agent)	macOS integration, computer use, unified supervisor, voice/vision
Mind	JARVIS-Prime	LLM inference, Neural Orchestrator Core, OpenAI-compatible API
Nerves	Reactor-Core (this repo)	Training, fine-tuning, experience collection, model deployment, Trinity coordination

Reactor-Core is the nervous system: it trains and improves models, collects experience from JARVIS, and deploys models to JARVIS-Prime. The unified supervisor in JARVIS discovers and starts Reactor-Core (default port 8090) alongside JARVIS-Prime (8000) and the JARVIS backend (8010).

🚀 What is JARVIS Reactor? (Features)

JARVIS Reactor is a production-grade ML infrastructure combining:

Advanced Training Methods: DPO, RLHF, Constitutional AI, Curriculum Learning, Meta-Learning, World Models, Causal Reasoning
Model Serving: Hot-reload model server with multi-backend support (vLLM, llama.cpp, MLX, Transformers)
Async Infrastructure: Circuit breakers, backpressure, bulkheads, dead letter queues, structured concurrency
API Platform: FastAPI server with telemetry, scheduling, model registry, health monitoring
Trinity Orchestration: Multi-repo coordination with heartbeat monitoring and state sync
Event Streaming: Real-time WebSocket/Redis pub-sub across JARVIS ecosystem
GCP Integration: Spot VM resilience, Cloud SQL storage, auto-checkpointing
MLForge C++ Core: High-performance ML primitives (optional submodule)
Unified Supervisor: One-command startup for entire AGI OS ecosystem (python3 run_supervisor.py)
Connection Pooling: Efficient HTTP/Redis connection management with automatic lifecycle
Dynamic Configuration: Zero hardcoding, XDG-compliant paths, environment-driven config
Structured Concurrency: Python 3.11+ TaskGroup patterns for robust async operations

🏗️ Architecture

System Overview

┌─────────────────────────────────────────────────────────────────────┐
│                  AGI OS UNIFIED SUPERVISOR v92.0                   │
│                    (Central Coordination Hub)                       │
│                    python3 run_supervisor.py                        │
└─────────────────────────────────────────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
        ▼                     ▼                     ▼
    ┌─────────┐       ┌─────────────┐       ┌─────────┐
    │ JARVIS  │◄─────►│   TRINITY   │◄─────►│ J-PRIME │
    │ (Body)  │Events │ ORCHESTRATOR│Events │ (Mind)  │
    │         │       │             │       │         │
    │ macOS   │       │ Heartbeats  │       │ LLM     │
    │ Actions │       │ Commands    │       │ Inference
    └─────────┘       │ State Sync  │       └─────────┘
                      └─────────────┘
                              │
              ┌───────────────┼───────────────┐
              ▼               ▼               ▼
      ┌─────────────┐  ┌─────────────┐  ┌─────────────┐
      │REACTOR CORE │  │  ONLINE     │  │ DISTRIBUTED │
      │  (Nerves)   │  │  LEARNING   │  │  TRAINING   │
      │             │  │             │  │             │
      │ Training    │  │ Experience  │  │ Multi-VM    │
      │ Learning    │  │ Replay      │  │ Gradient    │
      │ Serving     │  │ EWC/Drift   │  │ Sync        │
      └─────────────┘  └─────────────┘  └─────────────┘

Reactor Core Internal Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                        REACTOR CORE v2.10.0                         │
│                    (AGI OS Nervous System)                           │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │              UNIFIED API SERVER (v77.0)                       │   │
│  │  ┌─────────────┐  ┌─────────────┐  ┌──────────────────────┐  │   │
│  │  │ Telemetry   │  │  Night      │  │  Model               │  │   │
│  │  │ Collector   │  │  Scheduler  │  │  Registry            │  │   │
│  │  │ + WebSocket │  │ + Cron Jobs │  │ + A/B Testing         │  │   │
│  │  └─────────────┘  └─────────────┘  └──────────────────────┘  │   │
│  │  ┌─────────────┐  ┌─────────────┐                             │   │
│  │  │ Health      │  │ Cost       │                             │   │
│  │  │ Aggregator  │  │ Tracker    │                             │   │
│  │  └─────────────┘  └─────────────┘                             │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │         HOT-RELOAD MODEL SERVER (v77.1)                      │   │
│  │  • Multi-backend: vLLM, llama.cpp, MLX, Transformers        │   │
│  │  • Zero-downtime model swaps via file watcher                │   │
│  │  • LRU cache with memory-aware eviction                      │   │
│  │  • Priority request queue for SLA compliance                 │   │
│  │  • Semantic response caching (hash-based deduplication)      │   │
│  │  • Circuit breaker for backend failure protection            │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │      ADVANCED TRAINING ENGINE (v76.0-v80.0)                   │   │
│  │                                                                │   │
│  │   Experience Buffer → Data Selector → Training Router         │   │
│  │                               │                                │   │
│  │   ┌───────────────────────────┼───────────────────────────┐   │   │
│  │   │                           │                           │   │
│  │   ▼                           ▼                           ▼   │   │
│  │   DPO Trainer          RLHF Pipeline        Constitutional AI │   │
│  │   • Preference         • PPO Algorithm       • Self-supervised│   │
│  │     Learning           • Reward Modeling     • Safety         │   │
│  │   • Memory Efficient   • Value Functions     • Alignment      │   │
│  │                                                                │   │
│  │   Curriculum Learning  Meta-Learning        World Models     │   │
│  │   • Progressive        • MAML/Reptile       • Latent dynamics │   │
│  │     difficulty         • Few-shot learning   • Planning        │   │
│  │   • Adaptive           • Task adaptation     • Counterfactual │   │
│  │     scheduling                                 reasoning       │   │
│  │                                                                │   │
│  │   Causal Reasoning    FSDP Training        Federated Learning│   │
│  │   • SCMs              • Multi-GPU/Node      • Cross-repo      │   │
│  │   • Do-calculus       • Gradient sharding   • Byzantine-robust│   │
│  │   • Causal discovery  • Memory efficient    • Privacy-preserving│   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │    ASYNC INFRASTRUCTURE (v76.1, v92.0)                        │   │
│  │  • CircuitBreaker      • Backpressure      • DeadLetterQueue │   │
│  │  • Bulkhead            • HealthMonitor     • AdaptiveRateLimiter│   │
│  │  • TimeoutPolicy       • MetricsCollector  • AsyncRetry       │   │
│  │  • StructuredTaskGroup • ConnectionPool     • AsyncBarrier     │   │
│  │  • AsyncContextGroup   • AsyncLatch        • ScatterGather    │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │         TRINITY ORCHESTRATOR (v75.0)                           │   │
│  │  • Multi-repo heartbeat monitoring (JARVIS, Prime, Reactor)  │   │
│  │  • Command routing with intelligent load balancing            │   │
│  │  • State reconciliation across distributed system             │   │
│  │  • Dead Letter Queue for failed commands with auto-retry      │   │
│  │  • Atomic file I/O (zero-corruption operations)              │   │
│  │  • Circuit breakers for fault tolerance                       │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │    ONLINE LEARNING & DATA VERSIONING (v91.0)                  │   │
│  │  • Prioritized experience replay with importance sampling     │   │
│  │  • Elastic Weight Consolidation (EWC) - prevents forgetting  │   │
│  │  • Concept Drift Detection (Page-Hinkley test)                │   │
│  │  • Data Versioning: Content-addressed storage (DVC compatible)│   │
│  │  • Lineage tracking and reproducibility                       │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │    DISTRIBUTED TRAINING (v91.0)                               │   │
│  │  • Multi-VM coordination with gradient compression            │   │
│  │  • GCP Spot VM checkpointing with predictive preemption       │   │
│  │  • Dynamic resource allocation with cost-aware decisions      │   │
│  │  • Gradient checksum validation                               │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │         EVENT STREAMING (v10.3)                               │   │
│  │  • WebSocket real-time events with priority queues            │   │
│  │  • Redis pub/sub (optional) for scale                          │   │
│  │  • Safety audit trail with kill switch                        │   │
│  │  • Cost tracking & budget alerts                              │   │
│  │  • Multi-transport: WebSocket, file-watching, Redis            │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                                                      │
│         ▼                       ▼                      ▼             │
│  ┌──────────────┐      ┌──────────────┐      ┌──────────────┐      │
│  │  MLForge C++ │      │  Cloud SQL   │      │ GCP Storage  │      │
│  │   (Optional) │      │  (Events DB) │      │(Checkpoints) │      │
│  │  pybind11    │      │  PostgreSQL  │      │  GCS Bucket  │      │
│  └──────────────┘      └──────────────┘      └──────────────┘      │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Project Structure

JARVIS-Reactor/
├── reactor_core/
│   ├── training/              # Advanced training methods
│   │   ├── advanced_training.py   # DPO, RLHF, Constitutional AI (2,899 lines)
│   │   ├── unified_pipeline.py    # End-to-end training orchestration
│   │   ├── trainer.py             # Base trainer class
│   │   └── lora.py                # LoRA/QLoRA implementations
│   │
│   ├── serving/               # Model serving infrastructure
│   │   ├── model_server.py        # Hot-reload model server (1,545 lines)
│   │   └── inference_engine.py    # Multi-backend inference (1,891 lines)
│   │
│   ├── api/                   # REST API server
│   │   ├── server.py              # FastAPI endpoints (2,252 lines)
│   │   ├── telemetry.py           # Metrics & observability (1,128 lines)
│   │   ├── scheduler.py           # Night Shift scheduler (1,030 lines)
│   │   ├── model_registry.py      # Model versioning (1,301 lines)
│   │   └── health_aggregator.py   # Health monitoring (999 lines)
│   │
│   ├── orchestration/         # Trinity coordination
│   │   └── trinity_orchestrator.py # Multi-repo orchestrator
│   │
│   ├── utils/                 # Core utilities
│   │   ├── async_helpers.py       # Async patterns (1,746 lines)
│   │   └── dependencies.py        # Dependency injection (913 lines)
│   │
│   ├── integration/           # Cross-repo integration
│   │   ├── event_bridge.py        # Event streaming
│   │   ├── cost_bridge.py         # Cost tracking
│   │   ├── jarvis_connector.py    # JARVIS integration
│   │   └── prime_connector.py     # Prime integration
│   │
│   ├── eval/                  # Model evaluation
│   │   └── advanced_evaluation.py # Comprehensive eval suite (1,536 lines)
│   │
│   ├── data/                  # Data loading & preprocessing
│   ├── gcp/                   # GCP Spot VM support
│   └── config/                # Configuration management
│
├── run_supervisor.py          # AGI OS unified supervisor (1,635 lines)
├── mlforge/                   # C++ ML core (submodule)
├── docker/                    # Docker configurations
├── scripts/                   # Utility scripts
└── tests/                     # Test suite

Total: ~18,996+ lines of production code added in v75.0-v77.1

⭐ Key Features

🧠 Advanced Training Methods (v76.0)

DPO (Direct Preference Optimization): Preference learning without reward models
RLHF (Reinforcement Learning from Human Feedback): Full PPO pipeline
Constitutional AI: Self-supervised safety alignment
Curriculum Learning: Progressive difficulty scheduling
Memory Management: Dynamic batch sizing, gradient checkpointing, CPU offloading
FSDP Support: Fully Sharded Data Parallel for large models
Experience Replay: Priority-based sampling from interaction logs

⚡ Async Infrastructure (v76.1)

CircuitBreaker: Automatic failure detection and recovery
Backpressure: Adaptive load management with queue shedding
Bulkhead: Failure isolation between components
DeadLetterQueue: Failed operation tracking and replay
HealthMonitor: Real-time component health tracking
AdaptiveRateLimiter: Dynamic rate limiting based on success rates
TimeoutPolicy: Configurable timeouts with fallback strategies
MetricsCollector: Comprehensive observability

🌐 API Server & Telemetry (v77.0)

FastAPI Server: Production-grade REST API with auto-docs
Telemetry Collector: Real-time metrics ingestion with WebSocket streaming
Night Shift Scheduler: Automated training during off-peak hours
Model Registry: Version management, A/B testing, rollback support
Health Aggregator: Multi-service health dashboard
Cost Tracking: Budget alerts and spend analytics
WebSocket Events: Real-time training progress streaming

🔥 Model Serving & Hot Reload (v77.1)

Hot-Reload: Zero-downtime model updates via file watcher
Multi-Backend Support: vLLM, llama.cpp, MLX, Transformers
LRU Model Cache: Memory-aware model eviction
Priority Queue: Request prioritization for SLA compliance
Semantic Caching: Hash-based response deduplication
Circuit Breaker: Backend failure protection
Async Loading: Non-blocking model initialization
Version Management: Seamless model version switching

🎯 Trinity Orchestrator (v75.0)

Multi-Repo Coordination: Heartbeat monitoring across JARVIS, Prime, Reactor
Command Routing: Intelligent load balancing with priority queues
State Reconciliation: Consistent state across distributed system
Dead Letter Queue: Failed command tracking and retry
Atomic File I/O: Zero-corruption file operations (v73.0)
Self-Heartbeat: Liveness monitoring (v72.0)
Circuit Breakers: Fault tolerance with automatic recovery

🔄 Event Streaming (v10.3)

WebSocket Streaming: Real-time event broadcasting
Redis Pub/Sub: Optional Redis backend for scale
Event Deduplication: Hash-based duplicate prevention
Priority System: Safety-critical event prioritization
Safety Audit Trail: Comprehensive action logging
Cost Events: Budget tracking with alerts
Multi-Transport: WebSocket, file-watching, Redis

☁️ GCP Integration

Spot VM Resilience: Auto-resume from preemption
Cloud SQL Storage: Event and metric persistence
GCS Checkpointing: Distributed checkpoint storage
Auto-Detection: M1 local vs GCP remote environment detection

📦 Installation

Quick Install (Python only, no C++ bindings)

pip install jarvis-reactor

Build from Source (with MLForge C++ bindings)

# Clone with submodules
git clone --recursive https://github.com/drussell23/JARVIS-Reactor.git
cd JARVIS-Reactor

# Install dependencies (requires CMake and pybind11)
pip install pybind11 cmake

# Build and install
pip install -e .

Environment-Specific Installation

# For local development (M1 Mac)
pip install jarvis-reactor[local]

# For GCP training (32GB+ VM)
pip install jarvis-reactor[gcp]

# For full development (includes testing, linting, docs)
pip install -e ".[dev]"

Docker Installation

# Build Docker image
docker-compose build

# Run API server
docker-compose up api

# Run model server
docker-compose up model-server

# Run unified supervisor
docker-compose up supervisor

🚀 Quick Start

Start Reactor-Core (Recommended: via JARVIS)

# From JARVIS-AI-Agent repo — starts Body + Prime + Reactor-Core
cd /path/to/JARVIS-AI-Agent
python3 unified_supervisor.py

Reactor-Core will start on port 8090 and register with Trinity. Health: http://localhost:8090/health.

Start Reactor-Core Standalone

# From Reactor-Core repo
cd /path/to/Reactor-Core
python3 run_reactor.py --port 8090

Basic Training

from reactor_core import Trainer, TrainingConfig
from reactor_core.gcp import SpotVMCheckpointer

# Configure training
config = TrainingConfig(
    model_name="llama-2-7b",
    use_lora=True,
    lora_rank=16,
    num_epochs=3,
    batch_size=4,
    gradient_checkpointing=True,
)

# Auto-detect environment (M1 local vs GCP remote)
trainer = Trainer(config)

# Train with auto-resume on Spot VM preemption
trainer.train("./data/train.jsonl")

Advanced Training with DPO

from reactor_core.training.advanced_training import (
    DPOTrainer,
    DPOConfig,
    PreferenceDataset,
)

# Configure DPO
dpo_config = DPOConfig(
    model_name="llama-2-7b",
    beta=0.1,  # KL divergence penalty
    learning_rate=5e-7,
    max_length=512,
    batch_size=4,
)

# Initialize DPO trainer
dpo_trainer = DPOTrainer(dpo_config)

# Train on preference pairs
await dpo_trainer.train(
    preference_dataset=PreferenceDataset(
        chosen_responses=chosen_data,
        rejected_responses=rejected_data,
    ),
    num_epochs=3,
)

Model Serving with Hot Reload

from reactor_core.serving.model_server import ModelServer, ModelServerConfig

# Configure model server
config = ModelServerConfig(
    models_dir="/path/to/models",
    enable_hot_reload=True,
    backend="vllm",  # or "transformers", "llamacpp", "mlx"
    max_cached_models=3,
)

# Initialize server
server = ModelServer(config)
await server.start()

# Serve inference requests
response = await server.predict(
    prompt="What is machine learning?",
    model_id="llama-2-7b",
    max_tokens=256,
)
print(response.text)

# Hot-reload: Just update the model file, server auto-reloads!

API Server & Scheduler

# Start API server
uvicorn reactor_core.api.server:app --host 0.0.0.0 --port 8003 --reload

import requests

# Trigger training via API
response = requests.post(
    "http://localhost:8003/training/trigger",
    json={
        "model_name": "llama-2-7b",
        "training_type": "dpo",
        "config": {
            "num_epochs": 3,
            "batch_size": 4,
            "learning_rate": 5e-7,
        },
    },
)

# Schedule nightly training
response = requests.post(
    "http://localhost:8003/scheduler/schedule",
    json={
        "name": "nightly_dpo_training",
        "schedule_type": "cron",
        "cron_expression": "0 2 * * *",  # 2 AM daily
        "job_config": {
            "training_type": "dpo",
            "model_name": "llama-2-7b",
        },
    },
)

Trinity Orchestrator (Multi-Repo Coordination)

from reactor_core.orchestration.trinity_orchestrator import (
    initialize_orchestrator,
    get_orchestrator,
)

# Initialize orchestrator
orchestrator = await initialize_orchestrator()

# Dispatch command to JARVIS/Prime
await orchestrator.dispatch_command(
    intent="start_surveillance",
    payload={
        "app_name": "Chrome",
        "trigger_text": "bouncing ball",
    },
    target_components=["jarvis"],
)

# Check component health
health = await orchestrator.get_health_status()
print(f"JARVIS: {health['jarvis'].status}")
print(f"Prime: {health['prime'].status}")
print(f"Reactor: {health['reactor'].status}")

Entry Points

Entry Point	Purpose	When to Use
Unified Supervisor (JARVIS)	`python3 unified_supervisor.py` in JARVIS-AI-Agent	Recommended — starts Body + Prime + Reactor-Core with Trinity coordination; discovers Reactor via `REACTOR_CORE_REPO_PATH` or default path
`run_reactor.py`	Trinity-integrated Reactor entry point	Standalone Reactor, or when supervisor calls it (e.g. `python3 run_reactor.py --port 8090`)
`run_supervisor.py` (in this repo)	Legacy/alternative supervisor in Reactor repo	When running orchestration from the Reactor repo instead of JARVIS

The unified supervisor lives in JARVIS-AI-Agent. It starts Reactor-Core by running run_reactor.py (or the configured script) in this repo, typically on port 8090. Reactor exposes /health for supervisor health checks and Trinity state sync.

Unified Supervisor (One-Command Startup)

The unified supervisor is in JARVIS-AI-Agent (unified_supervisor.py). It is the single entry point for the entire AGI OS ecosystem and automatically discovers, starts, and coordinates JARVIS (Body), JARVIS-Prime (Mind), and Reactor-Core (Nerves).

# From JARVIS-AI-Agent repo — start entire AGI OS ecosystem (recommended)
python3 unified_supervisor.py

# With options (see JARVIS-AI-Agent unified_supervisor.py for full CLI)
# python3 unified_supervisor.py --mode supervisor --skip-trinity ...

What the Supervisor Does (in JARVIS-AI-Agent):

Component Discovery: Automatically finds JARVIS, JARVIS Prime, and Reactor Core repos
Health Monitoring: Continuous health checks with automatic recovery
Event Bridge: Sets up real-time event streaming between components
Trinity Orchestration: Initializes multi-repo coordination
Service Startup: Starts all Reactor Core services (API, Model Server, Training, etc.)
Experience Collection: Continuous learning from JARVIS interactions
Graceful Shutdown: Clean shutdown of all components on Ctrl+C

Startup Phases:

Phase 1: Initialize Trinity Orchestrator
Phase 2: Initialize Event Bridge
Phase 3: Discover Components
Phase 4: Start Reactor Core Services
Phase 5: Initialize v91.0 Advanced Services
Phase 6: Start JARVIS (Body)
Phase 7: Start J-Prime (Mind)
Phase 8: Start Background Tasks
Phase 9: Wait for Component Health

Output Example:

======================================================================
           AGI OS UNIFIED SUPERVISOR - PROJECT TRINITY
======================================================================

[Phase 1] Initializing Trinity Orchestrator...
[OK] Trinity Orchestrator running

[Phase 2] Initializing Event Bridge...
[OK] Event Bridge running

[Phase 3] Discovering components...
  Found JARVIS at /path/to/JARVIS-AI-Agent
  Found J-Prime at /path/to/jarvis-prime
  Reactor Core at /path/to/reactor-core

[Phase 4] Starting Reactor Core services...
  [OK] Telemetry Collector started
  [OK] Model Registry initialized (5 models)
  [OK] Health Aggregator started
  [OK] Scheduler started (daily/weekly training)
  [OK] Model Server started

[Phase 5] Initializing v91.0 Advanced Services...
  [OK] Online Learning Engine started
  [OK] Distributed Coordinator started
  [OK] Data Version Controller started
  [OK] Spot VM Checkpointer started

[Phase 6] Starting JARVIS (Body)...
[OK] JARVIS started (PID: 12345)

[Phase 7] Starting J-Prime (Mind)...
[OK] J-Prime started (PID: 12346)

[Phase 8] Starting background services...
[OK] Health monitoring started
[OK] Experience collection started
[OK] Event processing started

[Phase 9] Waiting for component health...
======================================================================
            AGI OS READY - All Systems Operational
======================================================================

Component Status:
  JARVIS:      ✅ Running (http://localhost:8000)
  J-Prime:     ✅ Running (http://localhost:8001)
  Reactor API: ✅ Running (http://localhost:8003)
  Model Server: ✅ Running (http://localhost:8004)

Background Services:
  Health Monitor:      ✅ Active
  Experience Collector: ✅ Active (0 experiences collected)
  Event Processor:     ✅ Active
  Trinity Experience Receiver: ✅ Active

Press Ctrl+C to shutdown gracefully...

🔬 Advanced Features

Advanced Training Methods (v76.0-v80.0)

DPO (Direct Preference Optimization)

Train models on preference pairs without reward models:

from reactor_core.training.advanced_training import DPOTrainer, DPOConfig

config = DPOConfig(
    model_name="llama-2-7b",
    beta=0.1,  # KL divergence penalty
    learning_rate=5e-7,
    max_length=512,
)

trainer = DPOTrainer(config)
await trainer.train(
    preference_dataset=PreferenceDataset(
        chosen_responses=chosen_data,
        rejected_responses=rejected_data,
    ),
    num_epochs=3,
)

Variants Supported:

Standard DPO
IPO (Identity Preference Optimization)
KTO (Kahneman-Tversky Optimization)
ORPO (Odds Ratio Preference Optimization)

RLHF (Reinforcement Learning from Human Feedback)

Full PPO pipeline with reward modeling:

from reactor_core.training.advanced_training import RLHFTrainer, RLHFConfig

config = RLHFConfig(
    model_name="llama-2-7b",
    reward_model_name="reward-model",
    ppo_config={
        "clip_epsilon": 0.2,
        "value_coef": 0.1,
        "entropy_coef": 0.01,
    },
)

trainer = RLHFTrainer(config)
await trainer.train(
    preference_dataset=preference_data,
    num_epochs=3,
)

Curriculum Learning (v79.0)

Progressive difficulty scheduling for faster convergence:

from reactor_core.training.curriculum_learning import CurriculumLearner

curriculum = CurriculumLearner(
    model=model,
    dataset=dataset,
    difficulty_metric="perplexity",
    progression_strategy="exponential",  # or "linear", "adaptive"
)

# Automatic difficulty progression
await curriculum.train(num_epochs=10)

Benefits: 30-50% faster convergence, better generalization

Meta-Learning (v79.0)

Few-shot learning with MAML, Reptile, Meta-SGD:

from reactor_core.training.meta_learning import MAMLTrainer

maml = MAMLTrainer(
    model=model,
    inner_lr=0.01,
    outer_lr=0.001,
    adaptation_steps=5,
)

# Learn to learn from few examples
await maml.meta_train(
    tasks=task_distribution,
    meta_batch_size=4,
    num_meta_iterations=1000,
)

World Model Training (v80.0)

Learn latent dynamics for planning and counterfactual reasoning:

from reactor_core.training.world_model_training import WorldModelTrainer

world_model = WorldModelTrainer(
    latent_dim=512,
    action_dim=128,
    reward_dim=1,
)

await world_model.train(
    trajectories=trajectory_data,
    num_epochs=100,
)

# Counterfactual reasoning: "What if I had done X?"
counterfactual = await world_model.imagine_rollout(
    initial_state=state,
    alternative_action=action,
    horizon=10,
)

Causal Reasoning (v80.0)

Understand cause-effect relationships:

from reactor_core.training.causal_reasoning import CausalReasoner

reasoner = CausalReasoner(
    model=model,
    causal_graph=graph,
)

# Do-calculus: P(Y | do(X))
interventional_prob = await reasoner.interventional_inference(
    intervention={"X": value},
    query="Y",
)

# Causal discovery
discovered_graph = await reasoner.discover_causality(data)

Async Infrastructure (v76.1, v92.0)

Structured Concurrency (v92.0+)

Python 3.11+ compatible structured concurrency with TaskGroup:

from reactor_core.utils.async_helpers import StructuredTaskGroup, run_in_task_group

# Structured task group with automatic error handling
async with StructuredTaskGroup(
    name="training_pipeline",
    max_concurrent=5,
    cancel_on_error=True,
    timeout_seconds=3600.0,
) as tg:
    tg.create_task(load_data(), name="data_loading")
    tg.create_task(preprocess_data(), name="preprocessing")
    tg.create_task(train_model(), name="training")
    tg.create_task(validate_model(), name="validation")

# Get results
results = tg.results
for result in results:
    if result.success:
        print(f"{result.name}: {result.result}")
    else:
        print(f"{result.name}: {result.exception}")

# Convenience function
results = await run_in_task_group(
    [fetch_url(url) for url in urls],
    names=[f"fetch_{i}" for i in range(len(urls))],
    max_concurrent=10,
)

Connection Pooling (v92.0+)

Efficient HTTP and Redis connection management:

from reactor_core.config.unified_config import (
    HTTPConnectionPool,
    RedisConnectionPool,
    ConnectionPoolConfig,
)

# HTTP connection pool
pool = await HTTPConnectionPool.get_instance("api_client")
async with pool.request("GET", "https://api.example.com/data") as response:
    data = await response.json()

# Redis connection pool
redis_pool = await RedisConnectionPool.get_instance()
client = await redis_pool.get_client(host="localhost", port=6379)
await client.set("key", "value")

Features:

Singleton pattern per configuration
Automatic session lifecycle management
Connection reuse with keepalive
Configurable pool sizes via environment variables

Circuit Breaker

Automatic failure detection and recovery:

from reactor_core.utils.async_helpers import CircuitBreaker

breaker = CircuitBreaker(
    failure_threshold=5,
    recovery_timeout=60.0,
    half_open_max_calls=3,
)

@breaker.protect
async def risky_operation():
    # This will be protected by circuit breaker
    return await external_api_call()

# Circuit states: CLOSED → OPEN → HALF_OPEN → CLOSED

Backpressure Control

Prevents memory exhaustion under high load:

from reactor_core.utils.async_helpers import BackpressureController

controller = BackpressureController(
    max_queue_size=1000,
    queue_full_strategy="reject",  # or "block", "drop_oldest"
)

async def process_item(item):
    await controller.acquire()
    try:
        await process(item)
    finally:
        controller.release()

Dead Letter Queue

Failed operation tracking and automatic retry:

from reactor_core.utils.async_helpers import DeadLetterQueue

dlq = DeadLetterQueue(
    name="training_operations",
    persist_path=Path("/tmp/dlq"),
    auto_retry_interval=300.0,  # Retry every 5 minutes
)

# Register operation for retry
dlq.register_operation("publish_model_ready", publish_model_ready_func)

# Add failed operation
await dlq.add(
    operation="publish_model_ready",
    args=(model_name, model_path),
    kwargs={},
    exception=exception,
)

# Automatic retry with exponential backoff

Online Learning & Data Versioning (v91.0)

Prioritized Experience Replay

Learn continuously from JARVIS interactions:

from reactor_core.training.online_learning import OnlineLearningEngine

engine = OnlineLearningEngine(
    buffer_size=100000,
    importance_sampling=True,
    ewc_lambda=100.0,  # Elastic Weight Consolidation
)

# Add experiences from JARVIS
await engine.add_experience({
    "user_input": "Hello",
    "assistant_output": "Hi there!",
    "feedback": "positive",
})

# Trigger incremental update
await engine.incremental_update(
    model=model,
    batch_size=32,
    num_steps=100,
)

Concept Drift Detection

Automatic model adaptation when data distribution changes:

from reactor_core.training.online_learning import DriftDetector

detector = DriftDetector(
    threshold=0.1,
    window_size=1000,
    test_type="page_hinkley",  # or "adwin", "kswin"
)

# Monitor for drift
drift_detected = await detector.check_drift(
    current_batch=recent_data,
    reference_batch=historical_data,
)

if drift_detected:
    # Trigger model retraining
    await retrain_model()

Data Versioning

Content-addressed storage with lineage tracking:

from reactor_core.data.versioning import DataVersionController

controller = DataVersionController(
    version_store_path=Path("/data/versions"),
)

# Version a dataset
version = await controller.create_version(
    dataset_path=Path("/data/train.jsonl"),
    metadata={"source": "jarvis_interactions", "date": "2025-01-15"},
)

# Get version lineage
lineage = await controller.get_lineage(version.id)
print(f"Version {version.id} derived from {lineage.parent_id}")

# Reproduce exact dataset
dataset = await controller.load_version(version.id)

Distributed Training (v91.0)

Multi-VM Coordination

Train across multiple GCP Spot VMs with gradient compression:

from reactor_core.training.distributed_coordinator import DistributedCoordinator

coordinator = DistributedCoordinator(
    num_workers=8,
    gradient_compression="fp16",  # or "int8", "sparse"
    checkpoint_interval=300,  # seconds
)

# Start distributed training
await coordinator.start_training(
    model=model,
    dataset=dataset,
    num_epochs=10,
)

# Automatic checkpoint/resume on VM preemption

GCP Spot VM Checkpointing

Predictive preemption detection and automatic resume:

from reactor_core.gcp.checkpointer import SpotVMCheckpointer

checkpointer = SpotVMCheckpointer(
    gcs_bucket="my-checkpoints",
    checkpoint_interval=300,
    enable_preemption_prediction=True,
)

# Automatic checkpointing during training
async with checkpointer.protect_training():
    await train_model()

# Resume from latest checkpoint
await checkpointer.resume_training()

Preemption Signals Monitored:

GCP metadata API warnings
System load spikes
Network latency increases
Memory pressure indicators

Cross-Repo Integration (Trinity)

Reactor-Core is the Nerves in the three-repo Trinity architecture. It is started and monitored by the JARVIS unified supervisor and coordinates with JARVIS-Prime for inference and model deployment.

How JARVIS (Body) uses Reactor-Core:

Discovery: Supervisor resolves REACTOR_CORE_REPO_PATH (or default ~/Documents/repos/Reactor-Core).
Startup: Supervisor runs run_reactor.py (or configured script) with port 8090; Reactor starts HTTP server and health endpoint.
Health: Supervisor polls GET /health on port 8090; Reactor reports training readiness and Trinity connection state.
State: Reactor reads/writes shared state under ~/.jarvis/ (e.g. Trinity state, experience queue) for coordination.

How Reactor-Core uses JARVIS-Prime:

Inference: Reactor can call Prime’s OpenAI-compatible API for generation during training, evaluation, or distillation.
Model deployment: Trained/updated models can be deployed to Prime (e.g. hot swap, model registry).
Trinity Protocol: Events and heartbeats flow via file IPC and/or WebSocket; Reactor participates in Trinity state sync and experience collection from JARVIS Body.
DPO Training from Telemetry (v238.0+): JARVIS Body's TelemetryEmitter captures every interaction — query, complexity classification, response, latency, and source. Reactor-Core uses this telemetry to build DPO preference pairs (e.g., chosen: "10", rejected: "Of course, the sum of five and five is ten...") for fine-tuning Mistral-7B. This training loop makes the v236.0/v238.0 adaptive prompt system's conciseness enforcement permanent — encoding terse-vs-detailed behavior in the model's weights instead of relying on prompt instructions. See the JARVIS-Prime README for the full training loop architecture.
Voice Conversation Training Data (v238.0+): JARVIS Body's real-time voice conversation pipeline generates a new class of training data: multi-turn conversation traces (20-turn sliding window sessions), barge-in events (proxy for response quality), turn detection accuracy logs, and conversation-mode-specific telemetry (session_type: "conversation", time_to_first_audio_ms, barge_in_count). This data enables conversational DPO pairs (sustained-engagement vs. conversation-ending responses), conciseness training (shorter responses preferred in voice mode), and turn-detection classifier training (replacing the heuristic V1 with an ML-based V2). See v248.0 roadmap below.
Autonomy Event Ingestion (Phase 2): JARVIS Body emits 7 canonical autonomy lifecycle events (intent_written, committed, failed, policy_denied, deduplicated, superseded, no_journal_lease) through the existing experience forwarder. Reactor ingests these via AutonomyEventIngestor with strict validation (7 required metadata keys), composite-key deduplication (50K window), and disk-based quarantine for malformed events. The centralized AutonomyEventClassifier maps each event type to a training label — only committed and failed feed the training pipeline; infrastructure events (policy_denied, no_journal_lease) are excluded via InteractionOutcome.INFRASTRUCTURE.

Phase 2: Trinity Autonomy Wiring (Reactor Role)

Reactor-Core serves as the ingestion and classification layer for autonomy events. It ensures only well-formed, non-duplicate, trainable events reach the training pipeline.

┌─────────────────────────────────────────────────────────────────┐
│                REACTOR AUTONOMY ROLE                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                   │
│  Inbound (from Body via ExperienceForwarder):                    │
│  ┌──────────────────────────────────────────────────┐           │
│  │ ExperienceEvent (type=METRIC)                     │           │
│  │   .metadata = {                                   │           │
│  │       autonomy_event_type: "committed",           │           │
│  │       autonomy_schema_version: "1.0",             │           │
│  │       idempotency_key: "...",                     │           │
│  │       trace_id: "...",                            │           │
│  │       correlation_id: "...",                      │           │
│  │       action: "workspace:send_email",             │           │
│  │       request_kind: "autonomous"                  │           │
│  │   }                                               │           │
│  └─────────────────────┬────────────────────────────┘           │
│                         │                                        │
│                         ▼                                        │
│  ┌──────────────────────────────────────────────────┐           │
│  │ AutonomyEventIngestor                             │           │
│  │                                                   │           │
│  │   Step 1: VALIDATE                                │           │
│  │   ├─ Check 7 required keys present               │           │
│  │   ├─ Verify event_type ∈ known set               │           │
│  │   ├─ Verify schema_version supported             │           │
│  │   └─ Reject → quarantine to disk                 │           │
│  │                                                   │           │
│  │   Step 2: DEDUPLICATE                             │           │
│  │   ├─ Composite key: (idempotency_key,            │           │
│  │   │    autonomy_event_type, trace_id)             │           │
│  │   ├─ 50K sliding window                           │           │
│  │   └─ Duplicate → skip silently                    │           │
│  │                                                   │           │
│  │   Step 3: CLASSIFY                                │           │
│  │   └─ AutonomyEventClassifier.classify()           │           │
│  │       ├─ committed → POSITIVE (trainable=true)    │           │
│  │       ├─ failed    → NEGATIVE (trainable=true)    │           │
│  │       ├─ policy_denied → INFRASTRUCTURE (false)   │           │
│  │       ├─ no_journal_lease → INFRASTRUCTURE (false)│           │
│  │       ├─ deduplicated → NEUTRAL (false)           │           │
│  │       ├─ intent_written → NEUTRAL (false)         │           │
│  │       └─ superseded → NEUTRAL (false)             │           │
│  │                                                   │           │
│  │   Step 4: BUILD RawInteraction                    │           │
│  │   └─ Passes to UnifiedPipeline                    │           │
│  └──────────────────────┬───────────────────────────┘           │
│                          │                                       │
│                          ▼                                       │
│  ┌──────────────────────────────────────────────────┐           │
│  │ UnifiedPipeline._build_dataset()                  │           │
│  │                                                   │           │
│  │   Training Exclusion Filter:                      │           │
│  │   if outcome ∈ {INFRASTRUCTURE, DEFERRED}:        │           │
│  │       skip (not trainable)                        │           │
│  │   else:                                           │           │
│  │       include in DPO/LoRA dataset                 │           │
│  └──────────────────────────────────────────────────┘           │
│                                                                   │
│  Quarantine: ~/.jarvis/quarantine/autonomy_events/               │
│    • Retention: 7 days                                            │
│    • Max size: 100 MB                                             │
│    • Alert threshold: 10 malformed events                        │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

New files:

reactor_core/ingestion/autonomy_classifier.py — Centralized AutonomyEventClassifier (single source of truth for training eligibility)
reactor_core/ingestion/autonomy_event_ingestor.py — Full AbstractIngestor with validation, dedup, quarantine

Modified files:

reactor_core/ingestion/base_ingestor.py — Added INFRASTRUCTURE to InteractionOutcome enum
reactor_core/training/unified_pipeline.py — Training exclusion filter for non-trainable outcomes
reactor_core/ingestion/__init__.py — Exports for new modules

run_reactor.py:

Trinity-integrated entry point for Reactor-Core. Designed to be started by the unified supervisor (python3 run_reactor.py --port 8090).
Exposes health (/health) for supervisor monitoring and training/API endpoints for the ecosystem.
Environment: REACTOR_PORT (default 8090), JARVIS_PRIME_URL, TRINITY_ENABLED, MODEL_OUTPUT_DIR, LOG_LEVEL.

🔗 Integration Architecture

JARVIS Ecosystem Integration

┌─────────────────────────────────────────────────────────────────────┐
│                       JARVIS AGI ECOSYSTEM                           │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  ┌──────────────────┐          ┌──────────────────┐                 │
│  │  JARVIS-AI-Agent │◄────────►│  JARVIS Prime    │                 │
│  │  (Claude Body)   │  Events  │  (LLM Mind)      │                 │
│  │                  │          │                  │                 │
│  │ • Computer Use   │          │ • Local LLM      │                 │
│  │ • macOS Control  │          │ • Reasoning      │                 │
│  │ • Voice Auth     │          │ • Context        │                 │
│  └─────────┬────────┘          └─────────┬────────┘                 │
│            │                              │                          │
│            │         Event Bridge         │                          │
│            │      (WebSocket/Redis)       │                          │
│            │                              │                          │
│  ┌─────────▼──────────────────────────────▼────────┐                 │
│  │            Reactor Core (Nervous System)        │                 │
│  │  ┌──────────────────────────────────────────┐   │                 │
│  │  │         Trinity Orchestrator             │   │                 │
│  │  │  • Heartbeat monitoring                  │   │                 │
│  │  │  • Command routing                       │   │                 │
│  │  │  • State reconciliation                  │   │                 │
│  │  └──────────────────────────────────────────┘   │                 │
│  │                                                  │                 │
│  │  ┌──────────────────────────────────────────┐   │                 │
│  │  │         Training & Serving               │   │                 │
│  │  │  • DPO, RLHF, Constitutional AI          │   │                 │
│  │  │  • Hot-reload model server               │   │                 │
│  │  │  • Night Shift scheduler                 │   │                 │
│  │  └──────────────────────────────────────────┘   │                 │
│  │                                                  │                 │
│  │  ┌──────────────────────────────────────────┐   │                 │
│  │  │         Event Streaming                  │   │                 │
│  │  │  • Safety audit trail                    │   │                 │
│  │  │  • Cost tracking                         │   │                 │
│  │  │  • Telemetry collection                  │   │                 │
│  │  └──────────────────────────────────────────┘   │                 │
│  └──────────────────────────────────────────────────┘                 │
│                                                                      │
│            ▼                             ▼                           │
│  ┌──────────────────┐         ┌──────────────────┐                  │
│  │   Cloud SQL      │         │   GCP Storage    │                  │
│  │   (Events DB)    │         │  (Checkpoints)   │                  │
│  └──────────────────┘         └──────────────────┘                  │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

⚙️ Configuration Guide

Environment Variables

JARVIS Reactor uses environment variables for all configuration (zero hardcoding):

# Path Configuration (XDG-compliant defaults)
export JARVIS_EVENTS_DIR="/custom/path/events"
export TRINITY_EVENTS_DIR="/custom/path/trinity/events"
export EXPERIENCE_QUEUE_DIR="/custom/path/experience_queue"
export MODEL_REGISTRY_PATH="/custom/path/models"
export DATA_VERSION_PATH="/custom/path/data_versions"

# API Configuration
export AGI_API_PORT=8003
export AGI_SERVING_PORT=8001
export AGI_JPRIME_PORT=8000

# Connection Pooling
export HTTP_POOL_SIZE=100
export HTTP_POOL_PER_HOST=10
export HTTP_KEEPALIVE_TIMEOUT=30.0
export REDIS_POOL_SIZE=10

# Training Configuration
export REACTOR_EXPERIENCE_BUFFER_THRESHOLD=100
export REACTOR_AUTO_TRAINING_THRESHOLD=1000
export REACTOR_CHECKPOINT_INTERVAL=300

# GCP Configuration
export GCP_PROJECT_ID="my-project"
export GCP_CHECKPOINT_BUCKET="my-checkpoints"
export GCP_SPOT_VM_ENABLED=true

# Feature Flags
export REACTOR_ENABLE_ONLINE_LEARNING=true
export REACTOR_ENABLE_DISTRIBUTED_TRAINING=true
export REACTOR_ENABLE_DATA_VERSIONING=true

Configuration Files

Configuration is loaded in this priority order:

Environment variables (highest priority)
~/.jarvis/reactor/config.json (user config)
reactor_core/config/default_config.json (defaults)

Example config file:

{
  "api": {
    "port": 8003,
    "host": "0.0.0.0"
  },
  "training": {
    "default_model": "llama-2-7b",
    "use_lora": true,
    "lora_rank": 16
  },
  "serving": {
    "max_cached_models": 5,
    "enable_hot_reload": true,
    "default_backend": "auto"
  },
  "trinity": {
    "heartbeat_interval": 5.0,
    "health_check_timeout": 10.0
  }
}

Dynamic Path Resolution

All paths are resolved dynamically with XDG compliance:

Environment Variable (if set)
base_config.resolve_path() (if available)
XDG_DATA_HOME/jarvis/ (fallback)

No hardcoded Path.home() calls - fully portable across systems.

🔧 Troubleshooting

Common Issues

Issue: Components fail to start

Symptoms: run_supervisor.py shows component failures

Solutions:

# Check component paths
python3 run_supervisor.py --dev --log-level DEBUG

# Verify component health
curl http://localhost:8003/health

# Check logs
tail -f ~/.jarvis/reactor/logs/supervisor.log

Issue: Training fails with OOM

Symptoms: Out of memory errors during training

Solutions:

# Enable gradient checkpointing
config = TrainingConfig(
    gradient_checkpointing=True,
    use_qlora=True,  # 4-bit quantization
    cpu_offload=True,  # Offload to CPU
)

# Use smaller batch size
config.batch_size = 1
config.gradient_accumulation_steps = 8

Issue: Model server not hot-reloading

Symptoms: Model updates don't appear in server

Solutions:

# Verify file watcher is enabled
config = ModelServerConfig(
    enable_hot_reload=True,
    watch_directories=["/path/to/models"],
)

# Check file permissions
ls -la /path/to/models

# Verify model format
# Server supports: .gguf, .safetensors, .bin

Issue: Cross-repo events not working

Symptoms: Events not flowing between JARVIS, Prime, Reactor

Solutions:

# Check event bridge status
curl http://localhost:8003/api/v1/events/status

# Verify event directories exist
ls -la ~/.jarvis/events/
ls -la ~/.jarvis/trinity/events/

# Check WebSocket connection
# Open browser console: ws://localhost:8003/ws

Issue: Distributed training hangs

Symptoms: Training stuck at barrier synchronization

Solutions:

# Check network connectivity
await coordinator.check_connectivity()

# Verify all workers are healthy
health = await coordinator.get_worker_health()

# Enable gradient checksum validation
coordinator.enable_gradient_verification = True

Debug Mode

Enable comprehensive debugging:

# Set debug environment
export REACTOR_DEBUG=true
export REACTOR_LOG_LEVEL=DEBUG

# Run with debug flags
python3 run_supervisor.py --dev --log-level DEBUG

# Check debug logs
tail -f ~/.jarvis/reactor/logs/debug.log

🛠️ Development Guide

Setting Up Development Environment

# Clone repository
git clone --recursive https://github.com/drussell23/JARVIS-Reactor.git
cd JARVIS-Reactor

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Run tests
pytest tests/

# Run linting
black reactor_core/
ruff check reactor_core/

Code Structure

reactor_core/
├── training/          # Training methods and pipelines
├── serving/           # Model serving infrastructure
├── api/              # REST API endpoints
├── orchestration/    # Trinity coordination
├── integration/      # Cross-repo integration
├── utils/            # Utilities (async_helpers, etc.)
├── config/           # Configuration management
├── data/             # Data processing and versioning
├── eval/             # Model evaluation
└── gcp/              # GCP-specific features

Adding New Features

Create feature branch:
```
git checkout -b feature/my-feature
```
Follow code style:
- Use black for formatting
- Follow type hints (use mypy)
- Add docstrings (Google style)

Write tests:

# tests/test_my_feature.py
import pytest
from reactor_core.my_module import MyFeature

@pytest.mark.asyncio
async def test_my_feature():
    feature = MyFeature()
    result = await feature.do_something()
    assert result is not None

Update documentation:
- Add to README.md
- Update API docs
- Add examples
Submit PR:
- Ensure all tests pass
- Update version in __init__.py
- Add to CHANGELOG.md

Testing

# Run all tests
pytest

# Run specific test file
pytest tests/test_training.py

# Run with coverage
pytest --cov=reactor_core --cov-report=html

# Run integration tests
pytest tests/integration/ -v

Code Quality

# Format code
black reactor_core/

# Lint code
ruff check reactor_core/

# Type checking
mypy reactor_core/

# Security scanning
bandit -r reactor_core/

📈 Version History

v238.0 - Ecosystem: Real-Time Voice Conversation Infrastructure (2026-02-18, JARVIS Body-side)

JARVIS Body v238.0 introduced a complete real-time voice conversation pipeline — continuous, bidirectional, streaming voice dialogue that transforms JARVIS from a command-response system into a conversational AI companion. This has significant implications for Reactor Core's training pipeline.

Architecture (7 layers in JARVIS Body):

Layer -1: FullDuplexDevice (synchronized mic+speaker at same sample clock) + PlaybackRingBuffer
Layer 0: AudioBus singleton (all audio I/O flows through) + AEC via speexdsp
Layer 1: Streaming TTS (Piper — local neural TTS, ~50ms time-to-first-audio)
Layer 2: Streaming STT (faster-whisper — partial + final transcripts)
Layer 3: Turn Detector (adaptive silence thresholds: 300/600/900ms)
Layer 4: Barge-In Controller (user interrupts JARVIS mid-sentence via AEC-cleaned signal)
Layer 5: Conversation Pipeline (full orchestrator with 20-turn sliding context)
Layer 6: Mode Dispatcher (COMMAND / CONVERSATION / BIOMETRIC routing)
Bootstrap: Two-phase factory (AudioBus before narrator, pipeline after Intelligence)

Impact on Reactor Core training data:

Data Type	Source	Training Application
Multi-turn conversation traces	ConversationSession (20-turn sliding window)	Fine-tune on conversational coherence — multi-turn dialogue pairs instead of single-turn Q&A
Barge-in events	BargeInController logs	Train models to produce "checkpoint sentences" — natural pause points where interruption feels organic
Turn detection accuracy	TurnDetector heuristic vs actual user behavior	Training signal for ML-based turn detection classifier (v248.0 in JARVIS Body)
Conversation mode LLM telemetry	J-Prime `/v1/chat/completions` with conversation context	DPO pairs from multi-turn context: "which response kept the conversation flowing vs. which caused a restart?"
Self-voice echo transcripts (dropped)	STT hallucination guard `conversation_mode` filter	Negative examples for STT fine-tuning — these are NOT user speech and should be classified as noise

New telemetry fields flowing to Reactor Core:

session_type: "conversation" vs "command" — enables per-mode DPO pair generation
conversation_turn_count — tracks where in a multi-turn session the response occurred
barge_in_count — how many times the user interrupted (proxy for response quality)
time_to_first_audio_ms — end-to-end latency metric for streaming pipeline

What Reactor Core should prepare for (v248.0+):

Conversational DPO pairs — Responses that sustained multi-turn engagement ("chosen") vs. responses that ended the conversation ("rejected")
Latency-aware training — Shorter, more concise responses score higher in conversation mode (unlike detailed analysis mode)
Turn-detection classifier training data — When the heuristic V1 makes mistakes (too-early or too-late turn detection), log the error as training data for an ML-based V2

v238.0 - Ecosystem: Degenerate Response Elimination (2026-02-08, JARVIS Body-side)

JARVIS Body v238.0 fixes degenerate LLM responses ("...") via 3-layer defense-in-depth
SIMPLE classification narrowed: "what is X?" queries promoted to MODERATE (512 tokens)
Backend degenerate response detection with safe retry using MODERATE parameters
Client-side degenerate suppression with zombie timeout re-arming
requestId echo in WebSocket responses enables frontend deduplication
Reactor-Core's DPO training pipeline receives improved telemetry (complexity + source fields) for preference pair generation
Note: v238.0 changes are in JARVIS (Body) and documented here for ecosystem coherence

v92.0 - Reliability & Robustness (2025-01-15)

Structured Concurrency: Python 3.11+ TaskGroup patterns for robust async operations
Connection Pooling: Efficient HTTP/Redis connection management with automatic lifecycle
Dynamic Path Resolution: Zero hardcoding, XDG-compliant paths, environment-driven config
Atomic File Writes: Prevents checkpoint corruption from partial writes
Circuit Breaker Pattern: Protects external service calls with auto-recovery
Backpressure Control: Prevents memory exhaustion under high load
Proper Async Patterns: Deadlock-free async/await with timeouts
Gradient Verification: Checksum validation for distributed training
Memory Pressure Awareness: Adaptive behavior under resource constraints
Unified Error Handling: Centralized error classification and routing

v91.0 - Advanced Learning & Distributed Training (2025-01-10)

Online/Incremental Learning: Prioritized experience replay with importance sampling
Elastic Weight Consolidation (EWC): Prevents catastrophic forgetting during updates
Concept Drift Detection: Page-Hinkley test for automatic model adaptation
Data Versioning: Content-addressed storage with lineage tracking (DVC compatible)
GCP Spot VM Checkpointing: Predictive preemption with multi-signal detection
Distributed Training: Multi-VM coordination with gradient compression
Dynamic Resource Allocation: Auto-scaling with cost-aware decisions
MLForge C++ Bindings: High-performance matrix/neural ops with pybind11

v77.1 - Model Serving & Hot Reload (2025-01-07)

Hot-reload model server with zero-downtime updates (1,545 lines)
Multi-backend inference engine: vLLM, llama.cpp, MLX, Transformers (1,891 lines)
Unified supervisor for one-command AGI OS startup (1,635 lines)
LRU model cache with memory-aware eviction
Priority request queue for SLA compliance
Semantic response caching with hash-based deduplication

v77.0 - Advanced API Server (2025-01-07)

Telemetry collection system with WebSocket streaming (1,128 lines)
Night Shift scheduler for automated training (1,030 lines)
Model registry with versioning and A/B testing (1,301 lines)
Health aggregator with multi-service dashboard (999 lines)
Enhanced FastAPI server (2,252 lines)

v76.1 - Async Infrastructure (2025-01-07)

Advanced async patterns library (1,746 lines)
Circuit breaker, backpressure, bulkhead patterns
Dead letter queue, health monitor, adaptive rate limiter
Dependency injection system (913 lines)

v76.0 - Advanced Training Methods (2025-01-07)

DPO, RLHF, Constitutional AI, Curriculum Learning (2,899 lines)
Memory manager with dynamic batch sizing
Advanced evaluation suite (1,536 lines)

v80.0 - World Models & Causal Reasoning (2024-12-20)

World model training with latent dynamics and planning
Causal reasoning with SCMs and do-calculus
Advanced data preprocessing with quality gates
Synthetic data generation (3-10x augmentation)
Active learning for efficient labeling

v79.0 - Curriculum & Meta-Learning (2024-12-15)

Curriculum learning with progressive difficulty
Meta-learning (MAML, Reptile, Meta-SGD)
Dependency injection framework

v75.0 - Trinity Dead Letter Queue (2024-12-25)

DLQ for failed/expired commands
Automatic retry with exponential backoff

v73.0 - Atomic File I/O (2024-11-15)

Zero-corruption file operations via atomic renames

v10.3 - Vision Safety Integration (2024-10-20)

Safety audit trail and kill switch mechanism

v10.0 - Cross-Repository Integration (2024-10-01)

Real-time event streaming across JARVIS ecosystem

v1.0.0 - Initial Release (2024-09-01)

PyTorch-first ML training framework
LoRA/QLoRA, DPO, FSDP support
GCP Spot VM resilience

🗺️ Roadmap — Next Phases

v239.0 — Pipeline Activation: Flip the Breaker (In Progress)

Status: Infrastructure ~95% built. Schemas verified identical across repos. All code exists. Zero training jobs have ever run. This version activates the pipeline with ~200-400 lines of wiring changes, zero new Python files.

Approach: Supervisor-Driven Activation — the unified supervisor already has ReactorCoreClient with trigger_training(), stream_experience(), get_experience_count(), and health monitoring. v239.0 wires these existing methods into the startup and runtime loops.

What v239.0 changes:

Wire supervisor startup (~40 lines in unified_supervisor.py)
- Call initialize_reactor_core() during Phase 5 (Trinity)
- Start ReactorCoreWatcher as background task
- Check for in-progress training jobs on startup (prevents duplicates after restart)
- Wire shutdown calls for clean cleanup
Verify HTTP connectivity (debug, no code change)
- Confirm Reactor Core API server binds to port 8090 and accepts POSTs
- Verify /api/v1/experiences/stream endpoint is registered and reachable
- Manual curl test before any code changes
First training job (manual trigger)
- Trigger via ReactorCoreClient.trigger_training() using accumulated telemetry
- Validate: DatasetBuilder → LoRATrainer → GGUFExporter chain works
- Goal: jobs.json goes from {} to having one completed job
Deployment smoke test gate (~60 lines in reactor_core_watcher.py)
- Before deploying GGUF to Prime, load model in a subprocess (avoids OOM on 16GB Mac)
- Run 5 test prompts, verify non-garbage output
- Block deployment if smoke test fails
- Extensible DeploymentGate interface for future JARVIS-Bench integration

Deployment feedback loop (~30 lines in reactor_core_watcher.py)

After hot-swap notification to Prime, wait 10s, check Prime health

Write deployment_status.json to ~/.jarvis/reactor/feedback/ with schema:

{
  "schema_version": "1.0",
  "model_path": "...",
  "deployed_at": "2026-02-15T10:30:00Z",
  "smoke_test_passed": true,
  "hot_swap_notified": true,
  "health_verified": true,
  "previous_model": "qwen2.5-coder-7b-v1",
  "deployment_latency_ms": 12400
}

Reactor Core consumes these to track deployment success rate

When this works, the full loop is:

User → JARVIS Body → J-Prime (inference + telemetry capture)
  → ~/.jarvis/telemetry/ (JSONL logs with model_id, task_type)
    → Reactor Core TelemetryIngestor (via file watch or HTTP POST)
      → Experience accumulation (threshold: 100 weighted experiences)
        → Supervisor auto-triggers training via ReactorCoreClient
          → LoRA fine-tuning (SFT first, DPO in v242.0)
            → GGUF export → Smoke test gate (subprocess)
              → Deploy to J-Prime → Feedback file
                → Models improve, automatically

v242.0 — DPO Training from Multi-Model Telemetry (Planned)

Status: Depends on v239.0 pipeline activation. DPO pair generation code exists in dpo_pair_generator.py but has never run on real data.

What v242.0 adds on top of v239.0:

Automatic DPO preference pairs from multi-model routing
- v241.1's task-type routing creates implicit quality comparisons:
```
Query: "solve 5x+3=18" routed to Mistral-7B → "x=11" (wrong)
Same query type routed to Qwen-Math-7B → "x=3" (correct)
→ Automatic DPO pair: {prompt, chosen: "x=3", rejected: "x=11"}
```
- Multi-model routing IS the labeling mechanism. No human annotation needed.
- model_id in telemetry (via X-Model-Id response header) enables per-model performance tracking
Ground truth sources for DPO pairs (not just self-assessment)
- User corrections — when a user re-asks or explicitly corrects, the correction is "chosen" and the original is "rejected"
- Claude-as-judge — use Claude API to evaluate which of two outputs is better (stronger model judging weaker ones)
- Objective metrics — for code tasks: does the code compile/run? For math: is the answer correct?
- Avoids circular reasoning (system training on its own quality judgments)
Fine-tune and export
- UnifiedTrainingPipeline supports DPO training with LoRA/QLoRA
- Training requires full-precision FP16 base models (~14 GB for 7B), not the GGUFs
- Elastic Weight Consolidation (EWC) prevents catastrophic forgetting
- Per-task-type regression tests run after every training run (all task types, not just the one trained on)

Architectural Status Report — Cross-Repo Audit (February 2026)

A comprehensive three-way architectural audit of the JARVIS ecosystem was conducted across JARVIS Body, JARVIS Prime, and Reactor Core. Three independent analyses were cross-verified against actual code, producing the corrected status below.

Training Data Pipeline Status

The training data pipeline from JARVIS Body → J-Prime → Reactor Core is ~95% built but never activated. All infrastructure exists, schemas are verified identical, and handoff code is implemented. The gap is operational — nobody has run the pipeline.

Component	Location	Status	Verified State (Feb 2026)
`TelemetryEmitter`	JARVIS Body	Built and active	Writes JSONL to `~/.jarvis/telemetry/`. Telemetry files confirmed present (e.g., `interactions_20260210.jsonl`).
`TelemetryIngestor`	Reactor Core	Built	Reads from `~/.jarvis/telemetry/`. Schema verified byte-identical to `TelemetryEmitter` output (v1.0 canonical). Not actively polling — only runs when `UnifiedTrainingPipeline` is explicitly invoked.
`ReactorCoreBridge.upload_training_data()`	J-Prime	Fully implemented	992 LOC, v242.0. Includes batch upload, file fallback, job tracking. ~~Previously reported as "not implemented" — this was incorrect.~~
`ExperienceEvent` schema	All 3 repos	Unified	One canonical `ExperienceEvent` dataclass with 5 adapter functions for legacy formats. ~~Previously reported as "three different schemas" — this was incorrect.~~
`UnifiedTrainingPipeline`	Reactor Core	Built	`DatasetBuilder` → `LoRATrainer` → `GGUFExporter` chain exists. Zero training jobs have ever run (`jobs.json` is empty).
`HotSwapManager`	J-Prime	Built	Accepts GGUF files for zero-downtime swap. `ReactorCoreWatcher` in Prime detects new model files.
`ModelDeploymentManager`	Reactor Core	Built	GGUF export and deployment signaling exists. Untested end-to-end.
`initialize_reactor_core()`	JARVIS Body	Built but never called	Function exists in `backend/autonomy/reactor_core_integration.py` but supervisor does not invoke it during startup.
`start_reactor_core_watcher()`	JARVIS Body	Built but never called	Function exists in `backend/autonomy/reactor_core_watcher.py` but supervisor does not start it.

Root Cause (Corrected): The pipeline is not "broken" — it was never turned on. The schemas match, the code is written, the APIs exist. The supervisor needs to call initialize_reactor_core() and start_reactor_core_watcher() during its startup sequence, and Reactor Core's API server needs to be verified as actively listening and accepting experience POSTs on port 8090. This is a wiring problem, not an architecture problem. Target: v239.0 Supervisor-Driven Pipeline Activation.

What v239.0 will wire:

Supervisor calls initialize_reactor_core() during Phase 5 (Trinity)
Supervisor starts ReactorCoreWatcher as a background task
Verify Reactor Core API accepts POSTs on port 8090
First manual training job triggered via ReactorCoreClient.trigger_training()
Deployment feedback file (~/.jarvis/reactor/feedback/deployment_status.json) closes the loop
Smoke test gate validates GGUF before deployment (runs in subprocess to avoid OOM)

Google Workspace Fixes Impact (v245.0)

The v245.0 Google Workspace fixes in JARVIS Body have a direct impact on Reactor Core's future training data:

Draft email body generation now works — Previously silent failures meant no email body generation telemetry was captured. Now, every draft email request generates a real LLM inference call (with X-Model-Id), producing training-relevant interaction data.
Agent singleton fix eliminates noise — The 49s recreation bug caused timeout errors that would have polluted training data with failed interactions. Clean request/response pairs are now the norm.
Task-type metadata flows correctly — Workspace commands now carry proper task-type metadata, enabling Reactor Core to generate per-model DPO pairs from workspace interactions.

LangGraph in JARVIS Body

All 9 LangGraph reasoning graphs in JARVIS Body are dead code because langgraph is not installed. This means:

The reasoning engine uses linear fallback (analysis → planning → validation → execution → reflection → learning) instead of conditional graph routing
The route_after_reflection() loop-back (for iterative reasoning on low confidence) has never executed
Training data from the reasoning engine reflects single-pass linear thinking, not the intended iterative, graph-based reasoning
Impact on Reactor Core: When the training pipeline activates, the quality of reasoning traces available for fine-tuning will be lower than designed until LangGraph is installed (v246.0 in JARVIS Body)

Planned: Unified Agent Runtime (v247.0 in JARVIS Body)

The JARVIS Body Unified Agent Runtime will generate a new class of training data for Reactor Core:

Multi-step goal traces — Complete autonomous workflows (sense → think → act → verify → reflect) with sub-step decomposition, producing rich sequential decision-making data
Cross-agent coordination traces — When the Runtime dispatches work to Neural Mesh agents, the coordination patterns become training data for improving multi-agent orchestration
Failure recovery traces — When autonomous goals fail and the Runtime retries or replans, the recovery patterns become training data for improving resilience
Human escalation signals — When the Runtime escalates to the user for approval, the decision boundary becomes a training signal for the safety classifier

New training data types Reactor Core should prepare for:

Data Type	Source	Training Method
Goal decomposition traces	Agent Runtime THINK phase	Supervised fine-tuning on planning
Sub-step success/failure	Agent Runtime VERIFY phase	DPO pairs (successful vs. failed approaches)
Escalation decisions	Agent Runtime escalation protocol	Constitutional AI for safety boundaries
Multi-agent coordination	Neural Mesh dispatch logs	Curriculum learning on orchestration complexity

✅ v243.0/v243.1 — Command Lifecycle Events + Event Bus Lifecycle (COMPLETED — JARVIS Body-side)

v243.0/v243.1 shipped as Command Lifecycle Events and Event Infrastructure Lifecycle Management in the JARVIS Body repo. This directly impacts Reactor Core because command lifecycle events create a new source of training data.

What this means for Reactor Core:

Command lifecycle events (command.received, command.classified, command.completed, command.failed) now flow through TrinityEventBus. NeuralMesh's Knowledge Graph subscribes to these events, building semantic memory of command patterns. This creates richer training signals for the DPO pipeline:

BEFORE v243.0:
  User command → J-Prime inference → response
  Training data: (query, response) pairs only

AFTER v243.0:
  User command → command.received event
    → J-Prime inference → command.classified event
      → Execution → command.completed/failed event
  Training data: (query, response, intent, domain, execution_outcome, latency)

  → Reactor Core TelemetryIngestor can now consume:
    - Successful vs failed executions as quality signals
    - Intent classification accuracy as routing feedback
    - Domain distribution for curriculum learning
    - Latency metrics for performance optimization

Impact on Reactor Core training pipeline:

DPO pair quality improvement — Command outcomes (success/failure) provide ground truth for preference pairs. A response that correctly classified intent="action" and executed successfully is a stronger "chosen" signal than one based solely on response text quality.
Curriculum learning data — Domain distribution from command.classified events enables data-driven curriculum: train on high-frequency domains first (general, system), then expand to rare domains (smart_home, media).
Drift detection signals — command.failed events with intent metadata enable per-domain quality monitoring. A spike in failures for domain="workspace" suggests the workspace model needs retraining.

Event infrastructure lifecycle (v243.1):

TrinityEventBus explicitly started in Phase 4 (before any subscriber)
Health checks registered with HealthAggregator
Graceful shutdown in correct dependency order
Boot-order races eliminated (NeuralMesh no longer needs 10s retry)

Files modified (all in JARVIS Body repo):

unified_supervisor.py — Event state tracking, explicit startup, health checks, shutdown
backend/core/trinity_event_bus.py — Command lifecycle event types
backend/api/unified_command_processor.py — Event emission at each command stage
backend/neural_mesh/neural_mesh_coordinator.py — Knowledge Graph subscription

✅ v244.0 — Startup Warning Root Fix + Brain Vacuum Classification (COMPLETED — JARVIS Body-side)

v244.0 shipped in the JARVIS Body repo with three fix categories. The brain vacuum classification fix is most relevant to Reactor Core's training pipeline:

Brain Vacuum Classification Fix:

When J-Prime is unreachable, _brain_vacuum_fallback() in jarvis_prime_client.py now includes a classification prompt prefix. The fallback LLM (Claude/Gemini) outputs a CLASSIFICATION: {"intent", "domain", "requires_action", "suggested_actions"} line before its response. This means:

Better training data during downtime — Fallback responses now include proper intent/domain classification, not hardcoded intent="answer". Telemetry events from brain vacuum periods produce valid DPO pairs.
Action commands execute — "Lock my screen" during J-Prime downtime returns intent="action" and actually executes, instead of becoming a text explanation.

Other v244.0 changes:

858 lines of dead code removed (orphaned tiered routing system)
Cloud SQL proxy startup reduced from ~47s to ~3-5s (learning_database initializes faster)

Impact on Reactor Core: Faster Cloud SQL proxy startup means the learning_database (which stores voiceprints, command history, and training metadata) initializes sooner, reducing the window where telemetry events might be lost during boot.

Ouroboros Training Support (Planned — Future Version)

Support the training side of JARVIS self-programming:

Code quality evaluation — Evaluate generated code diffs for correctness, style, security. Feed scores back as DPO signals.
Self-programming telemetry — Capture Ouroboros cycles (architect plan → generated code → verifier review → human decision) as training data.
Architect/Implementer specialization — Fine-tune DeepSeek-R1-14B on architectural reasoning traces and Qwen-Coder-14B on code generation from plans, using Ouroboros interaction data.
Constitutional AI for code — Apply Constitutional AI training to code generation: "Is this code safe? Does it follow the existing patterns? Does it handle errors?"

Continuous Learning Loop (Planned — Future Version)

Night Shift automation — NightShiftScheduler already exists. Wire it to trigger DPO training runs during off-peak hours using accumulated telemetry.
Concept drift detection — PageHinkleyDriftDetector already exists. Monitor model performance metrics and trigger retraining when quality degrades.
A/B model testing — ModelRegistry supports versioned models and A/B testing. Deploy fine-tuned models alongside originals, compare performance, promote winners.
Curriculum learning — Start fine-tuning on easy tasks (general chat), progressively add harder tasks (math, code, reasoning) using curriculum learning infrastructure already built in v79.0.

v245.0 — Distributed Training on GCP (Planned)

Multi-VM gradient aggregation — v91.0 built distributed training with gradient compression. Activate for 14B model fine-tuning which exceeds single-VM memory.
Spot VM resilience — Predictive preemption with checkpoint save already built. Test with real training runs.
Cost-aware scheduling — Train on spot VMs during cheap hours, pause during expensive hours. DynamicResourceAllocator has the framework.

v248.0 — Voice Conversation Training Data Pipeline (Planned)

Ingest and process training data from JARVIS Body's real-time voice conversation infrastructure (v238.0):

Conversation trace schema — Define JSONL schema for multi-turn conversation sessions: session_id, turns (role + text + audio_duration_ms), barge_in_events, turn_detection_errors, latency metrics. Compatible with existing TelemetryIngestor.
Conversational DPO pairs — Responses that sustained multi-turn engagement (user continued the conversation) are "chosen"; responses that ended the conversation (user said "goodbye" or restarted with a new topic) are "rejected". Barge-in count per response is a quality proxy (more interruptions → worse response).
Conciseness training for conversation mode — In conversation mode, shorter responses feel more natural. Generate DPO pairs where concise, direct responses are "chosen" over verbose, over-explained responses. This is the inverse of detailed-analysis mode training.
Turn detection classifier training — Log TurnDetector heuristic decisions (silence_duration_ms, threshold_used, was_correct) as training data. When the heuristic triggers too early (user wasn't done) or too late (awkward pause), these errors become labeled examples for a small ML classifier.
Self-voice echo negative examples — Transcripts dropped by the stt_hallucination_guard in conversation mode are negative examples for STT fine-tuning. These represent JARVIS's own speech (imperfect AEC residual) and should be classified as noise.
Session-level metrics for curriculum learning — Use conversation session duration, turn count, and user engagement as difficulty metrics for curriculum learning: short exchanges first, then longer multi-topic conversations.

v246.0 — Agent Runtime Training Data Ingestion (Planned)

Prepare Reactor Core to ingest and process training data from the JARVIS Body Unified Agent Runtime:

Goal trace schema — Define JSONL schema for multi-step autonomous goal traces (goal → sub-steps → outcomes → reflections) compatible with TelemetryIngestor
Sequential DPO pairs — Generate preference pairs from goal execution sequences: successful multi-step approaches vs. failed approaches for the same goal type
Escalation boundary training — Use human escalation decisions (approve/reject) as Constitutional AI training signals for the safety classifier
Multi-agent coordination curriculum — Build progressive difficulty curriculum from simple single-agent tasks to complex multi-agent workflows
Failure recovery fine-tuning — Fine-tune reasoning models on recovery traces: when a sub-step fails, what replanning strategies worked vs. didn't
Cross-model comparison at scale — With the Agent Runtime generating higher request volume across all specialist models, DPO pair generation becomes more statistically significant

v247.0 — End-to-End Pipeline Verification and Hardening (Planned)

Cross-repo verification and integration testing. Many items previously planned here were resolved during the Feb 2026 audit:

JSONL format contract — ~~Define and enforce shared schema~~ VERIFIED: Schemas are byte-identical across all three repos (v1.0 canonical ExperienceEvent). No action needed.
Implement ReactorCoreBridge.upload_training_data() — ~~Broken link in J-Prime~~ VERIFIED: Fully implemented (992 LOC, v242.0) with batch upload, fallback, job tracking. No action needed.
Deployment signal verification — Test the Reactor Core → Trinity Protocol → J-Prime HotSwapManager path end-to-end with a dummy GGUF (partially addressed in v239.0 smoke test)
Integration test suite — Automated test that writes a telemetry event in JARVIS Body format, ingests it in Reactor Core, runs a mock training step, exports a GGUF, and signals J-Prime for hot swap
Monitoring dashboard — Track pipeline health: events written/day, events ingested/day, training runs completed, models deployed, deployment feedback success rate
Model lineage tracking — Every deployed model records: base model, training method, training steps, dataset hash, evaluation scores, previous model scores
Data versioning activation — Content-addressed dataset storage with DVC-compatible lineage tracking (infrastructure exists, needs activation)

📚 API Documentation

REST API Endpoints

Once the API server is running (python3 run_supervisor.py), access:

API Base URL: http://localhost:8003
Interactive Docs: http://localhost:8003/docs (Swagger UI)
ReDoc: http://localhost:8003/redoc
Health Check: http://localhost:8003/health

Key Endpoints

Training

# Trigger training
POST /api/v1/training/trigger
{
  "model_name": "llama-2-7b",
  "training_type": "dpo",
  "config": {
    "num_epochs": 3,
    "batch_size": 4
  }
}

# Get training status
GET /api/v1/training/status/{job_id}

# Cancel training
POST /api/v1/training/cancel/{job_id}

Model Registry

# List models
GET /api/v1/models

# Get model info
GET /api/v1/models/{model_id}

# Register model
POST /api/v1/models/register
{
  "model_id": "my-model-v1",
  "model_path": "/path/to/model",
  "metadata": {...}
}

Scheduler

# Schedule job
POST /api/v1/scheduler/schedule
{
  "name": "nightly_training",
  "schedule_type": "cron",
  "cron_expression": "0 2 * * *",
  "job_config": {...}
}

# List scheduled jobs
GET /api/v1/scheduler/jobs

Telemetry

# Submit telemetry
POST /api/v1/telemetry/submit
{
  "event_type": "interaction",
  "data": {...}
}

# Query metrics
GET /api/v1/telemetry/metrics?start_time=...&end_time=...

WebSocket Events

Connect to ws://localhost:8003/ws for real-time events:

const ws = new WebSocket('ws://localhost:8003/ws');

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log('Event:', data.type, data.payload);
};

// Subscribe to training events
ws.send(JSON.stringify({
  type: 'subscribe',
  channels: ['training:progress', 'training:complete']
}));

Model Server API

Model server runs on port 8001 (configurable):

# Inference
POST http://localhost:8001/predict
{
  "prompt": "What is machine learning?",
  "model_id": "llama-2-7b",
  "max_tokens": 256,
  "temperature": 0.7
}

# List loaded models
GET http://localhost:8001/models

# Load model
POST http://localhost:8001/models/load
{
  "model_id": "my-model",
  "model_path": "/path/to/model",
  "backend": "vllm"
}

🔗 Links & Resources

Repositories (Trinity)

Role	Repository	URL
Body	JARVIS (JARVIS-AI-Agent)	https://github.com/drussell23/JARVIS-AI-Agent
Mind	JARVIS-Prime	https://github.com/drussell23/jarvis-prime
Nerves	Reactor-Core (this repo)	https://github.com/drussell23/JARVIS-Reactor
C++ Core	MLForge	https://github.com/drussell23/MLForge

Documentation

Architecture Docs: See ARCHITECTURE_ADVANCED.md
Trinity Integration: See TRINITY_INTEGRATION_COMPLETE.md
Version History: See CHANGELOG.md (if available)

Community

Issues: https://github.com/drussell23/JARVIS-Reactor/issues
Discussions: https://github.com/drussell23/JARVIS-Reactor/discussions

🤝 Contributing

We welcome contributions! Please see our contributing guidelines:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes following our code style
Add tests for new features
Update documentation
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Code Style

Formatting: Black (line length 100)
Linting: Ruff
Type Hints: Required for all functions
Docstrings: Google style

Autonomous Gmail Triage Integration (Nerves Role)

Reactor-Core is the learning and adaptation layer for autonomous Gmail triage. It does not triage inboxes directly; it consumes outcome signals and feeds safe, bounded improvements back into scoring behavior.

Reactor-Core Responsibilities in Triage

Ingest behavioral outcomes (opened, replied, ignored, relabeled) with confidence controls.
Track sender/domain reputation and outcome distributions over time.
Drive bounded adaptive weight proposals for Body-side scoring.
Preserve auditability: explainable adaptation events, rollback capability, and deterministic safety bounds.

Cross-Repo Learning Loop

flowchart TD
    A[Gmail triage decision in JARVIS Body] --> B[User behavior outcome observed]
    B --> C[OutcomeCollector classification]
    C --> D[Experience queue + reputation updates]
    D --> E[Reactor-Core learning pipelines]
    E --> F[Bounded adapted weights]
    F --> G[Shadow validation + drift checks]
    G --> H[Safe activation in Body scoring]

What to Expect in Testing

Adaptation is not immediate/unsafe by default; it should run in bounded mode with guardrails.
Low-confidence outcomes are excluded from adaptation input.
Weight changes remain bounded (no runaway drift), and disagreements trigger rollback behavior.
User-facing notifications and UI delivery continue through Body-side channels; Reactor-Core influences prioritization quality over time.

📄 License

MIT License - See LICENSE file for details.

🙏 Acknowledgments

Built with ❤️ for the JARVIS AGI Ecosystem

Special Thanks:

PyTorch team for the excellent ML framework
Hugging Face for transformers and PEFT
FastAPI for the amazing async web framework
All contributors and users of the JARVIS ecosystem

Version: 2.12.0 (v239.0 target)
Last Updated: February 2026
Status: ✅ Infrastructure Complete | ⏳ Pipeline Activation In Progress (v239.0 — wiring existing components, ~200-400 lines across 4 files, zero new Python files)

Feb 2026 Audit Corrections: A three-way cross-verification against actual code corrected several previously reported issues: ReactorCoreBridge.upload_training_data() IS fully implemented (992 LOC), experience schemas ARE byte-identical across repos, and ExperienceEvent IS the single canonical schema with legacy adapters. The remaining gap is operational activation, not missing code.

Known Gaps (In Roadmap)

Training data pipeline built but never activated — All infrastructure exists and schemas are verified identical across repos (v1.0 canonical). TelemetryEmitter, TelemetryIngestor, ReactorCoreBridge.upload_training_data() (992 LOC, v242.0), UnifiedTrainingPipeline, and ReactorCoreWatcher are all built. The gap is activation: zero training jobs have ever run (jobs.json is empty). The Reactor Core API server on port 8090 needs to be verified as accepting POSTs, and the supervisor needs to call initialize_reactor_core() and start_reactor_core_watcher() during startup. Target: v239.0 (wiring, not building).
Deployment feedback loop is one-way — Reactor Core can export GGUF and deploy to J-Prime, and ReactorCoreWatcher in Prime can detect new models. But there is no feedback mechanism — Prime never tells Reactor Core "model deployed successfully" or "model caused regression, rolling back." A deployment_status.json feedback file is needed. Target: v239.0.
No deployment quality gate — The pipeline goes Training → GGUF export → Deploy with no validation step. A smoke test (load model, run test inference, verify non-garbage output) must be inserted before deployment. Must run in a subprocess to avoid OOM on 16GB Mac. Target: v239.0.
No real production training data yet — UnifiedTrainingPipeline has never run on actual user interaction data. Telemetry JSONL files exist in ~/.jarvis/telemetry/ but have never been ingested into a training run. Target: v239.0 (first run).
LangGraph reasoning traces unavailable — JARVIS Body's reasoning engine produces linear fallback traces, not rich graph-based reasoning data (depends on JARVIS Body v246.0)
Agent Runtime training data schema undefined — When autonomous goal pursuit generates multi-step traces, Reactor Core needs a new ingestion schema (v246.0 target)
Voice conversation training data schema undefined — JARVIS Body v238.0 generates multi-turn conversation traces, barge-in events, and turn detection logs. Reactor Core needs a conversation trace ingestion schema and conversational DPO pair generator (v248.0 target)

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
.vscode		.vscode
bindings		bindings
docker		docker
mlforge @ 5dbe2f9		mlforge @ 5dbe2f9
reactor_core		reactor_core
scripts		scripts
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
ARCHITECTURE_ADVANCED.md		ARCHITECTURE_ADVANCED.md
ARCHITECTURE_VERIFICATION.md		ARCHITECTURE_VERIFICATION.md
ASYNC_DB_COORDINATOR_V86.md		ASYNC_DB_COORDINATOR_V86.md
ASYNC_LIFECYCLE_COORDINATOR_V88.md		ASYNC_LIFECYCLE_COORDINATOR_V88.md
CMakeLists.txt		CMakeLists.txt
DISTRIBUTED_HEALTH_MONITOR_V87.md		DISTRIBUTED_HEALTH_MONITOR_V87.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
LICENSE		LICENSE
MIGRATION_FIXES_V88.md		MIGRATION_FIXES_V88.md
MLFORGE_FILES_COPIED.md		MLFORGE_FILES_COPIED.md
MLFORGE_INTEGRATION.md		MLFORGE_INTEGRATION.md
PHASE_2_IMPLEMENTATION.md		PHASE_2_IMPLEMENTATION.md
PHASE_3_COMPLETE.md		PHASE_3_COMPLETE.md
README.md		README.md
SESSION_COMPLETE.md		SESSION_COMPLETE.md
TESTING.md		TESTING.md
TRINITY_INTEGRATION_COMPLETE.md		TRINITY_INTEGRATION_COMPLETE.md
TRINITY_UNIFICATION_V82.md		TRINITY_UNIFICATION_V82.md
TRINITY_UNIFICATION_V89.md		TRINITY_UNIFICATION_V89.md
UNIFIED_COORDINATION_V85.md		UNIFIED_COORDINATION_V85.md
UNIFIED_MODEL_MANAGEMENT_V83.md		UNIFIED_MODEL_MANAGEMENT_V83.md
docker-compose.yml		docker-compose.yml
managed_mode.py		managed_mode.py
pyproject.toml		pyproject.toml
run_reactor.py		run_reactor.py
run_supervisor.py		run_supervisor.py
setup.py		setup.py
umf_client.py		umf_client.py
umf_types.py		umf_types.py

Folders and files

Latest commit

History

Repository files navigation

JARVIS Reactor (Reactor-Core)

Session Update (2026-03-18): Voice Unlock Telemetry Integration and Ingestion Mapping

1) New Telemetry Event Types (Body → Nerves Contract)

2) Body Emission Wiring

3) Reactor Ingestion Classification

4) Training/Analytics Impact

5) Session Validation Context

What is JARVIS Reactor? (Trinity Role)

🚀 What is JARVIS Reactor? (Features)

📋 Table of Contents

🏗️ Architecture

System Overview

Reactor Core Internal Architecture

Project Structure

⭐ Key Features

🧠 Advanced Training Methods (v76.0)

⚡ Async Infrastructure (v76.1)

🌐 API Server & Telemetry (v77.0)

🔥 Model Serving & Hot Reload (v77.1)

🎯 Trinity Orchestrator (v75.0)

🔄 Event Streaming (v10.3)

☁️ GCP Integration

📦 Installation

Quick Install (Python only, no C++ bindings)

Build from Source (with MLForge C++ bindings)

Environment-Specific Installation

Docker Installation

🚀 Quick Start

Start Reactor-Core (Recommended: via JARVIS)

Start Reactor-Core Standalone

Basic Training

Advanced Training with DPO

Model Serving with Hot Reload

API Server & Scheduler

Trinity Orchestrator (Multi-Repo Coordination)

Entry Points

Unified Supervisor (One-Command Startup)

🔬 Advanced Features

Advanced Training Methods (v76.0-v80.0)

DPO (Direct Preference Optimization)

RLHF (Reinforcement Learning from Human Feedback)

Curriculum Learning (v79.0)

Meta-Learning (v79.0)

World Model Training (v80.0)

Causal Reasoning (v80.0)

Async Infrastructure (v76.1, v92.0)

Structured Concurrency (v92.0+)

Connection Pooling (v92.0+)

Circuit Breaker

Backpressure Control

Dead Letter Queue

Online Learning & Data Versioning (v91.0)

Prioritized Experience Replay

Concept Drift Detection

Data Versioning

Distributed Training (v91.0)

Multi-VM Coordination

GCP Spot VM Checkpointing

Cross-Repo Integration (Trinity)

Phase 2: Trinity Autonomy Wiring (Reactor Role)

🔗 Integration Architecture

JARVIS Ecosystem Integration

⚙️ Configuration Guide

Environment Variables

Configuration Files

Dynamic Path Resolution

🔧 Troubleshooting

Common Issues

Issue: Components fail to start

Issue: Training fails with OOM

Issue: Model server not hot-reloading

Issue: Cross-repo events not working

Issue: Distributed training hangs

Debug Mode

🛠️ Development Guide

Setting Up Development Environment