Tracker adapters translate a signac job + its linked Limina artifact into a
remote-backend run, without owning any canonical state. The ABC is
TrackerAdapter; two adapters ship in v1.
class TrackerAdapter(ABC):
name: str # short backend name: "noop", "wandb", ...
def init_run(self, *, project, group, tags, config, notes, offline) -> RunHandle: ...
def log(self, handle, metrics) -> None: ...
def log_artifact(self, handle, name, path) -> None: ...
def finish(self, handle, *, exit_code=0) -> None: ...
def list_runs(self, *, project, group_prefix) -> list[RunRecord]: ...Return RunHandle from init_run with id, url (optional), project,
group, and backend-specific state in extra={}. The caller's
bind_tracker writes a TrackerBinding dict to job.doc["tracker"]:
{
"backend": "wandb",
"run_id": "abcdef12",
"url": "https://wandb.ai/...",
"project": "my-project",
"group": "H012/E018/full",
}That round-trips: you can find a signac job from a W&B run via
config.job_id, and find the W&B run from a signac job via
job.doc["tracker"]["url"].
Writes a JSONL event log instead of hitting a network. Events:
{"timestamp": "...", "event": "init_run", "project": "...", "group": "...", "tags": [...], "config": {...}, "notes": "...", "offline": false}
{"timestamp": "...", "event": "log", "metrics": {"loss": 0.1}}
{"timestamp": "...", "event": "log_artifact", "name": "out", "path": "...", "size_bytes": 1234}
{"timestamp": "...", "event": "finish", "exit_code": 0}Default log location: <job_workspace>/tracker_log/<run_id>/events.jsonl.
Pass log_root=<path> to NoopAdapter(...) for tests that aren't operating
inside a real signac job (e.g. unit tests on the adapter itself).
Install: pip install agentic-experiments[wandb].
from aexp.trackers import WandbAdapter
adapter = WandbAdapter(entity="my-team") # entity optionalKey behaviors:
- Lazy-imports
wandbat construction; raisesTrackerInitErrorif the package is missing. Nothing at package import time requireswandb. init_run(..., offline=True)forwardsmode="offline"towandb.init, for HPC nodes without internet. See Offline + sync below.init_runalways passesdir=<job_workspace>towandb.init, so every run's local wandb state (offline-run dirs, logs, caches) lives under the signac job. One directory per run — no sprawl into the consumer repo's CWD.list_runs(project=..., group_prefix=...)queries W&B's API for matching runs. Regex filter ongroup.
Runs execute on compute nodes with no internet; you sync from a login node afterward. Because the adapter co-locates wandb state with the signac workspace, offline runs land at predictable paths:
<repo>/.runs/workspace/<job_id>/wandb/offline-run-YYYYMMDD_HHMMSS-<id>/
# Python
from aexp import create_run, bind_tracker, WandbAdapter, run_lifecycle
job = create_run(experiment_id="E018", hypothesis_id="H012",
statepoint={"condition": "full", "seed": 0})
bind_tracker(job, WandbAdapter(), project="ecg-inquiry-eval", offline=True)
with run_lifecycle(job):
# ... work; all wandb writes go to job.path/wandb/ ...Or via the CLI:
aexp new-run --experiment E018 --hypothesis H012 --sp condition=full,seed=0
aexp bind-tracker <job_id> --backend wandb --project ecg-inquiry-eval --offline# One command: walks .runs/workspace/*/wandb/, calls wandb sync on every offline run.
aexp sync-offline
# Preview without syncing:
aexp sync-offline --dry-runOr, if you'd rather drive wandb directly:
wandb sync --sync-all .runs/Run IDs are stable between offline and online, so the synced runs show up
in W&B with the same id, group (H012/E018/full), tags, and full Limina
config (limina.experiment_id, limina.hypothesis_id, etc.).
from aexp import find_offline_runs, sync_offline_runs
paths = find_offline_runs(".runs")
results = sync_offline_runs(".runs", dry_run=False)
for r in results:
if not r.ok:
print(r.path, r.stderr)- Subclass
TrackerAdapter, setname. - Lazy-import the backend SDK inside
__init__or the methods — never at module load. - Preserve the contract:
init_runreturns aRunHandle, subsequent methods take it. Store any backend handle inhandle.extra. - Register in
aexp/trackers/__init__.pyif you want it importable from the package root. - Add tests: mock the SDK (see
tests/test_trackers_wandb.pyfor the pattern). Assert the init kwargs and that the adapter tolerates a missing backend (raisesTrackerInitError).
Both were considered. Weave was rejected: the runtime is Claude Code /
Claude Desktop, which invokes the model inside a closed binary — our Python
never touches anthropic.messages.create(), so Weave's auto-instrumented
prompt/completion capture never fires. What's left is a generic function
tracer that doesn't justify the W&B-account + SDK weight.
OpenTelemetry is a plausible v1.1 extra (pip install agentic-experiments[otel]): Claude Code itself emits OTEL under
CLAUDE_CODE_ENABLE_TELEMETRY=1, so our spans could land in the same
collector and correlate by session id. Not shipping in v1 — we don't yet
know whether structured JSON logs to stderr (which the Limina hooks already
produce) are enough.