Skip to content

laiso/ts-bench

Repository files navigation

ts-bench

Reproducible benchmark CLI for comparing AI coding agents on TypeScript workloads. Numbers are directional, not lab-grade.

Interface: bun src/index.ts --help — agents, providers, --dataset v1|v2, exercises vs tasks.

Dataset Role
v1 (default) Exercism practice exercises
v2 SWE-Lancer (Docker, large monorepo) — run ./scripts/setup-v2-env.sh first

Handbook (setup, secrets, CI, methodology): specs/000-project-handbook/README.md. Cursor / runner caveats: AGENTS.md. Spec Kit (SDD): .specify/, specs/; local /speckit.* commands live under .cursor/ (gitignored — run specify init --here --ai cursor-agent --force after clone).

v1 frozen baseline for reproducibility: tag v1-final2b3bc94. Releases

bun install
bun src/index.ts --agent claude --model <model>              # v1 default
bun src/index.ts --dataset v2 --task <id> --agent claude ...  # v2 (Docker)

Workflows: v1 · v2. SWE-Lancer task UI: bun run build:swelancer-pages then open docs/swelancer-tasks/ (see docs/README.md).

About

Reproducible benchmark CLI for comparing AI coding agents on TypeScript workloads.

Topics

Resources

Stars

Watchers

Forks

Sponsor this project

 

Contributors

Languages