Skip to content

feat(cli, migrate): add stash encrypt commands + @cipherstash/migrate#357

Merged
calvinbrewer merged 22 commits into
mainfrom
encryption-migrations
May 4, 2026
Merged

feat(cli, migrate): add stash encrypt commands + @cipherstash/migrate#357
calvinbrewer merged 22 commits into
mainfrom
encryption-migrations

Conversation

@coderdan
Copy link
Copy Markdown
Contributor

@coderdan coderdan commented Apr 23, 2026

Summary

Adds the production-shape flow for migrating an existing plaintext column to eql_v2_encrypted — and wires it up so the primary way users encounter it is through stash init's agent handoff, not direct CLI use.

The intended UX:

  1. User runs stash init and picks a handoff (Claude / Codex / Cursor / Windsurf / Cline / wizard).
  2. The agent reads the orient-and-route setup prompt + the loaded skills, opens the conversation by asking which flow the user wants ("add a new encrypted column" or "migrate an existing column to encrypted"), then drives the schema edits, application-code wiring, and stash encrypt lifecycle commands itself.
  3. The CLI commands (stash db push, stash db activate, stash encrypt {backfill,cutover,drop,status,plan}) are the primitives the agent calls. Users can invoke them directly — they're documented and the help output is solid — but the expected ergonomic surface is the conversation, not the terminal.

This shape is why so much of the PR is prompt + skill + doctrine work: the lifecycle on its own is just a sequence of SQL operations, but a column migration spans multiple deploys and several deploy-ordering footguns, and the right place to walk a user through that is in the agent conversation that already has the project context loaded.

What ships:

  • A new stash encrypt command group (backfill, cutover, drop, status, plan).
  • New stash db push / stash db activate semantics matching the EQL pending → encrypting → active state machine.
  • A new @cipherstash/migrate package exposing the same primitives as a library (so users embedding backfill in their own workers/cron don't go through the CLI).
  • Setup-prompt + AGENTS.md doctrine + skill rewrite so the post-init agent orients the user, picks the right flow, and drives the lifecycle commands itself. Refines the handoff scaffolding from feat(cli): hand off init to local coding agents (Claude / Codex / AGENTS.md / wizard) #395.
  • Drizzle integration polish so cutover and drop don't desync drizzle-kit's journal/snapshot.

Lifecycle

Each column walks through:

schema-added → dual-writing → backfilling → backfilled → cut-over → dropped

Two real flows for the agent to pick between:

  1. Add a new encrypted column — no plaintext predecessor. Schema-add → db pushdb activate (if a config already exists). No backfill, no cutover.
  2. Migrate an existing populated column — the full lifecycle. Adds a <col>_encrypted twin, dual-writes from the app, backfills existing rows, then renames the twin into <col> (the app's read path is unchanged) and drops the old plaintext.

In-place conversion is explicitly unsupported — there's no way to swap the type of a populated column atomically without corrupting data.

State model (three layers, kept separate on purpose)

  • Repo manifest.cipherstash/migrations.json: desired columns, index set, target phase. Code-reviewable intent. Written by backfill (records the column entry) and drop (bumps target phase to dropped).
  • EQL intenteql_v2_configuration: unchanged. Proxy continues to read it as its source of truth. db push writes new column-sets as pending when an active config already exists, or directly as active for the first push. db activate and encrypt cutover both transition pending → encrypting → active in a single transaction.
  • Runtime state (new)cipherstash.cs_migrations: append-only event log — per-column phase, backfill cursor, rows processed. Installed by stash db install. Designed to be upstreamed into EQL as eql_v2_migrations later so Stack and Proxy own it jointly.

Why a new table instead of reusing eql_v2_configuration: its CHECK constraint rejects custom metadata, its state enum is global (only one {active, pending, encrypting} at a time) so it can't represent multiple columns in different phases, and backfill-cadence writes would collide with Proxy's 60s config refresh. Full reasoning in the design doc.

CLI commands (the primitives the agent calls)

Command Purpose
stash db push Register new EQL columns. Writes pending when an active config exists; first-ever push writes active directly.
stash db activate Promote pending → active without renames. Use after db push for the add-new-column flow.
stash encrypt status Per-column table: phase, EQL state, indexes, progress, drift flags. Phase-aware progress framing (no more 0/0 (100%) for completed columns).
stash encrypt plan Diff intent (.cipherstash/migrations.json) vs observed state.
stash encrypt backfill Chunked, resumable, idempotent; txn-per-chunk with atomic checkpoint; SIGINT-safe; auto-detects single-column PK. Prompts to confirm dual-writes are deployed (or accepts --confirm-dual-writes-deployed non-interactively). --force re-encrypts every plaintext row regardless of state — recovery path for drift.
stash encrypt cutover eql_v2.rename_encrypted_columns() + migrate_config() + activate_config() in one transaction. Optional Proxy refresh via CIPHERSTASH_PROXY_URL. For Drizzle projects, scaffolds an idempotent follow-up migration so drizzle-kit's journal/snapshot stays in sync (no-op on the source DB; applies on a fresh restore).
stash encrypt drop Generates a DROP COLUMN <col>_plaintext migration. For Drizzle projects, scaffolds via drizzle-kit generate --custom so the file is journaled.

stash encrypt advance (in earlier drafts of this PR) is gone — phase transitions are now driven implicitly by the lifecycle commands, not by a separate "record the transition" command.

@cipherstash/migrate package

Exposes the same primitives (runBackfill, appendEvent, progress, renameEncryptedColumns, migrateConfig, activateConfig, discardPendingConfig, upsertManifestColumn, setManifestTargetPhase, …) so users can embed backfill in their own workers or cron jobs without the CLI. Example in packages/migrate/README.md.

Init agent handoff (refines #395)

stash init's post-handoff prompt was a fixed TODO list that drove the agent past the user's actual intent. Replaced with an orient-and-route shape:

  1. Agent's first response is a short orientation message + a routing question (which flow does the user want).
  2. Two end-user-facing flows named: "Add a new encrypted column" and "Migrate an existing column to encrypted". No internal "Path 1/2/3" jargon.
  3. Migrate flow opens with a one-line note on why it's staged (parallel twin + dual-write + rename) before the steps.
  4. Steps name the concrete CLI commands inline, runner-aware (pnpm dlx stash … / npx stash … / bunx stash … per the project's package manager).

The interactive column-picker that #395 shipped is removed — schema decisions now happen post-handoff in the agent conversation, so the picker added friction without value. build-schema.ts always writes a heavily-commented placeholder; the agent edits the user's real schema files. (PR #398, which polished the picker, is closed as superseded.)

The doctrine in AGENTS.md (the durable invariants — "never log plaintext", "encrypted columns are nullable jsonb at creation") gains an explicit bundler-exclusion invariant: @cipherstash/stack wraps a native FFI module and must be excluded from any bundler (Next.js serverExternalPackages, webpack externals, esbuild external, Vite SSR ssr.external). The agent missed this in spike testing; promoting it to doctrine + the orient-and-route prompt's first step + the stash-encryption skill closes the gap.

Skill content

  • stash-encryption — adds the column-migration lifecycle section (the source of truth for the migrate-existing flow), the bundler-exclusion callout in Installation, and an updated CLI sequence including db push between backfill and cutover.
  • stash-drizzle — adds a 150-line "Migrating an Existing Column to Encrypted" worked example covering schema-add, dual-write, backfill, cutover, and drop in Drizzle terms.
  • stash-cli — rewrites the db push section with the pending/active behavior and a decision table for the next command, adds db activate, updates encrypt cutover with the full state-machine description, updates encrypt backfill with the new flags.

Post-db install UX

The "What next" panel that prints after db install was a stale wizard-centric snippet (hand-rolled client.encryptModel(record, table).run() calls). Replaced with two canonical "ask your agent X" phrasings — one per flow — so it bridges install → agent handoff cleanly in both the init pipeline and standalone db install.

Phase 1 scope / Phase 2 follow-ups

  • Phase 1 (this PR): Protect/Stack client-side backfill — CLI dynamic-imports the user's encryption client, encrypts in-process, writes payloads directly.
  • Phase 2: Proxy-mode backfill (SQL-through-Proxy using the same cs_migrations state), stash db introspect --json / stash env set CLI subcommands, upstream cs_migrationseql_v2_migrations in EQL.

Cross-PR follow-ups tracked in docs/plans/encryption-migrations-followups.md (working doc) — including deploy-ordering safeguards for the migrate-existing-column flow (self-guarding generated SQL + opt-in event trigger), live coverage check in encrypt status during dual-writing, smarter db push for additive-only changes, and stash encrypt update for re-encrypting after EQL config changes.

Test plan

  • pnpm --filter @cipherstash/migrate test — manifest round-trip, SQL identifier quoting, state DAO, integration tests against real Postgres (skipped without POSTGRES_URL_FOR_TESTS).
  • pnpm --filter stash test — 167 tests pass, including new orient-and-route prompt assertions and install-skills + build-agents-md coverage.
  • pnpm -w build — full workspace builds clean.
  • pnpm exec biome check <changed files> — clean.
  • ./dist/bin/stash.js --help shows the new encrypt, db push, and db activate commands.
  • Manual e2e against a spike project seeded with live data: stash init → agent handoff → migrate-existing-column flow end-to-end (schema-add → dual-write → backfill → re-push → cutover → drop). Drizzle journal/snapshot stays in sync.
  • Verify Proxy interop after cutover: SELECT email FROM users via Proxy returns plaintext, direct Postgres returns ciphertext JSON.

Design doc

docs/plans/encryption-migrations.md — full architecture including state-layer rationale, index-on-backfill implications, Proxy compatibility gotchas, and phased rollout.

Summary by CodeRabbit

  • New Features

    • Added stash encrypt command group (status, plan, backfill, cutover, drop) including resumable, chunked backfill with dual-write confirmation and transactional cutover (optional proxy reload)
    • Introduced @cipherstash/migrate library and automated migration tracking installed by db install
    • Added stash db activate to promote pending encryption configs
  • Documentation

    • Expanded docs, guides and examples (CLI, Drizzle, Sequelize) and bundler-exclusion guidance for native deps

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 23, 2026

🦋 Changeset detected

Latest commit: f418563

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
stash Minor
@cipherstash/migrate Minor
@cipherstash/e2e Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Adds end-to-end plaintext→encrypted-column migration tooling: a new stash encrypt CLI group (status/plan/backfill/cutover/drop), a reusable @cipherstash/migrate library (state, backfill, cursor, eql wrappers, manifest), runtime tracking table cipherstash.cs_migrations, manifest .cipherstash/migrations.json, CLI wiring, tests, docs, and init/install adjustments.

Changes

Encryption Migrations

Layer / File(s) Summary
Schema / Data Shape
packages/migrate/src/install.ts, packages/migrate/src/state.ts, packages/migrate/src/manifest.ts
Adds MIGRATIONS_SCHEMA_SQL and creates cipherstash.cs_migrations; defines Migration types, event/phase model, and a Zod-validated .cipherstash/migrations.json manifest structure.
Cursor / SQL Helpers
packages/migrate/src/cursor.ts, packages/migrate/src/sql.ts
Keyset pagination helpers (fetchUnencryptedPage, countUnencrypted, qualifyTable) and quoteIdent for safe SQL identifier quoting.
Backfill Core
packages/migrate/src/backfill.ts
Implements chunked, resumable, idempotent runBackfill with per-chunk transaction, checkpointing, encryption-client bulk encrypt validation, force overwrite mode, progress events, and chunk write helper.
EQL / Proxy Wrappers
packages/migrate/src/eql.ts
Wrappers for EQL functions: select pending, ready_for_encryption, rename_encrypted_columns, migrate_config, activate_config, discard pending, reload config, and count-encrypted (returns bigint).
State DAO
packages/migrate/src/state.ts
appendEvent, latestByColumn, progress to read/write append-only migration events and decode rows.
Manifest API
packages/migrate/src/manifest.ts
manifestPath, readManifest, writeManifest, upsertManifestColumn, setManifestTargetPhase for repo intent manifest management.
Library Entry & Packaging
packages/migrate/src/index.ts, packages/migrate/package.json, packages/migrate/tsup.config.ts, packages/migrate/tsconfig.json
Public barrel exports, package metadata, build config, TS config, and externals (pg, @cipherstash/stack).
Library Tests
packages/migrate/src/__tests__/*
Unit + integration tests covering backfill resumability/idempotency/error handling, manifest behaviors, SQL helpers, and state DAO.
CLI Dependency & Scripts
packages/cli/package.json, packages/cli/scripts/e2e-encrypt.sh, packages/cli/scripts/fixtures/seed-users.sql
CLI depends on @cipherstash/migrate; adds e2e smoke script and SQL fixture to seed 5,000 users for backfill interrupt/resume test.
CLI Dispatcher & Help
packages/cli/src/bin/stash.ts
Adds encrypt top-level command, documents new subcommands, and wires lazy-loaded subcommand routing.
Encryption Context Loader
packages/cli/src/commands/encrypt/context.ts
loadEncryptionContext() dynamically imports user encryption client, discovers encrypted tables (duck-typing + drizzle fallback), and exposes requireTable.
CLI Commands (status/plan/backfill/cutover/drop)
packages/cli/src/commands/encrypt/status.ts, plan.ts, backfill.ts, cutover.ts, drop.ts
status merges manifest/EQL/runtime/physical columns; plan diffs intent vs runtime; backfill performs dual-write precondition, manifest upsert, chunked resumable backfill with signal handling and --force; cutover runs transactional rename + config promotion and optional Proxy reload and scaffolds Drizzle migration; drop generates/drop migration file and records dropped event.
Drizzle Integration Helper
packages/cli/src/commands/encrypt/drizzle-helper.ts
scaffoldDrizzleMigration runs drizzle-kit generate --custom --name and overwrites latest matching migration file with provided SQL.
DB Install / Push / Activate
packages/cli/src/commands/db/install.ts, packages/cli/src/commands/db/push.ts, packages/cli/src/commands/db/activate.ts
db install attempts to install cs_migrations; generated Drizzle migrations now include MIGRATIONS_SCHEMA_SQL; db push writes active on first push or inserts pending (transactional) on subsequent pushes; new db activate promotes pendingactive inside a transaction.
Init / UX Adjustments
packages/cli/src/commands/init/*
Init no longer introspects DB for schemas; writes placeholder encryption client and allows empty schemas in context; prompt rewritten to orient-and-route with skill index and bundler-exclusion guidance.
Docs & Skills
docs/plans/encryption-migrations.md, packages/migrate/README.md, skills/*
Design plan, package README, and skill docs describing lifecycle, CLI usage, invariants, bundler-exclusion guidance, and worker example for runBackfill.
Examples
examples/sequelize-*.ts, examples smoke package
Sequelize integration and smoke examples demonstrating type parsing, model hooks for encrypt/decrypt, and encrypted query helpers.

Sequence Diagram(s)

sequenceDiagram
    participant CLI
    participant DB
    participant EncryptionClient
    participant Proxy
    CLI->>DB: read manifest (.cipherstash/migrations.json) & cs_migrations state
    CLI->>EncryptionClient: import dynamic client (loadEncryptionContext)
    CLI->>DB: runBackfill -> countUnencrypted / fetchUnencryptedPage
    CLI->>EncryptionClient: bulkEncryptModels(chunk)
    EncryptionClient-->>CLI: encrypted payloads
    CLI->>DB: BEGIN + writeEncryptedChunk + appendEvent(checkpoint) + COMMIT
    CLI->>DB: appendEvent(backfilled)
    CLI->>DB: cutover -> renameEncryptedColumns() (EQL)
    CLI->>DB: migrateConfig() + activateConfig()
    DB->>Proxy: (optionally) reload config via Proxy URL
    Proxy-->>CLI: reload result (warning on failure)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • calvinbrewer
  • auxesis

"🐰
I hopped through schemas, events, and code,
Wove checkpoints, renames, and a safe backflow.
Dual-writes confirmed, chunks stitched tight—
Now plaintext sleeps and ciphertext takes flight."

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch encryption-migrations

@coderdan coderdan force-pushed the encryption-migrations branch from 700009a to 90519c2 Compare May 4, 2026 02:00
coderdan added a commit that referenced this pull request May 4, 2026
Two doc updates in support of #357 now that the rulebook package is
gone:

- `docs/plans/encryption-migrations.md`: drop "rulebook" references
  (5 of them) and the stale `packages/cli/src/commands/wizard/lib`
  paths. Re-point the agent-handoff bits at the post-#395
  architecture: Claude / Codex / AGENTS.md handoffs from
  `init/steps/handoff-*.ts`, with the integration skill installed by
  init providing the per-stack guidance. Repoint
  `introspectDatabase` to its current home in `init/lib/introspect.ts`.

- `/skills/stash-cli/SKILL.md`: add an `encrypt` section documenting
  every subcommand (`status`, `plan`, `advance`, `backfill`,
  `cutover`, `drop`) with flags, examples, and a one-line note on
  runner-prefix substitution so the docs are not pinned to npm.

- `/skills/stash-encryption/SKILL.md`: add a "Column Migration
  Lifecycle" section covering the six-phase model
  (schema-added → dual-writing → backfilling → backfilled →
  cut-over → dropped), the three-source state model
  (`migrations.json` / `eql_v2_configuration` / `cs_migrations`), the
  CLI sequence, and the library `runBackfill` shape. Agents reading
  this skill now have the migration vocabulary they need.

No CLI behaviour changes. Buckets 3+ from the audit (advance handoff
integration, runner-aware help in encrypt commands, setup-prompt
recommending `stash encrypt`, AGENTS-doctrine pointing at the CLI
path) deferred until the encrypt-step UX has been reviewed.
coderdan added a commit that referenced this pull request May 4, 2026
Two doc updates in support of #357 now that the rulebook package is
gone:

- `docs/plans/encryption-migrations.md`: drop "rulebook" references
  (5 of them) and the stale `packages/cli/src/commands/wizard/lib`
  paths. Re-point the agent-handoff bits at the post-#395
  architecture: Claude / Codex / AGENTS.md handoffs from
  `init/steps/handoff-*.ts`, with the integration skill installed by
  init providing the per-stack guidance. Repoint
  `introspectDatabase` to its current home in `init/lib/introspect.ts`.

- `/skills/stash-cli/SKILL.md`: add an `encrypt` section documenting
  every subcommand (`status`, `plan`, `advance`, `backfill`,
  `cutover`, `drop`) with flags, examples, and a one-line note on
  runner-prefix substitution so the docs are not pinned to npm.

- `/skills/stash-encryption/SKILL.md`: add a "Column Migration
  Lifecycle" section covering the six-phase model
  (schema-added → dual-writing → backfilling → backfilled →
  cut-over → dropped), the three-source state model
  (`migrations.json` / `eql_v2_configuration` / `cs_migrations`), the
  CLI sequence, and the library `runBackfill` shape. Agents reading
  this skill now have the migration vocabulary they need.

No CLI behaviour changes. Buckets 3+ from the audit (advance handoff
integration, runner-aware help in encrypt commands, setup-prompt
recommending `stash encrypt`, AGENTS-doctrine pointing at the CLI
path) deferred until the encrypt-step UX has been reviewed.
@coderdan coderdan force-pushed the encryption-migrations branch from 4989081 to 671c2e0 Compare May 4, 2026 07:49
coderdan added a commit that referenced this pull request May 4, 2026
The post-install panel still recommended `stash wizard` as the
headline path and showed a hand-rolled `client.encryptModel(record,
table).run()` snippet — both stale post-#395 and post-#357.

Replace with brief guidance that bridges install → agent handoff:
two canonical "ask your agent X" phrasings (one per real path,
migrate-existing vs add-new), a short note that the agent will do
the schema edits and run the lifecycle commands, and a pointer at
the skills + public docs.

Same panel runs from any `db install` invocation — including the
one init triggers in install-eql — so the new copy makes sense
both during init's handoff and when `db install` is run
standalone (where "your agent" can be any agent the user has
open, or someone reading the lifecycle commands directly).
@coderdan coderdan force-pushed the encryption-migrations branch from 22d5e3d to 194875a Compare May 4, 2026 12:42
@coderdan coderdan marked this pull request as ready for review May 4, 2026 12:56
@coderdan coderdan requested a review from a team as a code owner May 4, 2026 12:56
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/cli/src/commands/init/steps/build-schema.ts (1)

97-113: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't wipe existing schema context on the "keep existing file" path.

When keepExisting is true, this still forces nextState.schemas = [], and writeBaselineContextFile() immediately persists that. Because write-context.ts writes state.schemas ?? [] into .cipherstash/context.json, re-running stash init in a repo that already has encrypted tables will replace the existing schema list with an empty one even though the user chose to keep the current client. Preserve the previous schemas on this branch, or skip the baseline write unless you actually generated a new placeholder.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/cli/src/commands/init/steps/build-schema.ts` around lines 97 - 113,
The current branch always sets nextState.schemas = [] and immediately calls
writeBaselineContextFile(nextState, cwd, envKeys), which wipes existing schema
context when keepExisting is true; change the logic so that when keepExisting is
true you preserve the prior schemas (e.g. nextState.schemas = state.schemas) or
skip calling writeBaselineContextFile entirely unless a new placeholder/schema
was generated; update the code paths around nextState, schemas, keepExisting and
the writeBaselineContextFile call so re-running stash init does not overwrite
existing .cipherstash/context.json with an empty list.
🧹 Nitpick comments (2)
packages/migrate/src/__tests__/manifest.test.ts (1)

52-77: ⚡ Quick win

Exercise the actual defaults in this test.

This fixture still provides castAs, indexes, and targetPhase, so the test would pass even if the .default(...) calls disappeared. Omit those keys and assert that the round-tripped manifest fills in 'text', [], and 'cut-over'.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/migrate/src/__tests__/manifest.test.ts` around lines 52 - 77, Update
the test in manifest.test.ts so it actually verifies defaulting behavior: in the
manifest passed to writeManifest remove the per-column keys castAs, indexes, and
targetPhase from the users column fixture (so only column: 'email' remains),
then after readManifest assert that read?.tables.users?.[0]?.castAs === 'text',
read?.tables.users?.[0]?.indexes equals [] and
read?.tables.users?.[0]?.targetPhase === 'cut-over'; keep using the existing
writeManifest and readManifest helpers and the same tmp setup/teardown.
packages/cli/src/bin/stash.ts (1)

243-316: ⚡ Quick win

encrypt subcommands can't be targeted with --database-url

runEncryptCommand doesn't plumb values['database-url'] through to the individual command implementations, unlike runDbCommand. Developers can only target a non-default DB via the environment variable, which is inconsistent with the db command group.

♻️ Proposed fix
 async function runEncryptCommand(
   sub: string | undefined,
   flags: Record<string, boolean>,
   values: Record<string, string>,
 ) {
+  const databaseUrl = values['database-url']
   switch (sub) {
     case 'status': {
       const { statusCommand } = await requireStack(
         () => import('../commands/encrypt/status.js'),
       )
-      await statusCommand()
+      await statusCommand({ databaseUrl })
       break
     }
     // ...repeat for plan, backfill, cutover, drop
   }
 }

Each encrypt command would need its options interface extended with databaseUrl?: string and pass it into loadStashConfig() / new pg.Client(...) calls.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/cli/src/bin/stash.ts` around lines 243 - 316, runEncryptCommand
currently ignores values['database-url'], so the encrypt subcommands
(statusCommand, planCommand, backfillCommand, cutoverCommand, dropCommand)
cannot target a non-default DB; update each command's options interface to
accept databaseUrl?: string and pass values['database-url'] into the respective
calls from runEncryptCommand (e.g., when invoking backfillCommand,
cutoverCommand, dropCommand, statusCommand, planCommand) and ensure the
implementations forward that databaseUrl into loadStashConfig() / new
pg.Client(...) where the DB connection is created.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/plans/encryption-migrations.md`:
- Around line 134-149: Update the “stash encrypt cutover” plan to match the
shipped flow: in the cutover transaction, after calling
eql_v2.rename_encrypted_columns() also create/promote the pending config and
call migrate_config() followed by activate_config() (all within the same
transaction), then record the cut_over event; retain the proxy refresh step
(eql_v2.reload_config()) if a Proxy URL is configured. Reference the existing
symbols eql_v2.rename_encrypted_columns(), migrate_config(), activate_config(),
eql_v2.reload_config(), and the cut_over event so the doc matches the actual
implementation and can be used as a faithful implementation reference.

In `@packages/cli/src/commands/db/push.ts`:
- Around line 86-124: Queries that reference the table eql_v2_configuration must
be schema-qualified to avoid search_path issues; update every SQL string that
uses "eql_v2_configuration" (the SELECT EXISTS query, the INSERT for 'active',
the INSERT for 'pending' inside the transaction, and the calls referenced in
activate.ts and migrate/eql.ts) to use public.eql_v2_configuration instead,
keeping the same parameters and surrounding logic (e.g., in the block that calls
client.query, and in the function discardPendingConfig) so the exact same table
is targeted regardless of the DB search_path.

In `@packages/cli/src/commands/encrypt/backfill.ts`:
- Around line 114-123: The signal handlers and AbortController creation must be
included in the same try/finally that closes the pool so they are cleaned up if
pool.connect() throws; move the AbortController, onSignal definition and
process.on('SIGINT'|'SIGTERM') registrations to occur after entering the try (or
move the try to start before creating controller/handlers) so that the finally
block always runs to call pool.end() and remove the handlers; ensure cleanup in
finally removes the handlers (process.removeListener) and closes db connection
if acquired (db.release()/db.end()) and calls pool.end(), referencing
controller, onSignal, process.on, pool.connect(), db, and pool.end() in your
changes.
- Around line 251-253: The catch block in the backfill command currently logs
raw exception text via p.log.error(error instanceof Error ? error.message :
'Backfill failed.'), which can leak plaintext; change the catch to emit a
generic/redacted error message (e.g., "Backfill failed — sensitive details
redacted") instead of error.message, and if needed capture the full Error object
at a non-CLI debug level or send it to a secure internal logger. Update the
catch surrounding the backfill flow (the block using p.log.error and
process.exit(1)) to avoid printing error.message while preserving exit behavior.

In `@packages/cli/src/commands/encrypt/context.ts`:
- Around line 150-164: The requireTable function currently only looks up
ctx.tables by the raw tableName key, which fails for schema-qualified inputs
like "public.users"; update requireTable to handle schema.table by first trying
ctx.tables.get(tableName) then, if not found and tableName contains a dot,
extract the segment after the last dot and attempt to find a table whose
EncryptedTable.tableName equals that segment (e.g.,
Array.from(ctx.tables.values()).find(t => t.tableName === shortName)); fall back
to the existing error message/process.exit if still not found. Ensure you
reference EncryptionContext.tables and EncryptedTable.tableName when
implementing the secondary lookup.

In `@packages/cli/src/commands/encrypt/cutover.ts`:
- Around line 145-154: The proxy reload error should not abort the
already-committed cutover: wrap the call to reloadConfig(proxy) in its own
try/catch so that failures only log a warning instead of bubbling to the outer
catch and causing exit(1); locate the block that constructs proxy from proxyUrl
and replace the single try/finally around connect/reloadConfig/end with a try {
await proxy.connect(); try { await reloadConfig(proxy); p.log.success('Proxy
config reloaded.'); } catch (err) { p.log.warn(`Proxy config reload failed:
${err}`); } } finally { await proxy.end(); } so connection teardown still runs
but reloadConfig errors are downgraded to warnings.
- Around line 180-196: buildRenameMigrationSql currently treats a possibly
schema-qualified table string like "public.users" as a single quoted identifier,
which breaks both the ALTER TABLE and the information_schema probe; update
buildRenameMigrationSql to parse the incoming table string (from options.table)
into schema and table parts (if there is a dot, split into schema and table;
otherwise treat schema as current/default), then build the IF EXISTS check using
information_schema.table_schema = '<schema>' AND information_schema.table_name =
'<table>' and produce ALTER TABLE using a schema-qualified identifier
("<schema>"."<table>") with proper quoting for each identifier; ensure the
column rename logic is unchanged but uses the parsed table identifier so the
generated SQL works for both "table" and "schema.table" inputs.
- Around line 71-97: The preflight currently only checks for any pending EQL
config but then runs renameEncryptedColumns, migrateConfig, and activateConfig
which affect the entire pending configuration; change the preflight to fetch the
pending configuration rows and ensure the pending set only includes the column
you intend to cut over (or else abort). Specifically, replace the EXISTS query
with a SELECT of pending config rows from eql_v2_configuration
(state='pending'), validate that the returned rows count is 1 and that the row's
table/column match options.table and options.column (or, if you allow multiple,
ensure all rows correspond to the same table/column and are safe to cut over),
and if the check fails call p.log.error with the explanatory message and
process.exit(1) before calling renameEncryptedColumns, migrateConfig, or
activateConfig.

In `@packages/cli/src/commands/encrypt/drizzle-helper.ts`:
- Around line 42-58: The execSync call in drizzle-helper.ts that runs `npx
drizzle-kit generate --custom --name=${opts.name}` is vulnerable to shell
injection via opts.name; replace execSync with child_process.spawnSync to pass
arguments as an array (e.g.
['npx','drizzle-kit','generate','--custom',`--name=${opts.name}`] or better
['npx','drizzle-kit','generate','--custom','--name',opts.name] depending on CLI)
so no shell is used, capture stdout/stderr from the spawnSync result, and
preserve the existing error handling logic (use the spawnSync result.status or
error output to build the thrown Error message similar to the current stderr
extraction).

In `@packages/cli/src/commands/encrypt/plan.ts`:
- Around line 52-57: The catch block currently calls process.exit(1) which
prevents the async finally from running (so client.end() never executes); change
the flow so the exit is deferred until after cleanup: in the try/catch
surrounding the logic in the function (the block that uses p.log.error and
client), remove the direct process.exit(1) call from the catch, instead set an
error flag or capture the exit code/message in a local variable inside the catch
(use the same error handling with p.log.error(error instanceof Error ?
error.message : 'Plan failed.')), let execution continue to the finally where
await client.end() is called, and after the finally, if the error flag/exit code
is set call process.exit(1) (or throw the error) so the client is always
gracefully disconnected; reference the existing catch that logs and the finally
that awaits client.end().

In `@packages/cli/src/commands/encrypt/status.ts`:
- Around line 70-83: The current split of keys using key.split('.') in the loop
over stateMap yields incorrect table/column pairs for schema-qualified table
names (e.g., "migrate_test.users.email"); change the split logic to locate the
last dot (e.g., lastIndexOf('.')) and set tableName = key.slice(0, lastDot) and
columnName = key.slice(lastDot + 1) before calling renderRow so
eqlConfig.get(key) and physicalCols.get(tableName) resolve correctly; update the
block that iterates stateMap/seen/rows.push (where renderRow, eqlConfig,
physicalCols are used) to use this last-dot split approach.

In `@packages/cli/src/commands/init/lib/setup-prompt.ts`:
- Line 223: The prompt's step 5 incorrectly claims "no read-path code change"
after `encrypt cutover` — update the text to state that after cutover `<col>`
contains ciphertext and application read paths must be updated to decrypt values
(e.g., using `decryptModel()` as shown in `skills/stash-drizzle/SKILL.md`/Phase
4), and insert a new step between the current steps 5 and 7 that instructs
engineers to update their read logic (example reference `rows[0].email`) and
verify reads return plaintext before dropping the plaintext column; keep
`encrypt cutover` and the rename behavior but make clear the cutover is the
point that breaks reads until `decryptModel()` (or equivalent) is applied.

In `@packages/migrate/src/__tests__/backfill.integration.test.ts`:
- Around line 1-27: Add an import for dotenv/config at the very top of this test
file so PG_TEST_URL is loaded into process.env before PG_URL is read;
specifically, insert "import 'dotenv/config'" as the first import line above the
existing imports (so the PG_URL constant and any calls that reference
process.env.PG_TEST_URL, such as the PG_URL declaration used by the tests that
call runBackfill/installMigrationsSchema/latestByColumn/progress, see the PG_URL
constant) to ensure environment variables are available during test execution.

In `@packages/migrate/src/backfill.ts`:
- Around line 377-386: The thrown Error in backfill.ts leaks plaintext by
including the `preview` of `value` when `isEncryptedPayload(value)` is false;
remove the JSON.stringify preview and any direct inclusion of `value` in the
message and instead throw a safe, redacted message that references only
non-sensitive identifiers (e.g. `options.schemaColumnKey` and the `pk` from
`row.__pk` or `page.rows[i]?.pk`) and instructs to verify the
schema/--schema-column-key; also drop the `preview` variable and ensure any code
that persists migration error details (the migration error/event path that
records to cs_migrations.details) will no longer receive plaintext from this
exception.

In `@packages/migrate/src/eql.ts`:
- Around line 148-157: countEncryptedWithActiveConfig currently converts a
BIGINT DB result to a JavaScript number causing silent precision loss for values
> Number.MAX_SAFE_INTEGER; update the function signature and implementation to
preserve precision (return a string or bigint) by reading result.rows[0]?.count
without Number(...) and returning it as a string/bigint, or perform an explicit
range check on Number(result.rows[0]?.count) and throw/handle when it exceeds
Number.MAX_SAFE_INTEGER; refer to countEncryptedWithActiveConfig and the query
result access result.rows[0]?.count to locate where to change the conversion and
adjust the Promise<number> return type accordingly (e.g., Promise<string> or
Promise<bigint>).

In `@packages/migrate/src/manifest.ts`:
- Around line 20-26: The castAs schema currently allows any string; tighten it
by replacing castAs: z.string().default('text') with an explicit enum validator
listing the supported EQL types (e.g.,
"text","int","small_int","big_int","real","double","boolean","date","jsonb","json","float","decimal","timestamp")
and keep the default as "text"; update the manifest schema in
packages/migrate/src/manifest.ts (the castAs field) to use z.enum(...) (or an
equivalent zod union) so readManifest() will reject invalid values up front and
adjust any places that import the manifest type if their types change.

In `@skills/stash-drizzle/SKILL.md`:
- Line 482: The doc uses the wrong helper name "createProtectOperators" in the
guidance; update that reference to the correct function name
"createEncryptionOperators" (and any other ambiguous mentions in the same
document) so examples and the sentence about encrypted query operators use
createEncryptionOperators (e.g., where it lists eq, like, gte) to match the rest
of the file and existing documentation.

In `@skills/stash-encryption/SKILL.md`:
- Around line 664-677: Update the runBackfill example to use the actual
parameter names and required values: replace table -> tableName, column ->
provide schemaColumnKey, plaintextColumn and encryptedColumn, and client ->
encryptionClient; also add missing tableSchema (e.g., usersTable) and pkColumn
(e.g., 'id'); ensure the Encryption(...) call remains but pass that result as
encryptionClient to runBackfill and reference the users schema/table symbols
(users and usersTable) when filling tableSchema and schemaColumnKey.

---

Outside diff comments:
In `@packages/cli/src/commands/init/steps/build-schema.ts`:
- Around line 97-113: The current branch always sets nextState.schemas = [] and
immediately calls writeBaselineContextFile(nextState, cwd, envKeys), which wipes
existing schema context when keepExisting is true; change the logic so that when
keepExisting is true you preserve the prior schemas (e.g. nextState.schemas =
state.schemas) or skip calling writeBaselineContextFile entirely unless a new
placeholder/schema was generated; update the code paths around nextState,
schemas, keepExisting and the writeBaselineContextFile call so re-running stash
init does not overwrite existing .cipherstash/context.json with an empty list.

---

Nitpick comments:
In `@packages/cli/src/bin/stash.ts`:
- Around line 243-316: runEncryptCommand currently ignores
values['database-url'], so the encrypt subcommands (statusCommand, planCommand,
backfillCommand, cutoverCommand, dropCommand) cannot target a non-default DB;
update each command's options interface to accept databaseUrl?: string and pass
values['database-url'] into the respective calls from runEncryptCommand (e.g.,
when invoking backfillCommand, cutoverCommand, dropCommand, statusCommand,
planCommand) and ensure the implementations forward that databaseUrl into
loadStashConfig() / new pg.Client(...) where the DB connection is created.

In `@packages/migrate/src/__tests__/manifest.test.ts`:
- Around line 52-77: Update the test in manifest.test.ts so it actually verifies
defaulting behavior: in the manifest passed to writeManifest remove the
per-column keys castAs, indexes, and targetPhase from the users column fixture
(so only column: 'email' remains), then after readManifest assert that
read?.tables.users?.[0]?.castAs === 'text', read?.tables.users?.[0]?.indexes
equals [] and read?.tables.users?.[0]?.targetPhase === 'cut-over'; keep using
the existing writeManifest and readManifest helpers and the same tmp
setup/teardown.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 679ed635-dd44-45b0-9341-2c80183741c5

📥 Commits

Reviewing files that changed from the base of the PR and between 538d5b1 and 194875a.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (41)
  • .changeset/encryption-migrations.md
  • docs/plans/encryption-migrations.md
  • packages/cli/package.json
  • packages/cli/scripts/e2e-encrypt.sh
  • packages/cli/scripts/fixtures/seed-users.sql
  • packages/cli/src/bin/stash.ts
  • packages/cli/src/commands/db/activate.ts
  • packages/cli/src/commands/db/install.ts
  • packages/cli/src/commands/db/push.ts
  • packages/cli/src/commands/encrypt/backfill.ts
  • packages/cli/src/commands/encrypt/context.ts
  • packages/cli/src/commands/encrypt/cutover.ts
  • packages/cli/src/commands/encrypt/drizzle-helper.ts
  • packages/cli/src/commands/encrypt/drop.ts
  • packages/cli/src/commands/encrypt/plan.ts
  • packages/cli/src/commands/encrypt/status.ts
  • packages/cli/src/commands/init/doctrine/AGENTS-doctrine.md
  • packages/cli/src/commands/init/lib/__tests__/setup-prompt.test.ts
  • packages/cli/src/commands/init/lib/setup-prompt.ts
  • packages/cli/src/commands/init/lib/write-context.ts
  • packages/cli/src/commands/init/steps/build-schema.ts
  • packages/cli/src/commands/init/utils.ts
  • packages/migrate/README.md
  • packages/migrate/package.json
  • packages/migrate/src/__tests__/backfill.integration.test.ts
  • packages/migrate/src/__tests__/manifest.test.ts
  • packages/migrate/src/__tests__/sql.test.ts
  • packages/migrate/src/__tests__/state.test.ts
  • packages/migrate/src/backfill.ts
  • packages/migrate/src/cursor.ts
  • packages/migrate/src/eql.ts
  • packages/migrate/src/index.ts
  • packages/migrate/src/install.ts
  • packages/migrate/src/manifest.ts
  • packages/migrate/src/sql.ts
  • packages/migrate/src/state.ts
  • packages/migrate/tsconfig.json
  • packages/migrate/tsup.config.ts
  • skills/stash-cli/SKILL.md
  • skills/stash-drizzle/SKILL.md
  • skills/stash-encryption/SKILL.md

Comment thread docs/plans/encryption-migrations.md Outdated
Comment thread packages/cli/src/commands/db/push.ts Outdated
Comment thread packages/cli/src/commands/encrypt/backfill.ts
Comment thread packages/cli/src/commands/encrypt/backfill.ts Outdated
Comment thread packages/cli/src/commands/encrypt/context.ts
Comment thread packages/migrate/src/backfill.ts
Comment thread packages/migrate/src/eql.ts Outdated
Comment thread packages/migrate/src/manifest.ts
Comment thread skills/stash-drizzle/SKILL.md Outdated
Comment thread skills/stash-encryption/SKILL.md
coderdan added 18 commits May 4, 2026 23:27
Adds first-class support for migrating existing plaintext columns to
`eql_v2_encrypted` in production databases — the flow that currently has
no good answer in either Stack or Proxy land.

Per-column lifecycle:
  schema-added → dual-writing → backfilling → backfilled → cut-over → dropped

State lives in three layers so Proxy interop stays clean:
  - `.cipherstash/migrations.json` — repo-side intent (indexes, target phase)
  - `eql_v2_configuration` — EQL intent, unchanged; Proxy reads as before
  - `cipherstash.cs_migrations` — NEW append-only event log for per-column
    runtime state (phase, backfill cursor, rows processed). Installed by
    `stash db install`. Designed to upstream into EQL as `eql_v2_migrations`
    in a later release so Stack and Proxy own it jointly.

New CLI commands under `stash encrypt`:
  - status    per-column table: phase, EQL state, indexes, progress, drift
  - plan      diff intent vs observed
  - advance   record a phase transition (dual-writing is user-declared)
  - backfill  chunked, resumable, idempotent; txn-per-chunk with checkpoint;
              SIGINT-safe; uses user's encryption client via jiti dynamic
              import; auto-detects single-column PK
  - cutover   `eql_v2.rename_encrypted_columns()` in a txn; optional Proxy
              refresh via CIPHERSTASH_PROXY_URL
  - drop      generates a DROP COLUMN <col>_plaintext migration file

New package `@cipherstash/migrate` exposes the same primitives as a library
(`runBackfill`, `appendEvent`, `progress`, `renameEncryptedColumns`, …) so
users can embed backfill in their own workers/cron without the CLI process.

Design doc: docs/plans/encryption-migrations.md
Manual e2e script: packages/cli/scripts/e2e-encrypt.sh

Phase 1 scope: Protect/Stack client-side backfill. Proxy-mode backfill
(SQL-through-Proxy using the same cs_migrations state) is Phase 2.
Expand TypeDoc across the @cipherstash/migrate public API and the stash
encrypt command option interfaces. No behaviour change — docs only.

Highlights:
  - BackfillOptions: each field now explains the three separate name
    spaces (physical table/column vs. schema column key) and common
    defaults (chunkSize = 1000, encryptedColumn = <col>_encrypted).
  - BackfillCommandOptions: CLI flag semantics with an example of when
    schemaColumnKey needs to differ from column.
  - MigrationEvent / MigrationPhase: describes the event-vs-phase
    mapping and the backfill_started/backfill_checkpoint distinction.
  - EQL wrappers: explain that renameEncryptedColumns is the cut-over
    primitive, and that reloadConfig must run through Proxy.
  - installMigrationsSchema: documents why cs_migrations is kept
    separate from eql_v2_configuration (CHECK constraint, global
    state enum, write-frequency mismatch).
  - Manifest: field-level documentation of cast_as values, index kinds,
    and how targetPhase interacts with advance/plan/drop.
  - Module-level @packageDocumentation in src/index.ts for TypeDoc's
    package overview.
…stgres

Adds packages/migrate/src/__tests__/backfill.integration.test.ts —
gated on PG_TEST_URL so it skips in CI without a Postgres available.

Covers the full backfill state machine against a real transactional
Postgres using a stub encryption client (no CipherStash credentials
required):

  - happy-path completion + correct terminal state event
  - idempotency on re-run (row-level hash unchanged; zero new writes)
  - resume from checkpoint after mid-run AbortSignal
  - error event recorded + exception rethrown on encrypt failure
  - pre-encrypted rows preserved (the `encrypted IS NULL` guard)
  - empty-table fast path
  - event log ordering (backfill_started → checkpoint* → backfilled)
  - latestByColumn / progress readbacks

Run locally:
  cd local && docker compose up -d
  PG_TEST_URL=postgres://cipherstash:password@localhost:5432/cipherstash \\
    pnpm -F @cipherstash/migrate test backfill.integration
…ation

`stash db install --drizzle` now appends the cipherstash.cs_migrations
schema DDL to the generated EQL migration file, so `drizzle-kit migrate`
rolls the tracking table out to every environment alongside EQL itself.

Before this change the drizzle path only wrote EQL SQL; the cs_migrations
schema was installed directly against the connected DB (in the non-drizzle
branch) and never appeared in migration history. That meant prod deploys
running from drizzle migrations alone got EQL but no cs_migrations, and
`stash encrypt ...` would fail with "schema cipherstash does not exist"
until someone ran an out-of-band install.

Also exports MIGRATIONS_SCHEMA_SQL from @cipherstash/migrate so other
consumers can embed the DDL in their own migration pipelines.
…orts

loadEncryptionContext used to require the user's encryption client file
to export an EncryptedTable-shaped object (tableName + build()). Users
following the drizzle pattern typically only export the pgTable and the
initialised client, leaving the extractEncryptionSchema(...) result as
a non-exported const — which the loader couldn't see. Backfill would
then fail with "Table X was not found in the encryption client exports.
Available: (none)".

Now the loader does a second pass over module exports, detects drizzle
pgTables via Symbol.for('drizzle:Name'), dynamic-imports
@cipherstash/stack/drizzle, and calls extractEncryptionSchema() on each
to derive the EncryptedTable on the fly. Silently no-ops if the drizzle
subpath isn't installed (Supabase / generic projects are unaffected).

Manually-exported EncryptedTables still win over auto-derived ones
(the set-if-absent check preserves the explicit export).
Two correctness bugs in the backfill path, diagnosed from a real run
that wrote plaintext values through to the encrypted column:

1) The CLI defaulted `schemaColumnKey` to the plaintext column name
   (`--column`). But under the drizzle convention the EncryptedTable's
   column keys are the *encrypted* column names — because that's what
   the user declared via `encryptedType('foo_encrypted', ...)`. With
   the wrong key, `bulkEncryptModels` saw a model key that didn't
   match any configured encrypted column and returned the models
   unchanged. The runner then wrote the plaintext into the encrypted
   column, which Postgres rendered as `(82.60)`-shaped composite values
   because `eql_v2_encrypted` is a composite type. Default now uses
   the encrypted column name.

2) Added a leak guard inside runBackfill: after bulkEncryptModels
   returns, inspect `data[0][schemaColumnKey]`. Real ciphertext is
   always an object (the EQL envelope with c/k/v fields); if we see
   a primitive, throw with an actionable message that names the key
   the schema should use. Prevents any future schema/key mismatch
   from silently corrupting data — it fails loudly on the first chunk
   before any write commits.

Updated the TypeDoc on BackfillOptions to make the two conventions
(drizzle-extracted vs handwritten encryptedTable) explicit.
… leak guard

Replace the hand-rolled object-shape check in runBackfill with the
canonical isEncryptedPayload helper already exported by @cipherstash/stack.
The helper checks for the actual EQL envelope shape (v, i, and either
c or sv) rather than just `typeof === 'object'`, so it also catches
non-null objects that happen to lack ciphertext fields.

Also validates every row in the returned chunk (not just the first)
and reports the offending primary key in the error message so a user
hitting a partial failure knows which row to look at.

Integration test stubs updated to return valid-shaped payloads
({v, i, c}) so they still exercise the write path under the new guard.
…ryption

pg's node driver returns `numeric` as a JS string (to preserve
precision), but an EncryptedTable schema declaring `dataType('number')`
expects a JS number — so bulkEncryptModels errored out with "Cannot
convert String to Float. String values can only be used with Utf8Str".

Fix is split across both packages:

- @cipherstash/migrate: new optional `transformPlaintext` callback on
  BackfillOptions. Invoked on each row's plaintext before it goes into
  the model passed to bulkEncryptModels. Library stays generic; does
  not know anything about schemas.

- @cipherstash/cli: new `buildPlaintextCoercer` inspects
  `tableSchema.build().columns[schemaColumnKey].cast_as` and returns
  an appropriate coercer:
    number / double / real / int / decimal → Number(string)
    bigint / big_int                        → BigInt(string)
    date / timestamp                        → new Date(string)
    boolean                                 → "true"/"false" → boolean
    string / text / json / jsonb / unknown  → identity

Null and undefined are always passed through unchanged.
The backfill "Backfilling x.y → y_enc" log line now also prints the
schema's cast_as value so a user diagnosing a type-coercion issue can
see immediately whether the coercer is reading the right dataType from
the EncryptedTable (vs. falling through to identity).

Refactored buildPlaintextCoercer to return { transform, castAs } so
the caller can log the detected value; behaviour unchanged.
… by protect-ffi

Investigation into "Cannot convert String to Date" for a column with
cast_as: 'date' turned up a genuine protect-ffi 0.21.2 limitation:
its JsPlaintext wire enum has only String/Number/Boolean/JsonB
variants — no JS Date representation. napi-rs serialises JS Date to
ISO string via Date.toJSON, and the Rust side then refuses it because
string values are only valid for Utf8Str columns. The Rust-internal
NaiveDate / Timestamp types exist but have no JS-visible wire format.

Not a tool bug; not fixable here. But running a backfill that will
inevitably fail on the first chunk is a poor UX. Add a pre-flight
check: if the schema declares cast_as 'date' or 'timestamp', print a
warning explaining the FFI limitation and the mitigation (change to
dataType: 'string' / ISO strings) and prompt before continuing.
Accepts --yes-style confirmation via the standard clack confirm UI.
Two doc updates in support of #357 now that the rulebook package is
gone:

- `docs/plans/encryption-migrations.md`: drop "rulebook" references
  (5 of them) and the stale `packages/cli/src/commands/wizard/lib`
  paths. Re-point the agent-handoff bits at the post-#395
  architecture: Claude / Codex / AGENTS.md handoffs from
  `init/steps/handoff-*.ts`, with the integration skill installed by
  init providing the per-stack guidance. Repoint
  `introspectDatabase` to its current home in `init/lib/introspect.ts`.

- `/skills/stash-cli/SKILL.md`: add an `encrypt` section documenting
  every subcommand (`status`, `plan`, `advance`, `backfill`,
  `cutover`, `drop`) with flags, examples, and a one-line note on
  runner-prefix substitution so the docs are not pinned to npm.

- `/skills/stash-encryption/SKILL.md`: add a "Column Migration
  Lifecycle" section covering the six-phase model
  (schema-added → dual-writing → backfilling → backfilled →
  cut-over → dropped), the three-source state model
  (`migrations.json` / `eql_v2_configuration` / `cs_migrations`), the
  CLI sequence, and the library `runBackfill` shape. Agents reading
  this skill now have the migration vocabulary they need.

No CLI behaviour changes. Buckets 3+ from the audit (advance handoff
integration, runner-aware help in encrypt commands, setup-prompt
recommending `stash encrypt`, AGENTS-doctrine pointing at the CLI
path) deferred until the encrypt-step UX has been reviewed.
…-force

The `stash encrypt advance --to <phase>` command was over-modelled: it
only ever did meaningful work for one transition (`dual-writing`), and
the user had to remember to invoke it as a prerequisite to backfill —
easy to miss, and a missed invocation just produced a confusing failure
mode where backfill couldn't tell whether dual-writes were live.

Fold the dual-write confirmation into `backfill` itself. The first run
against a column either prompts the user (interactive) or accepts
`--confirm-dual-writes-deployed` (non-interactive, with a loud warning),
appends the `dual_writing` event to `cs_migrations`, and proceeds. Re-runs
/ resumes are no-ops for the prompt — the bookmark is persisted.

Add `--force` for the recovery path. Drops the `<col>_encrypted IS NULL`
guard from both the SELECT and UPDATE so every plaintext row is
re-encrypted, including ones that already have a (potentially stale)
ciphertext. This handles the "I confirmed dual-writes but they weren't
actually live" failure mode where rows landed in plaintext only mid-
backfill, or where the application updated plaintext without dual-
writing the encrypted twin. Not destructive — re-encrypting a correctly-
encrypted value just rewrites the same payload — but expensive enough
that backfill prompts for explicit confirmation when --force is set.
The recovery run is recorded with `details.force = true` in
cs_migrations so audit-log queries can spot it.

Library changes (`@cipherstash/migrate`):
- `KeysetPageOptions.force` and `countUnencrypted(..., force)` drop the
  encrypted-IS-NULL clause from the WHERE.
- `BackfillOptions.force` plumbs the flag through `runBackfill` and the
  chunk writer; UPDATE WHERE drops `t.<enc> IS NULL` when force.
- The `backfill_started` event includes `force: true` in details.

CLI changes:
- Delete `commands/encrypt/advance.ts`.
- `commands/encrypt/backfill.ts` gains `confirmDualWritesDeployed`,
  `force`, and an `ensureDualWritesDeployed` guard that handles all
  three paths (already advanced, interactive prompt, non-interactive
  flag) and appends the `dual_writing` event on first acceptance.
- `bin/stash.ts` drops the `advance` route, parses
  `--confirm-dual-writes-deployed` and `--force` for `backfill`,
  removes `encrypt advance` from the help banner.
- `commands/encrypt/{status,plan}.ts` prose hints updated to point at
  `backfill` instead of `advance`.

Docs / skills:
- `docs/plans/encryption-migrations.md` — drop the standalone advance
  section; add a "Dual-write confirmation, folded into backfill"
  section explaining the rationale; update the verification flow.
- `skills/stash-cli/SKILL.md` — drop the `encrypt advance` subsection;
  rewrite the `encrypt backfill` subsection with the new flags and the
  dual-write precondition explanation.
- `skills/stash-encryption/SKILL.md` — update the lifecycle CLI
  sequence to fold phases 2 + 3 into one `backfill` command; add the
  `--force` recovery path.
- `packages/cli/scripts/e2e-encrypt.sh` — drop the `advance` step;
  use `--confirm-dual-writes-deployed` in the non-interactive backfill.
- `.changeset/encryption-migrations.md` — describe the new shape;
  also fix the package name in the frontmatter (`@cipherstash/cli` →
  `stash` post-rename).

Out of scope: `stash encrypt update` for re-encrypting an already
cut-over column when the EQL configuration changes — handled in the
next change.
… drizzle lifecycle worked example

Three coupled changes that fix the failure mode reported on the spike
project where the agent stopped at the "stop and ask" rule because the
setup prompt didn't route it toward `stash encrypt` for live-data
columns.

setup-prompt.ts is rewritten from "imperative TODO list" to
"orient and ask". The new prompt:
- Tells the agent its FIRST response is a routing question, not an
  edit. The agent must orient the user with the two paths and ask
  which they want before touching anything.
- Names every installed skill with a one-line purpose so the user
  can see what's available.
- Describes path 1 (new encrypted column from scratch) and path 3
  (migrate an existing populated column via `stash encrypt
  backfill/cutover/drop`) explicitly, with the right CLI commands
  inline (runner-aware, per package manager).
- Names path 2 (convert in place) as not supported and explains
  why, so the agent routes to path 3 if the user asks for it.
- Preserves the stop-and-ask invariants but ties them to the
  unsupported-path 2 case.

build-schema.ts no longer prompts the user to pick which columns
to encrypt during init. Deciding which columns to encrypt is the
user's choice in conversation with their agent — not a question to
answer at init time, because path 1 and path 3 need different
treatment and init can't tell which the user wants. Init now
always writes a placeholder encryption client; introspection-based
codegen is removed.

utils.ts:generatePlaceholderClient is rewritten. Used to synthesise
a fully-formed `pgTable('users', { email, name })` mirror of the
DB. That left users with two parallel definitions (real schema
file + synthesised stub) that the agent had to reconcile blind.
The new placeholder is a heavily-commented file showing the
encryption-client patterns inline (path 1 and path 3 examples
for both Drizzle and generic), exporting `Encryption({ schemas:
[] })` so the encrypt commands surface a clear error pointing
back at this file. The agent's job is to declare encrypted
columns directly in the user's real schema files and update this
file to reference them.

write-context.ts:buildContextFile no longer throws on empty
schemas. Init's `state.schemas` is now `[]` after the refactor.

skills/stash-drizzle/SKILL.md gets a new "Migrating an Existing
Column to Encrypted" section with a phase-by-phase Drizzle worked
example: schema-add (encryptedType twin column, nullable, generate
+ apply migration), dual-write (insert/update code change),
backfill (`stash encrypt backfill`), cutover (rename swap, switch
schema and read paths), drop (generated migration removes
plaintext). Mirrors the lifecycle vocabulary in stash-encryption.

setup-prompt tests rewritten to match the new orient-and-route
shape (12 tests). 165 unit tests pass; biome clean.

Out of scope (follow-ups tracked in conversation):
- stash-supabase skill needs the same worked example.
- Public docs repo needs the migration tool + lifecycle covered.
- Wizard's gateway prompt template needs the orient-and-route
  vocabulary update.
- `stash encrypt update` for re-encrypting after EQL config
  changes.
- `loadStashConfig` re-export from @cipherstash/migrate.
- AGENTS.md handoff validation in Cursor / Windsurf.
- Richer Codex skill structure (`scripts/`, `references/`).
…invariant

Two fixes from a smoke-test run on the supatest spike project.

Fix 1: backfill / drop never wrote `.cipherstash/migrations.json`

The manifest was modelled as the *intent* leg of the three-source
state model (intent in repo, EQL config in DB, runtime state in
cs_migrations) but no CLI command actually wrote the file —
`writeManifest` was exported from @cipherstash/migrate but never
called from the CLI. Plan and status emitted "no manifest" forever
and the drift-detection features were dead code.

Wired:
- New `upsertManifestColumn(table, column, cwd?)` in
  @cipherstash/migrate. Reads the existing manifest (or starts
  fresh), replaces the matching column entry under the named table,
  writes back. Preserves entries for other columns / other tables.
- New `setManifestTargetPhase(table, columnName, phase, cwd?)` —
  no-op when the column isn't tracked yet, used by `drop` to bump
  intent forward.
- `backfill.ts` calls `upsertManifestColumn` after the dual-write
  confirmation. The entry is derived from the encryption client's
  EncryptedTable schema (cast_as → manifest.castAs, configured
  index kinds → manifest.indexes); pkColumn flows through when
  the user passed `--pk-column`. targetPhase defaults to
  `cut-over`. Idempotent — re-runs replace the same entry.
- `drop.ts` calls `setManifestTargetPhase(... 'dropped')` after
  the migration file is written, so the manifest reflects the
  user's commitment to fully removing the plaintext column.

Cutover doesn't touch the manifest (current state lives in
cs_migrations; the manifest is only intent).

10 new tests in @cipherstash/migrate covering upsert idempotence,
target-phase update, and the no-op-when-untracked path.

Fix 2: bundler-exclusion invariant promoted

The skill mentioned that `@cipherstash/stack` must be excluded from
bundling (it wraps a native FFI module) but in a single line buried
in Installation. Claude missed it on the smoke test, then hit the
runtime crash.

- AGENTS-doctrine.md gains it as invariant #7 — the seventh
  "never break this" rule, alongside never-log-plaintext and
  jsonb-null-on-creation. Concrete config snippets for Next.js,
  webpack, esbuild, and Vite SSR included so the agent doesn't have
  to guess the field names.
- stash-encryption skill's Installation section gets a more
  prominent callout (`> [!IMPORTANT]`) plus the same per-bundler
  snippets.
- setup-prompt.ts adds it to path 1 step 1 ("if this is the first
  encrypted column in the project, configure the bundler exclusion
  first") and to path 3 schema-add as the same precondition.

The exclusion now appears at every layer the agent reads: doctrine,
skill, and project-specific action prompt. Test asserts
`serverExternalPackages` and `@cipherstash/protect-ffi` appear in
the rendered prompt.
`stash encrypt cutover` failed with "No pending configuration exists to
encrypt" because `stash db push` wrote configs straight to `active`,
skipping `pending` entirely. EQL's `rename_encrypted_columns()` requires
a pending row to compute rename targets, so the documented six-phase
lifecycle was unrunnable end-to-end. Reported in detail by Dan after a
spike on the supatest project.

Aligned the SDK with the EQL extension's native pending → encrypting →
active state machine (the same flow Proxy uses for hot-reloads):

- `db push` now writes the new config as `pending` when an `active`
  config already exists. First push (no active config) still writes
  directly to `active` since there's nothing to rename. Prints a clear
  "next step" note routing the user to the appropriate finalisation
  command.

- `cutover` now runs the full lifecycle in one transaction:
  rename_encrypted_columns → migrate_config → activate_config. Pending
  is promoted to active alongside the physical rename. Verifies pending
  exists upfront with a clear error if not.

- New `stash db activate` command for non-rename activations (path 1:
  brand-new encrypted column added to a project that already has an
  active config). Chains migrate_config + activate_config without any
  rename. Use after `db push` when no `<col>_encrypted` twin needs
  swapping.

- `@cipherstash/migrate` exports new `migrateConfig`,
  `activateConfig`, and `discardPendingConfig` wrappers around the
  corresponding EQL functions. The `renameEncryptedColumns` docstring
  was wrong — it claimed idempotency when no renames are pending, but
  the underlying SQL throws if there's no pending row at all. Fixed.

- `setup-prompt.ts` updated: path 1 now includes `db push → db
  activate`; path 3 walks through the schema flip + re-push between
  backfill and cutover so the agent knows to update the pending row.

- Skill updates (stash-cli, stash-drizzle, stash-encryption) document
  the new pending/active flow explicitly. The stash-cli `db push`
  section gained a decision table for "what to run next" based on
  whether the change is additive or includes a rename.

After this, the user's acceptance criterion holds: clean `init →
schema edit → db push → encrypt backfill → encrypt cutover` flow ends
with the rename applied, prior config marked inactive, new config
active, and a `cut_over` event in `cs_migrations`.
Three issues from the spike's lifecycle smoke-test, all in the
encrypt CLI. Bundled into one commit because they touch adjacent
files.

Issue 1: `encrypt drop` wrote a self-named timestamped migration
(`20260504112456_drop_*.sql`) that drizzle-kit migrate refused to
pick up — no journal entry, wrong prefix. Same shape `db install
--drizzle` already gets right by shelling out to `drizzle-kit
generate --custom`.

Fix: detect drizzle and route through a new
`packages/cli/src/commands/encrypt/drizzle-helper.ts` that wraps
`drizzle-kit generate --custom --name=...`, locates the generated
file, and writes the drop SQL into it. The migration now lands
with a journal entry; `drizzle-kit migrate` applies it like any
other migration. Non-drizzle projects keep the timestamped-file
fallback (Prisma / raw-SQL paths planned).

Issue 2: `encrypt cutover` ran `eql_v2.rename_encrypted_columns()`
live and never told drizzle. Drizzle's `meta/_journal.json` and
snapshot stayed pinned to the pre-rename shape, so the next
`drizzle-kit generate` against the source produced a confused
diff trying to recreate the old layout.

Fix: after cutover succeeds (transaction committed), scaffold a
follow-up custom drizzle migration containing idempotent
`ALTER TABLE … RENAME COLUMN` statements wrapped in a `DO` block
that checks whether `<col>_encrypted` still exists. On the source
DB the rename already ran, so the block is a no-op and Drizzle's
journal still records the migration; on a fresh restore the block
performs the rename. Same file, both behaviours, reproducible.
Non-drizzle projects skip the resync step (logged-only warning if
scaffolding fails).

Issue 3: `encrypt status` rendered `rowsProcessed/rowsTotal (pct%)`
uniformly across every phase. The same fraction means different
things at different points in the lifecycle, and `0/0 (100%)` for
a `backfilled` column that needed no encrypting reads as
nonsense.

Fix: phase-aware framing for the PROGRESS column. `schema-added`
shows `—`. `dual-writing` shows `(awaiting backfill)`. `backfilling`
keeps the fraction. `backfilled` / `cut-over` / `dropped` show a
plain completion marker instead of a degenerate ratio. Same data,
phase-appropriate label.

Wire `--migrations-dir <path>` through to cutover for projects with
non-default drizzle out dirs.

166 tests pass; biome clean.

Coverage check during dual-writing (the bug report's bonus
suggestion — show "rows-with-both-columns / total" rather than
just "awaiting backfill") needs a live SELECT against the user's
table, not just the cs_migrations data we already have. Tracked
as a follow-up; today's status surfaces phase awareness without
new queries.
The post-install panel still recommended `stash wizard` as the
headline path and showed a hand-rolled `client.encryptModel(record,
table).run()` snippet — both stale post-#395 and post-#357.

Replace with brief guidance that bridges install → agent handoff:
two canonical "ask your agent X" phrasings (one per real path,
migrate-existing vs add-new), a short note that the agent will do
the schema edits and run the lifecycle commands, and a pointer at
the skills + public docs.

Same panel runs from any `db install` invocation — including the
one init triggers in install-eql — so the new copy makes sense
both during init's handoff and when `db install` is run
standalone (where "your agent" can be any agent the user has
open, or someone reading the lifecycle commands directly).
The orient-and-route prompt the agent reads after `stash init` referred
to the two supported flows as "Path 1" and "Path 3" (with "Path 2" as
the not-supported in-place case). The agent then surfaced that
numbering verbatim to end users, who have no context for it — the gaps
in the numbering came from an internal conversation about the
scenario taxonomy, not anything the user should care about.

Replace the labels with the two intended actions ("Add a new encrypted
column" and "Migrate an existing column to encrypted"), and reframe
the not-supported case as a brief "Converting in place is not
supported" callout rather than a third numbered path. The migrate
flow now also opens with a one-line note on why it's staged (parallel
twin + dual-write + rename) so the user has the model before reading
the steps.

Tests updated to assert the new headings and the staged-twin mention.
@coderdan coderdan force-pushed the encryption-migrations branch from 194875a to 552bc58 Compare May 4, 2026 13:28
Aggregates the actionable items from CodeRabbit's review of #357. None
introduce new behaviour; each closes a specific defect.

Security / correctness:

- drizzle-helper: replace `execSync` with `spawnSync` so caller-provided
  migration names can't escape into the shell.
- migrate/backfill: drop the value preview from the leak-guard error
  message — the path that hits this branch is precisely where the
  encryption client passed plaintext through, so the value is sensitive.
- cli/backfill: emit a generic catch-block error instead of bubbling
  `error.message` (same plaintext-leak risk via upstream library
  exception text).

Schema-qualified table names:

- requireTable: fall back to unqualified-name lookup so
  `--table public.users` resolves to a schema whose `tableName === 'users'`.
- status: use lastIndexOf('.') when splitting `${table}.${column}` keys
  so schema-qualified table names don't drop the column segment.
- cutover.buildRenameMigrationSql + drop.dropSql: split `schema.table`
  into separate quoted identifiers so the generated `ALTER TABLE` is
  valid and the `information_schema` probe matches.

EQL state / SQL:

- push, activate, cutover, eql.discardPendingConfig, status: schema-
  qualify `eql_v2_configuration` as `public.eql_v2_configuration` so a
  custom `search_path` doesn't shadow or hide the table.
- countEncryptedWithActiveConfig: return `bigint` instead of `number` —
  the BIGINT row count silently truncates past `Number.MAX_SAFE_INTEGER`
  on the exact large tables this is meant to sanity-check.

Process / lifecycle hygiene:

- Every CLI handler that did `process.exit(1)` inside a try/catch with
  an async `finally`: replace with an `exitCode` flag and a single
  `if (exitCode) process.exit(exitCode)` after the finally, so
  `client.end()` actually runs on the error path.
- backfill: move `pool.connect()` inside the same try/finally that
  registers the SIGINT handlers, so handlers/pool are cleaned up if
  connection acquisition fails.
- cutover: catch proxy `reloadConfig()` failures separately and warn —
  reload runs after the cutover transaction commits, so a transient
  Proxy connectivity blip shouldn't make the outer catch report the
  whole cutover as failed.

Schema validation:

- manifest: replace `castAs: z.string()` with a `z.enum(...)` of the
  supported EQL types so a typo (`timestampz`, `strnig`) fails at
  manifest read instead of at use-time.

Doc + skill accuracy:

- setup-prompt: drop the incorrect "no read-path code change" claim
  from the migrate-existing-column flow. Add an explicit step 6
  pointing at `decryptModel` / `encryptedSupabase` — without it the
  agent ships read paths returning raw ciphertext to end users.
- stash-drizzle: `createProtectOperators` → `createEncryptionOperators`
  (the only correct name; every other reference in the file already
  uses it).
- stash-encryption: fix the `runBackfill` example's parameter names
  (`table` → `tableName`, `column` → `schemaColumnKey/plaintextColumn/
  encryptedColumn`, `client` → `encryptionClient`, plus the missing
  `tableSchema` and `pkColumn`).
- design doc: cutover section now reflects the shipped flow
  (`migrate_config()` + `activate_config()` inside the transaction
  alongside the rename), and the read-path claim matches the skill.

Test infrastructure:

- backfill.integration.test.ts: import `dotenv/config` so PG_TEST_URL
  loads from `.env`; add `dotenv` to devDependencies.

Deferred (separate follow-up entry):

- The cutover preflight only verifies one column's phase, but the
  underlying EQL function promotes the whole pending config in one
  call — already tracked as item 3.8 in the working follow-ups doc.
…grations

These files belong on a separate branch — they were swept into the
linting-fix commit by mistake and broke CI's frozen-lockfile install
because pnpm-lock.yaml didn't reflect the new package.json deps.
@calvinbrewer calvinbrewer merged commit 3627535 into main May 4, 2026
6 checks passed
@calvinbrewer calvinbrewer deleted the encryption-migrations branch May 4, 2026 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants