Skip to content

adapter/ddl-perf: eliminate O(n^2) operation replay in DDL transactions#35303

Draft
aljoscha wants to merge 4 commits intoMaterializeInc:mainfrom
aljoscha:push-vpoqwtsppwyq
Draft

adapter/ddl-perf: eliminate O(n^2) operation replay in DDL transactions#35303
aljoscha wants to merge 4 commits intoMaterializeInc:mainfrom
aljoscha:push-vpoqwtsppwyq

Conversation

@aljoscha
Copy link
Contributor

@aljoscha aljoscha commented Mar 2, 2026

DDL transactions (e.g., BEGIN; CREATE TABLE FROM SOURCE t1 ...; CREATE
TABLE FROM SOURCE t2 ...; COMMIT) replayed ALL previously executed
operations for every new statement. For N statements, total work was
1+2+...+N = O(N^2).

Root cause: catalog_transact_with_ddl_transaction built
all_ops = txn_ops + new_ops + TransactionDryRun and processed everything
from scratch through the full catalog.transact() pipeline each time.

Fix: instead of replaying all ops, process only the NEW ops against the
accumulated CatalogState from the previous dry run. A new method
transact_incremental_dry_run opens a fresh storage transaction, advances
the OID allocator past previously-allocated OIDs, runs transact_inner
with only the new ops against the accumulated state, and drops the
transaction without committing. This reduces per-statement work from
O(N) to O(1), making the overall transaction O(N) instead of O(N^2).

Experiments

Setup: Optimized build, PostgreSQL source with ~2500 upstream tables,
~6900 existing objects in Materialize. Each batch runs
CREATE TABLE ... FROM SOURCE N times inside a single BEGIN/COMMIT
DDL transaction. Results averaged over 2 repetitions per batch size.

Per-table cost (ms/table):

batch_size Baseline With Fix Speedup
1 273 237 1.15x
5 161 121 1.33x
10 147 106 1.39x
25 146 98 1.49x
50 131 93 1.41x
100 154 86 1.79x
200 225 88 2.56x
300 296 90 3.29x
500 451 90 5.01x

Total transaction time (ms):

batch_size Baseline With Fix Speedup
1 273 237 1.15x
5 811 606 1.34x
10 1,477 1,064 1.39x
25 3,674 2,463 1.49x
50 6,597 4,694 1.41x
100 15,515 8,647 1.79x
200 45,110 17,758 2.54x
300 89,073 27,408 3.25x
500 225,734 45,327 4.98x

Baseline per-table cost grows with batch size due to O(n²) replay
(131ms at batch=50 → 451ms at batch=500, 3.4x increase). With the
fix, per-table cost is constant at ~86–98ms regardless of batch size.
At batch=500 this is a 5x total speedup (226s → 45s).

@github-actions
Copy link

github-actions bot commented Mar 2, 2026

Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone.

PR title guidelines

  • Use imperative mood: "Fix X" not "Fixed X" or "Fixes X"
  • Be specific: "Fix panic in catalog sync when controller restarts" not "Fix bug" or "Update catalog code"
  • Prefix with area if helpful: compute: , storage: , adapter: , sql:

Pre-merge checklist

  • The PR title is descriptive and will make sense in the git log.
  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).

DDL transactions (e.g., BEGIN; CREATE TABLE FROM SOURCE t1 ...; CREATE
TABLE FROM SOURCE t2 ...; COMMIT) replayed ALL previously executed
operations for every new statement. For N statements, total work was
1+2+...+N = O(N^2). Session 23 measured per-table cost tripling from
104ms (batch=10) to 328ms (batch=500).

Root cause: catalog_transact_with_ddl_transaction built
all_ops = txn_ops + new_ops + TransactionDryRun and processed everything
from scratch through the full catalog.transact() pipeline each time.

Fix: instead of replaying all ops, process only the NEW ops against the
accumulated CatalogState from the previous dry run. A new method
transact_incremental_dry_run opens a fresh storage transaction, advances
the OID allocator past previously-allocated OIDs, runs transact_inner
with only the new ops against the accumulated state, and drops the
transaction without committing. This reduces per-statement work from
O(N) to O(1), making the overall transaction O(N) instead of O(N^2).

## Experiments

Setup: Optimized build, PostgreSQL source with ~2500 upstream tables,
~6900 existing objects in Materialize. Each batch runs
`CREATE TABLE ... FROM SOURCE` N times inside a single `BEGIN`/`COMMIT`
DDL transaction. Results averaged over 2 repetitions per batch size.

**Per-table cost (ms/table):**

| batch_size | Baseline | With Fix | Speedup |
|------------|----------|----------|---------|
| 1          | 273      | 237      | 1.15x   |
| 5          | 161      | 121      | 1.33x   |
| 10         | 147      | 106      | 1.39x   |
| 25         | 146      | 98       | 1.49x   |
| 50         | 131      | 93       | 1.41x   |
| 100        | 154      | 86       | 1.79x   |
| 200        | 225      | 88       | 2.56x   |
| 300        | 296      | 90       | 3.29x   |
| 500        | 451      | 90       | 5.01x   |

**Total transaction time (ms):**

| batch_size | Baseline   | With Fix  | Speedup |
|------------|------------|-----------|---------|
| 1          | 273        | 237       | 1.15x   |
| 5          | 811        | 606       | 1.34x   |
| 10         | 1,477      | 1,064     | 1.39x   |
| 25         | 3,674      | 2,463     | 1.49x   |
| 50         | 6,597      | 4,694     | 1.41x   |
| 100        | 15,515     | 8,647     | 1.79x   |
| 200        | 45,110     | 17,758    | 2.54x   |
| 300        | 89,073     | 27,408    | 3.25x   |
| 500        | 225,734    | 45,327    | 4.98x   |

Baseline per-table cost grows with batch size due to O(n²) replay
(131ms at batch=50 → 451ms at batch=500, 3.4x increase). With the
fix, per-table cost is constant at ~86–98ms regardless of batch size.
At batch=500 this is a 5x total speedup (226s → 45s).
@aljoscha aljoscha force-pushed the push-vpoqwtsppwyq branch from 416bdac to 33c249e Compare March 3, 2026 07:03
@aljoscha aljoscha force-pushed the push-vpoqwtsppwyq branch 2 times, most recently from 144c60f to 4210d74 Compare March 3, 2026 12:42
The incremental dry run created a fresh durable transaction for each
statement but processed ops against an accumulated CatalogState from
previous dry runs. For the 2nd+ statement, the tx (reflecting durable
storage) and the state (reflecting accumulated changes) were out of
sync, causing 'retraction does not match existing value' panics when
applying diffs.

Fix: after each dry run, export the transaction's current state as a
Snapshot (Transaction::current_snapshot). On the next dry run, initialize
the transaction from this saved snapshot via
DurableCatalogState::transaction_from_snapshot, keeping tx and state in
sync. This preserves the O(N)-per-statement optimization.
@aljoscha aljoscha force-pushed the push-vpoqwtsppwyq branch from 4210d74 to 28500ee Compare March 3, 2026 13:03
@aljoscha
Copy link
Contributor Author

aljoscha commented Mar 3, 2026

@def- could you maybe take a peek at the nightly there? Feels like they're all unrelated to this change but didn't want to just ignore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant